Usability of AST traversal in white-/blacklist

Long story short: the [blacklist validation](https://github.com/apluslms/python-grader-utils/tree/master/graderutils#validation-tasks) feature seems scarcely documented #42 and buggy (this issue). What do the users need, actually?

## User story

### Task: What the user tries to do

I was investigating for the _CS-A1141 Data Structures and Algorithms Y_ course how to detect whether a student has submitted an exercise template without modifications. Students do this for three causes:

1. being accustomed on the _Basics of programming Y1_ course that submitting a template reveals what kinds of graded unit tests are used;
2. testing whether one can gain some easy exercise points by just submitting the template;
3. by accident.

### Interaction with Python-grader-utils

I tried to implement this feature with Python-grader-utils as follows.

There is an exercise template like this:

File **eratosthenes.py**
```Python
# Implement the sieve_of_eratosthenes function here below

def sieve_of_eratosthenes(n):
    """ Return the number of prime numbers between the range (2, n) """

    raise NotImplementedError("sieve_of_eratosthenes function is missing!") #Remove this line

# Driver program
def main():
    n = 10
    nof_primes = sieve_of_eratosthenes(n)
    print("The number of primes between the range (2, {:d}) is: {:d}"
        .format(n, nof_primes))

if __name__ == "__main__":
    main()
```

Using the [blacklist validation](https://github.com/apluslms/python-grader-utils/tree/master/graderutils#validation-tasks) of Python-grader-utils, I implemented the functionality that if the student's code contains text `NotImplementedError`, it is likely that the student has submitted an unmodified exercise template, and therefore further grading should not be done. This was done with the following configuration.

File **test_config.yaml**
```yaml
test_groups:
  - module: tests
    description: Local tests
  - module: grader_tests
    description: Grader tests

validation:
  display_name: "Is the file structurally correct?"
  tasks:
  - type: python_syntax
    file: eratosthenes.py
    display_name: "The file is not a syntactically correct Python program."
  - type: python_import
    display_name: "The file has a function named `sieve_of_eratosthenes`."
    file: eratosthenes.py
    attrs:
      sieve_of_eratosthenes: function
  - type: python_blacklist
    display_name: "File does not contain restricted syntax"
    description: "There should not be NotImplementedError exceptions."
    file: eratosthenes.py
    node_dump_regexp:
      "NotImplementedError": "NotImplementedError found"
```

In principle, this works: When I submit the exercise template as above, there is an error message and further grader tests are not run. When I submit a correct solution, it is graded normally.

### Problem

However, in the case of submitting an unmodified exercise template, the output of Python-grader-utils, visible to the student, is rather complex and redundant:

<img width="892" height="820" alt="Image" src="https://github.com/user-attachments/assets/b80efb35-eef4-41b7-80d9-ae4c2066ad5f" />

## Analysis of the problem

In the validation error above, Python-grader-utils generates similar error message five times:

- 1 times on the file (module) level (line -1 in the message)
- 1 times on the function level (line 3 in the message)
- 3 times on the statement level (line 5 in the message)

It seems this is related to the way the abstract syntax tree is traversed with [`ast.walk()` in `_check_python_restricted_syntax()`](https://github.com/apluslms/python-grader-utils/blob/master/graderutils/validation.py#L71-L104). Sure, it is a handy library function, but is it really suitable for the job?

Furthermore, consider an use case where an exercise template has several functions that the student is supposed to implement, e.g. `foo()`, `bar()`, and `baz()`. Each of these have initially a `raise NotImplementedError` statement. Ideally, in my opinion, an automated grader should support grading a partially implemented solution. For example, the student could first implement `foo()` and submit it to see how it is graded. However, if we want to have the automated detection of an unmodified exercise template as described above, the problem is that the student's solution is not graded until _all_ `raise NotImplementedError` statements have been removed. Of course, the student could temporarily replace all the `raise` statements with, e.g. a `pass` statement, but the latter is a silent, nonfunctional code, while the former actively reminds the student that they still have work to do, when the student's program is run against unit tests. At this point of view, simple blacklisting of a certain statement is not sufficient.

At this point, I decided to abandon the solution based on Python-grader-utils and instead write a custom structural checker for student's Python program inside graded unit tests on the CS-A1141 course.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Usability of AST traversal in white-/blacklist #43

User story

Task: What the user tries to do

Interaction with Python-grader-utils

Problem

Analysis of the problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Usability of AST traversal in white-/blacklist #43

Description

User story

Task: What the user tries to do

Interaction with Python-grader-utils

Problem

Analysis of the problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions