Skip to content

Usability of AST traversal in white-/blacklist #43

@atilantera

Description

@atilantera

Long story short: the blacklist validation feature seems scarcely documented #42 and buggy (this issue). What do the users need, actually?

User story

Task: What the user tries to do

I was investigating for the CS-A1141 Data Structures and Algorithms Y course how to detect whether a student has submitted an exercise template without modifications. Students do this for three causes:

  1. being accustomed on the Basics of programming Y1 course that submitting a template reveals what kinds of graded unit tests are used;
  2. testing whether one can gain some easy exercise points by just submitting the template;
  3. by accident.

Interaction with Python-grader-utils

I tried to implement this feature with Python-grader-utils as follows.

There is an exercise template like this:

File eratosthenes.py

# Implement the sieve_of_eratosthenes function here below

def sieve_of_eratosthenes(n):
    """ Return the number of prime numbers between the range (2, n) """

    raise NotImplementedError("sieve_of_eratosthenes function is missing!") #Remove this line

# Driver program
def main():
    n = 10
    nof_primes = sieve_of_eratosthenes(n)
    print("The number of primes between the range (2, {:d}) is: {:d}"
        .format(n, nof_primes))

if __name__ == "__main__":
    main()

Using the blacklist validation of Python-grader-utils, I implemented the functionality that if the student's code contains text NotImplementedError, it is likely that the student has submitted an unmodified exercise template, and therefore further grading should not be done. This was done with the following configuration.

File test_config.yaml

test_groups:
  - module: tests
    description: Local tests
  - module: grader_tests
    description: Grader tests

validation:
  display_name: "Is the file structurally correct?"
  tasks:
  - type: python_syntax
    file: eratosthenes.py
    display_name: "The file is not a syntactically correct Python program."
  - type: python_import
    display_name: "The file has a function named `sieve_of_eratosthenes`."
    file: eratosthenes.py
    attrs:
      sieve_of_eratosthenes: function
  - type: python_blacklist
    display_name: "File does not contain restricted syntax"
    description: "There should not be NotImplementedError exceptions."
    file: eratosthenes.py
    node_dump_regexp:
      "NotImplementedError": "NotImplementedError found"

In principle, this works: When I submit the exercise template as above, there is an error message and further grader tests are not run. When I submit a correct solution, it is graded normally.

Problem

However, in the case of submitting an unmodified exercise template, the output of Python-grader-utils, visible to the student, is rather complex and redundant:

Image

Analysis of the problem

In the validation error above, Python-grader-utils generates similar error message five times:

  • 1 times on the file (module) level (line -1 in the message)
  • 1 times on the function level (line 3 in the message)
  • 3 times on the statement level (line 5 in the message)

It seems this is related to the way the abstract syntax tree is traversed with ast.walk() in _check_python_restricted_syntax(). Sure, it is a handy library function, but is it really suitable for the job?

Furthermore, consider an use case where an exercise template has several functions that the student is supposed to implement, e.g. foo(), bar(), and baz(). Each of these have initially a raise NotImplementedError statement. Ideally, in my opinion, an automated grader should support grading a partially implemented solution. For example, the student could first implement foo() and submit it to see how it is graded. However, if we want to have the automated detection of an unmodified exercise template as described above, the problem is that the student's solution is not graded until all raise NotImplementedError statements have been removed. Of course, the student could temporarily replace all the raise statements with, e.g. a pass statement, but the latter is a silent, nonfunctional code, while the former actively reminds the student that they still have work to do, when the student's program is run against unit tests. At this point of view, simple blacklisting of a certain statement is not sufficient.

At this point, I decided to abandon the solution based on Python-grader-utils and instead write a custom structural checker for student's Python program inside graded unit tests on the CS-A1141 course.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions