Skip to content

Too many false positives #4

@tshprecher

Description

@tshprecher

The classifier yields too many false positives. Some ideas on improving this:

  1. Prune results that occur more than once in a file. It's reasonable to assume that typos will not occur more than once in file.
  2. Prune results that occur more than once in all the files. Same as The set of english words is poor (missing words like "called" and "having"!) #1, except slightly more strict.
  3. Perhaps instead of skipping the symbols in code, store them and exclude any references to them in comments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions