-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
For some applications, it would be nice to have a mapping back from token/mask (redacted value) to the original text, e.g:
patient [NAAM-1] met tel. nr. [TELEFOONNUMMER-1]
should have a mapping:
{'NAAM-1': 'Jan Jansen', 'TELEFOONNUMMER-1': '0612345678'}
That allows re-identification downstream. This requires some thought, because:
- Annotations are frozen dataclasses, so it's hard to set a value on them (i.e. the mask)
- It's not really straightforward to enforce all
docdeid.process.Redactorto add a mask to an annotations. For redactors like theRedactAllText, this does not even make sense. - The current Deduce redactor does fuzzy matching, so a token potentially maps to multiple original values
Metadata
Metadata
Assignees
Labels
No labels