Skip to content

Releases: vmenger/deduce

v2.4.2

21 Nov 11:37
a94904f

Choose a tag to compare

2.4.2 (2023-11-21)

Changed

  • name lookup list contents, extending names and adding more exceptions

v2.4.1

15 Nov 14:46
2b28022

Choose a tag to compare

2.4.1 (2023-11-15)

Added

  • detection of initials Ch., Chr., Ph. and Th.

v2.4.0

15 Nov 13:44
4f74303

Choose a tag to compare

2.4.0 (2023-11-15)

Added

  • logic for detecting hospitals, with added whitelist and separate annotator

Changed

  • logic for detecting (non-hospital) institutions, with extended lookup list

Removed

  • the separate Altrecht annotator, now included in the lookup list

v2.3.1

01 Nov 13:33
426c08a

Choose a tag to compare

2.3.1 (2023-11-01)

Fixed

  • include data files recursively in package

v2.3.0

25 Oct 10:52
66f0e5e

Choose a tag to compare

2.3.0 (2023-10-25)

Added

  • lookup lists (and logic) for Dutch provinces, regions, municipalities and streets

Changed

  • name of residences annotator to placenames, now includes provinces, regions and municipalities
  • lookup lists (and logic) for residences
  • logic for streets, housenumber and housenumber letters

v2.2.0

28 Sep 09:47
3ccd61c

Choose a tag to compare

2.2.0 (2023-09-28)

Changed

  • tokenizer logic:
    • a token is now a sequence of alphanumeric characters, a single newline, or a single special character.
    • whitespaces are no longer considered tokens
  • moved token pattern logic to config, using a new TokenPatternAnnotator
  • moved context pattern logic to config, using a new ContextAnnotator
  • many updates to name detection logic
    • lookup list optimizations
    • added, removed and simplified patterns

v2.1.0

07 Aug 12:46
02349b8

Choose a tag to compare

2.1.0 (2023-08-07)

Added

  • a component for deidentifying BSN-nummers

Changed

  • updated dependencies
  • by default, deduce now recognizes and tags bsn nummers
  • by default, deduce now recognizes all other 7+ digit numbers as identifiers
  • improved regular expressions for e-mail address and url matching, with separate tags
  • logic for detecting phone numbers (improvements for hyphens, whitespaces, false positive identifiers)
  • improved regular expression for age matching
  • date detection logic:
    • now only recognizes combinations of day, month and year (day/month combinations caused many false positives)
    • detects year-month-day format in addition to (day-month-year)
  • loading a custom config now only replaces the config options that are explicitly set, using defaults for those not included in the custom config

Fixed

  • annotations can no longer be counted as adjacent when separated by newline or tab (and will thus not be merged)

Removed

  • a separate patient identifier tag, now superseded by a generic tag
  • detection of day/month combinations for dates, as this caused many false positives (e.g. lab values, numeric scores)

Deprecated

  • backwards compatibility, which was temporary added to transition from v1 to v2

v2.0.3

06 Apr 08:47
53616e7

Choose a tag to compare

2.0.3 (2023-04-06)

Fixed

  • removed 'decibutus' from list of institutions as it caused many false positives

v2.0.2

28 Mar 14:56
1d2d37c

Choose a tag to compare

2.0.2 (2023-03-28)

Changed

  • upgraded dependencies, including markdown-it-py which had a vulnerability

v2.0.1

09 Dec 11:22
4c70a5f

Choose a tag to compare

2.0.1 (2022-12-09)

Changed

  • updated dependencies