Skip to content

HECTA-UoM/M3

Repository files navigation

M3: Manchester Medical Mining

Place holder for M3 packages to be shared including codes, tools, and some learning data, and sample testing data.

Presentations and Reports

Arooj Hussain, Haifa Alrdahi, Hendrik Šuvalov, Lifeng Han, Meghna Jani, Will Dixon, Goran Nenadic. ''M3: Extracting medication and related attributes from outpatient letters''. Conference: HealTAC 2023: HEALTHCARE TEXT ANALYTICS CONFERENCE 2023 MANCHESTER, JUNE 14-16, 2023. Poster link: M3-Poster | Presentation-photo | Presentation-film

Fine-tuned on these models. Outcomes:

  1. Clinical-xlm-Roberta (Clinical-XR): to medication extraction task

  2. Extended-Med7 (Med7+): to 9 labels

Evaluation Metrics

Eval metrics:

Strict: exact boundary surface string match and entity type; Exact: exact boundary match over the surface string, regardless of the type; Partial: partial boundary match over the surface string, regardless of the type; Type: some overlap between the system tagged entity and the gold annotation is required;

Model instructions and running logs

Med7+ Instruction using Colab step-by-step documentation

Med7+ generated outputs in excel files 'generated_output_excels.zip' (65MB), all generated data (691MB) including NER, tok2vec, vocab folders as listed below can be downloaded

Med7+ Colab file download

Clinical-XLM-R Instruction using Colab: coming soon click here for Saved models (1GB+)

Evaluation Scores

Deirect deployment of Med7 15% testing set.

Med7+ Fine-tuned Med7+ performances on n2c2-2018 shared task data using our own data splition (70/15/15%) for overall 505 original annotated letters:

  • generating 9 labels, vs 7 from original Med7
  • micro: precision 91%, recall 88%, f1 89%. weighted: precision 90%, recall 88%, f1 89%.

Clinical-xlm-Roberta:

'overall_precision': 0.8798480837840948, 'overall_recall': 0.9014267185473411, 'overall_f1': 0.8905066977285965, 'overall_accuracy': 0.9676871902790035}

Acknowledgement

Med7 xlm-Roberta-base

References

@misc{alrdahi2023medmine, title={MedMine: Examining Pre-trained Language Models on Medication Mining}, author={Haifa Alrdahi and Lifeng Han and Hendrik Šuvalov and Goran Nenadic}, year={2023}, eprint={2308.03629}, archivePrefix={arXiv}, primaryClass={cs.CL} }

About

Manchester Medical Mining

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors