Place holder for M3 packages to be shared including codes, tools, and some learning data, and sample testing data.
Arooj Hussain, Haifa Alrdahi, Hendrik Šuvalov, Lifeng Han, Meghna Jani, Will Dixon, Goran Nenadic. ''M3: Extracting medication and related attributes from outpatient letters''. Conference: HealTAC 2023: HEALTHCARE TEXT ANALYTICS CONFERENCE 2023 MANCHESTER, JUNE 14-16, 2023. Poster link: M3-Poster | Presentation-photo | Presentation-film
Models hosted here included below. For more models on LLMs and TransformerCRF go to this link
Fine-tuned on these models. Outcomes:
-
Clinical-xlm-Roberta (Clinical-XR): to medication extraction task
-
Extended-Med7 (Med7+): to 9 labels
Strict: exact boundary surface string match and entity type; Exact: exact boundary match over the surface string, regardless of the type; Partial: partial boundary match over the surface string, regardless of the type; Type: some overlap between the system tagged entity and the gold annotation is required;
Med7+ Instruction using Colab step-by-step documentation
Clinical-XLM-R Instruction using Colab: coming soon click here for Saved models (1GB+)
Deirect deployment of Med7 15% testing set.
Med7+ Fine-tuned Med7+ performances on n2c2-2018 shared task data using our own data splition (70/15/15%) for overall 505 original annotated letters:
- generating 9 labels, vs 7 from original Med7
- micro: precision 91%, recall 88%, f1 89%. weighted: precision 90%, recall 88%, f1 89%.
Clinical-xlm-Roberta:
'overall_precision': 0.8798480837840948, 'overall_recall': 0.9014267185473411, 'overall_f1': 0.8905066977285965, 'overall_accuracy': 0.9676871902790035}
@misc{alrdahi2023medmine, title={MedMine: Examining Pre-trained Language Models on Medication Mining}, author={Haifa Alrdahi and Lifeng Han and Hendrik Šuvalov and Goran Nenadic}, year={2023}, eprint={2308.03629}, archivePrefix={arXiv}, primaryClass={cs.CL} }




