Collaborative and privacy-preserving workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions
This repositoy contains the computer code that has been used for the article:
@unpublished{petitjean2023nlp,
author = {Thomas Petit-Jean and Christel Gerardin and Emmanuelle Berthelot and Gilles Chatellier and Marie Frank and Xavier Tannier and Emmanuelle Kempf and Romain Bey},
title = {Collaborative and privacy-preserving workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions},
note = {Manuscript submitted for publication},
year = {2023}
}
The code has been executed on the database of the Greater Paris University Hospitals
- Code of article after review.
edsml
: library developped to facilitate the development of the qualification algorithms. Based onpytorch
andpytorch-lightning
.ecci_qualifier
: source code of the machine-learning based qualification algorithm.ecci
: code used during training for document selection, CLAIM pipeline or annotation process.analysis
: code used to generate results for the paper
We would like to thank Assistance Publique – Hôpitaux de Paris and AP-HP Foundation for funding this project.