Paul Avillach holds an MD in public health and epidemiology from the University of Bordeaux and a PhD in biomedical informatics from the University of Marseilles. Avillach's research focuses on the development of novel methods and techniques for the integration of multiple heterogeneous clinic cohorts, electronic health records data, and multiple types of genomics data to encompass biological observations. He is PI and Co-Investigator on several large projects at DBMI, including the BD2K PIC-SURE Center of Excellence, the Global Rare Diseases Registry project, the PCORI ARCH project, and the PCORI Phelan-Mcdermid Syndrome project.

Visit the Avillach Lab for more information.


Publications powered by Harvard Catalyst Profiles

  1. A high-throughput phenotyping algorithm is portable from adult to pediatric populations. J Am Med Inform Assoc. 2021 Feb 17. View abstract
  2. Validation of an Internationally Derived Patient Severity Phenotype to Support COVID-19 Analytics from Electronic Health Record Data. J Am Med Inform Assoc. 2021 Feb 10. View abstract
  3. GenoPheno: cataloging large-scale phenotypic and next-generation sequencing data within human datasets. Brief Bioinform. 2021 Jan 18; 22(1):55-65. View abstract
  4. What Every Reader Should Know About Studies Using Electronic Health Record Data but May be Afraid to Ask. J Med Internet Res. 2021 01 10. View abstract
  5. Vascular and metabolic risk factor differences prior to dementia diagnosis: a multidatabase case-control study using European electronic health records. BMJ Open. 2020 11 14; 10(11):e038753. View abstract
  6. International electronic health record-derived COVID-19 clinical course profiles: the 4CE consortium. NPJ Digit Med. 2020; 3:109. View abstract
  7. Scalability and cost-effectiveness analysis of whole genome-wide association studies on Google Cloud Platform and Amazon Web Services. J Am Med Inform Assoc. 2020 Jul 27. View abstract
  8. Development and validation of a Paediatric Early Warning Score for use in the emergency department: a multicentre study. Lancet Child Adolesc Health. 2020 08; 4(8):583-591. View abstract
  9. A Semi-Automated Approach for Multilingual Terminology Matching: Mapping the French Version of the ICD-10 to the ICD-10 CM. Stud Health Technol Inform. 2020 Jun 16; 270:18-22. View abstract
  10. Treatment pathway analysis of newly diagnosed dementia patients in four electronic health record databases in Europe. Soc Psychiatry Psychiatr Epidemiol. 2021 Mar; 56(3):409-416. View abstract
  11. Methotrexate and relative risk of dementia amongst patients with rheumatoid arthritis: a multi-national multi-database case-control study. Alzheimers Res Ther. 2020 04 06; 12(1):38. View abstract
  12. Non-alcoholic fatty liver disease and risk of incident acute myocardial infarction and stroke: findings from matched cohort study of 18 million European adults. BMJ. 2019 10 08; 367:l5367. View abstract
  13. The Genomics Research and Innovation Network: creating an interoperable, federated, genomics learning system. Genet Med. 2020 02; 22(2):371-380. View abstract
  14. Use of natural language processing in electronic medical records to identify pregnant women with suicidal behavior: towards a solution to the complex classification problem. Eur J Epidemiol. 2019 Feb; 34(2):153-162. View abstract
  15. Real-world data reveal a diagnostic gap in non-alcoholic fatty liver disease. BMC Med. 2018 08 13; 16(1):130. View abstract
  16. Screening pregnant women for suicidal behavior in electronic medical records: diagnostic codes vs. clinical notes processed by natural language processing. BMC Med Inform Decis Mak. 2018 05 29; 18(1):30. View abstract
  17. Adverse obstetric and neonatal outcomes complicated by psychosis among pregnant women in the United States. BMC Pregnancy Childbirth. 2018 05 02; 18(1):120. View abstract
  18. Rcupcake: an R package for querying and analyzing biomedical data through the BD2K PIC-SURE RESTful API. Bioinformatics. 2018 04 15; 34(8):1431-1432. View abstract
  19. Adverse obstetric outcomes during delivery hospitalizations complicated by suicidal behavior among US pregnant women. PLoS One. 2018; 13(2):e0192943. View abstract
  20. Health assessment of French university students and risk factors associated with mental health disorders. PLoS One. 2017; 12(11):e0188187. View abstract
  21. Phelan-McDermid syndrome data network: Integrating patient reported outcomes with clinical notes and curated genetic reports. Am J Med Genet B Neuropsychiatr Genet. 2018 10; 177(7):613-624. View abstract
  22. Dementia prevalence and incidence in a federation of European Electronic Health Record databases: The European Medical Informatics Framework resource. Alzheimers Dement. 2018 02; 14(2):130-139. View abstract
  23. CodeMapper: semiautomatic coding of case definitions. A contribution from the ADVANCE project. Pharmacoepidemiol Drug Saf. 2017 Aug; 26(8):998-1005. View abstract
  24. The Georges Pompidou University Hospital Clinical Data Warehouse: A 8-years follow-up experience. Int J Med Inform. 2017 06; 102:21-28. View abstract
  25. Identifying Cases of Type 2 Diabetes in Heterogeneous Data Sources: Strategy from the EMIF Project. PLoS One. 2016; 11(8):e0160648. View abstract
  26. An informatics research agenda to support precision medicine: seven key areas. J Am Med Inform Assoc. 2016 07; 23(4):791-5. View abstract
  27. Data Extraction and Management in Networks of Observational Health Care Databases for Scientific Research: A Comparison of EU-ADR, OMOP, Mini-Sentinel and MATRICE Strategies. EGEMS (Wash DC). 2016; 4(1):1189. View abstract
  28. Evaluating the Impact of Computerized Provider Order Entry on Medical Students Training at Bedside: A Randomized Controlled Trial. PLoS One. 2015; 10(9):e0138094. View abstract
  29. Detection of Drug-Drug Interactions Inducing Acute Kidney Injury by Electronic Health Records Mining. Drug Saf. 2015 Sep; 38(9):799-809. View abstract
  30. [Limiting a Medline/PubMed query to the "best" articles using the JCR relative impact factor]. Rev Epidemiol Sante Publique. 2014 Dec; 62(6):361-5. View abstract
  31. Guide to good practices to ensure privacy protection in secondary use of medical records. Rev Epidemiol Sante Publique. 2014 Jun; 62(3):207-14. View abstract
  32. Etiologies and diagnostic work-up of extreme macrocytosis defined by an erythrocyte mean corpuscular volume over 130°fL: A study of 109 patients. Am J Hematol. 2014 Jun; 89(6):665-6. View abstract
  33. Translational research platforms integrating clinical and omics data: a review of publicly available solutions. Brief Bioinform. 2015 Mar; 16(2):280-90. View abstract
  34. Signal detection of potentially drug-induced acute liver injury in children using a multi-country healthcare database network. Drug Saf. 2014 Feb; 37(2):99-108. View abstract
  35. Urinary retinol binding protein is a marker of the extent of interstitial kidney fibrosis. PLoS One. 2014; 9(1):e84708. View abstract
  36. Phenome-wide association studies on a quantitative trait: application to TPMT enzyme activity and thiopurine therapy in pharmacogenomics. PLoS Comput Biol. 2013; 9(12):e1003405. View abstract
  37. Gathering and exploring scientific knowledge in pharmacovigilance. PLoS One. 2013; 8(12):e83016. View abstract
  38. Characteristics and outcomes of sudden cardiac arrest during sports in women. Circ Arrhythm Electrophysiol. 2013 Dec; 6(6):1185-91. View abstract
  39. Drug-induced acute myocardial infarction: identifying 'prime suspects' from electronic healthcare records-based surveillance system. PLoS One. 2013; 8(8):e72148. View abstract
  40. Major regional disparities in outcomes after sudden cardiac arrest during sports. Eur Heart J. 2013 Dec; 34(47):3632-40. View abstract
  41. Pilot evaluation of an automated method to decrease false-positive signals induced by co-prescriptions in spontaneous reporting databases. Pharmacoepidemiol Drug Saf. 2014 Feb; 23(2):186-94. View abstract
  42. A reference standard for evaluation of methods for drug safety signal detection using electronic healthcare record databases. Drug Saf. 2013 Jan; 36(1):13-23. View abstract
  43. The EU-ADR Web Platform: delivering advanced pharmacovigilance tools. Pharmacoepidemiol Drug Saf. 2013 May; 22(5):459-67. View abstract
  44. Design and validation of an automated method to detect known adverse drug reactions in MEDLINE: a contribution from the EU-ADR project. J Am Med Inform Assoc. 2013 May 01; 20(3):446-52. View abstract
  45. Effect of competition bias in safety signal generation: analysis of a research database of spontaneous reports in France. Drug Saf. 2012 Oct 01; 35(10):855-64. View abstract
  46. Risk factors and clinical outcome of unsuspected pulmonary embolism in cancer patients: a case-control study. J Thromb Haemost. 2012 Oct; 10(10):2032-8. View abstract
  47. Harmonization process for the identification of medical events in eight European healthcare databases: the experience from the EU-ADR project. J Am Med Inform Assoc. 2013 Jan 01; 20(1):184-92. View abstract
  48. Automatic filtering and substantiation of drug safety signals. PLoS Comput Biol. 2012; 8(4):e1002457. View abstract
  49. EU-ADR healthcare database network vs. spontaneous reporting system database: preliminary comparison of signal detection. Stud Health Technol Inform. 2011; 166:25-30. View abstract
  50. A potential competition bias in the detection of safety signals from spontaneous reporting databases. Pharmacoepidemiol Drug Saf. 2010 Nov; 19(11):1166-71. View abstract
  51. Design and evaluation of a semantic approach for the homogeneous identification of events in eight patient databases: a contribution to the European EU-ADR project. Stud Health Technol Inform. 2010; 160(Pt 2):1085-9. View abstract
  52. A semantic approach for the homogeneous identification of events in eight patient databases: a contribution to the European eu-ADR project. Stud Health Technol Inform. 2009; 150:190-4. View abstract
  53. Using discharge abstracts to evaluate a regional perinatal network: assessment of the linkage procedure of anonymous data. Int J Telemed Appl. 2009; 2009:181842. View abstract
  54. Improving the quality of the coding of primary diagnosis in standardized discharge summaries. Health Care Manag Sci. 2008 Jun; 11(2):147-51. View abstract
  55. Using knowledge for indexing health web resources in a quality-controlled gateway. Stud Health Technol Inform. 2008; 136:205-10. View abstract
  56. Building application-related patient identifiers: what solution for a European country? Int J Telemed Appl. 2008; 678302. View abstract
  57. A model for indexing medical documents combining statistical and symbolic knowledge. AMIA Annu Symp Proc. 2007 Oct 11; 31-5. View abstract
  58. Interoperability issues regarding patient identification in Europe. Annu Int Conf IEEE Eng Med Biol Soc. 2007; 2007:6161. View abstract
  59. Proposal of a French health identification number interoperable at the European level. Stud Health Technol Inform. 2007; 129(Pt 1):503-7. View abstract
  60. How to manage secure direct access of European patients to their computerized medical record and personal medical record. Stud Health Technol Inform. 2007; 127:246-55. View abstract