Using social media, Internet searches, and electronic health records to predict incidence of flu and dengue in multiple locations worldwide. Using electronic health records to predict outcomes in pediatric intensive care units.


Mauricio Santillana is an Assistant Professor at Harvard Medical School, a faculty member in the Computational Health Informatics Program at Boston Children’s Hospital, and an associate at the Harvard Institute for Applied and Computational Sciences. Mauricio enjoys working with clinicians in the design of decision-making support tools.

Mauricio is a physicist and applied mathematician with expertise in mathematical modeling and scientific computing. He has worked in multiple research areas frequently analyzing big data sets to understand and predict the behavior of complex systems. His research modeling population growth patterns has informed policy makers in Mexico and Texas. His research in numerical analysis and computational fluid dynamics has been used to improve models of coastal floods due to hurricanes, and to improve the performance of global atmospheric chemistry models. In recent years, his main interest has been to develop mathematical models to improve healthcare. Specifically, he has leveraged information from big data sets from Internet-based services (such as Google, Twitter, Flu Near You, Weather) and electronic health records (EHR) to predict disease incidence in multiple locations worldwide and to predict outcomes in hospitalized patients. Dr. Santillana has advised the CDC and the White House on the development of population-wide disease forecasting tools.

Mauricio received a B.S. in physics with highest honors from the Universidad Nacional Autonoma de Mexico in Mexico City, and a master’s and PhD in computational and applied mathematics from the University of Texas at Austin. Mauricio first joined Harvard as a postdoctoral fellow at the Harvard Center for the Environment and has been a lecturer in applied mathematics at the Harvard SEAS, receiving two awards for excellence in teaching.


Publications powered by Harvard Catalyst Profiles

  1. A nowcasting framework for correcting for reporting delays in malaria surveillance. PLoS Comput Biol. 2021 11; 17(11):e1009570. View abstract
  2. Association Between Social Media Use and Self-reported Symptoms of Depression in US Adults. JAMA Netw Open. 2021 11 01; 4(11):e2136113. View abstract
  3. SARS-CoV-2 RNA concentrations in wastewater foreshadow dynamics and clinical presentation of new COVID-19 cases. Sci Total Environ. 2022 Jan 20; 805:150121. View abstract
  4. Gender-specificity of resilience in major depressive disorder. Depress Anxiety. 2021 10; 38(10):1026-1033. View abstract
  5. Estimating the cumulative incidence of COVID-19 in the United States using influenza surveillance, virologic testing, and mortality data: Four complementary approaches. PLoS Comput Biol. 2021 06; 17(6):e1008994. View abstract
  6. Toward the use of neural networks for influenza prediction at multiple spatial resolutions. Sci Adv. 2021 Jun; 7(25). View abstract
  7. A dynamic, ensemble learning approach to forecast dengue fever epidemic years in Brazil using weather and population susceptibility cycles. J R Soc Interface. 2021 06; 18(179):20201006. View abstract
  8. Factors Associated With Self-reported Symptoms of Depression Among Adults With and Without a Previous COVID-19 Diagnosis. JAMA Netw Open. 2021 06 01; 4(6):e2116612. View abstract
  9. High coverage COVID-19 mRNA vaccination rapidly controls SARS-CoV-2 transmission in Long-Term Care Facilities. medRxiv. 2021 May 24. View abstract
  10. Using heterogeneous data to identify signatures of dengue outbreaks at fine spatio-temporal scales across Brazil. PLoS Negl Trop Dis. 2021 05; 15(5):e0009392. View abstract
  11. Influenza forecasting for French regions combining EHR, web and climatic data sources with a machine learning ensemble approach. PLoS One. 2021; 16(5):e0250890. View abstract
  12. Socioeconomic status determines COVID-19 incidence and related mortality in Santiago, Chile. Science. 2021 05 28; 372(6545). View abstract
  13. High coverage COVID-19 mRNA vaccination rapidly controls SARS-CoV-2 transmission in Long-Term Care Facilities. Res Sq. 2021 Apr 12. View abstract
  14. Comparison of post-COVID depression and major depressive disorder. medRxiv. 2021 Apr 04. View abstract
  15. Avoidable Serum Potassium Testing in the Cardiac ICU: Development and Testing of a Machine-Learning Model. Pediatr Crit Care Med. 2021 04 01; 22(4):392-400. View abstract
  16. Persistence of symptoms up to 10 months following acute COVID-19 illness. medRxiv. 2021 Mar 08. View abstract
  17. An early warning approach to monitor COVID-19 activity with multiple digital traces in near real time. Sci Adv. 2021 03; 7(10). View abstract
  18. Association of Acute Symptoms of COVID-19 and Symptoms of Depression in Adults. JAMA Netw Open. 2021 03 01; 4(3):e213223. View abstract
  19. Socioeconomic status determines COVID-19 incidence and related mortality in Santiago, Chile. medRxiv. 2021 Jan 15. View abstract
  20. Incorporating human mobility data improves forecasts of Dengue fever in Thailand. Sci Rep. 2021 01 13; 11(1):923. View abstract
  21. COVID-19: US federal accountability for entry, spread, and inequities-lessons for the future. Eur J Epidemiol. 2020 Nov; 35(11):995-1006. View abstract
  22. Rates of increase of antibiotic resistance and ambient temperature in Europe: a cross-national analysis of 28 countries between 2000 and 2016. Euro Surveill. 2020 11; 25(45). View abstract
  23. The role of environmental factors on transmission rates of the COVID-19 outbreak: an initial assessment in two spatial scales. Sci Rep. 2020 10 12; 10(1):17002. View abstract
  24. Correction: Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models. J Med Internet Res. 2020 Sep 22; 22(9):e23996. View abstract
  25. Adding Continuous Vital Sign Information to Static Clinical Data Improves the Prediction of Length of Stay After Intubation: A Data-Driven Machine Learning Approach. Respir Care. 2020 Sep; 65(9):1367-1377. View abstract
  26. Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models. J Med Internet Res. 2020 08 17; 22(8):e20285. View abstract
  27. Real-time estimation of disease activity in emerging outbreaks using internet search information. PLoS Comput Biol. 2020 08; 16(8):e1008117. View abstract
  28. Estimating the Cumulative Incidence of COVID-19 in the United States Using Four Complementary Approaches. medRxiv. 2020 Aug 07. View abstract
  29. SARS-CoV-2 titers in wastewater foreshadow dynamics and clinical presentation of new COVID-19 cases. medRxiv. 2020 Jul 06. View abstract
  30. An Early Warning Approach to Monitor COVID-19 Activity with Multiple Digital Traces in Near Real-Time. ArXiv. 2020 Jul 01. View abstract
  31. Communicating Benefits from Vaccines Beyond Preventing Infectious Diseases. Infect Dis Ther. 2020 Sep; 9(3):467-480. View abstract
  32. Effect of non-pharmaceutical interventions to contain COVID-19 in China. Nature. 2020 09; 585(7825):410-413. View abstract
  33. Patients with Cancer Appear More Vulnerable to SARS-CoV-2: A Multicenter Study during the COVID-19 Outbreak. Cancer Discov. 2020 06; 10(6):783-791. View abstract
  34. A machine learning methodology for real-time forecasting of the 2019-2020 COVID-19 outbreak using Internet searches, news alerts, and estimates from mechanistic models. ArXiv. 2020 Apr 08. View abstract
  35. Aggregated mobility data could help fight COVID-19. Science. 2020 04 10; 368(6487):145-146. View abstract
  36. The Role of Environmental Factors on Transmission Rates of the COVID-19 Outbreak: An Initial Assessment in Two Spatial Scales. SSRN. 2020 Mar 12; 3552677. View abstract
  37. Effect of non-pharmaceutical interventions for containing the COVID-19 outbreak in China. medRxiv. 2020 Mar 06. View abstract
  38. Fitbit-informed influenza forecasts. Lancet Digit Health. 2020 02; 2(2):e54-e55. View abstract
  39. Internet search query data improve forecasts of daily emergency department volume. J Am Med Inform Assoc. 2019 12 01; 26(12):1574-1583. View abstract
  40. Noninvasive Ventilation Is Interrupted Frequently and Mostly Used at Night in the Pediatric Intensive Care Unit. Respir Care. 2020 Mar; 65(3):341-346. View abstract
  41. Differences in Regional Patterns of Influenza Activity Across Surveillance Systems in the United States: Comparative Evaluation. JMIR Public Health Surveill. 2019 Sep 14; 5(4):e13403. View abstract
  42. Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries. JMIR Public Health Surveill. 2019 Apr 04; 5(2):e12214. View abstract
  43. Genomic, epidemiological and digital surveillance of Chikungunya virus in the Brazilian Amazon. PLoS Negl Trop Dis. 2019 03; 13(3):e0007065. View abstract
  44. Improved state-level influenza nowcasting in the United States leveraging Internet-based data and network approaches. Nat Commun. 2019 01 11; 10(1):147. View abstract
  45. Enhancing Situational Awareness to Prevent Infectious Disease Outbreaks from Becoming Catastrophic. Curr Top Microbiol Immunol. 2019; 424:59-74. View abstract
  46. Estimation of Pneumonic Plague Transmission in Madagascar, August-November 2017. PLoS Curr. 2018 Nov 01; 10. View abstract
  47. Comparison of crowd-sourced, electronic health records based, and traditional health-care based influenza-tracking systems at multiple spatial resolutions in the United States of America. BMC Infect Dis. 2018 08 15; 18(1):403. View abstract
  48. Relatedness of the Incidence Decay with Exponential Adjustment (IDEA) Model," Farr's Law" and SIR Compartmental Difference Equation Models. Infectious Disease Modelling. 2018; 3(1):1-12. View abstract
  49. Antibiotic Resistance Increases with Local Temperature. Nature Climate Change. 2018; (8):510-514. View abstract
  50. Antibiotic Resistance Increases with Local Temperature. Nat Clim Chang. 2018 Jun; 8(6):510-514. View abstract
  51. Relatedness of the incidence decay with exponential adjustment (IDEA) model, "Farr's law" and SIR compartmental difference equation models. Infect Dis Model. 2018; 3:1-12. View abstract
  52. Accurate Influenza Monitoring and Forecasting Using Novel Internet Data Streams: A Case Study in the Boston Metropolis. JMIR Public Health Surveill. 2018 Jan 09; 4(1):e4. View abstract
  53. Combining Participatory Influenza Surveillance with Modeling and Forecasting: Three Alternative Approaches. JMIR Public Health Surveill. 2017 Nov 01; 3(4):e83. View abstract
  54. County-level assessment of United States kindergarten vaccination rates for measles mumps rubella (MMR) for the 2014-2015 school year. Vaccine. 2017 11 07; 35(47):6444-6450. View abstract
  55. Advances in using Internet searches to track dengue. PLoS Comput Biol. 2017 Jul; 13(7):e1005607. View abstract
  56. Using electronic health records and Internet search information for accurate influenza forecasting. BMC Infect Dis. 2017 05 08; 17(1):332. View abstract
  57. Determinants of Participants' Follow-Up and Characterization of Representativeness in Flu Near You, A Participatory Disease Surveillance System. JMIR Public Health Surveill. 2017 Apr 07; 3(2):e18. View abstract
  58. Forecasting Zika Incidence in the 2016 Latin America Outbreak Combining Traditional Disease Surveillance with Search, Social Media, and News Report Data. PLoS Negl Trop Dis. 2017 01; 11(1):e0005295. View abstract
  62. Evaluating the performance of infectious disease forecasts: A comparison of climate-driven and seasonal dengue forecasts for Mexico. Sci Rep. 2016 Sep 26; 6:33707. View abstract
  63. Editorial Commentary: Perspectives on the Future of Internet Search Engines and Biosurveillance Systems. Clin Infect Dis. 2017 01 01; 64(1):42-43. View abstract
  64. Utilizing Nontraditional Data Sources for Near Real-Time Estimation of Transmission Dynamics During the 2015-2016 Colombian Zika Virus Disease Outbreak. JMIR Public Health Surveill. 2016 Jun 01; 2(1):e30. View abstract
  65. Cloud-based Electronic Health Records for Real-time, Region-specific Influenza Surveillance. Sci Rep. 2016 05 11; 6:25732. View abstract
  66. Accurate estimation of influenza epidemics using Google search data via ARGO. Proc Natl Acad Sci U S A. 2015 Nov 24; 112(47):14473-8. View abstract
  67. Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance. PLoS Comput Biol. 2015 Oct; 11(10):e1004513. View abstract
  68. Flu Near You: Crowdsourced Symptom Reporting Spanning 2 Influenza Seasons. Am J Public Health. 2015 Oct; 105(10):2124-30. View abstract
  69. 2014 ebola outbreak: media events track changes in observed reproductive number. PLoS Curr. 2015 Apr 28; 7. View abstract
  70. A case study of the New York City 2012-2013 influenza season with daily geocoded Twitter data from temporal and spatiotemporal perspectives. J Med Internet Res. 2014 Oct 20; 16(10):e236. View abstract
  71. Using clinicians' search query data to monitor influenza epidemics. Clin Infect Dis. 2014 Nov 15; 59(10):1446-50. View abstract
  72. What can digital disease detection learn from (an external revision to) Google Flu Trends? Am J Prev Med. 2014 Sep; 47(3):341-7. View abstract
  73. Evaluation of Internet-based dengue query data: Google Dengue Trends. PLoS Negl Trop Dis. 2014 Feb; 8(2):e2713. View abstract
  74. Gradient-based estimation of Manning’s friction coefficient from noisy data. Journal of Computational and Applied Mathematics. 2013; (238):1–13. View abstract
  75. Quantifying the loss of information in source attribution problems using the adjoint method in global models of atmospheric chemical transport. arXiv preprint arXiv:1311.6315. 2013. View abstract
  76. A numerical approach to study the properties of solutions of the diffusive wave approximation of the shallow water equations. Computational Geosciences. 2010; 1(14):31-53. View abstract
  77. A local discontinuous Galerkin method for a doubly nonlinear diffusion equation arising in shallow water modeling. Computer Methods in Applied Mechanics and Engineering. 2010; 23(199):1424–1436.. View abstract
  78. Estimating small-area population growth using geographic-knowledge-guided cellular automata. International Journal of Remote Sensing. 2010; 21(31):5689–5707. View abstract
  79. An adaptive reduction algorithm for efficient chemical calculations in global atmospheric chemistry models. Atmospheric Environment. 2010; 35(44):4426–4431. View abstract
  80. On the diffusive wave approximation of the shallow water equations. European Journal of Applied Mathematics. 2008; 05(19):575–606. View abstract