Disease Surveillance in the Journal of Biomedical Informatics
« All Publications

Background: Around the world in tropical areas, certain vector-borne diseases have become endemic and hyperendemic.

Among the developing nations, there are common difficulties in establishing the incidences of various diseases, especially vector-borne diseases with complex etiologies and a broad spectrum of presentations. One alternative approach to characterization of the disease outbreaks examines the possibilities of developing proxy information from online news articles. Such sources are being evaluated for applications to disease surveillance, early outbreak detection, and epidemiology research. Our study here looks to examine the potential of news articles in elucidating outbreaks of dengue in India and zika disease in Brazil.

Objective: This study is designed to assess the potential usefulness of news articles in tracking case numbers of dengue and zika through an improved understanding of how news outlets report on disease. We specifically examine the possibilities of providing near real-time reporting on the development of outbreaks of dengue and zika.

Methods: Newspaper articles related to dengue fever and zika disease in India and Brazil, respectively were extracted from the LexisNexis database. We targeted news articles available from five popular international news sources and two local newspapers in each country. The news articles were processed to provide yearly and weekly time series in the number of articles concerned with dengue and zika to test their potential suitability as proxies for disease prevalence. The collections of articles were analyzed using a text mining tool-kit that subdivides a collections of news articles into smaller clusters to study the topical focus of articles and their relevance to tracking diseases.

Results: For dengue fever in India, the local newspapers provide a better source of information than international newspapers. The multi-year analysis (2010-2016) suggests that the numbers of dengue cases are strongly correlated with the numbers of news reports, with an R2 value of 0.88. For zika disease in Brazil, the news reports provided useful information on the timing of the zika outbreak. Reporting increase sharply at the beginning of 2016, peaked in weeks 5 to 8, and decreased sharply. The numbers of articles remained low for the remainder of 2016 and 2017. Comparisons with reported case again show article numbers to be a useful proxy of prevalence of zika in Brazil.

Conclusions: The paper describes a strategy that applies newspaper as proxies to monitor outbreaks of infectious diseases and to study the epidemiology. It has potential applicability in some developing countries and regions with relatively poor medical infrastructures and records. Clearly, large national newspapers in India provide a better source of information on diseases than international outlets. This approach has potential with selected diseases in a few selected countries. Article numbers internationally appear to vary in proportion to the perceived health impact.

Publication Summary

  • Geosyntec Authors: Yiding Zhang
  • All Authors: Yiding Zhang, Geosyntec Consultants; Motomu Ibaraki; Franklin W. Schwartz
  • Title: Disease Surveillance Using Online News: Dengue and Zika in Tropical Countries
  • Event or Publication: Publication
  • Practice Areas: Occupational Safety & Health
  • Citation: Yiding Zhang, Ph.D. (California) coauthored a paper entitled "Disease Surveillance Using Online News: Dengue and Zika in Tropical Countries" published in the Journal of Biomedical Informatics on January 3, 2020.
  • Date: January 3, 2020
  • Publication Type: Journal Article