Detecting Devastating Diseases in Search Logs
John Paparrizos, Columbia University; Ryen White*, Microsoft; Eric Horvitz, Microsoft Research
Web search queries can offer a unique population-scale window onto streams of evidence that are useful for detecting the emergence of health conditions. We explore the promise of harnessing behavioral signals in search logs to provide advance warning about the presence of devastating diseases such as pancreatic cancer. Pancreatic cancer is often diagnosed too late to be treated effectively as the cancer has usually metastasized by the time of diagnosis. Symptoms of the early stages of the illness are often subtle and nonspeciﬁc. We identify searchers who issue credible, ﬁrst-person diagnostic queries for pancreatic cancer and we learn models from prior search histories that predict which searchers will later input such queries. We show that we can infer the likelihood of seeing the rise of diagnostic queries months be-fore they appear and characterize the tradeoff between predictivity and false positive rate. The ﬁndings highlight the potential of harnessing search logs for the early detection of pancreatic cancer and more generally for harnessing search systems to reduce health risks for individuals.
Filed under: Classification