Calling all earth and environmental scientists, students and researchers! Please join us at Earth Day on August 5th!

Why at SIGKDD? Why now?

Earth is home to all of us. However, our beloved planet and civilization is facing major challenges from climate change and environmental degradation. Extreme events are becoming more extreme and more frequent. Surface water has more pollution and greenhouse gases have increased in the atmosphere. Moreover, the largest freshwater source on Earth, polar ice-caps and glaciers are losing ice, leading to sea-level rise.

Knowledge discovery and data mining (KDD) is crucial to addressing these and other challenges facing our changing planet. For example, Earth data (i.e. geo-referenced data such as in-situ and remotely sensed Earth observations, census, trajectories) records and helps in understanding biological, physical, and social changes. It can help forecast rates of sea level change in polar ice shelves, predict critical atmosphere and geospace events. It is also important for many societal priorities including security, public health, smart cities, transportation, climate, environment, food, energy, water.

Earth data have unique characteristics that bring challenges to data science. For some applications, the boundaries of geospatial objects can be amorphous and dynamically deform through time. In addition, the training of machine learning models also faces great challenges due to a lack of training samples: (1) significant Earth events can be very rare (e.g., earthquakes, cyclones), and (2) ground truth data are labor-intensive and time-consuming to collect. Earth Day will bring together thought leaders in academia, industry and government to explore this area and discuss opportunities to overcome the challenges that Earth faces today.

Earth Day Session Program

Morning: Three earth-related workshops

Fragile Earth: Theory Guided Data Science to Enhance Scientific Discovery

Data Mining and AI for Conservation

Urban Computing

12:00-1:00 pm- Lunch

1:30 pm-1:45 pm: Welcome, Rationale for Earth Day theme

1:45 pm-3:00 pm: Session 1: Importance of Earth data sets and use cases

Earth data (e.g., remote sensing imagery, GPS time service, location traces) has already transformed our lives by improving monitoring of global weather and agriculture for early warning of hurricanes and inclement weather as well as food shortage risks due to crop stresses or failures. Further, with 2 billion receivers in use for location and time services, the GPS has become a critical infrastructure for the world economy for use cases ranging from precision agriculture to navigation to ride sharing to smart cities. These success stories are only a beginning and many transformative opportunities lie ahead. For example, the 2011 Mckinsey Big Data report estimated that location trace data will generate about $600 billion annually by 2020. In addition, 2019 U.S. national academy report projects $1.6 trillion in savings for energy generation and use from earth data by 2035. Furthermore, government and industry have recently started major initiatives such as NASA Earth Exchange, Amazon's Earth on AWS, Google Earth Engine, Microsoft's AI for Earth, and NSF Navigating the New Arctic for meeting grand challenges facing our changing planet such as conservation, climate change, and environmental sustainability. This session will explore the tremendous value of Earth data for civil society, prosperity and good governance via a keynote and a panel discussion.


Keynote (30 min): Dr. Ramakrishna Nemani, NASA Earth Exchange

Panel (45 min): Societal value of Earth Data (Chair: James Hodson, AI for Good Foundation)

Questions for Panelists:

  • 1a. Civil Society View: What are their societal significance & most important use cases?
  • 1b. Business View: What will the annual value of Earth Data be in 2030 or 2040? Why?
  • 1c. Governance View: What is the role of Earth data in good governance?
  • 2. List important types, sources and accessible repositories of Earth data.
  • 3. List unique characteristics of Earth data and unique needs of its use cases.
  • 4. List strengths and weaknesses of current data mining techniques for Earth Data.

4:00 pm - 5:15 pm: Session 2 : Earth Day related Data Mining challenges and opportunities

Data mining methods have found success in analyzing many complicated systems, such as e-commerce, and use cases explored in the Earth Day aligned SIGKDD workshops. However, many questions remain open due to unique Earth data challenges such as spatio-temporal auto-correlation, heterogeneity, scale-dependence, measurement errors, modifiable areal unit problem, etc. For example, a recent Geo-Physical letter paper noted that "failure to account for dependence between [Physical] models, variables, locations and seasons yield misleading results". Additional challenges are noted in recent community papers from the NSF IS-GEO Research Coordination Network and University Consortium for Geographic Information Science. For example, Gerrymandering court debates also raise transparency concerns for the risk of altering statistical results by changing the choice of spatial partitions. This session explores these challenges and opportunities via a keynote and a panel discussion.


Keynote (30 min): Prof. Harvey Miller, Director, Center for Urban and Regional Analysis, OSU

Panel (45 min): Data Mining Challenges and Opportunities (Chair: Prof. Shashi Shekhar, McKnight Distinguished University Professor,UMN)

Questions for panelists:

  1. List knowledge gaps between Earth data mining needs and data mining state of the art.
  2. What new research is needed to fill the knowledge gaps?
  3. What are the data mining grand challenges with respect to analyzing Earth data?
  4. Is there a need for SIGKDD community action? If so, suggest community actions.


Shashi Shekhar, University of Minnesota – Twin Cities (Co-chair)

James Hodson, AI for Good Foundation (Co-chair)

Lucas Joppa, Microsoft Research (Co-chair)

Chaitanya Baru, University of California, San Diego

Vandana Janeja, University of Maryland, Baltimore County

Hui Xiong, Rutgers University & Baidu (Beijing, China)

Jieping Ye, University of Michigan, Ann Arbor & DiDi Chuxing (China)

Xun Zhou, University of Iowa

Ramasamy Uthurusamy, General Motors Corporation

Chid Apte, IBM T. J. Watson Research Center

Naoki Abe, IBM T. J. Watson Research Center

Vani Mandava, Microsoft Research

Meredith Lee, West Big Data Hub

Melissa Cragin, San Diego Supercomputing Center (formerly Midwest Big Data Hub)

Lea Shanley, South Big Data Hub

