Applied Data Science Invited Speakers

The Applied Data Science Invited Talks will provide a venue for leading experts in the world of applied data mining and knowledge discovery. These invited talks will feature highly influential speakers who have directly contributed to successful data mining applications in their respective fields. The talks and discussions will focus on innovative and leading-edge, large-scale industry or government applications of data mining in areas such as finance, health-care, bio-informatics, public policy, infrastructure, telecommunications, social media and computational advertising.

Keynote: Jai Ranganathan

Jai Ranganathan


Spinning the AI Pinwheel

Advances in supervised machine learning have frequently been fueled by access to large-scale labeled data. In business, however, natural labels may not exist. In these cases, a common industry playbook involves using manual, human annotation to label enough data points in order to train a model. This paper describes a general structure for accelerating the annotation process using artificial intelligence and combining it with model quality assurance (QA).

In this talk we walk through this process in detail. We start with rich manual annotation of a small number of unlabeled data points. These can then be used to train a series of coarse predictive models that are used to prepopulate some default selections in the annotation tool and speed up annotator performance. With more data points, models can be retrained on a regular cadence and less human intervention is required. Finally, models can provide defaults for all fields, and re-training continues until the annotator override rate reaches a production-grade level.

Tradeoffs of this type of approach include balancing the in- creased annotation efficiency with engineering costs associated with building annotation and quality assurance tools. We will walk through these tradeoffs, which depend on the problem class and complexity of the model.

Finally, we will include a detailed industry case study based on the use of artificial intelligence in the annotation process at KeepTruckin, where we use annotation to label vehicle location history data.

Jai Ranganathan is VP of Product & Data Science at KeepTruckin. Prior to joining KeepTruckin, Jai worked at Uber where he served as Senior Director of Data Science & Product, managing Machine Learning & AI, data, marketing systems, and operations tooling. Before that, Jai served as Senior Director of Product leading Machine Learning at Cloudera.

How can we assist you?

We'll be updating the website as information becomes available. If you have a question that requires immediate attention, please feel free to contact us. Thank you!

Please enter the word you see in the image below: