Large Scale Machine Learning Systems
Curated by: Eric P. Xing and Qirong Ho
The rise of Big Data requires complex Machine Learning models with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer powerful predictive analytics (such as high-dimensional latent features, intermediate representations, and decision functions) thereupon. In turn, this has led to new demands for Machine Learning (ML) systems to learn complex models with millions to billions of parameters. In order to support the computational needs of ML algorithms at such scales, an ML system often needs to operate on distributed clusters with 10s to 1000s of machines; however, implementing algorithms and writing systems softwares for such distributed clusters demands significant design and engineering effort. A recent and increasingly popular trend toward industrial-scale machine learning is to explore new principles and strategies for either highly specialized monolithic designs for large-scale vertical applications such as various distributed topic models or regression models, or flexible and easily programmable general purpose distributed ML platforms—- such as GraphLab based on vertex programming, and Petuum using parameter server. It has been recognized that, in addition to familiarity of distributed system architectures and programing, large scale ML systems can benefit greatly from ML-rooted statistical and algorithmic insights, which can lead to principles and strategies unique to distributed machine learning programs. These principles and strategies shed lights to the following key questions—- How to distribute an ML program over a cluster? How to bridge ML computation with inter-machine communication? How to perform such communication? What should be communicated between machines?—- and they span a broad continuum from application, to engineering, and to theoretical research and development of Big ML systems and architectures. The ultimate goal of large scale ML systems research is to understand how these principles and strategies can be made efficient, generally-applicable, and easy to program and deploy, while not forgetting that they should be supported with scientifically-validated correctness and scaling guarantees.
- KDD 2015 tutorial slides “A New Look at the System, Algorithm and Theory Foundations of Distributed Machine Learning”: http://petuum.github.io/papers/SysAlgTheoryKDD2015.pdf
- KDD 2015 video about the “Petuum” system: https://www.youtube.com/watch?v=vwXolaBQfaU&index=89&list=PLn0nrSd4xjjaNzvUtxHzU64xTz4Y_XNK9
- Scalable Machine Learning class @ Berkeley: https://www.youtube.com/watch?v=iyHEF8IjKgU&list=PLOxR6w3fIHWzljtDh7jKSx_cuSxEOCayP
Related KDD2016 Papers
Title & Authors |
---|
Accelerated Stochastic Block Coordinate Descent with Optimal Sampling Author(s): Aston Zhang*, UIUC; Quanquan Gu, University of Virginia |
Parallel Dual Coordinate Descent Method for Large-scale Linear Classification in Multi-core Environm Author(s): Wei-Lin Chiang, National Taiwan University; Mu-Chu Lee, National Taiwan University; Chih-Jen Lin*, National Taiwan University |
Stochastic Optimization Techniques for Quantification Performance Measures Author(s): Harikrishna Narasimhan, IACS, Harvard University; Shuai Li, University of Insubria; Purushottam Kar*, IIT Kanpur; Sanjay Chawla, QCRI-HBKU, Qatar; Fabrizio Sebastiani, QCRI-HBKU, Qatar |
Safe Pattern Pruning: An Efficient Approach for Predictive Pattern Mining Author(s): Kazuya Nakagawa, Nagoya Institute of Technology; Shinya Suzumura, Nagoya Institute of Technology; Masayuki Karasuyama, ; Koji Tsuda, University of Tokyo; Ichiro Takeuchi*, Nagoya Institute of Technology Japan |
Fast Component Pursuit for Large-Scale Inverse Covariance Estimation Author(s): Lei Han*, Rutgers University; Yu Zhang, Hong Kong University of Science and Technology; Tong Zhang, Rutgers University |
Parallel Lasso Screening for Big Data Optimization Author(s): Qingyang Li*, Arizona State University; Shuang Qiu, Umich; Shuiwang Ji, Washington State University; Jieping Ye, University of Michigan at Ann Arbor; Jie Wang, University of Michigan |
Compressing Graphs and Indexes with Recursive Graph Bisection Author(s): Laxman Dhulipala, Carnegie Mellon University; Igor Kabiljo, Facebook; Brian Karrer, Facebook; Giuseppe Ottaviano, Facebook; Sergey Pupyrev*, Facebook; Alon Shalita, Facebook |
Convex Optimization for Linear Query Processing under Approximate Differential Privacy Author(s): Ganzhao Yuan*, SCUT; Yin Yang, ; Zhenjie Zhang, ; Zhifeng Hao, |