AESOP: Automatic Policy Learning for Predicting and Mitigating Network Service Impairments
Supratim Deb (AT&T Labs);Zihui Ge (AT&T Labs);Sastry Isukapalli (AT&T Labs);Sarat Puthenpura (AT&T Labs);Shobha Venkataraman (AT&T Labs);He Yan (AT&T Labs);Jennifer Yates (AT&T Labs)
Efficient management and control of modern (4G LTE) and next-gen (5G) cellular networks is of paramount importance as networks have to maintain highly reliable service quality to support rapid growth in traffic demand and new application services. Fast responses to network service degradation is going to be the key for networks to deliver promised benefits to end-users. Evolving network management towards data-driven automation would have dramatic impact on the quality and speed of mitigation. In this paper, we present AESOP, a data-driven intelligent system to facilitate automatic learning of policies and rules for triggering remedial troubleshooting actions in Networks. AESOP combines best practices of operations’ intelligence with variety of measurement data to learn and validate operational policies to mitigate service problems in networks. AESOP design addresses key challenges like learning from high-dimensional noisy data, capturing multiple fault models, very high service-cost of false positives, evolving network infrastructure (leading to changing data distribution), etc. We present the design of our system and show results from our ongoing experiments to show the operational efficiency that can be had from such policy driven automation.