Let's See Your Digits: Anomalous-State Detection using Benford's Law
Samuel Maurus (Technical University of Munich);Claudia Plant (University of Vienna)
Abstract
Benford’s Law explains a curious “naturally-occurring” phenomenon in which the leading digits of numerical data are distributed in a precise fashion. In this paper we begin by showing that system metrics generated by many modern information systems like Twitter, Wikipedia, YouTube and GitHub obey this law. We then propose a novel unsupervised approach called BenFound that exploits this property to detect anomalous system events. BenFound tracks the “Benfordness” of key system metrics, like the follower counts of tweeting Twitter users or the change deltas in Wikipedia page edits. It then applies a novel Benford-conformity test in real-time to identify “non-Benford events”. We investigate a variety of such events, showing that they correspond to unnatural and often undesired system interactions like spamming, hashtag-hijacking and denial-of-service attacks. The result is a technically-uncomplicated and effective “red flagging” technique. Although not without its limitations, it is highly efficient and requires neither obscure parameters, nor text streams, nor natural-language processing.