Abstract

Many socio-economic indicators are sensitive to real-world events. Proper characterization of the events can help to identify the relevant events that drive fluctuations in these indicators. In this paper, we propose a novel generative model of real-world events and employ it to extract events from a large corpus of news articles. We introduce the notion of an event class, which is an abstract grouping of similarly themed events. These event classes are manifested in news articles in the form of event triggers which are specific words that describe the actions or incidents reported in any article. We use the extracted events to predict fluctuations in different socioeconomic indicators. Specifically, we focus on food prices and predict the price of 12 different crops based on real-world events that potentially influence food price volatility, such as transport strikes, festivals etc. Our experiments demonstrate that incorporating event information in the prediction tasks reduces the root mean square error (RMSE) of prediction by 22% compared to the standard ARIMA model. We also predict sudden increases in the food prices (i.e. spikes) using events as features, and achieve an average 5-10% increase in accuracy compared to baseline models, including an LDA topic-model based predictive model.


Filed under: Mining Rich Data Types


Comments