An Extensible Event Extraction System With Cross-Media Event Resolution
Fabio Petroni (Thomson Reuters); Natraj Raman (Thomson Reuters); Timothy Nugent (Thomson Reuters); Armineh Nourbakhsh (Thomson Reuters); Zarko Panic (Thomson Reuters); Sameena Shah (Thomson Reuters); Jochen L. Leidner (Thomson Reuters)
The automatic extraction of breaking news events from natural language text is a valuable capability for decision support systems. Traditional systems tend to focus on extracting events from a single media source and often ignore cross-media references. Here, we describe a large-scale automated system for extracting natural disasters and critical events from both newswire text and social media. We outline a comprehensive architecture that can identify, categorize and summarize seven different event types - namely floods, storms, fires, armed conflict, terrorism, infrastructure breakdown, and labour unavailability. The system comprises fourteen modules and is equipped with a novel coreference mechanism, capable of linking events extracted from the two complementary data sources. Additionally, the system is easily extensible to accommodate new event types. Our experimental evaluation demonstrates the effectiveness of the system.