Automatic Event Coding Framework for Spanish Political News Articles

Sayeed Salam, Lamisah Khan, Amir El-Ghamry, Patrick Brandt, Jennifer Holmes, Vito D'Orazio, Javier Osorio

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Today, Spanish speaking countries face widespread political crisis. These political conflicts are published in a large volume of Spanish news articles from Spanish agencies. Our goal is to create a fully functioning system that parses realtime Spanish texts and generates scalable event code. Rather than translating Spanish text into English text and using English event coders, we aim to create a tool that uses raw Spanish text and Spanish event coders for better flexibility, coverage, and cost.To accommodate the processing of a large number of Spanish articles, we adapt a distributed framework based on Apache Spark. We highlight how to extend the existing ontology to provide support for the automated coding process for Spanish texts. We also present experimental data to provide insight into the data collection process with filtering unrelated articles, scaling the framework, and gathering basic statistics on the dataset.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 IEEE 6th Intl Conference on Big Data Security on Cloud, BigDataSecurity 2020, 2020 IEEE Intl Conference on High Performance and Smart Computing, HPSC 2020 and 2020 IEEE Intl Conference on Intelligent Data and Security, IDS 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages246-253
Number of pages8
ISBN (Electronic)9781728168739
DOIs
StatePublished - May 2020
Externally publishedYes
Event6th IEEE International Conference on Big Data Security on Cloud, BigDataSecurity 2020, 6th IEEE International Conference on High Performance and Smart Computing, HPSC 2020 and 5th IEEE International Conference on Intelligent Data and Security, IDS 2020 - Baltimore, United States
Duration: May 25 2020May 27 2020

Publication series

NameProceedings - 2020 IEEE 6th Intl Conference on Big Data Security on Cloud, BigDataSecurity 2020, 2020 IEEE Intl Conference on High Performance and Smart Computing, HPSC 2020 and 2020 IEEE Intl Conference on Intelligent Data and Security, IDS 2020

Conference

Conference6th IEEE International Conference on Big Data Security on Cloud, BigDataSecurity 2020, 6th IEEE International Conference on High Performance and Smart Computing, HPSC 2020 and 5th IEEE International Conference on Intelligent Data and Security, IDS 2020
Country/TerritoryUnited States
CityBaltimore
Period5/25/205/27/20

Keywords

  • Apache Spark
  • Automated Event Coder
  • BERT
  • Multilingual
  • NLP
  • Universal Dependency

ASJC Scopus subject areas

  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems

Fingerprint

Dive into the research topics of 'Automatic Event Coding Framework for Spanish Political News Articles'. Together they form a unique fingerprint.

Cite this