Log on/register
BioMed Central home | Journals A-Z | Feedback | Support | My details
 
Open AccessResearch

Analysis of syntactic and semantic features for fine-grained event-spatial understanding in outbreak news reports

Hutchatai Chanlekha email and Nigel Collier email

National Institute of Informatics, Hitotsubashi 2-1-2, Chiyoda-ku, Tokyo, Japan

author email corresponding author email

Journal of Biomedical Semantics 2010, 1:3doi:10.1186/2041-1480-1-3

Published: 31 March 2010

Abstract

Background

Previous studies have suggested that epidemiological reasoning needs a fine-grained modelling of events, especially their spatial and temporal attributes. While the temporal analysis of events has been intensively studied, far less attention has been paid to their spatial analysis. This article aims at filling the gap concerning automatic event-spatial attribute analysis in order to support health surveillance and epidemiological reasoning.

Results

In this work, we propose a methodology that provides a detailed analysis on each event reported in news articles to recover the most specific locations where it occurs. Various features for recognizing spatial attributes of the events were studied and incorporated into the models which were trained by several machine learning techniques. The best performance for spatial attribute recognition is very promising; 85.9% F-score (86.75% precision/85.1% recall).

Conclusions

We extended our work on event-spatial attribute recognition by focusing on machine learning techniques, which are CRF, SVM, and Decision tree. Our approach avoided the costly development of an external knowledge base by employing the feature sources that can be acquired locally from the analyzed document. The results showed that the CRF model performed the best. Our study indicated that the nearest location and previous event location are the most important features for the CRF and SVM model, while the location extracted from the verb's subject is the most important to the Decision tree model.


© 1999-2010 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.