This article is part of the supplement: Proceedings of the Fourth International Symposium on Semantic Mining in Biomedicine (SMBM)
Linguistic scope-based and biological event-based speculation and negation annotations in the BioScope and Genia Event corpora
1 Research Group on Artificial Intelligence, Hungarian Academy of Sciences, Szeged, Hungary
2 UKP Lab, Technische Universität Darmstadt, Darmstadt, Germany
3 Department of Informatics, University of Szeged, Szeged, Hungary
4 Tsujii Laboratory, University of Tokyo, Tokyo, Japan
5 lnstitut für Maschinelle Sprachverarbeitung, Universität Stuttgart, Stuttgart, Germany
Journal of Biomedical Semantics 2011, 2(Suppl 5):S8 doi:10.1186/2041-1480-2-S5-S8Published: 6 October 2011
The treatment of negation and hedging in natural language processing has received much interest recently, especially in the biomedical domain. However, open access corpora annotated for negation and/or speculation are hardly available for training and testing applications, and even if they are, they sometimes follow different design principles. In this paper, the annotation principles of the two largest corpora containing annotation for negation and speculation – BioScope and Genia Event – are compared. BioScope marks linguistic cues and their scopes for negation and hedging while in Genia biological events are marked for uncertainty and/or negation.
Differences among the annotations of the two corpora are thematically categorized and the frequency of each category is estimated. We found that the largest amount of differences is due to the issue that scopes – which cover text spans – deal with the key events and each argument (including events within events) of these events is under the scope as well. In contrast, Genia deals with the modality of events within events independently.
The analysis of multiple layers of annotation (linguistic scopes and biological events) showed that the detection of negation/hedge keywords and their scopes can contribute to determining the modality of key events (denoted by the main predicate). On the other hand, for the detection of the negation and speculation status of events within events, additional syntax-based rules investigating the dependency path between the modality cue and the event cue have to be employed.