Optimizing feature representation for automated systematic review work prioritization

Aaron M Cohen

Optimizing feature representation for automated systematic review work prioritization

AMIA Annu Symp Proc. 2008 Nov 6:2008:121-5.

Author

Aaron M Cohen¹

Affiliation

¹ Department of Medical Informatics and Clinical Epidemiology,Oregon Health & Science University, Portland, Oregon, USA.

PMID: 18998798
PMCID: PMC2656096

Abstract

Automated document classification can be a valuable tool for enhancing the efficiency of creating and updating systematic reviews (SRs) for evidence-based medicine. One way document classification can help is in performing work prioritization: given a set of documents, order them such that the most likely useful documents appear first. We evaluated several alternate classification feature systems including unigram, n-gram, MeSH, and natural language processing (NLP) feature sets for their usefulness on 15 SR tasks, using the area under the receiver operating curve as a measure of goodness. We also examined the impact of topic-specific training data compared to general SR inclusion data. The best feature set used a combination of n-gram and MeSH features. NLP-based features were not found to improve performance. Furthermore, topic-specific training data usually provides a significant performance gain over more general SR training.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Abstracting and Indexing* / methods
Artificial Intelligence*
Documentation* / classification
Documentation* / methods
Evidence-Based Medicine*
Health Priorities*
Natural Language Processing*
Oregon
Pattern Recognition, Automated / methods
Systematic Reviews as Topic
Workload

Abstract

Publication types

MeSH terms

Grants and funding