Martini: using literature keywords to compare gene sets

Theodoros G Soldatos, Seán I O'Donoghue, Venkata P Satagopam, Lars J Jensen, Nigel P Brown, Adriano Barbosa-Silva, Reinhard Schneider

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

25 Citationer (Scopus)

Abstract

Life scientists are often interested to compare two gene sets to gain insight into differences between two distinct, but related, phenotypes or conditions. Several tools have been developed for comparing gene sets, most of which find Gene Ontology (GO) terms that are significantly over-represented in one gene set. However, such tools often return GO terms that are too generic or too few to be informative. Here, we present Martini, an easy-to-use tool for comparing gene sets. Martini is based, not on GO, but on keywords extracted from Medline abstracts; Martini also supports a much wider range of species than comparable tools. To evaluate Martini we created a benchmark based on the human cell cycle, and we tested several comparable tools (CoPub, FatiGO, Marmite and ProfCom). Martini had the best benchmark performance, delivering a more detailed and accurate description of function. Martini also gave best or equal performance with three other datasets (related to Arabidopsis, melanoma and ovarian cancer), suggesting that Martini represents an advance in the automated comparison of gene sets. In agreement with previous studies, our results further suggest that literature-derived keywords are a richer source of gene-function information than GO annotations. Martini is freely available at http://martini.embl.de.
OriginalsprogEngelsk
TidsskriftNucleic Acids Research
Vol/bind38
Udgave nummer1
Sider (fra-til)26-38
Antal sider13
ISSN0305-1048
DOI
StatusUdgivet - 2010
Udgivet eksterntJa

Citationsformater