SE132:/DS1

From Metabolonote
jump-to-nav Jump to: navigation, search

Sample Set Information

ID SE132
Title TargetSearch - a bioconductor package for the efficient preprocessing of GC-MS metabolite profiling data
Description We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R.

TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data.

Authors Álvaro Cuadros-Inostroza, Camila Caldana, Henning Redestig, Miyako Kusano, Jan Lisec, Hugo Peña-Cortés, Lothar Willmitzer and Matthew A Hannah
Reference Cuadros-Inostroza et al. (2009) BMC Bioinformatics 10:428
Comment Total of 27 chromatograms with replicates of three different standard mixtures. Provided for validation of the Bioconductor package.

The raw files were stored in DROP Met as "Mixture Dilution Series for Pre-processing Validation"


Link icon article.png

Link icon article.png

Link icon database.png Link icon dropmet.png

The raw data files are available at DROP Met web site in PRIMe database of RIKEN.

Data Analysis Details Information

ID DS1
Title TargetSearch Standard
Description A similar pre-processing analysis as the one described for the standard mixture dataset was performed. We used a fatty acid methyl esters (FAMEs) as RI marker standards (Additional file 8) and an in-house reference library composed of 153 metabolites (Additional file 9). This library was manually curated and in addition to known metabolites includes several unknown metabolites that have been observed in previous experiments.


After running TargetSearch, we manually compared the final profile with the deconvoluted peak profiles obtained with LECO (see Additional file 10 for full profile with manual annotations). The profile list contained 138 entries in total, of which 131 metabolites were unambiguously assigned. The 7 ambiguous assignments corresponded to 14 metabolites (2 metabolites per entry). The remaining 8 metabolites (of the 153) were neither present in the final profile nor in the chromatograms as confirmed by manual inspection. Based on our previous experience, we routinely only consider metabolites to be present if at least 3 correlating masses at the correct RI are identified (this excludes the duplicate isotope pairs that are often observed to correlate). Taking this into account, we would consider 101 metabolites to be identified in this experiment. We thus checked these manually and found that 96 were correctly assigned (these were all assigned a similarity score above 600), 1 ambiguity was wrongly resolved, i.e., the correct metabolite was not the one suggested; 1 ambiguity could not be resolved manually, due to similar RIs and reference spectra; 1 metabolite was not found in the chromatograms; and 2 metabolites were not found but 2 peaks (unknowns) were found at their expected retention time. The later could be anticipated in the profile since the similarity score reported by TargetSearch was below 400.

Comment_of_details


Personal tools
View and Edit Metadata
Variants
Views
Actions