SE132:/S1/M1/D1

From Metabolonote
jump-to-nav Jump to: navigation, search

Sample Set Information

ID SE132
Title TargetSearch - a bioconductor package for the efficient preprocessing of GC-MS metabolite profiling data
Description We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R.

TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data.

Authors Álvaro Cuadros-Inostroza, Camila Caldana, Henning Redestig, Miyako Kusano, Jan Lisec, Hugo Peña-Cortés, Lothar Willmitzer and Matthew A Hannah
Reference Cuadros-Inostroza et al. (2009) BMC Bioinformatics 10:428
Comment Total of 27 chromatograms with replicates of three different standard mixtures. Provided for validation of the Bioconductor package.

The raw files were stored in DROP Met as "Mixture Dilution Series for Pre-processing Validation"


Link icon article.png

Link icon article.png

Link icon database.png Link icon dropmet.png

The raw data files are available at DROP Met web site in PRIMe database of RIKEN.

Sample Information

ID S1
Title wild-type Col-0 Arabidopsis thaliana seedling
Organism - Scientific Name Arabidopsis thaliana
Organism - ID NCBI taxonomy 3702
Compound - ID
Compound - Source
Preparation The biological dataset consisted of wild-type Col-0 Arabidopsis thaliana seedling samples. Three week old seedlings, grown on solid MS medium with 1% sucrose, were kept for 4h under either continuous light or darkness.
Sample Preparation Details ID
Comment

Analytical Method Information

ID M1
Title GC-TOF MS
Method Details ID MS1
Sample Amount
Comment

Analytical Method Details Information

ID MS1
Title GC-TOF MS
Instrument Agilent 6890 gas chromatograph - Leco Pegasus 2 time-of-flight mass spectrometer (LECO)
Instrument Type
Ionization EI
Ion Mode Positive
Description Metabolites were extracted from pools of seedlings in a total of four biological replicates. Extraction and derivatisation of metabolites from leaves using GC-MS were performed as outlined by Lisec et al.

GC-MS data were obtained using an Agilent 7683 series autosampler (Agilent Technologies GmbH, Waldbronn, Germany), coupled to an Agilent 6890 gas chromatograph - Leco Pegasus 2 time-of-flight mass spectrometer (LECO, St. Joseph, MI, USA). Identical chromatogram acquisition parameters were used as those previously described by Weckwerth et al.

Comment_of_details Lisec et al. Nat Protoc. (2006) 1(1):387-96.

Weckwerth et al. Proteomics. (2004) Jan;4(1):78-83.


Link icon article.png

Link icon article.png

Data Analysis Information

ID D1
Title TargetSearch Standard
Data Analysis Details ID DS1
Recommended decimal places of m/z
Comment


Link icon database.png

Data Analysis Details Information

ID DS1
Title TargetSearch Standard
Description A similar pre-processing analysis as the one described for the standard mixture dataset was performed. We used a fatty acid methyl esters (FAMEs) as RI marker standards (Additional file 8) and an in-house reference library composed of 153 metabolites (Additional file 9). This library was manually curated and in addition to known metabolites includes several unknown metabolites that have been observed in previous experiments.


After running TargetSearch, we manually compared the final profile with the deconvoluted peak profiles obtained with LECO (see Additional file 10 for full profile with manual annotations). The profile list contained 138 entries in total, of which 131 metabolites were unambiguously assigned. The 7 ambiguous assignments corresponded to 14 metabolites (2 metabolites per entry). The remaining 8 metabolites (of the 153) were neither present in the final profile nor in the chromatograms as confirmed by manual inspection. Based on our previous experience, we routinely only consider metabolites to be present if at least 3 correlating masses at the correct RI are identified (this excludes the duplicate isotope pairs that are often observed to correlate). Taking this into account, we would consider 101 metabolites to be identified in this experiment. We thus checked these manually and found that 96 were correctly assigned (these were all assigned a similarity score above 600), 1 ambiguity was wrongly resolved, i.e., the correct metabolite was not the one suggested; 1 ambiguity could not be resolved manually, due to similar RIs and reference spectra; 1 metabolite was not found in the chromatograms; and 2 metabolites were not found but 2 peaks (unknowns) were found at their expected retention time. The later could be anticipated in the profile since the similarity score reported by TargetSearch was below 400.

Comment_of_details
Personal tools
View and Edit Metadata
Variants
Views
Actions