SE124:/DS1
From Metabolonote
Sample Set Information
ID | TSE3 |
---|---|
Title | Integrated strategy for unknown EI-MS identification using quality control calibration curve, multivariate analysis, EI-MS spectral database, and retention index prediction |
Description | Compound identification using unknown electron ionization (EI) mass spectra in gas chromatography coupled with mass spectrometry (GC-MS) is challenging in untargeted metabolomics, natural product chemistry, or exposome research. While the total count of EI-MS records included in publicly or commercially available databases is over 900 000, efficient use of this huge database has not been achieved in metabolomics. Therefore, we proposed a "four-step" strategy for the identification of biologically significant metabolites using an integrated cheminformatics approach: (i) quality control calibration curve to reduce background noise, (ii) variable selection by hypothesis testing in principal component analysis for the efficient selection of target peaks, (iii) searching the EI-MS spectral database, and (iv) retention index (RI) filtering in combination with RI predictions. In this study, the new MS-FINDER spectral search engine was developed and utilized for searching EI-MS databases using mass spectral similarity with the evaluation of false discovery rate. Moreover, in silico derivatization software, MetaboloDerivatizer, was developed to calculate the chemical properties of derivative compounds, and all retention indexes in EI-MS databases were predicted using a simple mathematical model. The strategy was showcased in the identification of three novel metabolites (butane-1,2,3-triol, 3-deoxyglucosone, and palatinitol) in Chinese medicine Senkyu for quality assessment, as validated using authentic standard compounds. All tools and curated public EI-MS databases are freely available in the 'Computational MS-based metabolomics' section of the RIKEN PRIMe Web site ( http://prime.psc.riken.jp ). |
Authors | Teruko Matsuo, Hiroshi Tsugawa, Hiromi Miyagawa, Eiichiro Fukusaki |
Reference | Matsuo et al. Analytical Chemistry (2017) 89(12):6766–6773 |
Comment |
The raw data files are available at DROP Met web site in PRIMe database of RIKEN.
Data Analysis Details Information
ID | DS1 |
---|---|
Title | Data Processing |
Description | NetCDF format files exported from GCMSsolutions (Shimadzu Co., Kyoto, Japan) were converted to Analysis Base Framework (ABF) format files using a free ABF file converter (http://www.reifycs.com/AbfConverter/index.html). MS-DIAL (version 2.48) software was downloaded from the RIKEN PRIMe Web site and used for data processing of the GC–MS data set. The parameters were set as follows: smoothing level, 3; minimum peak height, 2000; average peak width, 20; with default parameters used for the others. The MSP format file (EI–MS reference library) was created using our in-house database and can also be downloaded from RIKEN PRIMe Web site (entitled Osaka Univ. DB). Note that the ranking of structure candidates was based on mass spectral similarity, which was the total score of dot product, reverse dot product, and existence percentage of fragment ions (weighted 2:2:1, respectively) in combination with RI similarity. Details of their mathematical functions followed our previous report. After automatic data processing was finished, the identification results were manually curated with the MS-DIAL graphical user interface by a GC–MS expert, where false positive identification results were changed to “unknown”. A total of 1975 chromatographic peaks (labeled as aligned spots) were created, comprising 127 identified and 1848 unknown peaks (Supporting Information Table S2). All GC–MS data files can be downloaded from the RIKEN Dropmet Web site (http://prime.psc.riken.jp/?action=drop_index). |
Comment_of_details |
The raw data files are available at DROP Met web site in PRIMe database of RIKEN.