Compound identification using unknown electron ionization (EI) mass spectra in gas chromatography coupled with mass spectrometry (GC-MS) is challenging in untargeted metabolomics, natural product chemistry, or exposome research. While the total count of EI-MS records included in publicly or commercially available databases is over 900 000, efficient use of this huge database has not been achieved in metabolomics. Therefore, we proposed a "four-step" strategy for the identification of biologically significant metabolites using an integrated cheminformatics approach: (i) quality control calibration curve to reduce background noise, (ii) variable selection by hypothesis testing in principal component analysis for the efficient selection of target peaks, (iii) searching the EI-MS spectral database, and (iv) retention index (RI) filtering in combination with RI predictions. In this study, the new MS-FINDER spectral search engine was developed and utilized for searching EI-MS databases using mass spectral similarity with the evaluation of false discovery rate. Moreover, in silico derivatization software, MetaboloDerivatizer, was developed to calculate the chemical properties of derivative compounds, and all retention indexes in EI-MS databases were predicted using a simple mathematical model. The strategy was showcased in the identification of three novel metabolites (butane-1,2,3-triol, 3-deoxyglucosone, and palatinitol) in Chinese medicine Senkyu for quality assessment, as validated using authentic standard compounds. All tools and curated public EI-MS databases are freely available in the 'Computational MS-based metabolomics' section of the RIKEN PRIMe Web site (http://prime.psc.riken.jp).