matchms

Spectral similarity and compound identification for metabolomics. Use for comparing mass spectra, computing similarity scores (cosine, modified cosine), and identifying unknown compounds from spectral libraries. Best for metabolite identification, spectral matching, library searching. For full LC-MS/MS proteomics pipelines use pyopenms.

Category

Other Tools

Install

Hot:5

Download and extract to your skills directory

Copy command and send to OpenClaw for auto-install:

Download and install this skill https://openskills.cc/api/download?slug=k-dense-ai-scientific-skills-matchms&locale=en&source=copy

Matchms - Mass Spectral Similarity and Metabolite Identification Tool

Overview of Capabilities


Matchms is an open-source Python library for mass spectrometry data processing and spectral similarity analysis. It supports importing mass spectra from multiple formats, computing similarity scores, performing spectral library matching, and identifying unknown compounds.

Use Cases

1. Metabolite Identification and Spectral Matching


When you need to identify unknown metabolites from mass spectrometry data, Matchms can compare your spectra with reference libraries and compute match scores using algorithms such as cosine similarity and modified cosine similarity, quickly finding the most similar known compounds. It supports GNPS-format library searches and is suitable for metabolomics research and drug discovery.

2. Mass Spectrometry Data Preprocessing and Quality Control


Before performing mass spectrometry analysis, raw data must be cleaned and standardized. Matchms provides over 40 filters for quality control steps such as peak intensity normalization, metadata normalization, precursor ion removal, and minimum peak count requirements, ensuring data quality for downstream analyses.

3. Large-Scale Spectral Similarity Comparisons


When processing large numbers of mass spectra, Matchms supports batch computation of similarity matrices for spectral clustering, network analysis, and exploration of relationships between samples. You can build reproducible multi-step processing pipelines suitable for automated analysis workflows.

Core Features

1. Multi-Format Mass Spectrometry Data Import/Export


Supports major mass spectrometry formats such as mzML, mzXML, MGF, MSP, and JSON. Data can be imported from raw instrument files or spectral libraries like GNPS, and processed results can be exported in standard formats. This means that regardless of your data source, you can process it uniformly with Matchms.

2. Multiple Similarity Computation Algorithms


Provides a variety of similarity functions including CosineGreedy (fast cosine similarity), ModifiedCosine (accounts for precursor mass differences), NeutralLossesCosine (neutral loss patterns), and FingerprintSimilarity (molecular fingerprints), allowing you to choose the most appropriate algorithm for your specific analysis needs.

3. Customizable Spectral Processing Pipelines


With SpectrumProcessor you can combine multiple filtering steps to build reproducible analysis workflows. From metadata normalization and peak filtering to similarity calculation, the entire process can be saved and reused, ensuring consistency and traceability of analyses.

Frequently Asked Questions

How to choose between Matchms and pyopenms?


They have different focuses: Matchms concentrates on spectral similarity and metabolite identification, suitable for spectral library matching and compound identification; pyopenms is a comprehensive LC-MS/MS proteomics pipeline covering a broader range of mass spectrometry analysis functions. If you only need spectral comparison and library searching, Matchms is lighter and easier to use.

What mass spectrometry file formats are supported?


Matchms supports mzML, mzXML (raw MS formats), MGF (Mascot Generic Format), MSP (spectral library format), JSON (GNPS-compatible), USI references, and other formats, covering the common mass spectrometry data exchange formats.

How to handle analyses that require molecular structure information?


Install the matchms chemistry extension to enable handling of molecular structures like SMILES and InChI: uv pip install matchms[chemistry]. After installation you can use molecular fingerprint similarity, chemical information derivation, and other related functions.