Source code for SpecEncoder: Deep Metric Learning for Accurate Peptide Identification in Proteomics
Free for academic uses. Licensed under LGPL.
Visit https://predfull.com/ to check related works
- 2024.03.07: Second version.
- 2023.10.28: First version.
Based on the structure of the residual convolutional networks.
Different workflows:
Recommend to install dependency via Anaconda
- Python >= 3.7
- Tensorflow >= 2.5.0
- Pandas >= 0.20
- pyteomics
- numba
- Tensorflow-addons
After clone this project, you should download the pre-trained model (encoder.h5
) from zenodo.org and place it into SpecEncoder's folder.
Frist we convert query and database into vectors:
python encode.py --query query.mgf --model model.h5 --output query.npy
python encode.py --query database.mgf --model model.h5 --output database.npy
Then we can perform searching:
python search.py --query query.npy --db database.npy --output result.tsv
Typical running speed: convert around 700 spectra in 1 second on a NVIDIA A6000 GPU.
See train.py
for sample training codes