This repository contains the NbX models for the re-ranking of nanobody–antigen binding poses.
Tam, C.; Kumar, A.; Zhang, K.Y.J. NbX: Machine Learning-Guided Re-Ranking of Nanobody–Antigen Binding Poses. Pharmaceuticals 2021, 14, 968. https://doi.org/10.3390/ph14100968
The five models inside the model
folder are the XGBoost models shown on Figure 1 of our NbX paper.
(i.e. the 5-fold validated models trained and tested on Nb-Ag complexes where pairwise Ag structural alignment quality score < 0.9 to minimize train-test information leakage).
Model | PR-AUC |
---|---|
model_0001 | 0.229 |
model_0002 | 0.169 |
model_0003 | 0.349 (best) |
model_0004 | 0.205 |
model_0005 | 0.276 |
To run an example: do step1, step2 then go directly to step5.
To run with your Nb-Ag complex structures: start from step1.
-
Firstly, do...
git clone https://github.com/johnnytam100/NbX.git
cd NbX
conda env create -f environment.yml
conda activate nbx
which should have the following libraries installed
pip install biopandas==0.4.1
pip install xgboost==0.90
pip install scikit-learn==0.22.2.post1
pip install joblib==1.1.0
pip install dill==0.3.5.1
-
Secondly, in the activated
nbx
environment, manually install...(1) PyRosetta (https://www.pyrosetta.org/downloads)
- NbX was tested on PyRosetta installation with
pyrosetta-2022.23+release.f1e0f6d7bf7-cp38-cp38-linux_x86_64.whl
from here
(2) FoldX (https://foldxsuite.crg.eu/)
(3) DockQ (optional, https://github.com/bjornwallner/DockQ)
(4) Rosetta (optional, https://new.rosettacommons.org/demos/latest/tutorials/install_build/install_build)
- NbX was tested on PyRosetta installation with
Change the following paths inside NbX_feature_prep.py
path_to_python = "/home/cltam/anaconda3/envs/nbx/bin/python"
- Get this path inside the activated
nbx
environment with the commandwhich python
path_to_foldx = "/data/cltam/script/FoldX/foldx_20221231"
path_to_dockq = "/data/cltam/script/DockQ/"
(optional, be careful not to omit the last /
in this path)
Before any docking, please renumber your nanobody with PyIgClassify (http://dunbrack2.fccc.edu/pyigclassify/)
If your Nb (or Nb-Ag complex) structure is confidential and you don't want to submit to a webserver:
modify the CDRs start and end residue numbers (search "CDR1_start_residue"
) inside NbX_feature_prep.py
.
cp (path to your Nb-Ag complex structures .pdb) ./run_NbX
cd run_NbX
cp -r ../model ../NbX_feature_prep.py ../aaDescriptors.csv ../NbX_predict.py ./
python NbX_feature_prep.py --antigen_chain A --antibody_chain H
python NbX_feature_prep.py --antigen_chain A --antibody_chain H --native 6oq8_complex.pdb
python NbX_predict.py
Important concept in NbX re-ranking:
Descendingly sort the mean_predicted_CAPRI_binary_proba
in NbX_prediction.csv
, we get the following results
To mimic the NbX benchmark setting, you can perform RosettaDock refinement of your Nb-Ag complex structures before feature preparation.
step 1 : change ROSETTA_PATH
in RosettaDock.sh
to your Rosetta path.
step 2 : sh RosettaDock.sh
-
"Garbage in, garbage out". NbX is not a docking but a re-ranking method, which completely depends on the quality of the input Nb-Ag complex structures to suggest native-like solutions.
Take action:
-
Use a docking algorithm that is well-tested on predicting native-like Nb-Ab complex structures, no matter how the docking method ranks them.
-
We used Nb-Ag complex structures from ClusPro -> RosettaDock full-atom refinement to benchmark NbX. Please use equivalent or better docking methods.
-
-
NbX is largely unable to model a single classification threshold that generally applies to all tested Nb-Ag complexes to distinguish non-native-like (0) or native-like (1) Nb-Ag complex structures.
Take action:
-
Descendingly sort the
mean_predicted_CAPRI_binary_proba
inNbX_prediction.csv
(i.e. the mean native-like probablilty of the 5-fold validated NbX models). This is our NbX re-rank for you. -
Do consider top ranks as more probable native-like Nb-Ag complex structures compared to the lower ranks.
-
Avoid applying a single classification threshold on the absolute value of the probability to distinguish non-native-like (0) or native-like (1) among unrelated Nb-Ag pairs.
-
Avoid comparing the absolute value of the probability predicted among unrelated Nb-Ag pairs.
-
202206 Update: Use the following distributions of
mean_predicted_CAPRI_binary_proba
among 1) crystal 2) native-like and 3) non-native-like Nb-Ag complex structures to guide your selection of native-like Nb-Ag complex structures.
-