Skip to content

pfnet-research/pocket_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Protein pocket detection

GitHub repository for the paper: Protein ligand binding site prediction using graph transformer neural network, by Ryuichiro Ishitani, Mizuki Takemoto, and Kentaro Tomii.

Install

This package run under the environment:

  • Python: 3.10
  • PyTorch: 2.0.1
  • PyG: 2.5.3

The other package dependencies are described in requirements.txt.

To install the package and dependencies:

pip install .

Fpocket

Please install Fpocket version 4.2 in a location where the path is set. In the following example, it is installed in /usr/local.

$ git clone https://github.com/Discngine/fpocket.git
$ cd fpocket
$ git checkout 4.2
$ make
$ sudo make install

Usage

Download dataset and model weights

Please download the model weights files from zenodo DOI and save to the examples directory.

Extract the archive in the examples directory.

$ tar xJvf best_models.tar.xz

The five model files (fold0_best_model.pt~fold4_best_model.pt) will be extracted to the examples folder.

Inference using PDB file

This repository includes a file named 1SQN.pdb as a sample. If you will make predictions on this 1SQN.pdb file using the model downloaded from Zenodo as instructed above, there is no need to make any specific edits to the configuration file.

If you intend to make predictions on your own PDB file, prepare the PDB file for the target protein you want to predict. First, remove water molecules, ligands, and other non-amino-acid residues. Then, modify run_pdb_infer.sh as described below.

sampler.pdb_files=["<Path to PDB file>"] \

To carry out predictions on multiple PDB files at once, make the following adjustments to the run_pdb_infer.sh file.

sampler.pdb_files=["<Path to PDB file1>","<Path to PDB file2>"] \

Make sure there are no spaces before or after commas or brackets. Run run_pdb_infer.sh as follows

bash run_pdb_infer.sh

After the computation is finished, a CSV file named infer_results.csv will be generated. This CSV file contains the following columns:

  • PDB_ID: Name of the input file
  • pred_0~pred_4: Predicted values (ranging from 0 to 1) for each model, where higher values indicate better predictions.
  • pred_aver: Average of the predicted values for each model
  • pred_std: Variance of the predicted values for each model

Citation

If you find our work relevant to your research, please cite:

@article{ishitani2024pokeformer,
    title={Protein ligand binding site prediction using graph transformer neural network},
    author={Ryuichiro Ishitani, Mizuki Takemoto, Kentaro Tomii},
    year={2024},
    journal={}
}

Releases

No releases published

Packages

No packages published

Languages