Hongtu Liu

Contents

1. Help for using HDOCK server

Help for using HDOCK server

1 How to provide input for docked molecules

The HDOCK server is to predict the binding complexes between two molecules like proteins and nucleic acids by using a hybrid docking strategy. Therefore, users need to provide input for the two molecule to be docked. The HDOCK server can accept four types of input for molecules:

Upload your pdb file inPDB format.
Provide your pdb file in PDB ID:ChainID (e.g. 1CGI:E).
Copy and paste yourproteinsequence inFASTA format.
Upload yourproteinsequence file inFASTA format

Only ONE type of input is needed for each molecule.

If more than one types of input are provided, the first one will be used. For the “PDB ID:ChainID” input, the user can provide one single chain ID or multiple chain IDs. For example, “1CGI:E” stands for the chain E of the pdb file of 1CGI; “1AHW:AB” stands for the chains A and B of the pdb file of 1AHW. If only a sequence is provided, the server will automatically constuct a model structure from ahomologoustemplate in theProtein Data Bankusing a in-house modeling pipeline ofHH Suite,Clustalw2, andMODELLER. In addition, users are also recommended to submit their own pdb file if the protein contains multiple chains, as our pipeline is currently designed to model single-chain proteins.

**NOTE:**For docking efficiency, it is recommended that the larger one of two molecules is input as receptor if one molecule is much larger than the other one.

2 How to specify the binding site [optional]

The HDOCK performs global docking to predict the binding complexes between two molecules. Therefore, no information about the binding site is necessary for the docking job. However, the server also gives users the option to specify the binding site residues if such information is available, such that the predicted models will have a higher accuracy. Two types of binding site information can be provided.

Binding site resdiues on the receptor or ligand.
The binding site residues are provided like this in the text box

195:A, 203-206:A, 108:B

which stand for residues 195, 203-206 of chain A, and 108 of chain B. Note that the residues in a line must be separated by comma.

The binding site residues may also be submitted as a file that will look like this

195:A
203-206:A
108:B

The residues are put on different lines in the file.

Distance restraints between interacting residues
The users may directly provied such information on one line in the text box like

195:A 236:B 8, 215-218:A 306:B 6

where the distance of residue 195 of chain A on the receptor and residue 236 of chain B on the ligand will be within 8 A; The distance of residues 215-218 of chain A on the receptor and residue 306 of chain B on the ligand will be within 6 A. Likewise, the above distance restraints can also be provided as a file that looks like this

195:A 236:B 8 
215-218:A 306:B 6

NOTEFor each restraint, the first field is for receptor, the second field is for ligand, and the third field is for the constrained distance. The residue representation must be in num:chainID or num1-num2:chainID format, where the residue number and chain ID refer to the input structure if the input is a structure, or the modeled structure if the input is a sequence.

CAUTIONFor the 3D structure modeled by the server, the chain ID is set to “A” for single-chain molecule. The numbering of residues is consistent with that in the input sequence.

3 SAXS experimental data curve

The small-angle X-ray scattering (SAXS) experimental data can be provided as a post-docking filter for ranking the binding modes predicted by the HDOCK docking. The SAXS data file contains three columns, q, I(q), and error, like this

    0.0000E+00  1.4612E+07  3.0685E+03
    1.0000E-03  1.4743E+07  4.8653E+03
    2.0000E-03  1.4827E+07  7.3394E+03
    3.0000E-03  1.4685E+07  1.0573E+04
    4.0000E-03  1.4674E+07  1.3206E+04
    5.0000E-03  1.4659E+07  1.5831E+04
    6.0000E-03  1.4729E+07  1.5466E+04
    7.0000E-03  1.4707E+07  1.7649E+04
    8.0000E-03  1.4594E+07  2.3642E+04
    9.0000E-03  1.4787E+07  2.8835E+04

With the SAXS experimental curve, the binding models will be ranked according to a weighted score of the docking energy score calculated by our scoring function and the CHI value that measure the goodness of the predicted binding modes fitting to the SAXS experimental data.

4 Post-docking process (optional)

This step is for advanced users if they want to obtain more than 100 predicted complex models or filter the docked complex models with their own experimental information. The downloaded package contains an HDOCK output file, named like_hdock_5c984053e4b83.out_, that includes all 4392 docking solutions like this

Grid spacing: 1.200
Angle step: 15.000
Initial rotation: 0.00000 0.00000 0.00000
1CGI_r_b.pdb 23.562 26.523 22.675
1CGI_l_b.pdb 47.776 34.961 33.826
1.27246 0.01055 5.02167 -0.328 -0.164 0.264 -445.20 0.45 1.00
2.80075 0.00162 3.49381 -0.286 -0.209 0.111 -444.37 0.38 1.00
0.02137 0.00051 -0.00948 -0.267 -0.212 0.104 -444.28 0.36 1.00
2.98094 0.00164 3.31735 -0.237 -0.259 0.116 -444.15 0.37 1.00
3.04247 0.00300 3.25767 -0.340 -0.315 0.134 -442.80 0.49 1.00
…

where the first 5 lines have the following definitions

The 1st line is the Grid spacing of three (x, y, z) translational degrees of freedom.
The 2nd line is the Euler angle step for three rotational degrees of freedom.
The 3rd line are the initial rotation of the ligand before docking (optional).
The 4th line stands for the receptor file and its center of geometry.
The 5th line is the ligand file and its center of geometry.

Starting from the 6th line are the predicted binding modes each of which is represented by three translations, three rotations, its binding score, RMSD from the initial ligand orientation, and the translational ID for the rotation.

Users can download our “createpl_linux“ program and run it locally to generate complex models like this

createpl\_linux hdock\_5c984053e4b83.out top100.pdb -nmax 100 -complex -models

where binding site residues or restraints can be applied to filter the complex models. Users can type

createpl_linux

for the detailed usage about the program.

After generating the complex models, users may also use a third-party program likeFoXSto calculate the SAXS CHI values of the models based on their small-angle X-ray scattering (SAXS) profile file.