
To provide examples on how the web server can be used for protein binding site prediction, we illustrate some case studies used to analyse protein-drug interactions in different perspectives.
Uncharacterized protein is a protein of which the function cannot be defined, either based on sequence or structural similarity search. Meanwhile, hypothetical protein is a protein whose existence in the genome has been predicted, but the expression cannot be proved via experimental validation. While various sequence and structure- based prediction tools enable the prediction of function for these proteins, the prediction of drug binding site or the ability of the protein to bind to drug molecules is one of the potential method used for functional annotation.


One example of hypothetical protein is an uncharacterized protein from Homo sapiens (PDB ID: 2q4k) obtained from the Joint Center for Structural Genomics (JCSG). From the ‘Search for amino acid arrangements similar to known drug binding sites in known target structures / protein-drug complexes’ interface, PDB ID search for ‘2q4k’ returns a representative structure at 90% sequence identity, which is a BASOPHILIC LEUKEMIA EXPRESSED PROTEIN BLES03 (PDB ID: 1ztp). Clicking the 'Details' button will lead user to the results page containing structural details and visualization of the protein structure, as well as a list of similar patterns of amino acids derived from ASSAM searches. User may filter the list based on RMSD, Z-score and sequence identity values.
(Link)

If the search return no matches, user may use the 'Search a protein structure for amino acid residue arrangements similar to a known drug binding site' interface, where user can upload a PDB-formatted structure or insert a PDB ID to search for potential motifs for drug binding based on SPRITE search. SPRITE search compares a query protein structures against a data set of known drug binding sites and yield matches of amino acid patterns.

PDB ID 2q4k does not mapped to any functional annotation (https://www.uniprot.org/uniprot/Q9H3H3). Sequence comparison indicated that the protein do not share any sequence similarity or structural similarity to known proteins.
A query for '2q4k' returns list of similar patterns for right-handed and left-handed superposition of amino acid patterns.
(Link)

User may select to view either the list of matches (left-handed and right-handed superpositions of amino acid patterns) or download the results file. The results page show a list of similar patterns of amino acids found from sub-structural similrity searches of the query protein against a database of known binding sites similarly used in Drug ReposER.
User may filter the results according to residue number or RMSD value. Clicking on the DrReposER ID on the second column will open a new page for selected DrReposER ID and details for match binding site.


User may view one or more potential motifs for the query protein by selecting the pattern on the first column. Selecting only one pattern will give a better view in teh NGL viewer.
The NGL viewer provides multiple options for user to select, either by showing the superposition of matched residues or showing the actual known binding sites.
Clicking on the 'Superposed motifs' button will shows only matched residues that are superposed to the residues from query protein.
Clicking on the 'Hit binding sites' button will shows the actual known binding site that match the query structure.
Predicted binding sites obtained from structural similarity searches can be further used for molecular docking analysis and further validation through experiments.
References:
1) Jin, M. et al. (2013). Discovery of potent, selective and orally bioavailable imidazo[1,5-a]pyrazine derived ACK1 inhibitors. Bioorganic & Medicinal Chemistry Letters. 23(4):979-984. (link)
2) Mahajan, K. & Mahajan, N. P. (2013). ACK1 Tyrosine Kinase: Targeted Inhibition to Block Cancer Cell Proliferation. Cancer Lett. 338(2):185-192. (link)
3) Sohn et al. (2016). Identification of a Highly Conserved Hypothetical Protein TON_0340 as a Probable Manganese-Dependent Phosphatase. PLoS ONE 11(12): e0167549. (link)