SPRITE and ASSAM: webservers for the searching of side chain 3D-motifs and sites in protein structures

Email info@mfrlab.org for queries and to report errors.

Graph theory, 3D search, Amino acid side chain patterns

SPRITE Input file type:

  • SPRITE accepts ASCII PDB formatted coordinate files of protein structures.
  • PDB files with nucleic acid or other macromolecular chains such as RNA bound to proteins are also accepted and processed.
  • Files of structures solved by NMR containing multiple conformations will be processed but will likely return multiple results for a hit that corresponds to all the conformations.
ASSAM Input file type:
  • ASSAM accepts ASCII PDB formatted coordinate files containing not more than 12 amino acid residues.

The SPRITE and ASSAM search approach
The basic concept behind the search methodology has been described previously (See: Artymiuk et al. 1994). Briefly, the nodes in the graph representation of the protein structure represent individual amino acid side chains and the edges denote the inter-node geometric relationships. Each node consists of two pseudo-atoms, whose positions are chosen to emphasise the functional part of the side chain corresponding to that node. The locations of the two pseudo-atoms are used to generate a vector, and each such vector corresponds to one of the nodes in a graph. The geometric relationships between pairs of residues are defined in terms of distances calculated between the corresponding vectors, and these relationships correspond to the edges of a graph. Specifically, if we let S, M and E denote the start, middle and end, respectively, of a vector, then the graph edges contain five parts, these being the SS, SE, ES, EE and MM distances (although only a subset of these five distances is normally used to specify a query pattern). Both the SPRITE and ASSAM programs employ a similar search engine with the major differences being the target search database and the search or query structure provided as input.

SPRITE output

  • SPRITE outputs hits to patterns in the database are as a list (see pre-computed examples).
  • The SPRITE output presents a list of the amino acid residues (see example) in the query structure that match known side chain arrangements reported in the literature or extracted from databases such as PDBSum and CSA.
  • The hits can be further selected to be viewed using a Jmol molecular viewer browser plugin window.
ASSAM output
  • ASSAM outputs a list of PDB structures from the database that have similarly arranged amino acid side chain residues to the input pattern (see pre-computed example).
Utility of SPRITE
  • Currently, the primary users of SPRITE are protein crystallography and structural bioinformatics groups.
  • SPRITE is expected to be highly useful for identifying conservation of 3D arrangements for amino acid side chains in cases where there is no sequence similarity and limited or no fold similarity.
Utility of ASSAM
  • ASSAM is particularly useful when a series of residues that appear to have functional or structural importance have been identified and the user would like to search whether those residues are conserved in available examples of the PDB.
References for ASSAM
  1. Nadzirin N, Gardiner E, Willett P, Artymiuk PJ, Firdaus-Raih M. (2012) SPRITE and ASSAM: web servers for side chain 3D-motif searching in protein structures. Nucleic Acids Res., 2012 Jul;40(Web Server issue):W380-6. Epub 2012 May 9. 
  2. Spriggs RV, Artymiuk PJ, Willett P. (2003) Searching for patterns of amino acids in 3D protein structures. J Chem Inf Comput Sci., Mar-Apr;43(2):412-21. 
  3. Artymiuk PJ, Poirrette AR, Grindley HM, Rice DW, Willett P. (1994) A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures. J Mol Biol., Oct 21;243(2):327-44. 

Computational resources provided by the Genome Computing Centre, Malaysia Genome Institute

Please contact info_at_mfrlab.org ( _at_ = @ ) for any queries or to report errors.