We describe an algorithm for phasing protein crystal X-ray diffraction data that identifies, retrieves, refines and exploits general tertiary structural information from small fragments available in the Protein Data Bank. The algorithm successfully phased, through unspecific molecular replacement combined with density modification, all-helical, mixed alpha-beta, and all-beta protein structures. The method is available as a software implementation: Borges.

2014 
Main With structural knowledge available of the over 80,000 macromolecular crystal structures recorded in the Protein Data Bank (PDB) 1 , it should be feasible to solve the 'crystallographic phase problem' for any new structure through computation 2 . The crystallographic phase problem arises because only the diffracted intensities and not the phases are determined from the X-ray diffraction experiment, but the missing phases are essential to compute the structure. Initial phases are usually derived from measurement of heavy atom or anomalous scatterer derivatives, which involves an increase in the experimental effort and timescale of the crystallographic study, as many derivatives turn out to be unsuccessful. Molecular replacement phasing 3, 4 , on the other hand, works by locating a related model within the crystallographic unit cell to best account for the experimental diffraction data. Typically, homologs for molecular replacement are retrieved by finding closely related sequences. More recently, molecular replacement using remote homologs has been made successful by combining modeling with the program Rosetta 5 . This requires composing a fairly complete structural hypothesis within a 1.5-A r.m.s. deviation from the true structure. Alternatively, as little as 10% of the total mainchain structure is enough to achieve phasing at 2-A resolution, provided that it is almost identical to part of the target structure and accurately placed (r.m.s. deviation <0.5 A) 6 . Our previous program ARCIMBOLDO 6 , for ab initio phasing from the native data alone, combines fragment location with Phaser 7 and density modification and autotracing with SHELXE 8 in a supercomputing environment 9 . By applying secondary-structure constraints and density modification 10 , it overcomes the resolution and size limitations of direct methods based on constraints derived from atomicity 11 . By sequentially searching for polyalanine helices, ARCIMBOLDO generates hypotheses without specific previous structural knowledge that, if close enough to the true structure, can be expanded to a full solution. The limiting condition is that the search for the first fragment must contain a correct solution; this becomes increasingly challenging for larger structures as the signal becomes weaker. One possible solution is to locate larger, composite fragments, by using tertiary rather than secondary structure. In this particular scenario, modeling has limited use, as both the sequence and the context of the fragment are unknown and optimization would be largely underdetermined. We describe an algorithm and software tool, Borges (http://chango.ibmb.csic.es/BORGES/), that uses tertiary-structure searching in the PDB to solve the crystallographic phase problem.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    0
    Citations
    NaN
    KQI
    []