Prune¶
After residues have been selected, one may wish to remove some residues if they do not fit a given criterion. The functions below allow one allow to do so.
All functions are availible in both C++ and Python. In Python, the function names should be prefixed with prune_ instead of the namespace resolution.
Provided functions¶
-
namespace
lemon
::
prune
¶ Prune selected residues by removing them based on a criterion.
Functions
-
template<typename
Container
>
Containeridentical_residues
(const chemfiles::Frame &frame, Container &residue_ids)¶ Remove residues which are biologic copies of one another in a crystal
Many crystal structures in the PDB contain two identical copies of a biological macromolecule. Since these copies are functionally identical, some users wishing to only analyze a unique set of protein chains may want to remove the identical residue copy. This function performs this operation on a given set of residue ids by comparing all the residues in the
frame
’s biological assemblies. If a residue in one assembly has the same ID as a residue in a different assembly, then the copied residue is removed.- Parameters
[in] frame
: Theframe
containing residues of interest.[inout] residue_ids
: The residue IDs to be pruned
-
template<typename
Container
>
Containercofactors
(const chemfiles::Frame &frame, Container &residue_ids, const ResidueNameSet &rns)¶ Remove residues which are typically present in many crystal structures
There are a common set of cofactors present in many crystal structures such as sugars and fatty acids used to induce crystallization. As a result, some users may remove these cofactors as they may match other criteria (such as being a small molecule) set by the user.
- Parameters
[in] frame
: Theframe
containing residues of interest.[inout] residue_ids
: The residue IDs to be pruned.[in] rns
: The residue names that one wishes to remove from residue_ids.
-
template<typename
Container1
, typenameContainer2
= Container1>
Container1interactions
(const chemfiles::Frame &frame, Container1 &residue_ids, const Container2 &interaction_ids, double distance_cutoff = DEFAULT_DISTANCE, bool keep = true)¶
-
template<typename
Container1
, typenameContainer2
= Container1>
Container1keep_interactions
(const chemfiles::Frame &frame, Container1 &residue_ids, const Container2 &interaction_ids, double distance_cutoff = DEFAULT_DISTANCE)¶ Remove residues which do not interact with a given set of other residues
This function is designed to remove residues which do not have a desired interaction with the surrounding protein environment. For example, if a user is interested in small molecules that interact with a Heme group, they can use this function to remove all residues that do have this interaction.
- Parameters
[in] frame
: Theframe
containing residues of interest.[inout] residue_ids
: The residue IDs to be pruned.[in] interaction_ids
: The residue ids that the users wishes the residue_ids to interact with.[in] distance_cutoff
: The distance that the residue_ids must be within a checked residue to be included.
-
template<typename
Container1
, typenameContainer2
= Container1>
Container1remove_interactions
(const chemfiles::Frame &frame, Container1 &residue_ids, const Container2 &interaction_ids, double distance_cutoff = DEFAULT_DISTANCE)¶ Remove residues which do interact with a given set of other residues
This function is designed to remove residues which have a undesirable interaction with the surrounding protein environment. For example, if a user is interested in small molecules that do not interact with water, they can use this function to remove all residues that interact with water.
- Parameters
[in] frame
: Theframe
containing residues of interest.[inout] residue_ids
: The residue IDs to be pruned.[in] interaction_ids
: The residue ids that the users wishes the residue_ids to not interact with.[in] distance_cutoff
: The distance that the residue_ids must be within a checked residue to be removed.
-
template<typename
Container1
, typenameContainer2
= Container1>
Container1intersection
(Container1 &residue_ids, const Container2 &intersection_ids)¶ Turns
residue_ids
in to intersection between it andintersection_ids
This function is designed to keep residues which have a desirable intersection with another set of residue ids.
- Parameters
[inout] residue_ids
: The residue IDs to be pruned.[in] intersection_ids
: The residue ids that the users wishes the residue_ids to also be in.
-
template<typename
Container
>
Containerhas_property
(const chemfiles::Frame &frame, Container &residue_ids, const std::string &property_name, const chemfiles::Property &property)¶ Keeps residues with a given property
This function is designed to keep residues which have a desirable property
- Parameters
[in] frame
: Theframe
containing residues of interest.[inout] residue_ids
: The residue IDs to be pruned.[in] property_name
: The name of the property to keep[in] property
: The property that the residues must have to be kept
Variables
-
auto constexpr
DEFAULT_DISTANCE
= 6.0¶ The default distance used for pruning.
-
template<typename
Example¶
C++¶
The following example demonstrates how to remove cofactors and other ‘common’ residues from a selection.
auto worker = [](const chemfiles::Frame& entry,
const std::string& pdbid) -> std::string {
// Selection phase
auto smallm = lemon::select::small_molecules(entry);
if (smallm.empty()) {
return std::string("");
}
// Pruning phase
lemon::prune::identical_residues(entry, smallm);
lemon::prune::cofactors(entry, smallm, lemon::common_cofactors);
lemon::prune::cofactors(entry, smallm, lemon::common_fatty_acids);
// Output phase
return pdbid + lemon::count::print_residue_names(entry, smallm);
};
This example extends the previous one to show how one can find only the small- molecules interacting with a Heme group.
auto worker = [distance](const chemfiles::Frame& entry,
const std::string& pdbid) -> std::string {
// Selection phase
auto hemegs = lemon::select::specific_residues(
entry, {"HEM", "HEA", "HEB", "HEC"});
auto smallm = lemon::select::small_molecules(entry);
// Pruning phase
lemon::prune::identical_residues(entry, smallm);
lemon::prune::cofactors(entry, smallm, lemon::common_cofactors);
lemon::prune::cofactors(entry, smallm, lemon::common_fatty_acids);
lemon::prune::keep_interactions(entry, smallm, hemegs, distance);
// Output phase
return pdbid + lemon::count::print_residue_names(entry, smallm);
};
Python¶
These examples are availible in python as:
import lemon
class MyWorkflow(lemon.Workflow):
def worker(self, entry, pdbid):
import lemon
smallm = lemon.select_small_molecules(entry, lemon.small_molecule_types, 10)
# Pruning phase
lemon.prune_identical_residues(entry, smallm)
lemon.prune_cofactors(entry, smallm, lemon.common_cofactors)
lemon.prune_cofactors(entry, smallm, lemon.common_fatty_acids)
# Output phase
return pdbid + lemon.count_print_residue_names(entry, smallm)
import lemon
class MyWorkflow(lemon.Workflow):
def worker(self, entry, pdbid):
import lemon
heme_names = set()
heme_names.add(lemon.ResidueName("HEM"))
heme_names.add(lemon.ResidueName("HEA"))
heme_names.add(lemon.ResidueName("HEB"))
heme_names.add(lemon.ResidueName("HEC"))
hemegs = lemon.select_specific_residues(entry, heme_names)
smallm = lemon.select_small_molecules(entry, lemon.small_molecule_types, 10)
# Pruning phase
lemon.prune_identical_residues(entry, smallm)
lemon.prune_cofactors(entry, smallm, lemon.common_cofactors)
lemon.prune_cofactors(entry, smallm, lemon.common_fatty_acids)
lemon.keep_interactions(entry, smallm, hemegs, 6.0)
# Output phase
return pdbid + lemon.count_print_residue_names(entry, smallm) + '\n'
def finalize(self):
pass