Select¶
Most Lemon workflows start by selecting one or multiple sets of residues and performing operations on these residues. The functions below are availible in both C++ and Python, but there are a few implementation differences users should note.
First, the C++ version of all functions are implemented as templates, allowing for the user to use their prefered container for storing the resulting residues. Second, each function has two overloads. One takes a container initialized by the user as an argument and subsequently populates it using the container’s insert method. No template arguments are required as these can be deduced at compile-time. An other overload initializes and populates a new container specified by a template argument (default is a std::list<size_t>). The choice of correct container is left to the user.
In Python, both overloads are availible as well. However, due to restrictions imposed by Python generics, the user must use the ResidueIDs container. All Python functions have been prepended with select_.
Provided selectors¶
-
namespace
lemon
::
select
¶ Functions to select various residue based on a given criterion.
Functions
-
template<typename
Container
= std::vector<uint64_t>>
Containersmall_molecules
(const chemfiles::Frame &frame, const std::unordered_set<std::string> &types = small_molecule_types, size_t min_heavy_atoms = 10)¶ Select small molecules in a given frame
Use this function to find small molecules in a given
frame
. A small molecule is defined as an entity that has a given chemical composition. Also, the selected entity must have a specified number of atoms (default 10), so that common residues such as water and metal ions are not selected.- Return
The selected residue locations
- Parameters
[in] frame
: The entry containing molecules of interest.[in] types
: A set ofstd::string
containing the accepted chemical chemical composition. Defaults are NON-POLYMER, OTHER, PEPTIDE-LIKE[in] min_heavy_atoms
: The minimum number of non-hydrogen atoms for a residue to be classed as a small molecule.
-
template<typename
Container
= std::vector<uint64_t>>
Containermetal_ions
(const chemfiles::Frame &frame)¶ Select metal ions in a given frame
This function populates the residue IDs of metal ions. We define a metal ion as a residue with a single, positively charged ion.
- Return
The selected residue locations
- Parameters
[in] frame
: The entry containing metal ions of interest.
-
template<typename
Container
= std::vector<uint64_t>>
Containernucleic_acids
(const chemfiles::Frame &frame)¶ Select nucleic acid residues in a given frame
This function populates the residue IDs of nucleic acid residues. We define a nucleic acid as a residue with a chemical composition containing the RNA or DNA substring.
- Return
The selected residue locations
- Parameters
[in] frame
: The entry containing nucleic acid residues.
-
template<typename
Container
= std::vector<uint64_t>>
Containerpeptides
(const chemfiles::Frame &frame)¶ Select peptide residues in a given frame
This function populates the residue IDs of peptide residues. We define a peptided as a residue with a chemical composition containing the PEPTIDE substring which is not PEPTIDE-LIKE.
- Return
The selected residue locations
- Parameters
[in] frame
: The entry containing peptide residues.
-
template<typename
Container
= std::vector<uint64_t>>
Containerresidue_ids
(const chemfiles::Frame &frame, const std::set<uint64_t> &resis)¶ Select residues with a given name in a given frame
This function populates the residue IDs of peptides matching a given name set.
- Return
The selected residue locations
- Parameters
[in] frame
: The entry containing residues of interest.[in] resnis
: The set of residue IDs of interest.
-
template<typename
Container
= std::vector<uint64_t>>
Containerspecific_residues
(const chemfiles::Frame &frame, const ResidueNameSet &resnames)¶ Select residues with a given name in a given frame
This function returns a set of residue locations within a given name set
- Return
The selected residue locations
- Parameters
[in] frame
: The entry containing residues of interest.[in] resnames
: The set of residue names of interest.
-
template<typename
Container
= std::vector<uint64_t>>
Containerresidue_property
(const chemfiles::Frame &frame, const std::string &property_name, const chemfiles::Property &property)¶ Select residues with a property
This function returns the residue locations of residues with a property
- Return
The selected residue locations
- Parameters
[in] frame
: The entry containing residues of interest.[in] property_name
: The name of the property to select[in] property
: the property of interest
-
template<typename
Example¶
C++¶
auto worker = [](const chemfiles::Frame& entry,
const std::string& pdbid) -> std::string {
// Selection phase
auto metal_ids = lemon::select::metal_ions(entry);
// No pruning, straight to out output phase
return pdbid + lemon::count::print_residue_names(entry, metal_ids);
};
auto collector = lemon::print_combine(std::cout);
Python¶
import lemon
distance = 6.0
class MyWorkflow(lemon.Workflow):
def worker(self, entry, pdbid):
import lemon
# Selection phase
metals = lemon.select_metal_ions(entry)
# Output phase
return pdbid + lemon.count_print_residue_names(entry, metals) + '\n'
def finalize(self):
pass