Tools for creating machine-learning datasets from macromolecular structure
Project description
Macromolecule Census is a set of tools for creating machine-learning datasets from macromolecular structure data, especially those made available by the protein data bank (PDB). The purpose of these tools is to account for the following:
Filter for high-quality (e.g. high resolution, low R-factor), low-redundancy (i.e. sequence identity cutoffs) structures.
Make robust training/validation/test splits by accounting for domain-level structural similarities.
Store atomic coordinates in a compact, portable, standard format (SQLite).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
macromol_census-0.2.0.tar.gz
(26.3 kB
view hashes)
Built Distribution
Close
Hashes for macromol_census-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e3bc37571803dd89efb764a3051f2f5939df8fb92b9b285f588e5d2705210d5 |
|
MD5 | 29367fd0b196e0f9f64847e5f65bb918 |
|
BLAKE2b-256 | 3ba483d580c9dce03d352b15be945598b7f3463b2be3eede4d597a03991db81b |