macromol-census

Tools for creating machine-learning datasets from macromolecular structure

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Macromolecule Census is a set of tools for creating machine-learning datasets from macromolecular structure data, especially those made available by the protein data bank (PDB). The purpose of these tools is to account for the following:

Filter for high-quality (e.g. high resolution, low R-factor), low-redundancy (i.e. sequence identity cutoffs) structures.
Make robust training/validation/test splits by accounting for domain-level structural similarities.
Store atomic coordinates in a compact, portable, standard format (SQLite).

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.2.0

May 13, 2024

This version

0.1.0

May 1, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

macromol_census-0.1.0.tar.gz (26.3 kB view hashes)

Uploaded May 1, 2024 Source

Built Distribution

macromol_census-0.1.0-py3-none-any.whl (33.6 kB view hashes)

Uploaded May 1, 2024 Python 3

Hashes for macromol_census-0.1.0.tar.gz

Hashes for macromol_census-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fe184606446c8b99d97ed00369528a1023b8fc9adf4af2212e29e6b1abf5101d`
MD5	`81b6e396d7d5cf713b4eefa8cdfe23f0`
BLAKE2b-256	`e52c8a80dbf8f10d28c45cde76258f4d87b407c8c13f34b2bf6df07f01b6fea6`

Hashes for macromol_census-0.1.0-py3-none-any.whl

Hashes for macromol_census-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`25e64dc1e98ef36e9bfa13bc3b7d74af2fe0ceb6b29d131f5cf7735008f78da1`
MD5	`f2c14f94132c98fc925be9988f6ef1ba`
BLAKE2b-256	`34ff8deb21d82ca19fbb9c692df789d32a7530976d8c36111bfb75918ee4f87e`