🤢 Sickbay: Clinical data model for the Consortium for Molecular and Cellular Characterization of Screen-Detected Lesions.
Project description
👩⚕️ MCL Sickbay
"MCL Sickbay" is the data model and object-relational mapping for the clinical data application of the Consortium for Molecular and Cellular Characterization of Screen-Detected Lesions.
🏃♀️ Getting Started
The "Sickbay" software provides a Python based API into a data model (a series of related classes) and takes advantage of SQLAlchemy as the object-relational mapper. This section will help you get started.
📀 The Database
For this project, we're using PostgreSQL. You can create a PostgreSQL database to use with this software as follows:
dropdb --if-exists clinical_data
dropuser --if-exists mcl
createuser \
--createdb \
--inherit \
--login \
--no-createrole \
--no-superuser \
mcl
createdb --encoding=UTF8 --owner=mcl clinical_data
🖥 The Software
To use this software, simply add mcl.sickbay
as a dependency to your project or install it into your Python virtual environment.
You can develop, build, and test the package locally as follows:
python3 bootstrap.py
bin/buildout
bin/test
You can run bin/create-demo-db
to populate a PostgreSQL database with the schema of the Sickbay data model. Add -add-test-data
to include some test data.
To publish this software, try Twine.
🔢 Versioning
We use the SemVer philosophy for versioning this software. For versions available, see the release history.
📦 Additional Resources
Some resources that provide further context for this software are as follows:
👥 Contributing
Well it's wide open right now, but later you might look at open issues, forking the project, and submitting a pull request.
📃 License
The project is licensed under the Apache version 2 license.
📜 Changelog
This documents the changes from release to release.
0.0.10
For issue https://github.com/MCLConsortium/mcl.sickbay/issues/1:
- On
ClinicalCore
:- The
race
attribute is now a 1-to-many mapping toCoreRace
viacore_races
- The
type_tobacco_used
is now a 1-to-many mapping toCoreTobacco
viacore_tobaccos
- The attribute
days_to_birth
is now required
- The
- On
Biospecimen
:- The enumeration for
Precancers
has a whole bunch of new permitted values
- The enumeration for
- On
BreastOrgan
:- The enumeration for
PrecancerousHistopathology
contains values for "unknown" and "data not available" - The enumeration for
BreastSite
now has anunknown
value - A new value
pending
is available forGeneticTestingAnswer
,TestResults
,EstrogenTestResults
- The enumeration
HER2Results
addspending
andunknown
values - The enumeration
BreastImagingWorkup
adds anunknown
value - The enumeration
BIRADSTissues
adds values for "unknown" and "data not available"
- The enumeration for
- New
LungOrgan
plus (bogus) test data for it - New
PancreasOrgan
plus (bogus) test data for it - Updated
ProstateOrgan
- Previously, this was just a placeholder to test multiple inheritance from the common
Organ
class in terms of both Python class hierachy and database hierarchy - Now it's completely filled out with the
v0
prostate common data elements with its numerous controlled vocabularies
- Previously, this was just a placeholder to test multiple inheritance from the common
- Expanded enumerations:
ClinicalMStage7
,TStage7
,ClinicalNStage7
,GroupStage7
,MarginalStatus
- New enumerations, far too many to enumerate 😏
For issue https://github.com/MCLConsortium/mcl.sickbay/issues/4:
- All fields in
LabCASMetadata
are nowString
.
For issue https://github.com/MCLConsortium/mcl.sickbay/issues/3:
inscribed_clinicalCore_participant_ID
is a new field onPriorLesion
,CoreRace
, andCoreTobacco
inscribed_biospecimen_identifier
is a new field onAdjacentSpecimen
For issue https://github.com/MCLConsortium/mcl.sickbay/issues/5:
- The following updates diverge from the data dictionaries of the common data elements:
participant_ID
is now 50 characters (along with foreign keys andinscribed
fields), up from 14specimen_ID
is now 50 characters (along with foreign keys andinscribed
fields), up from 16
And finally, for issue https://github.com/MCLConsortium/mcl.sickbay/issues/6 … we add unknown
to all enumerations that didn't have it already.
0.0.9
- Rename
inscribed_participant_ID
→inscribed_clinicalCore_participant_ID
- Rename
inscribed_specimen_ID
→inscribed_biospecimen_specimen_ID
0.0.8
- Addresses https://github.com/MCLConsortium/mcl.sickbay/issues/2 by:
- Adding
inscribed_participant_ID
andinscribed_specimen_ID
toGenomics
- Adding
inscribed_participant_ID
andinscribed_specimen_ID
toImaging
- Adding
inscribed_participant_ID
toBiospecimen
- (It also adds some test data to these fields.)
- Adding
0.0.7
In this release:
- The
labcasFileURL
field is now justlabcasID
; everything else is the same except the name (and the semantics; it no longer is used to hold URLs) - The
Organ
class now has aninscribed_participant_ID
field you can use to note a future participant ID association with aClinicalCore
- All enumerations now use advanced enumerations for their base class.
- All enumerations now have a case-insensitive lookup.
The implications of that last bullet mean:
>>> from mcl.sickbay.model.enums import Race
>>> Race.black_or_african_american == Race('Black or African American')
True
>>> Race.black_or_african_american == Race['Black or African American']
True
>>> Race.black_or_african_american == Race['black or african american']
True
>>> Race('black or african american')
Traceback (most recent call last):
...
ValueError: 'black or african american' is not a valid Race
So if you want case-insensitive lookups, use brackets, not parentheses.
0.0.6
In this release:
- Base metadata for all classes now includes:
consortium
, a nullable string that can be used to contain an RDF URI to the consortium that originated the data, such ashttps://mcl.nci.nih.gov/
for the Consortium for Molecular and Cellular Characterization of Screen-Detected Lesions.protocolID
, a nullable integer that tells the research protocol that generated the data.
- Kristen's sample data (
--add-sample-data
) includes these consortium and protocol IDs
0.0.5
This release fixes:
- In
BreastOrgan
, the fieldher2_in_situ_hybridization
was the wrong enumerated type. It should've beenHER2InSituHybridization
. - In the enums, add the type
HER2InSituHybridization
. - Add test data from
12_78_BreastCore_20200625_0
. - Removed foreign key constraint from
Biospecimen.specimen_parent_ID
because the parent ID may be either another biospecimen or could be a participant (clinical core) object. - New class
AdjacentSpecimen
to work around circular dependency problem of having adjacent specimens directly onBiospeciment
. - New JSON serialization for
adjacent_specimens
onBiospecimen
- Misspelled enumeration
AnatomicalSite
:pancrease
→pancreas
- Change
create-demo-db
tocreate-clinical-db
since this is no longer a demo but the real deal - Transition from old style
setup.py
to everything insetup.cfg
In this release, 0.0.5, we also finally start keeping a changelog 😮
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for mcl.sickbay-0.0.10-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 769a34b346cf89ba4a8dc3dcff8de1a7898a96b015fdf7aa32aa2a1271c891eb |
|
MD5 | b0152344ad6fc73f8cce940a0bd2b88a |
|
BLAKE2b-256 | e1f61cb9f9ff4d2ca5327f9242e6620ce3cfeb9288cc6e697711a77e76647eab |