Skip to content

Commit

Permalink
Merge pull request #154 from CanDIG/mshadbolt/DIG-1217-create-ER-diagram
Browse files Browse the repository at this point in the history
DIG:1217 Create ER diagram workflow and add to READMEs
  • Loading branch information
SonQBChau authored Oct 26, 2023
2 parents 8224813 + 9510fea commit ca867ea
Show file tree
Hide file tree
Showing 6 changed files with 704 additions and 1 deletion.
63 changes: 63 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,69 @@ coverage html

## MOHCCN Clinical Data Model

Katsu uses an underlying data model that is a compatible interpretation, but does not exactly match the MOHCCN data model. Katsu is currently compliant with version 2 of the model, released February 2023. Some relationships between objects have been modified to avoid excessive complexity in the katsu database and allow for the submission of data that is incomplete compared to the MOHCCN gold standard requirements. Permissable values for controlled fields are not validated by the underlying database.

The katsu MoH model is explicitly defined as a set of classes in [models.py](chord_metadata_service/mohpackets/models.py). Permissable values for controlled fields, conditionally required fields and relationships are enforced by the [serializers](chord_metadata_service/mohpackets), [clinical ETL](https://github.com/CanDIG/clinical_ETL_code) and [ingest](https://github.com/CanDIG/candigv2-ingest) validation steps.

An overview diagram of how objects in the katsu model is shown below. A more detailed entity relationship diagram containing field level information can be found in the [mohpackets docs folder](chord_metadata_service/mohpackets/docs/er_diagram.md)

```mermaid
---
title: katsu object level MoH ER diagram
---
erDiagram
Program ||--o{ Donor : ""
Program ||--o{ PrimaryDiagnosis : ""
Program ||--o{ Comorbidity : ""
Program ||--o{ Biomarker : ""
Program ||--o{ Exposure : ""
Program ||--o{ FollowUp : ""
Program ||--o{ Specimen : ""
Program ||--o{ Treatment : ""
Program ||--o{ SampleRegistration : ""
Program ||--o{ Chemotherapy : ""
Program ||--o{ HormoneTherapy : ""
Program ||--o{ Immunotherapy : ""
Program ||--o{ Radiation : ""
Program ||--o{ Surgery : ""
Donor ||--o{ PrimaryDiagnosis : ""
Donor ||--o{ Comorbidity : ""
Donor ||--o{ Biomarker : ""
Donor ||--o{ Exposure : ""
Donor ||--o{ FollowUp : ""
Donor ||--o{ Specimen : ""
Donor ||--o{ Treatment : ""
Donor ||--o{ SampleRegistration : ""
Donor ||--o{ Chemotherapy : ""
Donor ||--o{ HormoneTherapy : ""
Donor ||--o{ Immunotherapy : ""
Donor ||--o{ Radiation : ""
Donor ||--o{ Surgery : ""
PrimaryDiagnosis ||--o{ Specimen : ""
PrimaryDiagnosis ||--o{ Treatment : ""
PrimaryDiagnosis ||--o{ FollowUp : ""
Specimen ||--o{ SampleRegistration : ""
Treatment ||--o{ Chemotherapy : ""
Treatment ||--o{ HormoneTherapy : ""
Treatment ||--o{ Immunotherapy : ""
Treatment ||--o| Radiation : ""
Treatment ||--o| Surgery : ""
Treatment ||--o{ FollowUp : ""
```
### General notes

* The primary key for **Program** is `program_id` and should be unique across all instances of the CanDIG platform
* For all other objects, the primary key is the `submitter_<object_name>_id`, a user provided identifier that should be unique across all instances of an object within a program
* All objects are explicitly linked with foreign keys to a **Program** and the **Donor** the object derives from.

### Deviations from the MOHCCN model

* **Biomarker** is explicitly linked to **Donor** with a foreign key, it should also be linked to a specific clinical event by storing either a `specimen`, `primary_diagnosis`, `treatment` or `follow_up` `submitter_id` in the **Biomarker** object. If it isn't linked to a clinical event, it should have `test_date` specified.
* **Surgery** is explicitly linked with a foreign key to a **Treatment**, it can also store a `specimen_submitter_id` to indicate which specimen derived from the surgery, this is not a foreign key relationship

### References
[Clinical Data Model](https://www.marathonofhopecancercentres.ca/docs/default-source/policies-and-guidelines/mohccn-clinical-data-model_v1_endorsed6oct-2022.pdf?Status=Master&sfvrsn=7f6bd159_7)

[ER Diagram](https://www.marathonofhopecancercentres.ca/docs/default-source/policies-and-guidelines/mohccn_data_standard_er_diagram_endorsed6oct22.pdf?Status=Master&sfvrsn=dd57a75e_5)
Expand Down
17 changes: 17 additions & 0 deletions chord_metadata_service/mohpackets/docs/README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

This folder contains the schema and documentation for **MoH models**

## Katsu API Documentation

To view the API documentation, simply open [openapi.md](openapi.md) or [Redoc](https://redocly.github.io/redoc/?url=https://raw.githubusercontent.com/CanDIG/katsu/develop/chord_metadata_service/mohpackets/docs/schema.yml).

To generate the `schema.yml` file, run the following command:
Expand All @@ -17,3 +19,18 @@ widdershins ./chord_metadata_service/mohpackets/docs/schema.yml -o ./chord_metad
```

This will create the openapi.md file with the updated documentation.

## Katsu MoH data model Documentation

To regenerate the `er_diagram.md` file, run the following from the commandline in the current directory:

To update the model classes:
```bash
pip install pylint
pyreverse -o mmd ./models.py
```

To update the markdown file
```bash
python make_er_diagram.py
```
264 changes: 264 additions & 0 deletions chord_metadata_service/mohpackets/docs/classes.mmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,264 @@
classDiagram
class AutoDateTimeField {
pre_save(model_instance, add)
}
class Biomarker {
ca125 : PositiveSmallIntegerField
cea : PositiveSmallIntegerField
er_percent_positive : FloatField
er_status : CharField
her2_ihc_status : CharField
her2_ish_status : CharField
hpv_ihc_status : CharField
hpv_pcr_status : CharField
hpv_strain : ArrayField
id : UUIDField
pr_percent_positive : FloatField
pr_status : CharField
program_id : ForeignKey
psa_level : PositiveSmallIntegerField
submitter_donor_id : ForeignKey
submitter_follow_up_id : CharField
submitter_primary_diagnosis_id : CharField
submitter_specimen_id : CharField
submitter_treatment_id : CharField
test_date : CharField
}
class Chemotherapy {
actual_cumulative_drug_dose : PositiveSmallIntegerField
chemotherapy_drug_dose_units : CharField
drug_name : CharField
drug_reference_database : CharField
drug_reference_identifier : CharField
id : UUIDField
prescribed_cumulative_drug_dose : PositiveSmallIntegerField
program_id : ForeignKey
submitter_donor_id : ForeignKey
submitter_treatment_id : ForeignKey
}
class Comorbidity {
age_at_comorbidity_diagnosis : PositiveSmallIntegerField
comorbidity_treatment : CharField
comorbidity_treatment_status : CharField
comorbidity_type_code : CharField
id : UUIDField
laterality_of_prior_malignancy : CharField
prior_malignancy : CharField
program_id : ForeignKey
submitter_donor_id : ForeignKey
}
class Donor {
cause_of_death : CharField
date_alive_after_lost_to_followup : CharField
date_of_birth : CharField
date_of_death : CharField
gender : CharField
is_deceased : BooleanField
lost_to_followup_after_clinical_event_identifier : CharField
lost_to_followup_reason : CharField
primary_site : ArrayField
program_id : ForeignKey
sex_at_birth : CharField
submitter_donor_id : CharField
}
class Exposure {
id : UUIDField
pack_years_smoked : FloatField
program_id : ForeignKey
submitter_donor_id : ForeignKey
tobacco_smoking_status : CharField
tobacco_type : ArrayField
}
class FollowUp {
anatomic_site_progression_or_recurrence : ArrayField
date_of_followup : CharField
date_of_relapse : CharField
disease_status_at_followup : CharField
method_of_progression_status : ArrayField
program_id : ForeignKey
recurrence_m_category : CharField
recurrence_n_category : CharField
recurrence_stage_group : CharField
recurrence_t_category : CharField
recurrence_tumour_staging_system : CharField
relapse_type : CharField
submitter_donor_id : ForeignKey
submitter_follow_up_id : CharField
submitter_primary_diagnosis_id : ForeignKey
submitter_treatment_id : ForeignKey
}
class HormoneTherapy {
actual_cumulative_drug_dose : PositiveSmallIntegerField
drug_name : CharField
drug_reference_database : CharField
drug_reference_identifier : CharField
hormone_drug_dose_units : CharField
id : UUIDField
prescribed_cumulative_drug_dose : PositiveSmallIntegerField
program_id : ForeignKey
submitter_donor_id : ForeignKey
submitter_treatment_id : ForeignKey
}
class Immunotherapy {
actual_cumulative_drug_dose : PositiveSmallIntegerField
drug_name : CharField
drug_reference_database : CharField
drug_reference_identifier : CharField
id : UUIDField
immunotherapy_drug_dose_units : CharField
immunotherapy_type : CharField
prescribed_cumulative_drug_dose : PositiveSmallIntegerField
program_id : ForeignKey
submitter_donor_id : ForeignKey
submitter_treatment_id : ForeignKey
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class Meta {
ordering : list
}
class PrimaryDiagnosis {
basis_of_diagnosis : CharField
cancer_type_code : CharField
clinical_m_category : CharField
clinical_n_category : CharField
clinical_stage_group : CharField
clinical_t_category : CharField
clinical_tumour_staging_system : CharField
date_of_diagnosis : CharField
laterality : CharField
lymph_nodes_examined_method : CharField
lymph_nodes_examined_status : CharField
number_lymph_nodes_positive : PositiveSmallIntegerField
program_id : ForeignKey
submitter_donor_id : ForeignKey
submitter_primary_diagnosis_id : CharField
}
class Program {
created : DateTimeField
metadata : JSONField
program_id : CharField
updated
}
class Radiation {
anatomical_site_irradiated : CharField
id : UUIDField
program_id : ForeignKey
radiation_boost : BooleanField
radiation_therapy_dosage : PositiveSmallIntegerField
radiation_therapy_fractions : PositiveSmallIntegerField
radiation_therapy_modality : CharField
radiation_therapy_type : CharField
reference_radiation_treatment_id : CharField
submitter_donor_id : ForeignKey
submitter_treatment_id : ForeignKey
}
class SampleRegistration {
program_id : ForeignKey
sample_type : CharField
specimen_tissue_source : CharField
specimen_type : CharField
submitter_donor_id : ForeignKey
submitter_sample_id : CharField
submitter_specimen_id : ForeignKey
tumour_normal_designation : CharField
}
class Specimen {
pathological_m_category : CharField
pathological_n_category : CharField
pathological_stage_group : CharField
pathological_t_category : CharField
pathological_tumour_staging_system : CharField
percent_tumour_cells_measurement_method : CharField
percent_tumour_cells_range : CharField
program_id : ForeignKey
reference_pathology_confirmed_diagnosis : CharField
reference_pathology_confirmed_tumour_presence : CharField
specimen_anatomic_location : CharField
specimen_collection_date : CharField
specimen_laterality : CharField
specimen_processing : CharField
specimen_storage : CharField
submitter_donor_id : ForeignKey
submitter_primary_diagnosis_id : ForeignKey
submitter_specimen_id : CharField
tumour_grade : CharField
tumour_grading_system : CharField
tumour_histological_type : CharField
}
class Surgery {
greatest_dimension_tumour : PositiveSmallIntegerField
id : UUIDField
lymphovascular_invasion : CharField
margin_types_involved : ArrayField
margin_types_not_assessed : ArrayField
margin_types_not_involved : ArrayField
perineural_invasion : CharField
program_id : ForeignKey
residual_tumour_classification : CharField
submitter_donor_id : ForeignKey
submitter_specimen_id : CharField
submitter_treatment_id : ForeignKey
surgery_location : CharField
surgery_site : CharField
surgery_type : CharField
tumour_focality : CharField
tumour_length : PositiveSmallIntegerField
tumour_width : PositiveSmallIntegerField
}
class Treatment {
days_per_cycle : PositiveSmallIntegerField
is_primary_treatment : CharField
line_of_treatment : IntegerField
number_of_cycles : PositiveSmallIntegerField
program_id : ForeignKey
response_to_treatment : CharField
response_to_treatment_criteria_method : CharField
status_of_treatment : CharField
submitter_donor_id : ForeignKey
submitter_primary_diagnosis_id : ForeignKey
submitter_treatment_id : CharField
treatment_end_date : CharField
treatment_intent : CharField
treatment_setting : CharField
treatment_start_date : CharField
treatment_type : ArrayField
}
AutoDateTimeField --* Program : updated
Loading

0 comments on commit ca867ea

Please sign in to comment.