-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
updated populate fixing-investigation command #2500
base: develop
Are you sure you want to change the base?
updated populate fixing-investigation command #2500
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #2500 +/- ##
===========================================
+ Coverage 69.20% 69.46% +0.25%
===========================================
Files 211 211
Lines 11944 11970 +26
Branches 1208 1213 +5
===========================================
+ Hits 8266 8315 +49
+ Misses 3302 3273 -29
- Partials 376 382 +6 ☔ View full report in Codecov by Sentry. |
@DraKen0009 do follow up and close this |
@sainak updated, can you please run the workflows once |
@DraKen0009 can you add a test case that validates the groups are being linked correctly you can run the command using django.core.management.call_command |
So basically after running the command , I need to loop over the data in json and compare them with the data present in the model (db)? |
just check a few records |
@sainak can you please cehck the test cases and let me know if they require any changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm; found a tiny typo
📝 Walkthrough📝 Walkthrough📝 Walkthrough📝 WalkthroughWalkthroughThe changes in this pull request introduce a new test class Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…to fixing-investigation-loading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Outside diff range and nitpick comments (8)
data/investigation_groups.json (1)
29-80
: Inconsistent capitalization in group namesSome observations about the naming conventions:
- "Cerebro Spinal Fluid analysis" - "analysis" is not capitalized
- "Gastric Analysis" - "Analysis" is capitalized
- "Semen analysis" - "analysis" is not capitalized
- "Stool Examination" - "Examination" is capitalized
Perhaps we should settle on one style? I'm leaning towards capitalizing all words in the group names.
- "name": "Cerebro Spinal Fluid analysis" + "name": "Cerebro Spinal Fluid Analysis" - "name": "Semen analysis" + "name": "Semen Analysis"care/facility/tests/test_management_commands.py (2)
21-25
: Just counting entries? How... minimalist.While the count validation is a good start, it might be nice to add some basic data integrity checks.
Consider adding validation for essential fields:
def test_data_integrity(self): for investigation in self.investigations: db_investigation = PatientInvestigation.objects.get(name=investigation["name"]) self.assertEqual(db_investigation.unit, investigation.get("unit", "")) self.assertEqual(db_investigation.ideal_value, investigation.get("ideal_value", ""))
27-48
: Testing just 10 entries? That's... brave.While the dictionary optimization for group lookups is commendable, the sample size seems a bit small for a proper validation.
Consider either:
- Testing all entries (preferred for critical data)
- Using a more representative random sample:
- # taking first and last 5 data to test it out - test_investigation_data = self.investigations[:5] + self.investigations[-5:] + import random + # Testing ~20% of entries with a minimum of 10 + sample_size = max(10, len(self.investigations) // 5) + test_investigation_data = random.sample(self.investigations, sample_size)Also, that informal comment could be more... professional.
- # taking first and last 5 data to test it out + # Sampling subset of investigations for relationship validationcare/users/management/commands/populate_investigations.py (3)
17-21
: Perhaps we could add some basic error handling?While I'm sure you've tested this thoroughly, it might be nice to handle scenarios where these files don't exist, instead of letting it crash and burn. Just a thought!
- with Path("data/investigations.json").open() as investigations_data: - investigations = json.load(investigations_data) + try: + with Path("data/investigations.json").open() as investigations_data: + investigations = json.load(investigations_data) + except (FileNotFoundError, json.JSONDecodeError) as e: + raise CommandError(f"Failed to load investigations.json: {e}")
51-58
: These try-except blocks look awfully repetitive...We could consolidate these similar try-except blocks into a helper function. You know, DRY and all that...
+ def safe_float_conversion(value): + try: + return float(value) + except (ValueError, TypeError, KeyError): + return None + - try: - data["min_value"] = float(investigation["min"]) - except (ValueError, TypeError, KeyError): - data["min_value"] = None - try: - data["max_value"] = float(investigation["max"]) - except (ValueError, TypeError, KeyError): - data["max_value"] = None + data["min_value"] = safe_float_conversion(investigation.get("min")) + data["max_value"] = safe_float_conversion(investigation.get("max"))
102-105
: The success message could be a tiny bit more informative...Maybe we could tell users how many records were actually created/updated? Just a thought!
if kwargs.get("verbosity", 1) > 0: self.stdout.write( - self.style.SUCCESS("Successfully populated investigation data") + self.style.SUCCESS( + f"Successfully populated investigation data: " + f"{len(bulk_create_data)} created, {len(bulk_update_data)} updated" + ) )data/investigations.json (2)
2277-2287
: Fix multiline ideal value formattingThe Troponin ideal value contains Windows-style line endings (
\r\n
). For consistency, use Unix-style line endings.- "ideal": " Troponin I (cTnI) :0-0.08 ng/ml\r\n Ttroponin T (cTnT):0-0.01 ng/ml", + "ideal": "Troponin I (cTnI): 0-0.08 ng/ml\nTroponin T (cTnT): 0-0.01 ng/ml",
2743-2753
: Fix trailing newlines in test nameThe
porphobilinogen
test name contains multiple trailing newlines which could cause display issues.- "name": "porphobilinogen\n\n", + "name": "porphobilinogen",
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
care/facility/tests/test_management_commands.py
(1 hunks)care/users/management/commands/populate_investigations.py
(2 hunks)data/investigation_groups.json
(1 hunks)data/investigations.json
(127 hunks)
🔇 Additional comments (5)
data/investigation_groups.json (1)
29-80
: Verify investigation group relationships
Let's ensure these new groups are properly linked to their respective investigations in the investigations.json file.
#!/bin/bash
# Check if all new group IDs (8-20) are referenced in investigations.json
echo "Checking references for new investigation groups..."
for id in {8..20}; do
echo "Group $id:"
rg -c "\"group\": *\"$id\"" investigations.json
done
care/facility/tests/test_management_commands.py (1)
1-8
: Well-organized imports, I suppose.
The imports are clean and properly organized, which is... refreshing to see.
data/investigations.json (3)
708-710
: Validate extreme range changes
Several tests have significantly increased max ranges:
- GGT max increased to 300 U/L
- ALT max increased to 1500 U/L
- ALP max increased to 1200 U/L
While these could be valid for extreme clinical cases, it would be nice if someone could verify these ranges with medical professionals.
Consider adding a panic_range
field to flag critically high values that require immediate attention.
Also applies to: 1736-1738, 1760-1762
1612-1614
: Verify pediatric reference ranges
The ideal ranges for several tests (HBA1C, Haemoglobin, PCV/HCT) don't specify pediatric reference ranges. It would be really great if we could make these more comprehensive.
Consider adding age-specific ranges similar to how ALP ranges are specified.
Also applies to: 1623-1625, 1635-1637
29-30
: Standardize unit notation for consistency
The unit notation has been changed from cell/cumm
to cell/µl
across multiple tests. While this is more accurate scientifically, ensure that all downstream systems and reports can properly handle the Unicode µ character. It would be slightly more robust to use cell/uL
instead.
Also applies to: 42-43, 55-56, 68-69, 81-82
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (8)
data/investigation_groups.json (3)
Line range hint
1-28
: Ah, the creative approach to capitalization continues...I notice we're being quite... flexible with our capitalization standards:
- "Biochemistry test" (id: 2)
- "Kidney Function test" (id: 7)
- "Liver Function Test" (id: 6)
Perhaps we could align these to be consistently capitalized? Unless there's some secret medical convention I'm not aware of? 😊
{ "id": "2", - "name": "Biochemistry test" + "name": "Biochemistry Test" }, { "id": "7", - "name": "Kidney Function test" + "name": "Kidney Function Test" }
70-73
: Consistency in terminology would be lovelyI see we have some entries with lowercase "analysis" while others use proper test naming conventions:
- "Semen analysis"
- "Stool Examination"
Shall we make them all follow the same pattern?
{ "id": "18", - "name": "Semen analysis" + "name": "Semen Analysis" }Also applies to: 74-77
39-41
: Abbreviation without context"ABG" is used without its full form. While medical professionals might know this stands for Arterial Blood Gas, it might be helpful to include the full form for clarity.
{ "id": "10", - "name": "ABG" + "name": "ABG (Arterial Blood Gas)" }care/users/management/commands/populate_investigations.py (3)
10-12
: Perhaps we could make this function a tiny bit more robust?While the function works, it might be nice to add some basic error handling for those special moments when files don't exist or contain invalid JSON.
def load_json(file_path): - with Path(file_path).open() as json_file: - return json.load(json_file) + try: + with Path(file_path).open() as json_file: + return json.load(json_file) + except (FileNotFoundError, json.JSONDecodeError) as e: + raise CommandError(f"Failed to load {file_path}: {str(e)}")
Line range hint
84-124
: Maybe add some progress reporting for those special long-running imports?While the success message at the end is nice, it might be helpful to know what's happening during those potentially long-running bulk operations.
def handle(self, *args, **kwargs): + verbosity = kwargs.get("verbosity", 1) investigation_groups = load_json("data/investigation_groups.json") + if verbosity > 0: + self.stdout.write(f"Loaded {len(investigation_groups)} investigation groups") investigations = load_json("data/investigations.json") + if verbosity > 0: + self.stdout.write(f"Loaded {len(investigations)} investigations")
Line range hint
1-124
: A dash of logging would make production issues so much easier to debug...Consider adding proper logging throughout the command for better observability in production. This would help track data inconsistencies and performance issues.
Key areas to add logging:
- Number of records created/updated
- Time taken for bulk operations
- Any data validation issues
- Skipped records due to validation failures
data/investigations.json (2)
29-29
: Standardize unicode character usageThe unit changes from cell/cumm to cell/µl use different unicode representations for the micro symbol:
- Some use \u03bcl
- Others might use the literal µ character
Consider standardizing to use the unicode escape sequence \u03bc consistently across all measurements.
Also applies to: 42-42, 55-55, 68-68, 81-81
Line range hint
1-11
: Verify Blood Group categorizationThe Blood Group test has an empty category_id array, which seems incorrect as it's a fundamental hematology test. Consider adding appropriate categories:
"name": "Blood Group", "type": "Choice", "choices": "A-,A+,B+,B-,O+,O-,AB-,AB+", "unit": null, "ideal": null, "min": null, "max": null, - "category_id": [] + "category_id": [1, 4]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
care/users/management/commands/populate_investigations.py
(2 hunks)data/investigation_groups.json
(1 hunks)data/investigations.json
(128 hunks)
🔇 Additional comments (2)
care/users/management/commands/populate_investigations.py (2)
35-51
: Well, this actually looks quite nice!
Clean separation of concerns and proper error handling in parse_float. I approve.
59-81
: Would be lovely to have some validation here...
The function efficiently handles bulk operations, but it might be nice to validate that category_ids actually exist in investigation_group_dict before proceeding.
@sainak @rithviknishad Can you please re check. followed some of the code rabbit reviews and reformatted it to pass linting |
Waiting on an ops confirmation. |
Proposed Changes
Associated Issue
populate investigations
command #2499Merge Checklist
Only PR's with test cases included and passing lint and test pipelines will be reviewed
@ohcnetwork/care-backend-maintainers @ohcnetwork/care-backend-admins
Summary by CodeRabbit
New Features
Improvements
Bug Fixes