Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Implement cascading deletion for obsolete TrackedEntities[DHIS2-15066] #196

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
194 changes: 193 additions & 1 deletion releases/2.42/migration-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ To help you navigate the document, here's a detailed table of contents.
- [Inconsistent data](#inconsistent-data)
- [Tracker](#tracker)
- [Null Organisation Unit](#null-organisation-unit)
- [Null Tracked Entity Type](#null-tracked-entity-type)
---
## Inconsistent-data

Expand Down Expand Up @@ -272,4 +273,195 @@ BEGIN

END;
$$;
```
```

### Null Tracked Entity Type

The `TrackedEntity`, previously known as `TrackedEntityInstance`, is required to have a specific type, such as person, place, equipment, or area. However, this constraint was not enforced at the database level, leading to inconsistent data. To ensure data integrity moving forward, we need to enforce this requirement by making the `trackedentitytypeid` column in the `trackedentity` table non-nullable.

#### Checking for Null Values

To check for any NULL values in this column, you can use the following SQL script. If it returns a value greater than 0, it indicates the presence of inconsistent data in the system.

##### For 2.41 Instances:

```sql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we shouldn't show entities that we can fix in the migration, so we should filter out any tracked entity that has an enrollment with a program that has a defined trackedEntityType

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should change this query to not show the entities that will be fixed by the migration

SELECT COUNT(1)
FROM trackedentity
WHERE trackedentitytypeid IS NULL;
```
##### For <= 2.40 Instances:

```sql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

SELECT COUNT(1)
FROM trackedentityinstance
WHERE trackedentitytypeid IS NULL;
```

#### Fixing Null Values
zubaira marked this conversation as resolved.
Show resolved Hide resolved

Starting from version v42, NULL values are no longer allowed in the trackedentitytypeid column. The migration attempted to address the invalid data, but it was unsuccessful. There are two options going forward.
- Change the `NULL` value to a valid trackedentitytypeid. ([Assign trackedentitytyeid to tracked entity](#assign-tracked-entity-type))
- Completely remove invalid trackedentity record. ([Delete trackedentities](#deleting-invalid-trackedenetities))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Completely remove invalid trackedentity record. ([Delete trackedentities](#deleting-invalid-trackedenetities))
- Completely remove invalid trackedentity record. ([Delete trackedentities](#deleting-invalid-tracked-entities))


##### Assign tracked entity type
To assign valid trackedentitytypeid. It can be fetched from program via enrollment.
zubaira marked this conversation as resolved.
Show resolved Hide resolved

```sql
UPDATE trackedentity SET trackedentitytypeid = (
SELECT program.trackedentitytypeid FROM program
JOIN enrollment ON program.programid = enrollment.programid
WHERE enrollment.trackedentityid = trackedentity.trackedentityid)
WHERE trackedentitytypeid IS NULL AND EXISTS (
SELECT 1 FROM enrollment JOIN program ON enrollment.programid = program.programid
WHERE enrollment.trackedentityid = trackedentity.trackedentityid
);
```


##### Deleting invalid trackedenetities
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
##### Deleting invalid trackedenetities
##### Deleting invalid tracked entities

The following script can be used to remove all trackedentities with null in `trackedentitytypeid` column.
zubaira marked this conversation as resolved.
Show resolved Hide resolved
```plsql
zubaira marked this conversation as resolved.
Show resolved Hide resolved
DO $$
DECLARE
invalid_count INT; -- Variable to store the count of invalid TrackedEntities
deleted_count INT := 0; -- Variable to keep track of deleted TrackedEntity count
BEGIN
CREATE TEMP TABLE te AS (
SELECT trackedentityid
FROM trackedentity
WHERE trackedentitytypeid IS NULL
);

CREATE TEMP TABLE enrollment_ids AS (
SELECT enrollmentid
FROM enrollment
WHERE trackedentityid IN (SELECT trackedentityid FROM te)
);

CREATE TEMP TABLE event_ids AS (
SELECT eventid
FROM event
WHERE enrollmentid IN (SELECT enrollmentid FROM enrollment_ids)
);

CREATE TEMP TABLE te_pm AS (
SELECT id
FROM programmessage
WHERE trackedentityid IN (SELECT trackedentityid FROM te)
);

CREATE TEMP TABLE pi_pm AS (
SELECT id
FROM programmessage
WHERE enrollmentid IN (SELECT enrollmentid FROM enrollment_ids)
);

CREATE TEMP TABLE event_pm AS (
SELECT id
FROM programmessage
WHERE eventid IN (SELECT eventid FROM event_ids)
);

SELECT COUNT(trackedentityid)
INTO invalid_count
FROM trackedentity
WHERE trackedentitytypeid IS NULL;

RAISE NOTICE 'Number of invalid TrackedEntities (trackedentitytypeid IS NULL): %', invalid_count;

-- If there are any invalid TrackedEntities, proceed with deletion
IF invalid_count > 0 THEN

DELETE FROM programmessage_deliverychannels
WHERE programmessagedeliverychannelsid IN (SELECT id FROM te_pm);

DELETE FROM programmessage_emailaddresses
WHERE programmessageemailaddressid IN (SELECT id FROM te_pm);

DELETE FROM programmessage_phonenumbers
WHERE programmessagephonenumberid IN (SELECT id FROM te_pm);

DELETE FROM programmessage_deliverychannels
WHERE programmessagedeliverychannelsid IN (SELECT id FROM pi_pm);

DELETE FROM programmessage_emailaddresses
WHERE programmessageemailaddressid IN (SELECT id FROM pi_pm);

DELETE FROM programmessage_phonenumbers
WHERE programmessagephonenumberid IN (SELECT id FROM pi_pm);

DELETE FROM programmessage_deliverychannels
WHERE programmessagedeliverychannelsid IN (SELECT id FROM event_pm);

DELETE FROM programmessage_emailaddresses
WHERE programmessageemailaddressid IN (SELECT id FROM event_pm);

DELETE FROM programmessage_phonenumbers
WHERE programmessagephonenumberid IN (SELECT id FROM event_pm);

DELETE FROM event_notes
WHERE eventid IN (SELECT eventid FROM event_ids);

DELETE FROM enrollment_notes
WHERE enrollmentid IN (SELECT enrollmentid FROM enrollment_ids);

DELETE FROM note
WHERE noteid NOT IN (
SELECT noteid FROM event_notes
UNION ALL
SELECT noteid FROM enrollment_notes
);

DELETE FROM trackedentitydatavalueaudit
WHERE eventid IN (SELECT eventid FROM event_ids);

DELETE FROM programmessage
WHERE eventid IN (SELECT eventid FROM event_ids);

DELETE FROM programmessage
WHERE enrollmentid IN (SELECT enrollmentid FROM enrollment_ids);

DELETE FROM event
WHERE enrollmentid IN (SELECT enrollmentid FROM enrollment_ids);

DELETE FROM programmessage
WHERE trackedentityid IN (SELECT trackedentityid FROM te);

DELETE FROM relationshipitem
WHERE trackedentityid IN (SELECT trackedentityid FROM te);

DELETE FROM trackedentityattributevalue
WHERE trackedentityid IN (SELECT trackedentityid FROM te);

DELETE FROM trackedentityattributevalueaudit
WHERE trackedentityid IN (SELECT trackedentityid FROM te);

DELETE FROM trackedentityprogramowner
WHERE trackedentityid IN (SELECT trackedentityid FROM te);

DELETE FROM programtempowner
WHERE trackedentityid IN (SELECT trackedentityid FROM te);

DELETE FROM programtempownershipaudit
WHERE trackedentityid IN (SELECT trackedentityid FROM te);

DELETE FROM programownershiphistory
WHERE trackedentityid IN (SELECT trackedentityid FROM te);

DELETE FROM enrollment
WHERE trackedentityid IN (SELECT trackedentityid FROM te);

WITH deleted AS (DELETE FROM trackedentity WHERE trackedentitytypeid IS NULL RETURNING *) SELECT COUNT(*) INTO deleted_count FROM deleted;

RAISE NOTICE 'Total number of TrackedEntities deleted: %', deleted_count;

ELSE
RAISE NOTICE 'No invalid TrackedEntities found for deletion.';
END IF;

DROP TABLE IF EXISTS te, enrollment_ids, event_ids, te_pm, pi_pm, event_pm;

END;
$$;
```