Count mismatch between hadm_id is not NaN and disposition=ADMITTED #1573

rddelosreyes · 2023-06-22T07:01:58Z

rddelosreyes
Jun 22, 2023

I've recently been replicating the preprocessing steps done on MIMIC-ED in this paper: [https://www.nature.com/articles/s41597-022-01782-9](Benchmarking emergency department prediction models with machine learning and public electronic health records). I noticed that the number of hospitalizations they had didn't match mine (aware that they are using v1 and I'm using v2.2). Theirs was 208,976 (in v1 which is 203,016 in v2.2) but mine was only 158,010 (in v2.2). In their code, they are using hadm_id is not NaN to label hospitalizations, whereas I'm using disposition=ADMITTED. I then checked the dispositions of edstays with hadm_id is not NaN and found the following:

ADMITTED: 157,626
HOME: 36,497
TRANSFER: 4,696
OTHER: 2,548
ELOPED: 1,182
LEFT AGAINST MEDICAL ADVICE: 343
LEFT WITHOUT BEING SEEN: 110
EXPIRED: 14

May I kindly ask why this is the case and which labeling scheme should be used? I think the count of hadm_id is not NaN should be equal to disposition=ADMITTED. I checked whether this may be because some hadm_id might just be duplicated but only found 575 duplicates (which does not account for the ~50k difference in counts, and I think hadm_id should not have duplicates as well).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Count mismatch between hadm_id is not NaN and disposition=ADMITTED #1573

{{title}}

Replies: 0 comments

Select a reply

Count mismatch between hadm_id is not NaN and disposition=ADMITTED #1573

rddelosreyes Jun 22, 2023

Replies: 0 comments

rddelosreyes
Jun 22, 2023