Count mismatch between hadm_id is not NaN and disposition=ADMITTED #1573
Unanswered
rddelosreyes
asked this question in
MIMIC-IV-ED
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've recently been replicating the preprocessing steps done on MIMIC-ED in this paper: [https://www.nature.com/articles/s41597-022-01782-9](Benchmarking emergency department prediction models with machine learning and public electronic health records). I noticed that the number of hospitalizations they had didn't match mine (aware that they are using v1 and I'm using v2.2). Theirs was 208,976 (in v1 which is 203,016 in v2.2) but mine was only 158,010 (in v2.2). In their code, they are using hadm_id is not NaN to label hospitalizations, whereas I'm using disposition=ADMITTED. I then checked the dispositions of edstays with hadm_id is not NaN and found the following:
May I kindly ask why this is the case and which labeling scheme should be used? I think the count of hadm_id is not NaN should be equal to disposition=ADMITTED. I checked whether this may be because some hadm_id might just be duplicated but only found 575 duplicates (which does not account for the ~50k difference in counts, and I think hadm_id should not have duplicates as well).
Beta Was this translation helpful? Give feedback.
All reactions