-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix measure bottlenecks #337
Conversation
Above commits may seem strange but I had to fix these for the new RCLL tests |
R/PredictionDataSurv.R
Outdated
ok = inherits(distr, c("VectorDistribution", "Matdist", "Arrdist")) && | ||
length(keep) > 1 # edge case: Arrdist(1xYxZ) and keep = FALSE | ||
if (ok) { | ||
pdata$distr = distr[keep] # we can subset row/samples like this | ||
} else { | ||
pdata$distr = base::switch(keep, distr) # one distribution only | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bblodfon can you tell me what's happening here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to filter Distribution
objects (like the prediction object from surv.parametric
) since I noticed we don't do any filtering on those and I needed on the tests to compare RCLL old and new code kind-of. The above is the result of testing some edge cases.
In general you can subset with distr[keep]
when you have a Matdist
, Arrdist
or an VectorDistribution
if you have two or more observations. The problem starts when 1 observation is left. distr[keep]
with a VectorDistribution
leaves a Distribution
(e.g. WeibullAFT
-something in surv.parametric
), a WeightedDiscrete()
for Matdist
, and an Arrdist
of dim 1x...
(it doesn't degenerate to a Matdist
). And that's all fine, we still use pdata$distr = distr[keep]
to do this. But then if you redo some more filtering (on that same specific row index or another to go to 0 observations - a very edge case :), keep
will be either TRUE
or FALSE
, and most of the above are not subsettable or don't work as expected:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for asking this - based on the above, we might want to make an Arrdist
to a Matdist
when only one observation is left by design in distr6
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RaphaelS1 Do you have an opinion on this?
distr = prediction$distr | ||
if (inherits(distr, c("Matdist", "Arrdist"))) { | ||
si = diag(distr$survival(true_times)) | ||
} else { # VectorDistribution or single Distribution, e.g. WeightDisc() | ||
si = as.numeric(distr$survival(data = matrix(true_times, nrow = 1L))) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is from using lrn('surv.parametric')
and filtering to one observation, p$distr
is a WeibullAFT()
which is not a VectorDistribution
, so diag(distr$survival(true_times))
returned 0 and later NaN
in the score
Depends on xoopR/distr6#296