Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 102 #113

Merged
merged 9 commits into from
Nov 1, 2021
Merged

Issue 102 #113

merged 9 commits into from
Nov 1, 2021

Conversation

assignUser
Copy link
Collaborator

Closes #102
Closes #105

This is ready but I do want to merge the touchstone changes back into this from main to check how/if the performance changes.
I added some tests and looked over everything again and it looks good to me. Let me know what you think.

@assignUser assignUser requested a review from kgoldfeld October 31, 2021 14:02
@assignUser assignUser self-assigned this Oct 31, 2021
@assignUser assignUser mentioned this pull request Oct 31, 2021
@kgoldfeld
Copy link
Owner

Should we be worried that all these checks failed?

@assignUser
Copy link
Collaborator Author

Should we be worried that all these checks failed?

I'll have a look after #112

@kgoldfeld
Copy link
Owner

I built and installed from issue-102 branch to see how things are working. R crashes when I run this simple code

b_0 <- 0
b_1 <- c(-3, 3)
b_2 <- c(-1, 1)
b_12 <- matrix(c(-1, -1, -1, 3), nrow = 2)

d1 <- defData(varname = "a", formula = ".5;.5", variance = "1;2", dist = "categorical")
d1 <- defData(d1, varname = "b", formula = ".5;.5", variance = "1;2", dist = "categorical")

set.seed(1)
dx <- genData(3, d1)

But now, I can't build and install anything from within RStudio - getting this message:

==> Rcpp::compileAttributes()

* Updated R/RcppExports.R

==> R CMD INSTALL --no-multiarch --with-keep.source simstudy

* installing to library/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library* installing *source* packagesimstudy...
** using staged installation
make: Nothing to be done for `all'.
** libs
installing to /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/00LOCK-simstudy/00new/simstudy/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location

 *** caught segfault ***
address 0x0, cause 'invalid permissions'
An irrecoverable exception occurred. R is aborting now ...
sh: line 1: 17735 Segmentation fault: 11  R_TESTS= '/Library/Frameworks/R.framework/Resources/bin/R' --no-save --no-restore --no-echo 2>&1 < '/var/folders/jr/19fz7v9976gbf62sf66f7zz40000gq/T//RtmpudpQAP/file453012e04ccd'
ERROR: loading failed
* removing ‘/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/simstudy’
* restoring previous ‘/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/simstudy’

Exited with status 1.

This might take me a little while to figure out.

@assignUser
Copy link
Collaborator Author

assignUser commented Oct 31, 2021

@kgoldfeld try updating RCPP I have recently switched over to linux so all my packages are completely uptodate and RCPP automatically added this: https://github.com/kgoldfeld/simstudy/pull/113/files#diff-13dc68d7d985baff5c1cac41e1471ed9fe4db63a2bea80c2cdf20fd795541062R8-R12

That is the most likely cause I think...

@assignUser
Copy link
Collaborator Author

I can run the code posted above without issue... but the error with the vignette is caused by it not being to able to run a mixture dist correctly, this is a data.table issue I dont fully understand, maybe you can have a look and let me know what the issue is?

defblk <- defData(varname = "blksize", 
   formula = "..sizes[1] | .5 + ..sizes[2] | .5", dist = "mixture")
sizes <- c(2, 4)
genData(1000, defblk)

gives this error

Error in `[.data.table`(dtSim, , `:=`(newVar, eval(parse(text = as.character(formula2parse)))),  : 
  Supplied 1000 items to be assigned to group 1 of size 1 in column 'newVar'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.

@kgoldfeld
Copy link
Owner

Just wanted to let you know that I am able to build and install again. And the vectorization does not crash my computer any more - it seems to work. I'll check out the other issue this afternoon.

@kgoldfeld
Copy link
Owner

That mixture error is weird - it works just fine under the current CRAN version as well as the development branch (main). Something you did to enable the vectorization is wrecking havoc. I guess you know that already? And just trying to understand what is going wrong? I'll look into it more.

@assignUser
Copy link
Collaborator Author

Yes it is the keyby argument in the data.table formula. I have to read up on data table more but what is happening, I think is that keyby = id makes each row a separate group but the formula for the mixture gives a n-length vector when evaluatef.

Usually we would just use an if to not use keyby for those cases.... the issue is that I don now how we could easily detect the difference in the formulas...

@assignUser
Copy link
Collaborator Author

So... I have what I would loath to call a solution... but well here it is:

res <- with(e, {
      expr <- parse(text = as.character(formula2parse))
      tryCatch(
        expr = dtSim[, newVar := eval(expr), keyby = def_id],
        error = function(err) {
          if (grepl("RHS length must either be 1", gettext(err), fixed = T)) {
            dtSim[, newVar := eval(expr)]
          } else {
            stop(gettext(err))
          }
        }
      )
      copy(dtSim$newVar)
    })

It is not pretty but it works xD in view of simstudy 1.0 (or the "overhaul ) which will change the whole system I think this solution is adequate for now.

I have pushed the changes, let me know if you think this is acceptable for now.

p.s. I (or rather devtools) removed the knitted vignettes from the /vignettes folder as is expected by cran(https://blog.r-hub.io/2020/06/03/vignettes/#overview-of-vignettes-states)

@assignUser
Copy link
Collaborator Author

assignUser commented Oct 31, 2021

rebuilding worked on my end so I don't know why the gha checks fail, could you check out this branch and test if you can build the vignettes? nevermind I was looking at the wrong check result

@assignUser
Copy link
Collaborator Author

@kgoldfeld my "solution" passes all checks :D give it a try and if everything is ok please merge #112 so I can merge it into this branch too and we can check out the performance implications.

@kgoldfeld kgoldfeld marked this pull request as ready for review October 31, 2021 23:31
@kgoldfeld kgoldfeld merged commit 7d872c9 into main Nov 1, 2021
@assignUser assignUser deleted the issue-102 branch November 3, 2021 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Working with vectorized variables in formulas vector/matrix reference within double dot notation
2 participants