-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
138 lines (110 loc) · 7.38 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
fig.align = "center",
fig.width = 9,
fig.height = 6
)
```
```{r srr-tags, eval = FALSE, echo = FALSE}
#' @srrstats {G1.1, G1.2}
```
# dynamite: Bayesian Modeling and Causal Inference for Multivariate Longitudinal Data <a href="https://docs.ropensci.org/dynamite/"><img src="man/figures/logo.png" align="right" height="139"/></a>
<!-- badges: start -->
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![R-CMD-check](https://github.com/ropensci/dynamite/workflows/R-CMD-check/badge.svg)](https://github.com/ropensci/dynamite/actions)
[![Codecov test coverage](https://codecov.io/gh/ropensci/dynamite/branch/main/graph/badge.svg)](https://app.codecov.io/gh/ropensci/dynamite?branch=main)
[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/554_status.svg)](https://github.com/ropensci/software-review/issues/554)
[![dynamite status badge](https://ropensci.r-universe.dev/badges/dynamite)](https://ropensci.r-universe.dev)
[![dynamite CRAN badge](https://www.r-pkg.org/badges/version/dynamite)](https://cran.r-project.org/package=dynamite)
<!-- badges: end -->
The `dynamite` [R](https://www.r-project.org/) package provides an easy-to-use interface for Bayesian inference of complex panel (time series) data comprising of multiple measurements per multiple individuals measured in time via dynamic multivariate panel models (DMPM). The main features distinguishing the package and the underlying methodology from many other approaches are:
* Support for regular time-invariant effects, group-level random effects, and time-varying effects modeled via Bayesian P-splines.
* Joint modeling of multiple measurements per individual (multiple channels) based directly on the assumed data-generating process. Individual channels can be univariate or multivariate.
* Support for various distributions: Currently Gaussian, Multivariate Gaussian, Student t, Categorical, Ordered, Multinomial, Poisson, Bernoulli, Binomial, Negative Binomial, Gamma, Exponential, and Beta distributions are available, and these can be combined arbitrarily in multichannel models.
* Allows evaluating realistic long-term counterfactual predictions that take into account the dynamic structure of the model by efficient posterior predictive distribution simulation.
* Transparent quantification of parameter and predictive uncertainty due to a fully Bayesian approach.
* Various visualization methods including a method for drawing and producing a TikZ code of the directed acyclic graph (DAG) of the model structure.
* User-friendly and efficient R interface with state-of-the-art estimation via Stan. Both `rstan` and `cmdstanr` backends are supported, with both parallel chains and within-chain parallelization.
The `dynamite` package is developed with the support of the Research Council of Finland grant 331817 ([PREDLIFE](https://sites.utu.fi/predlife/en/)). For further information on DMPMs and the `dynamite` package, see the related papers:
* Helske J. and Tikka S. (2024). Estimating Causal Effects from Panel Data with Dynamic Multivariate Panel Models. *Advances in Life Course Research*, 60, 100617. ([Journal version](https://doi.org/10.1016/j.alcr.2024.100617), [SocArXiv](https://osf.io/preprints/socarxiv/mdwu5/) preprint)
* Tikka S. and Helske J. (2024). `dynamite`: An R Package for Dynamic Multivariate Panel Models. ([arXiv](https://arxiv.org/abs/2302.01607) preprint)
## Installation
You can install the most recent stable version of `dynamite` from [CRAN](https://cran.r-project.org/package=dynamite) or the development version from [R-universe](https://r-universe.dev/search) by running one the following lines:
```{r, eval = FALSE}
install.packages("dynamite")
install.packages("dynamite", repos = "https://ropensci.r-universe.dev")
```
## Example
A single-channel model with time-invariant effect of `z`, time-varying effect of `x`, lagged value of the response variable `y` and a group-specific random intercepts:
```{r, echo = FALSE}
library("dynamite")
ggplot2::theme_set(ggplot2::theme_bw())
```
```{r, eval = FALSE}
set.seed(1)
library("dynamite")
gaussian_example_fit <- dynamite(
obs(y ~ -1 + z + varying(~ x + lag(y)) + random(~1), family = "gaussian") +
splines(df = 20),
data = gaussian_example, time = "time", group = "id",
iter = 2000, chains = 2, cores = 2, refresh = 0
)
```
```{r, echo = FALSE}
set.seed(1)
library("dynamite")
gaussian_example_fit <- update(
gaussian_example_fit,
iter = 2000, warmup = 1000, thin = 1,
chains = 2, cores = 2, refresh = 0
)
```
Summary of the model:
```{r}
print(gaussian_example_fit)
```
Posterior estimates of time-varying effects:
```{r, fig.width = 9, fig.height = 4}
plot(gaussian_example_fit, types = c("alpha", "delta"), scales = "free")
```
And group-specific intercepts (for first 10 groups):
```{r, fig.width = 9, fig.height = 4}
plot(gaussian_example_fit, types = "nu", groups = 1:10)
```
Traceplots and density plots for time-invariant parameters:
```{r, fig.width = 9, fig.height = 4}
plot(gaussian_example_fit, plot_type = "trace", types = "beta")
```
Posterior predictive samples for the first 4 groups (using the samples based on the posterior distribution of the model parameters and observed data on the first time point):
```{r, warning=FALSE, fig.width = 9, fig.height = 4}
library("ggplot2")
pred <- predict(gaussian_example_fit, n_draws = 100)
pred |>
dplyr::filter(id < 5) |>
ggplot(aes(time, y_new, group = .draw)) +
geom_line(alpha = 0.25) +
# observed values
geom_line(aes(y = y), colour = "tomato") +
facet_wrap(~id) +
theme_bw()
```
Visualizing the model structure as a DAG (a snapshot at time `t`):
```{r, fig.width = 4, fig.height = 4}
plot(gaussian_example_fit, plot_type = "dag", show_covariates = TRUE)
```
For more examples, see the package vignettes and the [blog post about dynamite](https://ropensci.org/blog/2023/01/31/dynamite-r-package/).
## Related packages
- The `dynamite` package uses Stan via [`rstan`](https://CRAN.R-project.org/package=rstan) and [`cmdstanr`](https://mc-stan.org/cmdstanr/) (see also https://mc-stan.org), which is a probabilistic programming language for general Bayesian modelling.
- The [`brms`](https://CRAN.R-project.org/package=brms) package also uses Stan, and can be used to fit various complex multilevel models.
- Regression modeling with time-varying coefficients based on kernel smoothing and least squares estimation is available in package [`tvReg`](https://CRAN.R-project.org/package=tvReg). The [`tvem`](https://CRAN.R-project.org/package=tvem) package provides similar functionality for gaussian, binomial and poisson responses with [`mgcv`](https://CRAN.R-project.org/package=mgcv) backend.
- [`plm`](https://CRAN.R-project.org/package=plm) contains various methods to estimate linear models for panel data, e.g., fixed effect models.
- [`lavaan`](https://CRAN.R-project.org/package=lavaan) provides tools for structural equation modeling, and as such can be used to model various panel data models as well.
## Contributing
Contributions are very welcome, see [CONTRIBUTING.md](https://github.com/ropensci/dynamite/blob/main/.github/CONTRIBUTING.md) for general guidelines.