- Obtaining the model parameter dimensions via
get_parameter_dims()
no longer requires a compiled Stan model. This leads to a significant performance improvement when applied todynamiteformula
objects. - Model fitting using
cmdstanr
backend no longer relies onrstan::read_stan_csv()
to construct the fit object. Instead, the resultingCmdStanMCMC
object is used directly. This should provide a substantial performance improvement in some instances. Fordynamice()
, samples from different imputed datasets are combined usingcmdstanr::as_cmdstan_fit()
instead.
- Restored and updated the main package vignette. The vignette now also contains a real data example and information on multiple imputation.
- The package data
gaussian_simulation_fit
has been removed to accommodate CRAN package size requirements. The code to generate the data is still available in thedata_raw
directory.
- The main package vignette has been temporarily removed as it contained out-of-date information. Please see the arXiv preprint for up-to-date information instead: https://arxiv.org/abs/2302.01607
- The
type
argument ofcoef()
andplot()
has been replaced bytypes
accepting multiple types simultaneously, similar toas.data.table()
andas.data.frame()
. - The functions
plot_betas()
,plot_deltas()
,plot_nus()
,plot_lambdas()
andplot_psis()
have been deprecated and are now provided via the default plot method by selecting the appropriatetypes
. - A new argument
plot_type
has been added to control what type of plot will be drawn by theplot()
method. The default value"default"
draws the posterior means and posterior intervals of all parameters. The old functionality of drawing posterior densities and traceplots is provided by the option"trace"
. - The
plot()
method has gained the argumentn_params
to limit the amount of parameters drawn at once (per parameter type). - Both time-varying and time-invariant parameters can now be plotted simultaneously.
- Fixed an issue with
predict()
andfitted()
for multinomial responses. - Priors of the cutpoint parameters of the
cumulative
family are now customizable. - Both
factor
andordered factor
responses are now supported forcategorical
andcumulative
families. In addition,ordered factor
columns ofdata
are no longer converted tofactor
columns. - Arguments that have the different names but the same functionality between
rstan
andcmdstanr
can now be used interchangeably for either backend, such asiter
anditer_samples
. - The latent factor component was reparametrized for additional robustness. User-visible changes are related to priors: Instead of prior on the standard deviations
sigma_lambda
andtau_psi
, prior is now defined onzeta
, the sum of these, as well as onkappa
, which is the proportion ofzeta
attributable tosigma_lambda
.
- Estimation of dynamic multivariate panel models with multiple imputation is now available via the function
dynamice()
which uses themice
package. predict
andfitted
functions no longer permutes the posterior samples when all samples are used i.e. whenn_draws = NULL
(default). This also corrects the standard error estimates ofloo()
, which were not correct earlier due to the mixing of chains.- Added an argument
thin
forloo()
,predict()
andfitted()
methods. - Print method now only prints the run time for the fastest and the slowest chain instead of all chains.
- A new exported function
hmc_diagnostics()
is now available. - Added a vignette on
get_code()
andget_data()
functions and how they can be used to modify the generated Stan code and perform variational Bayes inference. - Contemporaneous dependencies are now allowed between different components of multivariate distributions, e.g.,
obs(c(y, x) ~ x | 1, family = "mvgaussian")
. - Ordered probit and logit regressions are now available via
obs(., family = "cumulative", link = "probit")
andobs(., family = "cumulative", link = "logit")
, respectively.
- The package now depends on
data.table
version 1.15.0 or higher and theggforce
package. - Added a
plot
method fordynamiteformula
objects. This method draws a directed acyclic graph (DAG) of the model structure as a snapshot in time with timepoints from the past and the future equal to the highest-order lag dependency in the model as aggplot
object. Alternatively, setting the argumenttikz = TRUE
returns the DAG as acharacter
string in TikZ format. See the documentation for more details.
- The formula interface now prohibits additional invalid
fixed()
,varying()
, andrandom()
definitions inobs()
. - Fixed an error in Stan code generation if an offset term was included in the model formula.
- Fixed an issue when using
character
typegroup
variables.
- Added option to input a custom model code for
dynamite
which can be used to tweak some aspects of the model (no checks on the compatibility with the post processing are made). - Changed the default optimization level for
cmdstanr
backend toO0
, as theO1
is not necessarily stable in all cases. - Added a new argument
full_diagnostics
to theprint()
method which can be used to control the computation of the ESS and Rhat values. By default, these are now computed only for the time- and group-invariant parameters (which are also printed). - The
print()
method now also warns about possible divergences, treedepth saturation, and low E-BMFI. - Fixed an error related to
predict()
code generation.
- Made several performance improvements to data parsing.
dynamite()
will now retain the original column order ofdata
in all circumstances.
- Added a note on priors vignette regarding default priors for
tau
parameters. - Fixed
mcmc_diagnostics()
function so that HMC diagnostics are checked also for models run with thecmdstanr
backend.
- Fixed the construction of latent factors for categorical responses.
- The
get_data()
method fordynamitefit
objects now correctly uses the previously defined priors instead of the default ones. - Fixed a bug in indexing of random effect terms.
- Limited the number of parallel threads used by the
data.table
package to 1 in examples, tests, and vignettes for CRAN.
- Example of the
lfo()
method now uses a single chain and core to avoid a compatibility issue with CRAN. - Fixed
plot_nus()
for categorical responses. - Fixed an issue which caused an error in error message of
predict()
andfitted()
methods whennewdata
contained duplicate time points within group. - Fixed an issue (#72) which caused NA ELPD value in
lfo()
in case of missing data.
- Fixed an issue with
formula.dynamitefit()
with models defined usinglags()
with a vectork
argument with more than one value. - Fixed an issue in the
lfo()
method which resulted wrong ELPD estimates in panel data setting. - Fixed an issue in the
lfo()
method which in case of lagged responses caused the ELPD computations to skip last time points.
- Added further checks and fixes for backwards compatibility with Stan.
- Fixed code generation for intercept-only categorical model.
- Fixed code generation in the transformed data block to be backwards compatible with Stan.
- Fixed an issue in
dynamite()
data parsing that caused substantial memory usage in some instances. - Fixed an issue with Stan code generation for categorical responses.
- Fixed an issue with
formula.dynamitefit()
with models that had multinomial channels. - Fixed an issue with
formula.dynamitefit()
when thedf
argument ofsplines()
wasNULL
. - Formulas with
trials()
andoffset()
terms are now properly parsed when usinglags()
. - Removed experimental shrinkage feature.
dynamite()
now supports parallel computation via the reduce-sum functionality of Stan.- Fixed an issue in
predict()
that resulted in redundantNAs produced
warnings. - Fixed an issue with
formula.dynamitefit()
with models that had multivariate channels.
- Fixed a partial argument name issue in the internal
update()
method used bylfo()
.
- Fixed the regularization of the default priors so that they match with the priors vignette.
- Fixed an issue with the
update()
method for model fit objects without a group variable. - Fixed an issue with the
update()
method inlfo()
. - Fixed an issue with
"tau"
and"tau_alpha"
type parameters with theas_draws()
method for categorical responses. - Fixed an issue with Stan code generation for models with time-varying covariates for categorical responses.
- Fixed an issue with
formula.dynamitefit()
when the model contained asplines
component.
- Fixed an incorrect URL in the main vignette.
"dynamitefit"
objects no longer contain the data used for Stan sampling by default. This data can still be retrieved viaget_data()
.- Added a new package data
gaussian_simulation_fit
that includes the model fit of thedynamite_simulation
vignette for the example with time-varying effects. - The package data
latent_factor_example
andlatent_factor_example_fit
have been removed to accommodate CRAN package size requirements. The code to generate these data is still available in thedata_raw
directory. - Fixed an issue with
formula.dynamitefit()
when the model formula contained alags
component or alfactor
component.
- Added support for Student's t-distribution via
"student"
family inobs()
. - Added support for the multinomial distribution via
"multinomial"
family inobs()
. Atrials()
term is now mandatory for multinomial channels. - The generated Stan code now automatically switches between the array keyword syntax and the deprecated syntax based on the backend Stan version (see https://mc-stan.org/docs/reference-manual/brackets-array-syntax.html for details).
- The presence of variables used in
trials()
andoffset()
is now properly checked in the data. - The model components
trials()
andoffset()
now function correctly inpredict()
when they contain response variables of the model. - Fixed the calculation of the number of observations in
nobs()
for models that have multivariate channels. - Fixed an issue in
predict()
with models that contained multivariate channels with random effects. - Scenarios that have zero non-missing observations at specific time indices are now handled properly in the Stan code generation.
- The names of additional arguments passed to
rstan::sampling()
and thesample()
method of thecmdstanr
Stan model via...
in the call todynamite
are now checked and unrecognized arguments will be ignored. - Added a new function
get_parameter_dims()
that returns the parameter dimensions of the Stan model for"dynamitefit"
and"dynamiteformula"
objects. - Group-level random effects are now supported also for categorical and multinomial channels.
- Added a new vignette that describes how the package can be used to simulate data from a dynamic multivariate panel model.
- Added a new vignette that describes how the default priors of the model parameters are defined.
- Removed argument
noncentered_lambda
fromlfactor()
as this did not work as intended. - Added next observation carried backward imputation scheme for fixed predictors in predict as option
"nocb"
. - Changed naming of
omega
parameters, they now include also the channel name. - Fixed an issue related to channels with latent factors that did not not have any other predictors.
- Improved efficiency of sum-to-zero constraints based post by @aaronjg on the Stan forums.
- Fixed several issues related to Stan code generation for the multivariate gaussian distribution.
- The package no longer uses
gregexec()
internally which made it dependent on R version 4.1.0 or higher. - Corrected R version dependency to 3.6.0 or higher based on the package dependencies.
- Added support for the multivariate gaussian distribution via
"mvgaussian"
family inobs()
. See the documentation of thedynamiteformula()
function for details on how to define multivariate channels. - Latent factors were not previously used in predict by error, this is now fixed. However, due to identifiability constraints no new group levels are allowed with models using latent factors.
- Response variable names of the channels are now processed to avoid invalid variable names in the generated Stan code. Note that these variables names should be used when defining priors and when using methods of the
"dynamitefit"
class. You can use the functionsget_priors()
andget_parameter_names()
to see the names that are available, as before. - Optimized prediction code by removing redundant expressions and using better indexing.
- The argument
verbose_stan
is now ignored whenbackend = "cmdstanr"
. - The
stanc_options
argument for defining compiler options when usingcmdstanr
can now be controlled viadynamite()
. - Optimized column binding of
"data.table"
objects inpredict()
leading to faster computation. - The
update()
method now checks if thebackend
has changed from the original model fit. - The
update()
method now properly recompiles the model (if necessary) in cases whereupdate()
is used for already updated"dynamitefit"
object. - Fixed a bug in the default prior definitions of intercept for families using log-link which lead to
-Inf
prior mean if all observations at the first time point were zero. - Fixed some issues in the code generation of latent factor components.
plot_deltas()
and other plotting functions now throw an error if the user tries to plot parameters of an incorrect type with them.
dynamite()
now supports general group-level random effects. Newrandom()
works analogously withvarying()
insideobs()
, and the new optionalrandom_spec()
component can be used to define whether the random effects should be correlated or not and whether to use noncentered parameterization.- The package no longer depends on the
bayesplot
package. Instead,ggplot2
andpatchwork
packages are used for theplot
method. - Argument order of the
dynamite()
function has been changed:time
now precedesgroup
andbackend
now precedesverbose
. This change is also reflected in theget_data()
,get_priors()
, andget_code()
functions. - Vectorized priors and various indexing variables are now passed as data to Stan instead of being hard-coded in the generated model code.
- The package now supports contemporaneous dependencies between channels such that the dependency structure is acyclic. For example, having
y ~ x
andx ~ z
simultaneously is valid, but addingz ~ y
to these would result in a cycle. - The output of
mcmc_diagnostics()
is now clearer. - The default value of the
summary
argument was changed toFALSE
inas.data.frame()
andas.data.table()
methods, whereas it is now hard-coded toTRUE
in thesummary()
method. The column ordering of the output of these methods was also changed so that the estimate columns are placed before the extra columns such astime
. - The standard deviation of the default priors for spline coefficient standard deviations is now scaled based on the data analogously with regression coefficients.
- Added argument
parameters
toas.data.frame()
and similar methods as well for the plotting functions. - Added functions
get_parameter_types()
andget_parameter_names()
for extracting model parameter types and names respectively.
- Fixed a name clash issue in Stan code generation.
- The package no longer depends on the development version of the
data.table
package. - Removed the Grunfeld example from vignette due to CRAN file size restrictions.
multichannel_example
and the corresponding fit was modified: The standard deviation parameter of the Gaussian channel used in the data generation was decreased in order to make the example in the vignette more interesting.- The latent factor model was also modified by removing the
random()
component in order to reduce the size of the model fit object. - Fixed the name extraction of the supplied data.
plot_deltas()
no longer unnecessarily warns about missing values.
- Increased the version number to 1.0.0 to reflect the fact that the package is now fully functional and has successfully passed the rOpenSci review.
get_prior()
,get_code()
, andget_data()
now support case withoutgroup
argument, as per issue #48.- Fixed some typos and other issues in the vignette raised by @nicholasjclark during the rOpenSci review process.
- Added an example on simulating from the prior predictive distribution to the documentation of
predict()
. - Declarations now occur before statements in the generated Stan code.
- Added support for
cmdstanr
via argumentbackend
indynamite
. - Added a link to the contributing guidelines to README.
- The package no longer depends on the development version of
rstan
. - Dropped R version dependency from 4.1.0 to 3.5.0.
- Moved
dplyr
andtidyr
to 'Suggests'. categorical_logit()
is now used instead ofcategorical_logit_glm()
on olderrstan
andcmdstanr
versions.- Random intercepts with
random()
now also support centered parametrization. - Added more comments to the generated Stan code.
- Fixed the output of
formula.dynamitefit()
so that it is now compatible with theupdate()
method. Also added the required"call"
object to the"dynamitefit"
object. - Added
loo()
andlfo()
methods for the dynamite models which can be used for approximate leave-one-out and leave-future-out cross validation. - Cleaned up NAMESPACE.
- The
env
argument ofdata.table()
is now used to avoid possible variable name conflicts. - Breaking change: The shrinkage parameter which was previously named as
lambda
is nowxi
in order to freelambda
for factor loadings parameter as is customary in factor analysis. - Added support for correlated latent dynamic factors (modeled as splines).
get_code()
applied to fitted model now correctly returns only the model code and not thestanmodel
object.- Fixed the
.draw
column of theas.data.frame()
output.
- Improved the memory usage of
predict()
andfitted()
by separating the simulated values from the predictors that are independent of the posterior draws. - Added support for summarized predictions via a new argument
funs
, this can further significantly reduce memory usage when individual level predictions are not of interest.
- First version of
dynamite