Robert Kubinec and Michal Ovádek 2024-11-22
Note: This is a beta release of idealstan v1.0.
While most features have been implemented and are stable, there may be
bugs that have not been sorted out. To report bugs with the package,
please file an issue on the Github
page.
At present, idealstan is only available on Github as one
of its main dependencies, cmdstanr is also not on CRAN. To
use the package, cmdstanr must be first be set up with a
local installation of cmdstan, which is used for
estimation. To see how to install cmdstanr, see this guide. Note that the
cmdstanr default installation location should be used when
installing cmdstan.
To install this package, type the command
remotes::install_github('saudiwin/idealstan') at the R
console prompt (you first must have the remotes package
installed from CRAN for this to work). The best way to learn how the
package works is to look at the package vignettes, which can be accessed
from the package
website, especially Introduction
to Idealstan.
If you use this package, please cite the following:
Kubinec, Robert. “Generalized Ideal Point Models for Robust Measurement with Dirty Data in the Social Sciences”. SocArchiv (2024). doi:10.31219/osf.io/8j2bt.
The paper is available from this link.
This package implements IRT (item response theory) ideal point models, which are models designed for situations in which actors make strategic choices that correlate with a unidimensional scale, such as the left-right axis in American politics. Compared to traditional IRT, ideal point models use a similar parameterization (the 2-Pl variant) but without the strong assumption that all items load in the same direction (i.e., higher ability). For more information, I refer you to my paper about IRT and ideal point models, documenting many of the features in the package.
The goal of idealstan is to offer both standard
IRT/ideal point models and additional models for missing data,
time-varying ideal points and diverse responses, such as binary,
ordinal, count, continuous and positive-continuous outcomes. In
addition, idealstan uses the Stan estimation engine to
offer full Bayesian inference (with some options for approximate
inference) for all models so that every model is estimated with
uncertainty. Models can also have mixed outcomes, such as discrete and
continuous responses.
The approach to handling missing data in this package is to model directly strategic censoring in observations. While this kind of missing data pattern can be found in many situations in which data is not missing at random, this particular version was developed to account for legislatures in which legislators (persons) are strategically absent for votes on bills (items). This approach to missing data can be usefully applied to many contexts in which a missing outcome is a function of the person’s ideal point (i.e., people will tend to be present in the data when the item is far away or very close to their ideal point).
The package also includes ordinal ideal point models to handle
situations in which a ranked outcome is polarizing, such as a legislator
who can vote yes, no or to abstain. Because idealstan uses
Bayesian inference, it can model any kind of ordinal data even if there
aren’t an even distribution of ordinal categories for each item.
The package also has extensive plotting functions via
ggplot2 for model parameters, particularly the legislator
(person) ideal points (ability parameters).