Comparing Estimators
Use this table to narrow the estimator choice before opening a method-specific page. The evidence column states the current public documentation scope.
Estimator |
Use when |
Data / transition requirement |
Reward target |
State scale |
Avoid when |
Evidence status |
|---|---|---|---|---|---|---|
You need the reference structural DDC likelihood and counterfactual policy analysis. |
Discrete panel data; transitions known or estimated first. |
Finite parametric structural reward. |
Small or moderate tabular state-action spaces. |
Repeated exact Bellman solves are too expensive or transition modeling is the main bottleneck. |
Synthetic tabular simulation. |
|
You want a faster Hotz-Miller or NPL tabular structural estimate. |
Discrete panel data; transitions known or estimated first; strong empirical action support. |
Finite parametric structural reward. |
Small or moderate tabular state-action spaces. |
Many states have weak or one-action support, or you need the direct nested fixed-point likelihood. |
Synthetic tabular simulation with support conditions. |
|
You want a constrained-optimization check on the DDC likelihood. |
Discrete panel data; transitions known or estimated first. |
Finite parametric structural reward. |
Moderate tabular state-action spaces. |
The Bellman constraint is too large for a constrained optimizer or fast repeated runs are central. |
Synthetic constrained-likelihood simulation. |
|
You want maximum-likelihood-grade structural estimates without nesting any fixed point in the search. |
Discrete panel data; transitions known or estimated first; most states visited. |
Finite parametric structural reward. |
Small to large tabular state-action spaces (one up-front factorization). |
State coverage is very thin everywhere, or you need the exact finite-sample MLE benchmark. |
Synthetic tabular simulation. |
|
The value object is too large, smooth, or encoded for repeated exact dynamic programming. |
Discrete panel data; transitions known or estimated before estimation. |
Finite parametric structural reward with neural value approximation. |
Larger, encoded, smooth, or multidimensional states. |
The reward itself must be an unrestricted neural function or transition estimation is the main problem. |
Synthetic low- and high-dimensional structural DDC simulations. |
|
Transition-density modeling is hard but the reward has known finite features. |
Panel trajectories with current and next state-action information; transition environment still needed for post-fit counterfactuals. |
Finite linear structural reward. |
Encoded or higher-dimensional discrete states. |
State space is small enough for tabular likelihood methods, support is sparse, or the target is a neural reward map. |
Encoded-state finite-theta hard case with locally robust standard errors. |
|
Demonstrations should be explained by maximum causal entropy feature matching. |
Demonstrations from a discrete dynamic decision problem; transitions known or supplied. |
Supplied finite reward features. |
Tabular state-action spaces. |
You need likelihood-based structural standard errors or reward features are unknown. |
Synthetic supplied-feature simulations. |
|
You need nonlinear reward-map recovery under the MCE objective. |
Demonstrations and supplied state encodings; transitions known or supplied; a reward anchor available. |
Neural reward map, evaluated through the recovered reward matrix. |
Encoded discrete states. |
You need identified finite structural parameters or raw spatial inputs outside the current scope. |
Synthetic anchored neural reward-map simulations. |
|
Adversarial recovery under the original state-only AIRL assumptions is the research object. |
Demonstrations from a discrete dynamic decision problem; transitions available for validation or post-fit evaluation. |
State-only reward with shaping term. |
Discrete dynamic decision settings. |
Reward is action-dependent, an absorbing-state normalization is central, or structural action-dependent recovery is required. |
Synthetic state-only AIRL simulation. |
|
Latent segments have different dynamic preferences and segment-specific counterfactuals matter. |
Repeated user trajectories; credible anchor action and absorbing-state normalization. |
Segment-specific action-dependent reward. |
Encoded discrete dynamic choice settings. |
Segment membership is weakly identified, no credible reward anchor exists, or a homogeneous estimator is enough. |
Synthetic serialized-content simulation. |
|
The study question is state-marginal matching under an f-divergence. |
Demonstrations plus transitions for policy evaluation. |
State-only reward. |
Discrete state-marginal settings. |
You need generic action-dependent structural DDC reward recovery. |
Synthetic state-marginal simulation. |
|
You want neural Q and continuation modeling with anchor-moment reward projection. |
Dynamic discrete choices; known transitions; credible anchor action with known rewards. |
Projected structural reward from neural Q/continuation objects. |
High-dimensional encoded state features. |
No credible anchor action exists or you need tabular structural estimation. |
Preview: projected reward diagnostics. |
|
Inverse soft-Q learning or imitation diagnostics are the estimator of interest. |
Demonstrations plus transitions for inverse Bellman reward calculation; strong expert support needed for structural interpretation. |
Bellman-implied reward from learned soft Q-values. |
Tabular or neural Q diagnostics depending on configuration. |
You need the package’s strongest structural counterfactual evidence. |
Preview: imitation and Q diagnostics. |
The tabular structural estimators are the usual starting point for small dynamic discrete choice problems. Approximate structural estimators are for larger state representations while keeping a finite reward target. IRL estimators are for reward recovery from demonstrations, with scope determined by the reward form, support, anchors, and transition information.