Comparing Estimators

Use this table to narrow the estimator choice before opening a method-specific page. The evidence column states the current public documentation scope.

Estimator

Use when

Data / transition requirement

Reward target

State scale

Avoid when

Evidence status

NFXP

You need the reference structural DDC likelihood and counterfactual policy analysis.

Discrete panel data; transitions known or estimated first.

Finite parametric structural reward.

Small or moderate tabular state-action spaces.

Repeated exact Bellman solves are too expensive or transition modeling is the main bottleneck.

Synthetic tabular simulation.

CCP

You want a faster Hotz-Miller or NPL tabular structural estimate.

Discrete panel data; transitions known or estimated first; strong empirical action support.

Finite parametric structural reward.

Small or moderate tabular state-action spaces.

Many states have weak or one-action support, or you need the direct nested fixed-point likelihood.

Synthetic tabular simulation with support conditions.

MPEC

You want a constrained-optimization check on the DDC likelihood.

Discrete panel data; transitions known or estimated first.

Finite parametric structural reward.

Moderate tabular state-action spaces.

The Bellman constraint is too large for a constrained optimizer or fast repeated runs are central.

Synthetic constrained-likelihood simulation.

UFXP

You want maximum-likelihood-grade structural estimates without nesting any fixed point in the search.

Discrete panel data; transitions known or estimated first; most states visited.

Finite parametric structural reward.

Small to large tabular state-action spaces (one up-front factorization).

State coverage is very thin everywhere, or you need the exact finite-sample MLE benchmark.

Synthetic tabular simulation.

NNES

The value object is too large, smooth, or encoded for repeated exact dynamic programming.

Discrete panel data; transitions known or estimated before estimation.

Finite parametric structural reward with neural value approximation.

Larger, encoded, smooth, or multidimensional states.

The reward itself must be an unrestricted neural function or transition estimation is the main problem.

Synthetic low- and high-dimensional structural DDC simulations.

TD-CCP

Transition-density modeling is hard but the reward has known finite features.

Panel trajectories with current and next state-action information; transition environment still needed for post-fit counterfactuals.

Finite linear structural reward.

Encoded or higher-dimensional discrete states.

State space is small enough for tabular likelihood methods, support is sparse, or the target is a neural reward map.

Encoded-state finite-theta hard case with locally robust standard errors.

MCE-IRL

Demonstrations should be explained by maximum causal entropy feature matching.

Demonstrations from a discrete dynamic decision problem; transitions known or supplied.

Supplied finite reward features.

Tabular state-action spaces.

You need likelihood-based structural standard errors or reward features are unknown.

Synthetic supplied-feature simulations.

Deep MCE-IRL

You need nonlinear reward-map recovery under the MCE objective.

Demonstrations and supplied state encodings; transitions known or supplied; a reward anchor available.

Neural reward map, evaluated through the recovered reward matrix.

Encoded discrete states.

You need identified finite structural parameters or raw spatial inputs outside the current scope.

Synthetic anchored neural reward-map simulations.

AIRL

Adversarial recovery under the original state-only AIRL assumptions is the research object.

Demonstrations from a discrete dynamic decision problem; transitions available for validation or post-fit evaluation.

State-only reward with shaping term.

Discrete dynamic decision settings.

Reward is action-dependent, an absorbing-state normalization is central, or structural action-dependent recovery is required.

Synthetic state-only AIRL simulation.

AIRL-Het

Latent segments have different dynamic preferences and segment-specific counterfactuals matter.

Repeated user trajectories; credible anchor action and absorbing-state normalization.

Segment-specific action-dependent reward.

Encoded discrete dynamic choice settings.

Segment membership is weakly identified, no credible reward anchor exists, or a homogeneous estimator is enough.

Synthetic serialized-content simulation.

f-IRL

The study question is state-marginal matching under an f-divergence.

Demonstrations plus transitions for policy evaluation.

State-only reward.

Discrete state-marginal settings.

You need generic action-dependent structural DDC reward recovery.

Synthetic state-marginal simulation.

GLADIUS

You want neural Q and continuation modeling with anchor-moment reward projection.

Dynamic discrete choices; known transitions; credible anchor action with known rewards.

Projected structural reward from neural Q/continuation objects.

High-dimensional encoded state features.

No credible anchor action exists or you need tabular structural estimation.

Preview: projected reward diagnostics.

IQ-Learn

Inverse soft-Q learning or imitation diagnostics are the estimator of interest.

Demonstrations plus transitions for inverse Bellman reward calculation; strong expert support needed for structural interpretation.

Bellman-implied reward from learned soft Q-values.

Tabular or neural Q diagnostics depending on configuration.

You need the package’s strongest structural counterfactual evidence.

Preview: imitation and Q diagnostics.

The tabular structural estimators are the usual starting point for small dynamic discrete choice problems. Approximate structural estimators are for larger state representations while keeping a finite reward target. IRL estimators are for reward recovery from demonstrations, with scope determined by the reward form, support, anchors, and transition information.