# Comparing Estimators

Use this table to narrow the estimator choice before opening a method-specific
page. The evidence column states the current public documentation scope.

| Estimator | Use when | Data / transition requirement | Reward target | State scale | Avoid when | Evidence status |
| --- | --- | --- | --- | --- | --- | --- |
| [NFXP](nfxp.md) | You need the reference structural DDC likelihood and counterfactual policy analysis. | Discrete panel data; transitions known or estimated first. | Finite parametric structural reward. | Small or moderate tabular state-action spaces. | Repeated exact Bellman solves are too expensive or transition modeling is the main bottleneck. | Synthetic tabular simulation. |
| [CCP](ccp.md) | You want a faster Hotz-Miller or NPL tabular structural estimate. | Discrete panel data; transitions known or estimated first; strong empirical action support. | Finite parametric structural reward. | Small or moderate tabular state-action spaces. | Many states have weak or one-action support, or you need the direct nested fixed-point likelihood. | Synthetic tabular simulation with support conditions. |
| [MPEC](mpec.md) | You want a constrained-optimization check on the DDC likelihood. | Discrete panel data; transitions known or estimated first. | Finite parametric structural reward. | Moderate tabular state-action spaces. | The Bellman constraint is too large for a constrained optimizer or fast repeated runs are central. | Synthetic constrained-likelihood simulation. |
| [UFXP](ufxp.md) | You want maximum-likelihood-grade structural estimates without nesting any fixed point in the search. | Discrete panel data; transitions known or estimated first; most states visited. | Finite parametric structural reward. | Small to large tabular state-action spaces (one up-front factorization). | State coverage is very thin everywhere, or you need the exact finite-sample MLE benchmark. | Synthetic tabular simulation. |
| [NNES](nnes.md) | The value object is too large, smooth, or encoded for repeated exact dynamic programming. | Discrete panel data; transitions known or estimated before estimation. | Finite parametric structural reward with neural value approximation. | Larger, encoded, smooth, or multidimensional states. | The reward itself must be an unrestricted neural function or transition estimation is the main problem. | Synthetic low- and high-dimensional structural DDC simulations. |
| [TD-CCP](tdccp.md) | Transition-density modeling is hard but the reward has known finite features. | Panel trajectories with current and next state-action information; transition environment still needed for post-fit counterfactuals. | Finite linear structural reward. | Encoded or higher-dimensional discrete states. | State space is small enough for tabular likelihood methods, support is sparse, or the target is a neural reward map. | Encoded-state finite-theta hard case with locally robust standard errors. |
| [MCE-IRL](mce_irl.md) | Demonstrations should be explained by maximum causal entropy feature matching. | Demonstrations from a discrete dynamic decision problem; transitions known or supplied. | Supplied finite reward features. | Tabular state-action spaces. | You need likelihood-based structural standard errors or reward features are unknown. | Synthetic supplied-feature simulations. |
| [Deep MCE-IRL](deep_mce_irl.md) | You need nonlinear reward-map recovery under the MCE objective. | Demonstrations and supplied state encodings; transitions known or supplied; a reward anchor available. | Neural reward map, evaluated through the recovered reward matrix. | Encoded discrete states. | You need identified finite structural parameters or raw spatial inputs outside the current scope. | Synthetic anchored neural reward-map simulations. |
| [AIRL](airl.md) | Adversarial recovery under the original state-only AIRL assumptions is the research object. | Demonstrations from a discrete dynamic decision problem; transitions available for validation or post-fit evaluation. | State-only reward with shaping term. | Discrete dynamic decision settings. | Reward is action-dependent, an absorbing-state normalization is central, or structural action-dependent recovery is required. | Synthetic state-only AIRL simulation. |
| [AIRL-Het](airl_het.md) | Latent segments have different dynamic preferences and segment-specific counterfactuals matter. | Repeated user trajectories; credible anchor action and absorbing-state normalization. | Segment-specific action-dependent reward. | Encoded discrete dynamic choice settings. | Segment membership is weakly identified, no credible reward anchor exists, or a homogeneous estimator is enough. | Synthetic serialized-content simulation. |
| [f-IRL](f_irl.md) | The study question is state-marginal matching under an f-divergence. | Demonstrations plus transitions for policy evaluation. | State-only reward. | Discrete state-marginal settings. | You need generic action-dependent structural DDC reward recovery. | Synthetic state-marginal simulation. |
| [GLADIUS](gladius.md) | You want neural Q and continuation modeling with anchor-moment reward projection. | Dynamic discrete choices; known transitions; credible anchor action with known rewards. | Projected structural reward from neural Q/continuation objects. | High-dimensional encoded state features. | No credible anchor action exists or you need tabular structural estimation. | Preview: projected reward diagnostics. |
| [IQ-Learn](iq_learn.md) | Inverse soft-Q learning or imitation diagnostics are the estimator of interest. | Demonstrations plus transitions for inverse Bellman reward calculation; strong expert support needed for structural interpretation. | Bellman-implied reward from learned soft Q-values. | Tabular or neural Q diagnostics depending on configuration. | You need the package's strongest structural counterfactual evidence. | Preview: imitation and Q diagnostics. |

The tabular structural estimators are the usual starting point for small
dynamic discrete choice problems. Approximate structural estimators are for
larger state representations while keeping a finite reward target. IRL
estimators are for reward recovery from demonstrations, with scope determined
by the reward form, support, anchors, and transition information.