# MCE-IRL

**Reference PDF:** `papers/econirl_package/primers/mce_irl/mce_irl.pdf`.

MCE-IRL learns reward parameters by matching expert feature expectations under
the maximum causal entropy policy. It is part of the 12-estimator known-truth
validation suite.

## Validation Status

**Pass.** The gated artifact is generated by
`papers/econirl_package/primers/mce_irl/mce_irl_run.py` from the shared
known-truth harness. It validates the low-level `MCEIRLEstimator` directly with
known transitions and known action-dependent reward features.

The primary cell is `mce_low_high_reward`: 25 states, 3 actions, 8
action-dependent reward features, 3,000 individuals, and 100 periods. It passes
10/10 gates: feature residual, occupancy moment residual, normalized reward
RMSE, policy TV, normalized value RMSE, normalized Q RMSE, and Type A/B/C
counterfactual regret.

Reward, value, and Q metrics use the standard IRL location-and-scale
normalization before RMSE is computed. Policy and counterfactual gates are not
normalized. Raw parameter cosine is not used as a MCE-IRL validation gate.

## Usage Scope

Use MCE-IRL when transitions are known or supplied and the reward features are
explicit. For multi-action structural recovery, pass a `RewardSpec` to `fit()`
or pass a `feature_matrix` at construction time. The wrapper no longer silently
treats `feature_matrix=None` as a validated structural default for multi-action
models.

For the neural reward variant, see `docs/estimators/deep_mce_irl.md`. Its
primary validation artifact is the anchored recovered reward matrix; projected
finite parameters are diagnostic unless the supplied feature basis is
well-conditioned.

## Artifacts

- PDF source: `papers/econirl_package/primers/mce_irl/mce_irl.tex`
- Result generator: `papers/econirl_package/primers/mce_irl/mce_irl_run.py`
- Shared DGP harness: `experiments/known_truth.py`
- Results: `papers/econirl_package/primers/mce_irl/mce_irl_results.json`