# Under the Hood ## Model The data are state, action, next-state triples $(s, a, s')$ from a stationary infinite-horizon dynamic discrete choice model with linear flow utility $u_\theta(s, a) = \varphi(s, a)^\top \theta$, discount factor $\beta$, known transition kernels $F_a(s' \mid s)$, and i.i.d. logit shocks. The integrated value function satisfies the soft Bellman fixed point $$ V_\theta(s) = \log \sum_a \exp\!\bigl(u_\theta(s,a) + \beta \sum_{s'} F_a(s'\mid s)\,V_\theta(s')\bigr). $$ MPEC treats the value vector $V$ as an explicit optimization variable alongside $\theta$ and enforces the Bellman constraint directly. The full model derivation, standard-error identity, and consistency argument are on the [MPEC overview](../mpec.md). ## Pseudocode ```text initialize theta; solve V = T_theta(V) for the starting value vector define the constraint c(theta, V) = V - T_theta(V) while the constrained optimizer has not stopped: compute Q(s, a) from theta, V, transitions, and beta compute log pi(a | s) by the log-softmax rule evaluate the conditional log likelihood evaluate c(theta, V) and its Jacobian via JAX update theta and V with SLSQP return theta, V, policy, standard errors, and constraint diagnostics ``` The joint variable is $x = (\theta, V)$ and the equality constraint has one row per state. Objective gradients and constraint Jacobians come from JAX, not finite differences. The value vector is initialized at the Bellman fixed point of the starting $\theta$, so the optimizer begins feasible and the SQP steps stay near the constraint surface. No Bellman fixed point is solved inside the objective; the constraint carries it. ## Implementation Notes Standard errors use the implicit-score identity stated on the overview page: at the constrained optimum, per-observation score contributions are computed from $\partial V / \partial \theta$ after convergence, and the robust covariance is their outer product. The fitted summary exposes `metadata["final_constraint_violation"]`; gate on it alongside the convergence flag, since a high likelihood with a violated constraint is not a solution. The estimator lives in `econirl.estimation.mpec`. Use `MPECConfig(solver="sqp")` for the recommended SLSQP path. The `augmented_lagrangian` solver is retained for comparison but is less reliable at high discount factors.