Simulation Study

UFXP runs on the same low-dimensional action-dependent synthetic cell as the other structural estimators. The cell has 21 states, 3 actions, known linear reward, known transitions, and exact policy, value, Q, and Type A, Type B, and Type C counterfactual oracle objects, so every recovery claim is checked against the truth.

The full result generator is ufxp_run.py. It writes the machine-readable results file ufxp_results.json.

cd /path/to/econirl
PYTHONPATH=src:. python validation/estimators/ufxp/run.py

Design

Quantity	Value
States	21
Actions	3
Individuals	2,000
Periods per individual	80
Observations	160,000
Discount factor	0.95
Weighting	optimal

Fit Summary

Quantity	Value
Converged	True
Log-likelihood	-174875.7871
Estimation time	0.64 seconds

Parameter Recovery

Parameter	Truth	Estimate	Std. error	Error
action_0_intercept	0.1000	0.0851	0.0295	-0.0149
action_0_progress	0.5000	0.5269	0.0360	+0.0269
action_1_intercept	0.0000	-0.0112	0.0367	-0.0112
action_1_progress	-0.2000	-0.2020	0.0525	-0.0020

Recovery Metrics

Metric	Value
Parameter RMSE	0.0164
Parameter cosine similarity	0.9991
Reward RMSE	0.0083
Value RMSE	0.0170
Q RMSE	0.0193
Policy TV	0.0050
Policy KL	0.000070

Numerical Checks

Check	Value	Threshold	Status
converged	true	is true	pass
standard_errors_finite	true	is true	pass
parameter_cosine	0.9991	at least 0.98	pass
parameter_relative_rmse	0.0598	at most 0.15	pass
policy_tv	0.0050	at most 0.03	pass
value_rmse	0.0170	at most 0.10	pass
q_rmse	0.0193	at most 0.10	pass
type_a_regret	0.00016	at most 0.05	pass
type_b_regret	0.00030	at most 0.05	pass
type_c_regret	0.00008	at most 0.05	pass

UFXP also appears on every page of the simulation studies, where it is compared against the full structural and IRL rosters on the bus engine, gridworld, and abstract MDP benchmarks.