Simulation Study

UFXP runs on the same low-dimensional action-dependent synthetic cell as the other structural estimators. The cell has 21 states, 3 actions, known linear reward, known transitions, and exact policy, value, Q, and Type A, Type B, and Type C counterfactual oracle objects, so every recovery claim is checked against the truth.

The full result generator is ufxp_run.py. It writes the machine-readable results file ufxp_results.json.

cd /path/to/econirl
PYTHONPATH=src:. python validation/estimators/ufxp/run.py

Design

Quantity

Value

States

21

Actions

3

Individuals

2,000

Periods per individual

80

Observations

160,000

Discount factor

0.95

Weighting

optimal

Fit Summary

Quantity

Value

Converged

True

Log-likelihood

-174875.7871

Estimation time

0.64 seconds

Parameter Recovery

Parameter

Truth

Estimate

Std. error

Error

action_0_intercept

0.1000

0.0851

0.0295

-0.0149

action_0_progress

0.5000

0.5269

0.0360

+0.0269

action_1_intercept

0.0000

-0.0112

0.0367

-0.0112

action_1_progress

-0.2000

-0.2020

0.0525

-0.0020

Recovery Metrics

Metric

Value

Parameter RMSE

0.0164

Parameter cosine similarity

0.9991

Reward RMSE

0.0083

Value RMSE

0.0170

Q RMSE

0.0193

Policy TV

0.0050

Policy KL

0.000070

Numerical Checks

Check

Value

Threshold

Status

converged

true

is true

pass

standard_errors_finite

true

is true

pass

parameter_cosine

0.9991

at least 0.98

pass

parameter_relative_rmse

0.0598

at most 0.15

pass

policy_tv

0.0050

at most 0.03

pass

value_rmse

0.0170

at most 0.10

pass

q_rmse

0.0193

at most 0.10

pass

type_a_regret

0.00016

at most 0.05

pass

type_b_regret

0.00030

at most 0.05

pass

type_c_regret

0.00008

at most 0.05

pass

UFXP also appears on every page of the simulation studies, where it is compared against the full structural and IRL rosters on the bus engine, gridworld, and abstract MDP benchmarks.