Using your own data
EconIRL fits a panel of decisions. Each row is one observed choice: who decided, the state they were in, and the action they took. This page takes a panel from raw data to a fitted model and one counterfactual.
The examples use the bundled bus-engine dataset as a stand-in for your own panel. Replace it with your DataFrame and the rest stays the same.
What your data needs
One row per observed decision. Three columns name the pieces.
Column |
What it holds |
|---|---|
|
The state when the decision was made, as an integer label. |
|
The discrete action that was chosen, as an integer label. |
|
The individual, unit, or trajectory the row belongs to. |
A next_state column is optional. When it is missing, the tabular estimators
read transitions from the order of states within each id.
Two ways to call econirl
Most users hand econirl a DataFrame and name the columns.
from econirl.datasets import load_rust_bus
from econirl import NFXP
df = load_rust_bus() # your DataFrame goes here
model = NFXP(n_states=90, discount=0.9999, utility="linear_cost")
model.fit(df, state="mileage_bin", action="replaced", id="bus_id")
If you are replicating a paper or running a simulation study, you can build the model from transition and feature arrays instead. The estimator pages under Estimators show that path where it applies.
Check the data before you fit
First, confirm the panel is well formed.
from econirl.preprocessing import check_panel_structure
report = check_panel_structure(df, id_col="bus_id", period_col="period",
state_col="mileage_bin", action_col="replaced")
print(report.valid, report.n_individuals, report.n_observations)
print(report.warnings)
True 90 9410
['Unbalanced panel: 80-120 periods per individual']
The panel is usable. The warning notes that individuals are observed for different lengths, which these estimators allow.
Next, look for states where only one action ever appears.
counts = df.groupby("mileage_bin")["replaced"].nunique()
print((counts < 2).sum(), "of", df["mileage_bin"].nunique(), "states show one action")
14 of 52 states show one action
NFXP still fits these states from the likelihood. The choice-probability estimators, CCP and UFXP, read action frequencies directly, so they need states with some variation.
Last, check that your reward features can be recovered from choices. A feature
that never changes with the action cancels out of every choice comparison, so
its parameter is not identified. feature_diagnostics reports this.
import numpy as np
from econirl.preprocessing import feature_diagnostics
S, A = 20, 3
rng = np.random.default_rng(0)
phi = np.zeros((S, A, 2))
phi[:, :, 0] = rng.normal(size=(S, A)) # varies with the action
phi[:, :, 1] = rng.normal(size=(S, 1)) # same for every action in a state
print({k: round(v, 2) if isinstance(v, float) else v
for k, v in feature_diagnostics(phi).items()})
{'num_features': 2, 'feature_rank': 2, 'condition_number': 1.29,
'contrast_rank': 1, 'contrast_condition_number': 1.0}
The design has full rank, but contrast_rank is 1, not 2. The second feature is
constant across actions, so one parameter cannot be recovered. When
contrast_rank equals num_features, every parameter is identified from
behavior.
Fit and read the outputs
model.fit(df, state="mileage_bin", action="replaced", id="bus_id")
print(model.params_)
{'theta_c': 0.0010028828858836278, 'RC': 3.0722093435989524}
A fitted estimator exposes a common set of attributes.
Attribute |
What it holds |
|---|---|
|
Estimated reward or cost parameters. |
|
Estimated choice probabilities, one row per state. |
|
Estimated value function, when the estimator computes one. |
|
A report of the estimates with standard errors. |
One counterfactual
Change a parameter and read the new policy.
cf = model.counterfactual(RC=4.0)
print(cf.params)
print("replace probability at state 50:", float(np.asarray(cf.policy)[50, 1]))
{'theta_c': 0.0010028828858836278, 'RC': 4.0}
replace probability at state 50: 0.05519477716656161
A higher replacement cost lowers the chance of replacing at a given mileage.
Next steps
Pick an estimator for your problem in the estimator map.
Read the overview for the ideas behind these models.
Look up any class in the API reference.