Using your own data

EconIRL fits a panel of decisions. Each row is one observed choice: who decided, the state they were in, and the action they took. This page takes a panel from raw data to a fitted model and one counterfactual.

The examples use the bundled bus-engine dataset as a stand-in for your own panel. Replace it with your DataFrame and the rest stays the same.

What your data needs

One row per observed decision. Three columns name the pieces.

Column

What it holds

state

The state when the decision was made, as an integer label.

action

The discrete action that was chosen, as an integer label.

id

The individual, unit, or trajectory the row belongs to.

A next_state column is optional. When it is missing, the tabular estimators read transitions from the order of states within each id.

Two ways to call econirl

Most users hand econirl a DataFrame and name the columns.

from econirl.datasets import load_rust_bus
from econirl import NFXP

df = load_rust_bus()                        # your DataFrame goes here
model = NFXP(n_states=90, discount=0.9999, utility="linear_cost")
model.fit(df, state="mileage_bin", action="replaced", id="bus_id")

If you are replicating a paper or running a simulation study, you can build the model from transition and feature arrays instead. The estimator pages under Estimators show that path where it applies.

Check the data before you fit

First, confirm the panel is well formed.

from econirl.preprocessing import check_panel_structure

report = check_panel_structure(df, id_col="bus_id", period_col="period",
                               state_col="mileage_bin", action_col="replaced")
print(report.valid, report.n_individuals, report.n_observations)
print(report.warnings)
True 90 9410
['Unbalanced panel: 80-120 periods per individual']

The panel is usable. The warning notes that individuals are observed for different lengths, which these estimators allow.

Next, look for states where only one action ever appears.

counts = df.groupby("mileage_bin")["replaced"].nunique()
print((counts < 2).sum(), "of", df["mileage_bin"].nunique(), "states show one action")
14 of 52 states show one action

NFXP still fits these states from the likelihood. The choice-probability estimators, CCP and UFXP, read action frequencies directly, so they need states with some variation.

Last, check that your reward features can be recovered from choices. A feature that never changes with the action cancels out of every choice comparison, so its parameter is not identified. feature_diagnostics reports this.

import numpy as np
from econirl.preprocessing import feature_diagnostics

S, A = 20, 3
rng = np.random.default_rng(0)
phi = np.zeros((S, A, 2))
phi[:, :, 0] = rng.normal(size=(S, A))   # varies with the action
phi[:, :, 1] = rng.normal(size=(S, 1))   # same for every action in a state

print({k: round(v, 2) if isinstance(v, float) else v
       for k, v in feature_diagnostics(phi).items()})
{'num_features': 2, 'feature_rank': 2, 'condition_number': 1.29,
 'contrast_rank': 1, 'contrast_condition_number': 1.0}

The design has full rank, but contrast_rank is 1, not 2. The second feature is constant across actions, so one parameter cannot be recovered. When contrast_rank equals num_features, every parameter is identified from behavior.

Fit and read the outputs

model.fit(df, state="mileage_bin", action="replaced", id="bus_id")
print(model.params_)
{'theta_c': 0.0010028828858836278, 'RC': 3.0722093435989524}

A fitted estimator exposes a common set of attributes.

Attribute

What it holds

params_

Estimated reward or cost parameters.

policy_

Estimated choice probabilities, one row per state.

value_

Estimated value function, when the estimator computes one.

summary()

A report of the estimates with standard errors.

One counterfactual

Change a parameter and read the new policy.

cf = model.counterfactual(RC=4.0)
print(cf.params)
print("replace probability at state 50:", float(np.asarray(cf.policy)[50, 1]))
{'theta_c': 0.0010028828858836278, 'RC': 4.0}
replace probability at state 50: 0.05519477716656161

A higher replacement cost lowers the chance of replacing at a given mileage.

Next steps