# Using your own data EconIRL fits a panel of decisions. Each row is one observed choice: who decided, the state they were in, and the action they took. This page takes a panel from raw data to a fitted model and one counterfactual. The examples use the bundled bus-engine dataset as a stand-in for your own panel. Replace it with your DataFrame and the rest stays the same. ## What your data needs One row per observed decision. Three columns name the pieces. | Column | What it holds | | --- | --- | | `state` | The state when the decision was made, as an integer label. | | `action` | The discrete action that was chosen, as an integer label. | | `id` | The individual, unit, or trajectory the row belongs to. | A `next_state` column is optional. When it is missing, the tabular estimators read transitions from the order of states within each `id`. ## Two ways to call econirl Most users hand econirl a DataFrame and name the columns. ```python from econirl.datasets import load_rust_bus from econirl import NFXP df = load_rust_bus() # your DataFrame goes here model = NFXP(n_states=90, discount=0.9999, utility="linear_cost") model.fit(df, state="mileage_bin", action="replaced", id="bus_id") ``` If you are replicating a paper or running a simulation study, you can build the model from transition and feature arrays instead. The estimator pages under [Estimators](../estimators.md) show that path where it applies. ## Check the data before you fit First, confirm the panel is well formed. ```python from econirl.preprocessing import check_panel_structure report = check_panel_structure(df, id_col="bus_id", period_col="period", state_col="mileage_bin", action_col="replaced") print(report.valid, report.n_individuals, report.n_observations) print(report.warnings) ``` ```text True 90 9410 ['Unbalanced panel: 80-120 periods per individual'] ``` The panel is usable. The warning notes that individuals are observed for different lengths, which these estimators allow. Next, look for states where only one action ever appears. ```python counts = df.groupby("mileage_bin")["replaced"].nunique() print((counts < 2).sum(), "of", df["mileage_bin"].nunique(), "states show one action") ``` ```text 14 of 52 states show one action ``` NFXP still fits these states from the likelihood. The choice-probability estimators, CCP and UFXP, read action frequencies directly, so they need states with some variation. Last, check that your reward features can be recovered from choices. A feature that never changes with the action cancels out of every choice comparison, so its parameter is not identified. `feature_diagnostics` reports this. ```python import numpy as np from econirl.preprocessing import feature_diagnostics S, A = 20, 3 rng = np.random.default_rng(0) phi = np.zeros((S, A, 2)) phi[:, :, 0] = rng.normal(size=(S, A)) # varies with the action phi[:, :, 1] = rng.normal(size=(S, 1)) # same for every action in a state print({k: round(v, 2) if isinstance(v, float) else v for k, v in feature_diagnostics(phi).items()}) ``` ```text {'num_features': 2, 'feature_rank': 2, 'condition_number': 1.29, 'contrast_rank': 1, 'contrast_condition_number': 1.0} ``` The design has full rank, but `contrast_rank` is 1, not 2. The second feature is constant across actions, so one parameter cannot be recovered. When `contrast_rank` equals `num_features`, every parameter is identified from behavior. ## Fit and read the outputs ```python model.fit(df, state="mileage_bin", action="replaced", id="bus_id") print(model.params_) ``` ```text {'theta_c': 0.0010028828858836278, 'RC': 3.0722093435989524} ``` A fitted estimator exposes a common set of attributes. | Attribute | What it holds | | --- | --- | | `params_` | Estimated reward or cost parameters. | | `policy_` | Estimated choice probabilities, one row per state. | | `value_` | Estimated value function, when the estimator computes one. | | `summary()` | A report of the estimates with standard errors. | ## One counterfactual Change a parameter and read the new policy. ```python cf = model.counterfactual(RC=4.0) print(cf.params) print("replace probability at state 50:", float(np.asarray(cf.policy)[50, 1])) ``` ```text {'theta_c': 0.0010028828858836278, 'RC': 4.0} replace probability at state 50: 0.05519477716656161 ``` A higher replacement cost lowers the chance of replacing at a given mileage. ## Next steps - Pick an estimator for your problem in the [estimator map](../estimators.md). - Read the [overview](overview.md) for the ideas behind these models. - Look up any class in the [API reference](../api/index.rst).