econirl.NeuralUFXP
- class econirl.NeuralUFXP(n_states=None, n_actions=None, discount=0.95, scale=1.0, num_projections=64, reward_hidden_dim=64, reward_num_layers=2, max_epochs=2000, lr=0.01, gradient_clip=10.0, ccp_min_count=1, ccp_smoothing=1e-06, seed=0, verbose=False)[source]
Bases:
NeuralEstimatorMixinNeural-utility UFXP estimator (Oguz and Bray 2026).
Trains a neural utility
u_w(s, a)by minimizing the UFXP random-projection objective, reusing the linear estimator’s precomputed dual so no Bellman equation is solved during training.- Parameters:
n_states (int, optional) – Sizes of the state and action spaces. Inferred from the data if None.
n_actions (int, optional) – Sizes of the state and action spaces. Inferred from the data if None.
discount (float, default=0.95) – Discount factor
beta.scale (float, default=1.0) – Logit scale
sigma.num_projections (int, default=64) – Number of random projections
m.reward_hidden_dim (int, default=64) – Hidden width of the utility network.
reward_num_layers (int, default=2) – Hidden depth of the utility network.
max_epochs (int, default=2000) – Adam steps over the projection objective.
lr (float, default=1e-2) – Adam learning rate.
gradient_clip (float, default=10.0) – Global-norm gradient clip (<=0 disables).
ccp_min_count (int, default=1) – Minimum visits for a state’s first-order conditions to be scored.
ccp_smoothing (float, default=1e-6) – Additive smoothing for the frequency CCPs.
seed (int, default=0) – Seed for the projections and the network initialization.
verbose (bool, default=False) – Print the objective during training.
- Variables:
policy (numpy.ndarray) – Estimated choice probabilities, shape (n_states, n_actions).
value (numpy.ndarray) – Estimated value function, shape (n_states,).
reward (numpy.ndarray) – Learned utility
u_w(s, a), shape (n_states, n_actions).params (dict) – The learned utility projected onto the features. The objective constrains the choice-relevant utility, not the utility level, so this is a best-effort linear summary of a partially identified function; a low
projection_r2_flags that the utility is not linear in the features.se (dict) – Projection pseudo standard errors (not the efficient UFXP variance).
coef (numpy.ndarray) – Projected coefficients in array form.
projection_r2 (float) – R-squared of the feature projection.
converged (bool) – Whether the objective decreased to a finite value.
- __init__(n_states=None, n_actions=None, discount=0.95, scale=1.0, num_projections=64, reward_hidden_dim=64, reward_num_layers=2, max_epochs=2000, lr=0.01, gradient_clip=10.0, ccp_min_count=1, ccp_smoothing=1e-06, seed=0, verbose=False)[source]
- fit(data, state=None, action=None, id=None, features=None, transitions=None)[source]
Fit the neural utility to data.
- Parameters:
data (pandas.DataFrame or Panel or TrajectoryPanel) – Panel of observed choices.
state (str, optional) – Column names (required when
datais a DataFrame).action (str, optional) – Column names (required when
datais a DataFrame).id (str, optional) – Column names (required when
datais a DataFrame).features (numpy.ndarray) – Reward features
phi(s, a)of shape (n_states, n_actions, K). The utility network maps each feature vector to a scalar utility, so the features set the inputs the network can combine.transitions (numpy.ndarray) – Transition matrices
P(s'|s,a)of shape (n_actions, n_states, n_states).
- Return type: