econirl.RewardSpec
- class econirl.RewardSpec(features, names, n_actions=None)[source]
Bases:
objectUnified feature specification for structural estimation and IRL.
Stores features as a (S, A, K) array and provides compute, gradient, and hessian methods compatible with the BaseUtilityFunction protocol.
- Parameters:
features (jnp.ndarray) – Either (S, A, K) for action-dependent features, or (S, K) for state-only features (broadcast to all actions).
names (list[str]) – Human-readable name for each feature/parameter dimension.
n_actions (int, optional) – Required when
featuresis (S, K) to specify the number of actions for broadcasting. Ignored whenfeaturesis (S, A, K).
- classmethod state_dependent(state_features, names, n_actions)[source]
Create from state-only features (S, K), broadcast to all actions.
- Parameters:
- Return type:
- classmethod state_action_dependent(features, names)[source]
Create from action-dependent features (S, A, K).
- Parameters:
- Return type:
- property feature_matrix: Array
Feature array of shape (S, A, K).
- compute(parameters)[source]
Compute reward matrix R(s, a) = sum_k params[k] * features[s, a, k].
- Parameters:
parameters (jnp.ndarray) – Shape (K,).
- Returns:
Shape (S, A).
- Return type:
jnp.ndarray
- compute_gradient(parameters)[source]
Gradient of reward w.r.t. parameters.
For linear specification the gradient is the feature matrix itself, independent of the parameter values.
- Parameters:
parameters (jnp.ndarray) – Shape (K,). Unused but kept for protocol compatibility.
- Returns:
Shape (S, A, K).
- Return type:
jnp.ndarray
- compute_hessian(parameters)[source]
Hessian of reward w.r.t. parameters.
For linear specification the Hessian is identically zero.
- Parameters:
parameters (jnp.ndarray) – Shape (K,). Unused.
- Returns:
Shape (S, A, K, K) of zeros.
- Return type:
jnp.ndarray
- get_initial_parameters()[source]
Return zeros of shape (K,) as a starting point.
- Return type:
Array
- get_parameter_bounds()[source]
Return (None, None) indicating unbounded parameters.
- Return type:
tuple[Array | None, Array | None]
- validate_parameters(parameters)[source]
Check that parameters have shape (K,).
- Raises:
ValueError – If shape does not match.
- Parameters:
parameters (Array)
- Return type:
None
- subset_states(indices)[source]
Return a new RewardSpec containing only the specified states.
- Parameters:
indices (jnp.ndarray) – 1-D integer array of state indices to keep.
- Return type:
- to_linear_utility()[source]
Convert to a LinearUtility with the same (S, A, K) feature matrix.
- Returns:
Equivalent LinearUtility instance.
- Return type:
LinearUtility
- to_action_dependent_reward()[source]
Convert to an ActionDependentReward with the same (S, A, K) features.
- Returns:
Equivalent ActionDependentReward instance.
- Return type:
ActionDependentReward
- to_linear_reward()[source]
Convert to a LinearReward with state-only (S, K) features.
This only works when features are truly state-only (identical across all actions). If features differ across actions, a ValueError is raised.
- Returns:
Equivalent LinearReward instance.
- Return type:
LinearReward
- Raises:
ValueError – If features vary across actions.