econirl.RewardSpec

class econirl.RewardSpec(features, names, n_actions=None)[source]

Bases: object

Unified feature specification for structural estimation and IRL.

Stores features as a (S, A, K) array and provides compute, gradient, and hessian methods compatible with the BaseUtilityFunction protocol.

Parameters:
  • features (jnp.ndarray) – Either (S, A, K) for action-dependent features, or (S, K) for state-only features (broadcast to all actions).

  • names (list[str]) – Human-readable name for each feature/parameter dimension.

  • n_actions (int, optional) – Required when features is (S, K) to specify the number of actions for broadcasting. Ignored when features is (S, A, K).

__init__(features, names, n_actions=None)[source]
Parameters:
  • features (Array)

  • names (list[str])

  • n_actions (int | None)

classmethod state_dependent(state_features, names, n_actions)[source]

Create from state-only features (S, K), broadcast to all actions.

Parameters:
  • state_features (jnp.ndarray) – Shape (S, K).

  • names (list[str]) – One name per feature.

  • n_actions (int) – Number of actions to broadcast to.

Return type:

RewardSpec

classmethod state_action_dependent(features, names)[source]

Create from action-dependent features (S, A, K).

Parameters:
  • features (jnp.ndarray) – Shape (S, A, K).

  • names (list[str]) – One name per feature.

Return type:

RewardSpec

property feature_matrix: Array

Feature array of shape (S, A, K).

property parameter_names: list[str]

Human-readable names for each parameter.

property num_parameters: int

Number of parameters (K).

property num_states: int

Number of states (S).

property num_actions: int

Number of actions (A).

property is_state_only: bool

Whether the spec was constructed from state-only features.

compute(parameters)[source]

Compute reward matrix R(s, a) = sum_k params[k] * features[s, a, k].

Parameters:

parameters (jnp.ndarray) – Shape (K,).

Returns:

Shape (S, A).

Return type:

jnp.ndarray

compute_gradient(parameters)[source]

Gradient of reward w.r.t. parameters.

For linear specification the gradient is the feature matrix itself, independent of the parameter values.

Parameters:

parameters (jnp.ndarray) – Shape (K,). Unused but kept for protocol compatibility.

Returns:

Shape (S, A, K).

Return type:

jnp.ndarray

compute_hessian(parameters)[source]

Hessian of reward w.r.t. parameters.

For linear specification the Hessian is identically zero.

Parameters:

parameters (jnp.ndarray) – Shape (K,). Unused.

Returns:

Shape (S, A, K, K) of zeros.

Return type:

jnp.ndarray

get_initial_parameters()[source]

Return zeros of shape (K,) as a starting point.

Return type:

Array

get_parameter_bounds()[source]

Return (None, None) indicating unbounded parameters.

Return type:

tuple[Array | None, Array | None]

validate_parameters(parameters)[source]

Check that parameters have shape (K,).

Raises:

ValueError – If shape does not match.

Parameters:

parameters (Array)

Return type:

None

subset_states(indices)[source]

Return a new RewardSpec containing only the specified states.

Parameters:

indices (jnp.ndarray) – 1-D integer array of state indices to keep.

Return type:

RewardSpec

to_linear_utility()[source]

Convert to a LinearUtility with the same (S, A, K) feature matrix.

Returns:

Equivalent LinearUtility instance.

Return type:

LinearUtility

to_action_dependent_reward()[source]

Convert to an ActionDependentReward with the same (S, A, K) features.

Returns:

Equivalent ActionDependentReward instance.

Return type:

ActionDependentReward

to_linear_reward()[source]

Convert to a LinearReward with state-only (S, K) features.

This only works when features are truly state-only (identical across all actions). If features differ across actions, a ValueError is raised.

Returns:

Equivalent LinearReward instance.

Return type:

LinearReward

Raises:

ValueError – If features vary across actions.