Skip to content

Policies API

Policies receive a sequence of observation mappings and return a batched numpy action array.

Policy Protocol

python
class Policy(Protocol):
    def reset(self, episode_ids: Sequence[str] | None = None) -> None:
        ...

    def act(
        self,
        observations: Sequence[Observation],
        *,
        action_spec: ActionSpec | None = None,
        policy_kwargs: Mapping[str, Any] | None = None,
        episode_ids: Sequence[str] | None = None,
    ) -> np.ndarray:
        ...

action_spec is evaluator-side validation metadata. Policy adapters should not require underlying model code to consume it directly.

LocalPolicy

python
from praxis_eval import LocalPolicy

policy = LocalPolicy(my_policy)

my_policy may be:

  • an object with act(observations, *, policy_kwargs=None, episode_ids=None);
  • a callable with the same signature.

If my_policy defines reset(...), LocalPolicy forwards resets.

RemotePolicy

python
from praxis_eval import RemotePolicy

policy = RemotePolicy("127.0.0.1:50051", timeout=30.0)

RemotePolicy calls a praxis-remote policy server. Install it with:

bash
pip install "praxis-eval[remote]"

Remote mode is optional. It is useful when the simulator runtime and policy runtime need different dependency stacks.

Action Normalization

normalize_batched_action(...) enforces batch size, finite values, and ActionSpec validation when an action spec is supplied. For a single observation, unbatched action arrays are accepted if the array shape matches the single-action spec.

Released under the Apache-2.0 License.