jaxdem.rl.environments.single_roller#

Environment where a single agent rolls towards a target on the floor.

Functions

frictional_wall_force(pos, state, system)

Normal, frictional, and restitution forces for a sphere on a \(z = 0\) plane.

Classes

SingleRoller3D(state, system, env_params)

Single-agent 3D navigation via torque-controlled rolling.

class jaxdem.rl.environments.single_roller.SingleRoller3D(state: State, system: System, env_params: dict[str, Any])#

Bases: Environment

Single-agent 3D navigation via torque-controlled rolling.

The agent is a sphere resting on a \(z = 0\) floor under gravity. Actions are 3-D torque vectors; translational motion arises from frictional contact with the floor (see frictional_wall_force()). A viscous drag -friction * vel and a fixed angular damping of 0.05 * ang_vel are applied each step.

The reward uses exponential potential-based shaping:

\[\mathrm{rew} = e^{-2\,d} - e^{-2\,d^{\mathrm{prev}}}\]

Notes

The observation vector per agent is:

Feature

Size

Unit direction to objective

2

Clamped displacement (x, y)

2

Velocity (x, y)

2

Angular velocity

3

classmethod Create(min_box_size: float = 2.0, max_box_size: float = 2.0, max_steps: int = 1000, friction: float = 0.2, work_weight: float = 0.0) SingleRoller3D[source]#

Create a single-agent roller environment.

Parameters:
  • min_box_size (float) – Range for the random square domain side length.

  • max_box_size (float) – Range for the random square domain side length.

  • max_steps (int) – Episode length in physics steps.

  • friction (float) – Viscous drag coefficient applied as -friction * vel.

  • work_weight (float) – Penalty coefficient for large actions.

Returns:

A freshly constructed environment (call reset() before use).

Return type:

SingleRoller3D

static reset(env: SingleRoller3D, key: Array | ndarray | bool | number | bool | int | float | complex | TypedNdArray) Environment[source]#

Randomly place the agent and objective on the floor.

Parameters:
  • env (Environment) – Current environment instance.

  • key (ArrayLike) – JAX PRNG key.

Returns:

Freshly initialised environment.

Return type:

Environment

static step(env: SingleRoller3D, action: Array) Environment[source]#

Apply a torque action, advance physics by one step.

Parameters:
  • env (Environment) – Current environment.

  • action (jax.Array) – 3-D torque vector per agent.

Returns:

Updated environment after one physics step.

Return type:

Environment

static observation(env: SingleRoller3D) Array[source]#

Per-agent observation vector.

Contents per agent:

  • Unit displacement to objective projected to x-y (shape (2,)).

  • Clamped displacement to objective projected to x-y (shape (2,)).

  • Velocity projected to x-y (shape (2,)).

  • Angular velocity (shape (3,)).

Returns:

Shape (N, 9).

Return type:

jax.Array

static reward(env: SingleRoller3D) Array[source]#

Returns a vector of per-agent rewards.

Exponential potential-based shaping:

\[\mathrm{rew}_i = e^{-2 \cdot d_i} - e^{-2 \cdot d_i^{\mathrm{prev}}}\]
Returns:

Shape (N,).

Return type:

jax.Array

static done(env: SingleRoller3D) Array[source]#

True when step_count exceeds max_steps.

property action_space_size: int[source]#

Per-agent flattened action dimensionality (3-D torque).

property action_space_shape: tuple[int][source]#

Per-agent action tensor shape.

property observation_space_size: int[source]#

Per-agent flattened observation dimensionality (9).