jaxdem.rl.environments.swarm_stacking_3d#

Multi-agent 3-D swarm stacking environment with periodic boundaries.

Classes

SwarmStacking3D(state, system, env_params, ...)

Multi-agent 3-D stacking environment with periodic boundaries.

class jaxdem.rl.environments.swarm_stacking_3d.SwarmStacking3D(state: State, system: System, env_params: dict[str, Any], n_lidar_rays: int, n_lidar_elevation: int)#

Bases: Environment

Multi-agent 3-D stacking environment with periodic boundaries.

Agents must stack on top of one another to reach as high as possible.

Reward

\[R_i = w_{climb} (0.8 \cdot z_i + 0.2 \cdot \bar{z}_t) + w_{cohesion} \sum \text{lidar} - w_w\,\|\tau_i\|^2 - w_{\mathrm{vel}}\,\|v_i\|^2 - \bar{r}_i\]

where \(\bar{z}_t\) is the average height of the swarm.

Boundary Conditions: - Periodic in X and Y. - Frictional floor at Z=0. - Effectively unbounded Z (large box size).

n_lidar_rays: int#

Number of azimuthal bins for the 3-D LiDAR sensor.

n_lidar_elevation: int#

Number of elevation bins for the 3-D LiDAR sensor.

classmethod Create(N: int = 16, min_box_size: float = 0.5, max_box_size: float = 0.5, box_padding: float = 0.0, max_steps: int = 5760, friction: float = 0.2, ang_damping: float = 0.07, climb_weight: float = 20.0, cohesion_weight: float = 0.05, work_weight: float = 0.0, velocity_weight: float = 2.0, alpha_r_bar: float = 0.07, lidar_range: float = 0.5, n_lidar_rays: int = 8, n_lidar_elevation: int = 8, magnet_strength: float = 40.0, magnet_range: float = 0.12) SwarmStacking3D[source]#

Create a swarm stacking 3-D environment.

static reset(env: SwarmStacking3D, key: Array | ndarray | bool | number | bool | int | float | complex | TypedNdArray) Environment[source]#
static step(env: SwarmStacking3D, action: Array) Environment[source]#
static observation(env: SwarmStacking3D) Array[source]#
static reward(env: SwarmStacking3D) Array[source]#
static done(env: SwarmStacking3D) Array[source]#
property action_space_size: int[source]#

Flattened action size per agent. Actions passed to step() have shape (A, action_space_size).

property action_space_shape: tuple[int][source]#

Original per-agent action shape (useful for reshaping inside the environment).

property observation_space_size: int[source]#

Flattened observation size per agent. observation() returns shape (A, observation_space_size).