Safety Gymnasium Environment#
Safety Gymnasium Interface#
Documentation
- class omnisafe.envs.safety_gymnasium_env.SafetyGymnasiumEnv(env_id, num_envs=1, device=DEVICE_CPU, **kwargs)[source]#
Safety Gymnasium Environment.
- Parameters:
env_id (str) – Environment id.
num_envs (int, optional) – Number of environments. Defaults to 1.
device (torch.device, optional) – Device to store the data. Defaults to
torch.device('cpu')
.
- Keyword Arguments:
render_mode (str, optional) – The render mode ranges from ‘human’ to ‘rgb_array’ and ‘rgb_array_list’. Defaults to ‘rgb_array’.
camera_name (str, optional) – The camera name.
camera_id (int, optional) – The camera id.
width (int, optional) – The width of the rendered image. Defaults to 256.
height (int, optional) – The height of the rendered image. Defaults to 256.
- Variables:
need_auto_reset_wrapper (bool) – Whether to use auto reset wrapper.
need_time_limit_wrapper (bool) – Whether to use time limit wrapper.
Initialize an instance of
SafetyGymnasiumEnv
.- property max_episode_steps: int#
The max steps per episode.
- render()[source]#
Compute the render frames as specified by
render_mode
during the initialization of the environment.- Returns:
The render frames – we recommend to use np.ndarray which could construct video by moviepy.
- Return type:
Any
- reset(seed=None, options=None)[source]#
Reset the environment.
- Parameters:
seed (int, optional) – The random seed. Defaults to None.
options (dict[str, Any], optional) – The options for the environment. Defaults to None.
- Returns:
observation – Agent’s observation of the current environment.
info – Some information logged by the environment.
- Return type:
tuple[torch.Tensor, dict[str, Any]]
- set_seed(seed)[source]#
Set the seed for the environment.
- Parameters:
seed (int) – Seed to set.
- Return type:
None
- step(action)[source]#
Step the environment.
Note
OmniSafe uses auto reset wrapper to reset the environment when the episode is terminated. So the
obs
will be the first observation of the next episode. And the truefinal_observation
ininfo
will be stored in thefinal_observation
key ofinfo
.- Parameters:
action (torch.Tensor) – Action to take.
- Returns:
observation – The agent’s observation of the current environment.
reward – The amount of reward returned after previous action.
cost – The amount of cost returned after previous action.
terminated – Whether the episode has ended.
truncated – Whether the episode has been truncated due to a time limit.
info – Some information logged by the environment.
- Return type:
tuple
[Tensor
,Tensor
,Tensor
,Tensor
,Tensor
,dict
[str
,Any
]]
Safety Gymnasium World Model#
Documentation
- class omnisafe.envs.safety_gymnasium_modelbased.SafetyGymnasiumModelBased(env_id, num_envs=1, device='cpu', use_lidar=False, **kwargs)[source]#
Safety Gymnasium environment for Model-based algorithms.
- Variables:
_support_envs (list[str]) – List of supported environments.
need_auto_reset_wrapper (bool) – Whether to use auto reset wrapper.
need_time_limit_wrapper (bool) – Whether to use time limit wrapper.
Initialize the environment.
- Parameters:
env_id (str) – Environment id.
num_envs (int, optional) – Number of environments. Defaults to 1.
device (torch.device, optional) – Device to store the data. Defaults to ‘cpu’.
use_lidar (bool, optional) – Whether to use lidar observation. Defaults to False.
- Keyword Arguments:
render_mode (str, optional) – The render mode, ranging from
human
,rgb_array
,rgb_array_list
. Defaults torgb_array
.camera_name (str, optional) – The camera name.
camera_id (int, optional) – The camera id.
width (int, optional) – The width of the rendered image. Defaults to 256.
height (int, optional) – The height of the rendered image. Defaults to 256.
- _dist_xy(pos1, pos2)[source]#
Return the distance from the robot to an XY position.
- Parameters:
pos1 (np.ndarray | list[np.ndarray]) – The first position.
pos2 (np.ndarray | list[np.ndarray]) – The second position.
- Returns:
distance – The distance between the two positions.
- Return type:
float
- _ego_xy(robot_matrix, robot_pos, pos)[source]#
Return the egocentric XY vector to a position from the robot.
- Parameters:
robot_matrix (list[list[float]]) – 3x3 rotation matrix.
robot_pos (np.ndarray) – 2D robot position.
pos (np.ndarray) – 2D position.
- Returns:
2D_egocentric_vector – The 2D egocentric vector.
- Return type:
ndarray
- _get_coordinate_sensor()[source]#
Return the coordinate observation and sensor observation.
We will ignore the z-axis coordinates in every poses. The returned obs coordinates are all in the robot coordinates.
- Returns:
coordinate_obs – The coordinate observation.
- Return type:
dict
[str
,Any
]
- _get_flat_coordinate(coordinate_obs)[source]#
Get the flattened obs.
- Parameters:
coordinate_obs (dict[str, Any]) – The dict of coordinate and sensor observations.
- Returns:
flat_obs – The flattened observation.
- Return type:
ndarray
- _obs_lidar_pseudo(robot_matrix, robot_pos, positions)[source]#
Return a robot-centric lidar observation of a list of positions.
Lidar is a set of bins around the robot (divided evenly in a circle). The detection directions are exclusive and exhaustive for a full 360 view. Each bin reads 0 if there are no objects in that direction. If there are multiple objects, the distance to the closest one is used. Otherwise the bin reads the fraction of the distance towards the robot.
E.g. if the object is 90% of lidar_max_dist away, the bin will read 0.1, and if the object is 10% of lidar_max_dist away, the bin will read 0.9. (The reading can be thought of as “closeness” or inverse distance)
- This encoding has some desirable properties:
bins read 0 when empty
bins smoothly increase as objects get close
maximum reading is 1.0 (where the object overlaps the robot)
close objects occlude far objects
constant size observation with variable numbers of objects
- Parameters:
robot_matrix (list[list[float]]) – 3x3 rotation matrix.
robot_pos (np.ndarray) – 2D robot position.
positions (list[np.ndarray]) – 2D positions.
- Returns:
lidar_observation – The lidar observation.
- Return type:
ndarray
- get_cost_from_obs_tensor(obs, is_binary=True)[source]#
Get batch cost from batch observation.
- Parameters:
obs (torch.Tensor) – Batch observation.
is_binary (bool, optional) – Whether to use binary cost. Defaults to True.
- Returns:
cost – Batch cost.
- Return type:
Tensor
- get_lidar_from_coordinate(obs)[source]#
Get lidar observation.
- Parameters:
obs (np.ndarray) – The observation.
- Returns:
lidar_obs – The lidar observation.
- Return type:
Tensor
- property max_episode_steps: int#
The max steps per episode.
- render()[source]#
Render the environment.
- Returns:
The rendered frames, we recommend using `np.ndarray` which could construct video by
moviepy.
- Return type:
Any
- reset(seed=None, options=None)[source]#
Reset the environment.
- Parameters:
seed (int, optional) – The random seed. Defaults to None.
options (dict[str, Any], optional) – The options for the environment. Defaults to None.
- Returns:
observation – The initial observation of the space.
info – Some information logged by the environment.
- Return type:
tuple[torch.Tensor, dict[str, Any]]
- set_seed(seed)[source]#
Set the seed for the environment.
- Parameters:
seed (int) – Seed to set.
- Return type:
None
- step(action)[source]#
Step the environment.
Note
OmniSafe use auto reset wrapper to reset the environment when the episode is terminated. So the
obs
will be the first observation of the next episode. And the truefinal_observation
ininfo
will be stored in thefinal_observation
key ofinfo
.- Parameters:
action (torch.Tensor) – Action to take.
- Returns:
observation – The agent’s observation of the current environment.
reward – The amount of reward returned after previous action.
cost – The amount of cost returned after previous action.
terminated – Whether the episode has ended.
truncated – Whether the episode has been truncated due to a time limit.
info – Some information logged by the environment.
- Return type:
tuple
[Tensor
,Tensor
,Tensor
,Tensor
,Tensor
,dict
[str
,Any
]]
- property task: str#
The name of the task.