Mujoco Environment#
MujocoEnv Interface#
Documentation
- class omnisafe.envs.mujoco_env.MujocoEnv(env_id, num_envs=1, device='cpu', **kwargs)[source]#
Gymnasium Mujoco environment.
- Variables:
need_auto_reset_wrapper (bool) – Whether to use auto reset wrapper.
need_time_limit_wrapper (bool) – Whether to use time limit wrapper.
Initialize the environment.
- Parameters:
env_id (str) – Environment id.
num_envs (int, optional) – Number of environments. Defaults to 1.
device (torch.device, optional) – Device to store the data. Defaults to ‘cpu’.
- Keyword Arguments:
render_mode (str, optional) – The render mode, ranging from
human
,rgb_array
,rgb_array_list
. Defaults torgb_array
.camera_name (str, optional) – The camera name.
camera_id (int, optional) – The camera id.
width (int, optional) – The width of the rendered image. Defaults to 256.
height (int, optional) – The height of the rendered image. Defaults to 256.
- property max_episode_steps: int#
The max steps per episode.
- reset(seed=None, options=None)[source]#
Reset the environment.
- Parameters:
seed (int, optional) – The random seed. Defaults to None.
options (dict[str, Any], optional) – The options for the environment. Defaults to None.
- Returns:
observation – Agent’s observation of the current environment.
info – Auxiliary diagnostic information (helpful for debugging, and sometimes learning).
- Return type:
tuple[torch.Tensor, dict]
- set_seed(seed)[source]#
Set the seed for the environment.
- Parameters:
seed (int) – Seed to set.
- Return type:
None
- step(action)[source]#
Step the environment.
Note
OmniSafe use auto reset wrapper to reset the environment when the episode is terminated. So the
obs
will be the first observation of the next episode. And the truefinal_observation
ininfo
will be stored in thefinal_observation
key ofinfo
.- Parameters:
action (torch.Tensor) – Action to take.
- Returns:
observation – Agent’s observation of the current environment.
reward – Amount of reward returned after previous action.
cost – Amount of cost returned after previous action.
terminated – Whether the episode has ended.
truncated – Whether the episode has been truncated due to a time limit.
info – Auxiliary diagnostic information (helpful for debugging, and sometimes learning).
- Return type:
tuple
[Tensor
,Tensor
,Tensor
,Tensor
,Tensor
,dict
[str
,Any
]]