OmniSafe Config#

Config(**kwargs)

Config class for storing hyperparameters.

ModelConfig(**kwargs)

Model config.

get_default_kwargs_yaml(algo, env_id, algo_type)

Get the default kwargs from yaml file.

check_all_configs(configs, algo_type)

Check all configs.

__check_algo_configs(configs, algo_type)

Check algorithm configs.

__check_logger_configs(configs)

Check logger configs.

Config#

OmniSafe uses yaml file to store all the configurations. The configuration file is stored in omnisafe/configs. The configuration file is divided into several parts.

Take PPOLag as an example, the configuration file is as follows:

Config

Description

train_cfgs

Training configurations.

algo_cfgs

Algorithm configurations

logger_cfgs

Logger configurations

model_cfgs

Model configurations

lagrange_cfgs

Lagrange configurations

Specifically, the train_cfgs is as follows:

Config

Description

Value

device

Device to use.

cuda or cpu

torch_threads

Number of threads to use.

16

vector_env_nums

Number of vectorized environments.

1

parallel

Number of parallel agent, similar to A3C.

1

total_steps

Total number of training steps.

1000000

Other configurations are similar to train_cfgs. You can refer to the omnisafe/configs for more details.

Documentation

class omnisafe.utils.config.Config(**kwargs)[source]#

Config class for storing hyperparameters.

OmniSafe uses a Config class to store all hyperparameters. OmniSafe store hyperparameters in a yaml file and load them into a Config object. Then the Config class will check the hyperparameters are valid, then pass them to the algorithm class.

Variables:
  • seed (int) – Random seed.

  • device (str) – Device to use for training.

  • device_id (int) – Device id to use for training.

  • wrapper_type (str) – Wrapper type.

  • epochs (int) – Number of epochs.

  • steps_per_epoch (int) – Number of steps per epoch.

  • actor_iters (int) – Number of actor iterations.

  • critic_iters (int) – Number of critic iterations.

  • check_freq (int) – Frequency of checking.

  • save_freq (int) – Frequency of saving.

  • entropy_coef (float) – Entropy coefficient.

  • max_ep_len (int) – Maximum episode length.

  • num_mini_batches (int) – Number of mini batches.

  • actor_lr (float) – Actor learning rate.

  • critic_lr (float) – Critic learning rate.

  • log_dir (str) – Log directory.

  • target_kl (float) – Target KL divergence.

  • batch_size (int) – Batch size.

  • use_cost (bool) – Whether to use cost.

  • cost_gamma (float) – Cost gamma.

  • linear_lr_decay (bool) – Whether to use linear learning rate decay.

  • exploration_noise_anneal (bool) – Whether to use exploration noise anneal.

  • penalty_param (float) – Penalty parameter.

  • kl_early_stop (bool) – Whether to use KL early stop.

  • use_max_grad_norm (bool) – Whether to use max gradient norm.

  • max_grad_norm (float) – Max gradient norm.

  • use_critic_norm (bool) – Whether to use critic norm.

  • critic_norm_coeff (bool) – Critic norm coefficient.

  • model_cfgs (ModelConfig) – Model config.

  • buffer_cfgs (Config) – Buffer config.

  • gamma (float) – Discount factor.

  • lam (float) – Lambda.

  • lam_c (float) – Lambda for cost.

  • adv_eastimator (AdvatageEstimator) – Advantage estimator.

  • standardized_rew_adv (bool) – Whether to use standardized reward advantage.

  • standardized_cost_adv (bool) – Whether to use standardized cost advantage.

  • env_cfgs (Config) – Environment config.

  • num_envs (int) – Number of environments.

  • async_env (bool) – Whether to use asynchronous environments.

  • env_name (str) – Environment name.

  • env_kwargs (dict) – Environment keyword arguments.

  • normalize_obs (bool) – Whether to normalize observation.

  • normalize_rew (bool) – Whether to normalize reward.

  • normalize_cost (bool) – Whether to normalize cost.

  • max_len (int) – Maximum length.

  • num_threads (int) – Number of threads.

Keyword Arguments:

kwargs (Any) – keyword arguments to set the attributes.

Initialize an instance of Config.

static dict2config(config_dict)[source]#

Convert dictionary to Config.

Parameters:

config_dict (dict[str, Any]) – The dictionary to be converted.

Returns:

The algorithm config.

Return type:

Config

recurisve_update(update_args)[source]#

Recursively update args.

Parameters:

update_args (dict[str, Any]) – Args to be updated.

Return type:

None

todict()[source]#

Convert Config to dictionary.

Returns:

The dictionary of Config.

Return type:

dict[str, Any]

tojson()[source]#

Convert Config to json string.

Returns:

The json string of Config.

Return type:

str

Model Config#

Documentation

class omnisafe.utils.config.ModelConfig(**kwargs)[source]#

Model config.

Initialize an instance of Config.

Common Method#

Documentation

omnisafe.utils.config.get_default_kwargs_yaml(algo, env_id, algo_type)[source]#

Get the default kwargs from yaml file.

Note

This function search the yaml file by the algorithm name and environment name. Make sure your new implemented algorithm or environment has the same name as the yaml file.

Parameters:
  • algo (str) – The algorithm name.

  • env_id (str) – The environment name.

  • algo_type (str) – The algorithm type.

Returns:

The default kwargs.

Return type:

Config

Documentation

omnisafe.utils.config.check_all_configs(configs, algo_type)[source]#

Check all configs.

This function is used to check the configs.

Parameters:
  • configs (Config) – The configs to be checked.

  • algo_type (str) – The algorithm type.

Return type:

None

Documentation

omnisafe.utils.config.__check_algo_configs(configs, algo_type)[source]#

Check algorithm configs.

This function is used to check the algorithm configs.

Note

  • update_iters must be greater than 0 and must be int.

  • steps_per_epoch must be greater than 0 and must be int.

  • batch_size must be greater than 0 and must be int.

  • target_kl must be greater than 0 and must be float.

  • entropy_coeff must be in [0, 1] and must be float.

  • gamma must be in [0, 1] and must be float.

  • cost_gamma must be in [0, 1] and must be float.

  • lam must be in [0, 1] and must be float.

  • lam_c must be in [0, 1] and must be float.

  • clip must be greater than 0 and must be float.

  • penalty_coeff must be greater than 0 and must be float.

  • reward_normalize must be bool.

  • cost_normalize must be bool.

  • obs_normalize must be bool.

  • kl_early_stop must be bool.

  • use_max_grad_norm must be bool.

  • use_cost must be bool.

  • max_grad_norm must be greater than 0 and must be float.

  • adv_estimation_method must be in [gae, v-trace, gae-rtg, plain].

  • standardized_rew_adv must be bool.

  • standardized_cost_adv must be bool.

Parameters:
  • configs (Config) – The configs to be checked.

  • algo_type (str) – The algorithm type.

Return type:

None

Documentation

omnisafe.utils.config.__check_logger_configs(configs)[source]#

Check logger configs.

Parameters:
  • configs (Config) – The configs to be checked.

  • algo_type (str) – The algorithm type.

Return type:

None