Model-based Algorithms#

CAPPETS#

Documentation

class omnisafe.algorithms.model_based.CAPPETS(env_id, cfgs)[source]#

The Conservative and Adaptive Penalty (CAP) algorithm implementation based on PETS.

References

Title: Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning
Authors: Yecheng Jason Ma, Andrew Shen, Osbert Bastani, Dinesh Jayaraman.
URL: CAP

Initialize an instance of algorithm.

_init_log()[source]#

Initialize the logger.

Things to log	Description
Plan/feasible_num	The number of feasible plans.
Plan/episode_costs_max	The maximum planning cost.
Plan/episode_costs_mean	The mean planning cost.
Plan/episode_costs_min	The minimum planning cost.
Metrics/LagrangeMultiplier	The lagrange multiplier.
Plan/var_penalty_max	The maximum planning penalty.
Plan/var_penalty_mean	The mean planning penalty.
Plan/var_penalty_min	The minimum planning penalty.

Return type:: None

_init_model()[source]#

Initialize the dynamics model and the planner.

CAP uses following models: :rtype: None

dynamics model: to predict the next state and the cost.
lagrange multiplier: to trade off between the cost and the reward.
planner: to generate the action.

_save_model()[source]#

Save the model.

Return type:: None

_update_epoch()[source]#

Update function per epoch.

Return type:: None

CCEPETS#

Documentation

class omnisafe.algorithms.model_based.CCEPETS(env_id, cfgs)[source]#

The Constrained Cross-Entropy (CCE) algorithm implementation based on PETS.

References

Title: Constrained Cross-Entropy Method for Safe Reinforcement Learning
Authors: Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess,
Tom Erez, Yuval Tassa, David Silver, Daan Wierstra.
URL: CCE

Initialize an instance of algorithm.

_init_log()[source]#

Initialize the logger keys for the CCE algorithm.

Things to log	Description
Plan/feasible_num	The number of feasible plans.
Plan/episode_costs_max	The maximum planning cost.
Plan/episode_costs_mean	The mean planning cost.
Plan/episode_costs_min	The minimum planning cost.

Return type:: None

_init_model()[source]#

Initialize the dynamics model and the planner.

CCEPETS uses following models: :rtype: None

dynamics model: to predict the next state and the cost.
planner: to generate the action.

RCEPETS#

Documentation

class omnisafe.algorithms.model_based.RCEPETS(env_id, cfgs)[source]#

The Robust Cross Entropy (RCE) algorithm implementation based on PETS.

References

Title: Constrained Model-based Reinforcement Learning with Robust Cross-Entropy Method
Authors: Zuxin Liu, Hongyi Zhou, Baiming Chen, Sicheng Zhong, Martial Hebert, Ding Zhao.
URL: RCE

Initialize an instance of algorithm.

_init_log()[source]#

Initialize the logger.

Things to log	Description
Plan/feasible_num	The number of feasible plans.
Plan/episode_costs_max	The maximum planning cost.
Plan/episode_costs_mean	The mean planning cost.
Plan/episode_costs_min	The minimum planning cost.
Metrics/LagrangeMultiplier	The lagrange multiplier.

Return type:: None

_init_model()[source]#

Initialize the dynamics model and the planner.

RCEPETS uses following models: :rtype: None

dynamics model: to predict the next state and the cost.
planner: to generate the action.

Safe LOOP#

Documentation

class omnisafe.algorithms.model_based.SafeLOOP(env_id, cfgs)[source]#

The Safe Learning Off-Policy with Online Planning (SafeLOOP) algorithm.

References

Title: Learning Off-Policy with Online Planning
Authors: Harshit Sikchi, Wenxuan Zhou, David Held.
URL: SafeLOOP

Initialize an instance of algorithm.

_init_log()[source]#

Initialize the logger keys for the algorithm.

Things to log	Description
Plan/feasible_num	The number of feasible plans.
Plan/episode_costs_max	The maximum planning cost.
Plan/episode_costs_mean	The mean planning cost.
Plan/episode_costs_min	The minimum planning cost.

Return type:: None

_init_model()[source]#

Initialize the dynamics model and the planner.

SafeLOOP uses following models: :rtype: None

dynamics model: to predict the next state and the cost.
planner: to generate the action.