BaseAgent¶
-
class
grid2op.Agent.
AgentWithConverter
(action_space, action_space_converter=None, **kwargs_converter)[source]¶ Compared to a regular BaseAgent, these types of Agents are able to deal with a different representation of
grid2op.BaseAction.BaseAction
andgrid2op.BaseObservation.BaseObservation
.As any other Agents, AgentWithConverter will implement the
BaseAgent.act()
method. But for them, it’s slightly different.They receive in this method an observation, as an object (ie an instance of
grid2op.BaseObservation
). This object can then be converted to any other object with the methodAgentWithConverter.convert_obs()
.Then, this transformed_observation is pass to the method
AgentWithConverter.my_act()
that is supposed to be defined for each agents. This function outputs an encoded_act which can be whatever you want to be.Finally, the encoded_act is decoded into a proper action, object of class
grid2op.BaseAction.BaseAction
, thanks to the methodAgentWithConverter.convert_act()
.This allows, for example, to represent actions as integers to train more easily standard discrete control algorithm used to solve atari games for example.
- NB It is possible to define
AgentWithConverter.convert_obs()
andAgentWithConverter.convert_act()
or to define a
grid2op.Converters.Converter
and feed it to the action_space_converter parameters used to initialise the class. The second option is preferred, as theAgentWithConverter.action_space
will then directly be this converter. Such an BaseAgent will really behave as if the actions are encoded the way he wants.
Examples
For example, imagine an BaseAgent uses a neural networks to take its decision.
Suppose also that, after some features engineering, it’s best for the neural network to use only the load active values (
grid2op.BaseObservation.BaseObservation.load_p
) and the sum of the relative flows (grid2op.BaseObservation.BaseObservation.rho
) with the active flow (grid2op.BaseObservation.BaseObservation.p_or
) [NB that agent would not make sense a priori, but who knows]Suppose that this neural network can be accessed with a class AwesomeNN (not available…) that can predict some actions. It can be loaded with the “load” method and make predictions with the “predict” method.
For the sake of the examples, we will suppose that this agent only predicts powerline status (so 0 or 1) that are represented as vector. So we need to take extra care to convert this vector from a numpy array to a valid action.
This is done below:
import grid2op import AwesomeNN # this does not exists! # create a simple environment env = grid2op.make() # define the class above class AgentCustomObservation(AgentWithConverter): def __init__(self, action_space, path): AgentWithConverter.__init__(self, action_space) self.my_neural_network = AwesomeNN() self.my_neural_networl.load(path) def convert_obs(self, observation): # convert the observation return np.concatenate((observation.load_p, observation.rho + observation.p_or)) def convert_act(self, encoded_act): # convert back the action, output from the NN "self.my_neural_network" # to a valid action act = self.action_space({"set_status": encoded_act}) def my_act(self, transformed_observation, reward, done=False): act_predicted = self.my_neural_network(transformed_observation) return act_predicted # make the agent that behaves as expected. my_agent = AgentCustomObservation(action_space=env.action_space, path=".") # this agent is perfectly working :-) You can use it as any other agents.
-
action_space_converter
¶ The converter that is used to represents the BaseAgent action space. Might be set to
None
if not initialized- Type
grid2op.Converters.Converter
-
init_action_space
¶ The initial action space. This corresponds to the action space of the
grid2op.Environment.Environment
.
-
action_space
¶ If a converter is used, then this action space represents is this converter. The agent will behave as if the action space is directly encoded the way it wants.
- Type
grid2op.Converters.ActionSpace
-
act
(observation, reward, done=False)[source]¶ Standard method of an
BaseAgent
. There is no need to overload this function.- Parameters
observation (
grid2op.Observation.Observation
) – The current observation of thegrid2op.Environment.Environment
reward (
float
) – The current reward. This is the reward obtained by the previous actiondone (
bool
) – Whether the episode has ended or not. Used to maintain gym compatibility
- Returns
res – The action chosen by the bot / controler / agent.
- Return type
-
convert_act
(encoded_act)[source]¶ This function will convert an “ecnoded action” that be of any types, to a valid action that can be ingested by the environment.
- Parameters
encoded_act (
object
) – Anything that represents an action.- Returns
act – A valid actions, represented as a class, that corresponds to the encoded action given as input.
- Return type
:grid2op.BaseAction.BaseAction`
-
convert_obs
(observation)[source]¶ This function convert the observation, that is an object of class
grid2op.BaseObservation.BaseObservation
into a representation understandable by the BaseAgent.For example, and agent could only want to look at the relative flows
grid2op.BaseObservation.BaseObservation.rho
to take his decision. This is possible by overloading this method.This method can also be used to scale the observation such that each compononents has mean 0 and variance 1 for example.
- Parameters
observation (
grid2op.Observation.Observation
) – Initial observation received by the agent in theBaseAgent.act()
method.- Returns
res – Anything that will be used by the BaseAgent to take decisions.
- Return type
object
-
abstractmethod
my_act
(transformed_observation, reward, done=False)[source]¶ This method should be overide if this class is used. It is an “abstract” method.
If someone wants to make a agent that handles different kinds of actions an observation.
- Parameters
transformed_observation (
object
) – Anything that will be used to create an action. This is the results to the call ofAgentWithConverter.convert_obs()
. This is likely a numpy array.reward (
float
) – The current reward. This is the reward obtained by the previous actiondone (
bool
) – Whether the episode has ended or not. Used to maintain gym compatibility
- Returns
res – A representation of an action in any possible format. This action will then be ingested and formatted into a valid action with the
AgentWithConverter.convert_act()
method.- Return type
object
- NB It is possible to define
-
class
grid2op.Agent.
BaseAgent
(action_space)[source]¶ This class represents the base class of an BaseAgent. All bot / controller / agent used in the Grid2Op simulator should derived from this class.
To work properly, it is advise to create BaseAgent after the
grid2op.Environment
has been created and reuse thegrid2op.Environment.Environment.action_space
to build the BaseAgent.-
action_space
¶ It represent the action space ie a tool that can serve to create valid action. Note that a valid action can be illegal or ambiguous, and so lead to a “game over” or to a error. But at least it will have a proper size.
-
abstractmethod
act
(observation, reward, done=False)[source]¶ This is the main method of an BaseAgent. Given the current observation and the current reward (ie the reward that the environment send to the agent after the previous action has been implemented).
- Parameters
observation (
grid2op.Observation.Observation
) – The current observation of thegrid2op.Environment.Environment
reward (
float
) – The current reward. This is the reward obtained by the previous actiondone (
bool
) – Whether the episode has ended or not. Used to maintain gym compatibility
- Returns
res – The action chosen by the bot / controler / agent.
- Return type
-
-
class
grid2op.Agent.
DoNothingAgent
(action_space)[source]¶ This is the most basic BaseAgent. It is purely passive, and does absolutely nothing.
-
act
(observation, reward, done=False)[source]¶ As better explained in the document of
grid2op.BaseAction.update()
orgrid2op.BaseAction.ActionSpace.__call__()
.The preferred way to make an object of type action is to call
grid2op.BaseAction.ActionSpace.__call__()
with the dictionnary representing the action. In this case, the action is “do nothing” and it is represented by the empty dictionnary.- Parameters
observation (
grid2op.Observation.Observation
) – The current observation of thegrid2op.Environment.Environment
reward (
float
) – The current reward. This is the reward obtained by the previous actiondone (
bool
) – Whether the episode has ended or not. Used to maintain gym compatibility
- Returns
res – The action chosen by the bot / controller / agent.
- Return type
-
-
class
grid2op.Agent.
GreedyAgent
(action_space, action_space_converter=None)[source]¶ This is a class of “Greedy BaseAgent”. Greedy agents are all executing the same kind of algorithm to take action:
They
grid2op.BaseObservation.simulate()
all actions in a given setThey take the action that maximise the simulated reward among all these actions
To make the creation of such BaseAgent, we created this abstract class (object of this class cannot be created). Two examples of such greedy agents are provided with
PowerLineSwitch
andTopologyGreedy
.-
abstractmethod
_get_tested_action
(observation)[source]¶ Returns the list of all the candidate actions.
From this list, the one that achieve the best “simulated reward” is used.
- Parameters
observation (
grid2op.Observation.Observation
) – The current observation of thegrid2op.Environment.Environment
- Returns
res – A list of all candidate
grid2op.BaseAction.BaseAction
- Return type
list
-
act
(observation, reward, done=False)[source]¶ By definition, all “greedy” agents are acting the same way. The only thing that can differentiate multiple agents is the actions that are tested.
These actions are defined in the method
_get_tested_action()
. Thisact()
method implements the greedy logic: take the actions that maximizes the instantaneous reward on the simulated action.- Parameters
observation (
grid2op.Observation.Observation
) – The current observation of thegrid2op.Environment.Environment
reward (
float
) – The current reward. This is the reward obtained by the previous actiondone (
bool
) – Whether the episode has ended or not. Used to maintain gym compatibility
- Returns
res – The action chosen by the bot / controller / agent.
- Return type
-
class
grid2op.Agent.
MLAgent
(action_space, action_space_converter=<class 'grid2op.Converter.ToVect.ToVect'>, **kwargs_converter)[source]¶ This agent allows to handle only vectors. The “my_act” function will return “do nothing” action (so it needs to be override)
In this class, the “my_act” is expected to return a vector that can be directly converted into a valid action.
-
convert_from_vect
(act)[source]¶ Helper to convert an action, represented as a numpy array as an
grid2op.BaseAction
instance.- Parameters
act (
numppy.ndarray
) – An action cast as angrid2op.BaseAction.BaseAction
instance.- Returns
res – The act parameters converted into a proper
grid2op.BaseAction.BaseAction
object.- Return type
-
my_act
(transformed_observation, reward, done=False)[source]¶ This method should be overide if this class is used. It is an “abstract” method.
If someone wants to make a agent that handles different kinds of actions an observation.
- Parameters
transformed_observation (
object
) – Anything that will be used to create an action. This is the results to the call ofAgentWithConverter.convert_obs()
. This is likely a numpy array.reward (
float
) – The current reward. This is the reward obtained by the previous actiondone (
bool
) – Whether the episode has ended or not. Used to maintain gym compatibility
- Returns
res – A representation of an action in any possible format. This action will then be ingested and formatted into a valid action with the
AgentWithConverter.convert_act()
method.- Return type
object
-
-
class
grid2op.Agent.
OneChangeThenNothing
(action_space, action_space_converter=None)[source]¶ This is a specific kind of BaseAgent. It does an BaseAction (possibly non empty) at the first time step and then does nothing.
This class is an abstract class and cannot be instanciated (ie no object of this class can be created). It must be overridden and the method
OneChangeThenNothing._get_dict_act()
be defined. Basically, it must know what action to do.-
abstractmethod
_get_dict_act
()[source]¶ Function that need to be overridden to indicate which action to perfom.
- Returns
res – A dictionnary that can be converted into a valid
grid2op.BaseAction.BaseAction
. See the help ofgrid2op.BaseAction.ActionSpace.__call__()
for more information.- Return type
dict
-
act
(observation, reward, done=False)[source]¶ This is the main method of an BaseAgent. Given the current observation and the current reward (ie the reward that the environment send to the agent after the previous action has been implemented).
- Parameters
observation (
grid2op.Observation.Observation
) – The current observation of thegrid2op.Environment.Environment
reward (
float
) – The current reward. This is the reward obtained by the previous actiondone (
bool
) – Whether the episode has ended or not. Used to maintain gym compatibility
- Returns
res – The action chosen by the bot / controler / agent.
- Return type
-
abstractmethod
-
class
grid2op.Agent.
PowerLineSwitch
(action_space)[source]¶ This is a
GreedyAgent
example, which will attempt to disconnect powerlines.It will choose among:
doing nothing
disconnecting one powerline
which action that will maximize the reward. All powerlines are tested.
-
_get_tested_action
(observation)[source]¶ Returns the list of all the candidate actions.
From this list, the one that achieve the best “simulated reward” is used.
- Parameters
observation (
grid2op.Observation.Observation
) – The current observation of thegrid2op.Environment.Environment
- Returns
res – A list of all candidate
grid2op.BaseAction.BaseAction
- Return type
list
-
class
grid2op.Agent.
RandomAgent
(action_space, action_space_converter=<class 'grid2op.Converter.IdToAct.IdToAct'>, **kwargs_converter)[source]¶ This agent acts randomnly on the powergrid. It uses the
grid2op.Converters.IdToAct
to compute all the possible actions available for the environment. And then chooses a random one among all these.-
my_act
(transformed_observation, reward, done=False)[source]¶ This method should be overide if this class is used. It is an “abstract” method.
If someone wants to make a agent that handles different kinds of actions an observation.
- Parameters
transformed_observation (
object
) – Anything that will be used to create an action. This is the results to the call ofAgentWithConverter.convert_obs()
. This is likely a numpy array.reward (
float
) – The current reward. This is the reward obtained by the previous actiondone (
bool
) – Whether the episode has ended or not. Used to maintain gym compatibility
- Returns
res – A representation of an action in any possible format. This action will then be ingested and formatted into a valid action with the
AgentWithConverter.convert_act()
method.- Return type
object
-
-
class
grid2op.Agent.
TopologyGreedy
(action_space, action_space_converter=None)[source]¶ This is a
GreedyAgent
example, which will attempt to reconfigure the substations connectivity.It will choose among:
doing nothing
changing the topology of one substation.
-
_get_tested_action
(observation)[source]¶ Returns the list of all the candidate actions.
From this list, the one that achieve the best “simulated reward” is used.
- Parameters
observation (
grid2op.Observation.Observation
) – The current observation of thegrid2op.Environment.Environment
- Returns
res – A list of all candidate
grid2op.BaseAction.BaseAction
- Return type
list
If you still can’t find what you’re looking for, try in one of the following pages: