Implementation Tips

The Main CogsAgent.cs Script

  • CogsAgent.cs is the script that your own script extends, and therefore inherits functionality from. This means that many values and functions stored in this script will be accessible to you
  • Make sure you read through the script and all the descriptions so you know everything you have access to
  • In general, variables and functions marked protected can be directly accessed in your script (by just typing out its name). Variables that are marked private, on the other hand, are not directly accessible
      • Some private variables can be accessed via getter functions, which are all placed under GETTERS in CogsAgent. For example, moveSpeed cannot be accessed directly, but its value can be obtained using GetMoveSpeed(). This means moveSpeed cannot be modified, but its value can still be used in any comparison or calculation you want
  • Public and protected functions can be called from your script, except for Monobehavior functions (such as Start() and FixedUpdate()) and OnCollisionEnter() and OnTriggerEnter(), which already have function definitions in your script


ML Agents Components

  • ML Agents components that specify properties of your neural network are attached to your robot prefab (under Resources/MyGroupName/[MyGroupName prefab]). These values can be directly adjusted by choosing the prefab and editing the value in the Inspector window. The 2 scripts of interest are:
    • Behavior Parameters: discussed in lab 5. More information can be found here
    • Ray Perception Sensor 3D: a way to add raycasts as observations more easily. Here is a brief tutorial of what the component is and what the modifiable values are
  • To modify network inputs and outputs, look at the agent functions provided in your script
    • CollectObservations(): more values can be added here as input to the network. Hint: look at variables you have access to from CogsAgent and think about what information would be useful to add here
    • Heuristic(): for directly modifying network output with keyboard input. This function might need to be modified if changes are made to the output array size and values
    • OnActionReceived(): modify what to do with the network output. If you wish to change what action each output corresponds to, this function will need to be modified

Adding Rewards

In this setup, rewards can be added in two ways:

  • Using AddReward() or SetReward() inside code written in your script. For example, you can call AddReward() in the OnCollisionEnter() to add a certain amount of reward when the robot collides with a specific object
    • Note that AddReward() will add the specified amount to your cumulative reward, while SetReward() will reset your total reward to the value you specify
  • Assigning reward values to rewardDict, with initial values specified in the helper AssignBasicRewards()
    • Due to the code for the robot’s basic functionalities being hidden in CogsAgent, rewardDict is used to store key-pair values to specify rewards for some of the robot’s basic actions
      • “frozen”: when the robot is frozen i.e. it is hit by a laser
      • “shooting-laser”: when the robot’s laser is being shot
      • “hit-enemy”: when the robot’s laser hits the enemy
      • “dropped-one-target”: for each target dropped by the robot when hit by a laser e.g. if the specified value is -1f, then if the target is carrying 3 targets when it is hit, a reward of -3f will be applied
      • “dropped-targets”: one-time reward applied when the robot drops its targets after being hit by laser e.g. if the specified value is -1f, then when the robot is hit while carrying at least 1 target, a reward of -1f will be applied regardless of the actual number of targets that were carried
    • If you wish to do a one-time assignment for these rewards, simply adjust them in AssignBasicRewards() to the desired rewards for each action
    • If instead, you want to adjust these values during run time, use the same format in AssignBasicRewards() to change the desired reward values
      • Note: when these values are changed, they won’t be reset. For instance, if in AssignBasicRewards(), “frozen” is assigned a reward of -1f, then the value was later changed during runtime to -2f, then -2f will be the reward value for “frozen” for the remainder of the training, carrying over to subsequent episodes

Counting enemy’s captured targets

There is an easy way of finding this information by looking at each of the targets and seeing where it is. You can copy this function into your agent’s script. Note that it can be easily changed to determine what your enemy is carrying too…

    private int GetEnemyCaptured(){
        int enemyCaptured = 0;
        int enemyCarry = 0;
        int available = 0;
        foreach(GameObject target in targets){

            int inBase = target.GetComponent().GetInBase();
            int carried = target.GetComponent().GetCarried();

            //if they are not in your base, and not in no one's base, they must be in the enemy base!
            if (inBase != team && inBase != 0){
                enemyCaptured += 1;


        return enemyCaptured;