|
| 1 | +# Use MAES Optimization to Find the Best Solution |
| 2 | + |
| 3 | +Covariance Matrix Adaptation Evolution Strategy(CMA-ES) is a type of evolutionary algorithm. See [Wikipedia Page](https://en.wikipedia.org/wiki/CMA-ES) for details. |
| 4 | + |
| 5 | +To use our codes, you just need to understand the overall process. |
| 6 | +1. Initialize the optimzier with initial guess, and get the first generation of children from the optimizer. |
| 7 | +2. Tell the optimizer how good each child in this generation is by asisgn a value to each of them. |
| 8 | +3. The optimizer will update its covariance matrix based on how good those children are, and generate the next generation of children. |
| 9 | +4. Repeat 2 and 3 until the you are satisfied with the result, or until you give up. |
| 10 | + |
| 11 | +We provided some helper script in Unity for you to use MAES optimization without doing too much coding, whether you are using Unity's ML-Agents or not. |
| 12 | + |
| 13 | +If you want to use the low level optimizers directly, check `LMMAES`, `MAES` classes and `IMAES` interface. |
| 14 | + |
| 15 | +## Use ESOptimizer.cs |
| 16 | +Example scene: `UnityTensorflow/Examples/IntelligentPool/BilliardMAESOnly-OneShot-UseMAESDirectly`. |
| 17 | + |
| 18 | +`ESOptimizer.cs` is a helper script that you can attach to a GameObject and use it easily. Here are the steps: |
| 19 | +1. Attach a `ESOptimizer.cs` to any GameObject, and set the parameters in inspector as you want(See [MAES parameters](#maes-parameters) for their meaning). |
| 20 | +2. Implement `IESOptimizable` interface for the AI agent you want to optmizer. |
| 21 | +```csharp |
| 22 | +public interface IESOptimizable { |
| 23 | + |
| 24 | + /// <summary> |
| 25 | + /// Evaluate a batch of params. |
| 26 | + /// </summary> |
| 27 | + /// <param name="param">Each item in the list is a set of parameters.</param> |
| 28 | + /// <returns>List of values of each parameter set in the input</returns> |
| 29 | + List<float> Evaluate(List<double[]> param); |
| 30 | + |
| 31 | + /// <summary> |
| 32 | + /// Return the dimension of the parameters |
| 33 | + /// </summary> |
| 34 | + /// <returns>dimension of the parameters</returns> |
| 35 | + int GetParamDimension(); |
| 36 | +} |
| 37 | +``` |
| 38 | +Note that the `Evaluate` method above should be a batch operation. Each item in the input list is one child and you need to return the values of all children in the input list. |
| 39 | + |
| 40 | +3. Call one of the following two methods based on your need: |
| 41 | +```csharp |
| 42 | + /// <summary> |
| 43 | + /// Start to optimize asynchronized. It is actaually not running in another thread, but running in Update() in each frame of your game. |
| 44 | + /// This way the optimization will not block your game. |
| 45 | + /// </summary> |
| 46 | + /// <param name="optimizeTarget">Target to optimize</param> |
| 47 | + /// <param name="onReady">Action to call when optmization is ready. THe input is the best solution found.</param> |
| 48 | + /// <param name="initialMean">initial mean guess.</param> |
| 49 | + public void StartOptimizingAsync(IESOptimizable optimizeTarget, Action<double[]> onReady = null, double[] initialMean = null) |
| 50 | +``` |
| 51 | + or |
| 52 | + |
| 53 | +```csharp |
| 54 | + /// <summary> |
| 55 | + /// Optimize and return the solution immediately. |
| 56 | + /// </summary> |
| 57 | + /// <param name="optimizeTarget">Target to optimize</param> |
| 58 | + /// <param name="initialMean">initial mean guess.</param> |
| 59 | + /// <returns>The best solution found</returns> |
| 60 | + public double[] Optimize(IESOptimizable optimizeTarget, double[] initialMean = null) |
| 61 | +``` |
| 62 | + |
| 63 | + |
| 64 | +## Use MAESDecision for ML-Agents |
| 65 | +Example scenes: Under `UnityTensorflow/Examples/IntelligentPool/BilliardSLAndMAES-xxxx`, . |
| 66 | + |
| 67 | +We also have a `DecisionMAES` class which implements [AgentDependentDecision](AgentDependentDeicision.md) using MAES. If your agent has implemented `IESOptimizable`, you can just attach `DecisionMAES.cs` to your agent and use it for [PPO](Training-PPO.md) or [Supervised Learning](Training-SL.md). |
| 68 | + |
| 69 | +## TrainerMAES.cs |
| 70 | +This is deprecated. But you can still use it. Just use `AgentES` as base class instead of `Agent`, and use TranerMAES as the Trainer for the CoreBrainInternalTrainable. |
| 71 | + |
| 72 | +Example scene: `UnityTensorflow/Examples/IntelligentPool/BilliardMAESOnly-OneShot-UseTrainer`. |
| 73 | + |
| 74 | +## ESOptimizer parameters |
| 75 | +The explanation of paramters that you can change in inspecotr of ESOptimizer.cs |
| 76 | +- `iterationPerUpdate`: When use asynchronized optimization, the number of generation per frame. Adjust this depending on your speed of evaluation. |
| 77 | +- `populationSize`: Number of children in each generation. |
| 78 | +- `optimizerType`: MAES or LMMAES(Limitted Memory MAES). For small parameter dimension, use MAES and for larger one use LMMAES. |
| 79 | +- `initialStepSize`: Initial s variance of the children. |
| 80 | +- `mode`: Maximize or minimize the value. |
| 81 | +- `maxIteration`: The optimizer will automatically stop if reaches the max iteration. |
| 82 | +- `targetValue`: The optimizer will automatically stop if the best solution reaches the target value. |
| 83 | +- `evalutaionBatchSize`: What is the max batch size when evaluting. Might speed up the evaluation if your `IESOptimizable`'s `Evaluate(List<double[]> param)` method performs better for batch evalutation. |
0 commit comments