Skip to content

A1. Base Experiment

Ezequiel Torres Feyuk edited this page Aug 12, 2017 · 39 revisions

The following appendix explain in detail the logic behind the base experiment. The base experiment is the one deployed when a new experiment is created. It is also known as Toy Problem (toy_problem). Before starting the explanation of the experiment, a brief description is given about the interface to implement in order to create a MLC experiment:

Experiment Format

Code Snippet N°1 shows the Evaluation file of an experiment. It has three functions which are explained in detail:

1  # Helper method                
2  def individual_data(indiv): 
3      # TODO: Implement experiment evaluation
4      ...
5      return x, y, b
6
7
8  def cost(indiv):
9      # Evaluate individual
10     x, y, b = individual_data(indiv)
11
12     # TODO: Calculate cost as a function of y and b
13     ...
14     return cost
15
16
17 def show_best(index, generation, indiv, cost, block=True):
18     # Evaluate individual
19     x, y, b = individual_data(indiv)
20
21     # TODO: Create a graphic with Matplotlib or Qt showing
22     # how good the individual is
   
Code Snippet N°1: Experiment Interface
  • individual_data (optional): Method responsible of evaluating the Individual received by parameter. It is recommended to separate the evaluation of the Individual from the Cost calculation since the evaluation must be used in the graph generation too. Since this function is not directly called by MLC, the input and output can be modified by the user if necessary.

  • cost (mandatory): Method that receives an Individual and returns the Cost associated to it. This function is explicitly called by MLC, so the method input and output parameters must not be changed. To calculate the *Cost, the Individual must be evaluated first. To do this, we recommend the usage of the function individual_data.

  • show_best (mandatory): Method that receives the best Individual (lower cost) of a generation and proceed to create a graph showing the quality of the solution found. This function is explicitly called by MLC, so the method input and output parameters must not be changed. This function is executed when the button Best Individual in the Results Tab window is clicked. It is responsibility of the user to create a graph that shows how good the function found by MLC is good or not. If the user doesn't want to implement this feature, just leave the function non-implemented.

Toy Problem

Code Snippet N°2 shows the code of the base experiment. The goal of the experiment is to find a well-known mathematical function (test function) in order to test the effectiveness of MLC as a pattern matching tool. The easier way to fulfill this goal is to find the best Individual to the test function, comparing the values of both functions point to point.

The lines 14 and 15 define the test function, an hyperbolic tangent, which has a third grade polynomial as a parameter. This function was chosen because MLC does not have the possibility to generate it. The reason for this is that, by default, polynomials are not available as one of the functions used by MLC (polynomials can be supported adding the power operation to the regular operators implemented by MLC via the property opsetrange). Taking this into consideration, Individuals created with the following experiment will be only an approximation of the actual function.

In the line 19 it can be seen as a noise component is added to the original test function. The idea behind adding a noise component to the signal is to emulate a real scenario which will be full of undesired stimulus. As a result of this, the noise component will try to persuade MLC from discover Individuals near desirable solution spaces.

In the line 30 the Individual is evaluated. MLC represents Individual as trees containing mathematical functions on its non-leaf nodes and constants or sensors into leaf nodes. The function get_tree() in Individual let the user access this tree (class LispTreeExpr) in order to replace sensors with valid values. In the toy_problem only one sensor has been configured. The value of the sensor is overridden passing a list of values to the method calculate_expression. Since we just want to compare two functions, the sensor represents the domain of the functions to be compared. Take into consideration that the amount of points and ranges used in the example are totally arbitrary. The result of the calculation is then stored in variable b (the name of this variable was chosen to follow the nomenclature defined in the section MLC Foundations).

As was previously explained, the function individual_data does not possess restrictions regarding the amount of input/output parameters it has. In this example, the function returns the domain of the function to be evaluated (x), the function to be matched (y), the function to be matched with noise (y_with_noise) and the Individual evaluated (b). This values are the ones used by the function cost to calculate the cost of the Individual (J).

1  import numpy as np
2  import MLC.Log.log as lg
3  import matplotlib.pyplot as plt
4  import random as rd
5  import sys
6  import time
7 
8  from MLC.mlc_parameters.mlc_parameters import Config
9  from PyQt5.QtCore import Qt
10
11
12  def individual_data(indiv):
13      SAMPLES = 201
14      x = np.linspace(-10.0, 10.0, num=SAMPLES)
15      y = np.tanh(x**3 - x**2 - 1)
16
17      config = Config.get_instance()
18      noise = config.getint('EVALUATOR', 'artificialnoise')
19      y_with_noise = y + [rd.random() - 0.5 for _ in xrange(SAMPLES)] + noise * 500
20
21      if isinstance(indiv.get_formal(), str):
22          formal = indiv.get_formal().replace('S0', 'x')
23      else:
24          # toy problem support for multiple controls
25          formal = indiv.get_formal()[0].replace('S0', 'x')
26
27      # Calculate J like the sum of the square difference of the
28      # functions in every point
29      lg.logger_.debug('[POP][TOY_PROBLEM] Individual Formal: ' + formal)
30      b = indiv.get_tree().calculate_expression([x])
31
32      # If the expression doesn't have the term 'x',
33      # the eval returns a value (float) instead of an array.
34      # In that case transform it to an array
35      if type(b) == float:
36          b = np.repeat(b, SAMPLES)
37
38      return x, y, y_with_noise, b
39
40
41  def cost(indiv):
42      x, y, y_with_noise, b = individual_data(indiv)
43
44      # Deactivate the numpy warnings, because this sum could raise an overflow
45      # Runtime warning from time to time
46      np.seterr(all='ignore')
47      cost_value = float(np.sum((b - y_with_noise)**2))
48      np.seterr(all='warn')
49
50      return cost_value
51
52
53  def show_best(index, generation, indiv, cost, block=True):
54      x, y, y_with_noise, b = individual_data(indiv)
55      cuadratic_error = np.sqrt((y_with_noise - b)**2 / (1 + np.absolute(x**2)))
56
57      fig = plt.figure()
58      # Put figure window on top of all other windows
59      fig.canvas.manager.window.setWindowModality(Qt.ApplicationModal)
60
61      plt.suptitle("Generation N#{0} - Individual N#{1}\n"
62                   "Cost: {2}\n Formal: {3}".format(generation,
63                                                    index,
64                                                    cost,
65                                                    indiv.get_formal()))
66      plt.subplot(2, 1, 1)
67      plt.plot(x, y, x, y_with_noise, '*', x, b)
68
69      plt.subplot(2, 1, 2)
70      plt.plot(x, cuadratic_error, '*r')
71      plt.yscale('log')
72      plt.show(block=block)
   
Code Snippet N°2: Base Experiment (toy_problem)

To conclude, the base experiment provides an implementation of the show_best function. The function generates two graphs that compare the best Individual found by MLC with the original function (y). The graph is plotted with the library matplotlib and is shown in the Figure N°1. The figures shown in the graph are documented below:

  • The first figure compares the evaluated Individual with the original function without noise and with the original function with the noise component, plotting the curves in the same canvas
  • The second figure shows the squared mean error between b and y point to point. It is interesting to note that the main differences between the original function and the one obtained are in the central part of the curve. With this information the user can take actions in order to let MLC create better functions. A good approach could be to increase the number of samples used between -1 and 1, giving more information to MLC about the changes in the original curve.
Best Individual
Figure N°1: Graphs created by the function show_best

Clone this wiki locally