tutorial

Tutorial

This section describes gro using a variety of examples that highlight its features. Studying these examples should teach you most of what you need to know to write your own gro programs. Later chapters on the precise definition of the gro language will make more sense after you write a few simple programs.

Growth

Letting gro grow

Probably the simplest gro program is the following:

include gro

set ( "dt", 0.1 ); // fast and inaccurate

program p() := {

  skip(); 

};

ecoli ( [ x := 0, y := 0 ], program p() );

This program is found in examples/growth.gro of the distribution. You can load it by choosing growth.gro from the Examples menu with the right mouse button. You might also want to open the program with an editor of some kind of editor. The program works as follows.

Line 1 includes the gro libary of functions (discussed later). For now, just put this at the top of all of your code.

On line 3 we set the size of the time step, dt, for the simulation¹. If you set dt very large, you will not get accurate simulations. If you set dt too small, you will have to wait a long time for your results. Try running gro with different values for dt.

Lines 5-9 declare a simple program p(). On line 11, the program is associated with a simulated E. coli cell initialized at the position (0,0) in the simulated environment. By default, gro defines 10 pixels in the window to be 1 μm.

Loading this program with the gro simulator places a single simulated cell in the middle of the gro window. When you run the program, by clicking Start from the sidebar or the menu, you will see that cell start to grow and divide. The simulation will run until you stop it, or until you get to 1000 cells, at which point gro stops simulating.

Here is what is happening under the hood:

A cell is assumed to be 1 μm in diameter and, initially, 2 μm long², so that it its volume is V = 1.57 fL (that's femtoliters).
The cell grows so that by default it doubles in size, to 3.14 fL, in 20 minutes. Using dV/dt = k V and solving for k gives the default rate of 0.034 fL / min. In the simulation, the volume is updated at each step according to V' = V + δ k V where δ is approximately dt (to model random fluctuations in the rate of growth, the actual update chooses δ randomly from a gamma distribution with mean dt).
The cell grows until it is approximately doubled in volume, at which point it divides. By default, the volume at which the cell divides is a random variable with mean 3.14 fL and variance 0.005 fL².
When the cell divides, it produces two cells of approximately the same size, but with a bit of randomness. One cell might get 51% or the old volume, the other 49%.
One of the cells is chosen arbitrarily to be the parent and the other the daughter. The program running on the parent is copied and the copy is associated with the daughter cell. The values of the variables in each program are adjusted as well - exactly how is described later.

Note: The simulation does not proceed in real time, or even with a consistent real time step. Because the number of cells increases exponentially, gro has to do exponentially more work as simulated time goes by, and so the simulation gets slower and slower. If you were to save a snapshot of the simulation periodically and then play the snapshots back, only then would you at least see a consistent real time step.

Controlling growth rate and division volume

If you don't like the default parameters, you can redefine them. Here is a variation on the program above that makes lots of little cells with a greater variation of division size.

program p() := {

  set ( "ecoli_division_size_mean", 2.0 );     // fL
  set ( "ecoli_division_size_variance", 0.02 ); // fL^2

};

Note that the set statements are only called when the first cell is initialized, which will make more sense after you see a few examples of programs that actually do something.

To determine the division time of a cell, gro uses does approximately the following. When a minimum volume is reached, it starts counting to n, with a probability of λdt of advancing the counter in each dt minutes. The minimum volume, n, and λ are chosen so that the mean and variance of the volume at division time match the desired parameters.

You might not like that model. That's okay, because you can code up your own. Here is an example of a simple division model.

program p() := {

  set ( "ecoli_division_size_mean", 1000 );

  rate 1 & volume > 3.14 : { 
    divide()
  }

};

Line 3 sets the mean division volume to be so huge that it will never be reached by the simulator. Lines 5-6 describe a guarded command. The part before the colon is the guard. The part after is the command. It states: when the volume of the cell is greater than 3.14 fL then there is probability of 1 dt that the cell will divide in the next simulation iteration.

Here is a more precise description:

The symbol rate is defined (in include/standard.gro) to be a function that takes an argument k and returns true if a uniformly generated random number is less than kdt, where dt is duration of a simulated timestep (0.05 minutes by default). This is the standard meaning of "rate" in stochastic chemical kinetics: an event occurs at rate k if the probability that it occurs in the next dt seconds is kdt.
The symbol volume is a variable defined by gro and set to the current volume just before the program p() is called at each simulated time step. You can try to set it, but the next time gro runs through p(), volume will have been redefined to be the (possibly new) current volume of the cell. A few other variables are made available in this way, and will be discussed later.
The function divide() tells gro to divide the cell and include the randomness associated with division (see the next section).

Chemostat Mode

Try adding

chemostat ( true );

to the top the growth.gro example. You will see that a simple chemostat arrangement shows up, reminiscent of the flow chamber used by the Hasty group with their quorum sensing circuits.

You might also notice some jitter in the simulation. This is because we set dt to 0.1. That's fine for micro-colony growth where there are fewer constraints on movement. If you dial dt down to 0.01, then the chemostat mode will look a bit better - although it will run pretty slowly.

Notes

¹ Because gro includes both stochastic low copy number reactions and continuous diffusion of signals (in later examples), the simulation is unfortunately not done using the SSA. Instead, the simulation is a vanilla Euler integration scheme with a time step of dt.

² This is on the large side for an E. coli, but believable accordining to bionumbers.

Molecules

Dilution

Suppose you start with a single cell with 1000 copies of GFP in it and also assume that, for whatever reason, the GFP is neither being produced nor degraded. As the cell grows, the GFP becomes more dilute. When it divides, some number of the GFP molecules end up in the parent, the rest in the daughter. After enough divisions, there will 0 or 1 copies of GFP in each cell.

To model this in gro, we can use the following code, which can be found in examples/dilution:

include gro

set ( "dt", 0.1 );

program dilute(m) := {

    gfp := m;

};

ecoli ( [ x := 0, y := 0 ], program dilute(1000) );

In line 5, the amount of GFP is initialized to m, which is an argument to the program. In line 9, the program is called with 1000 for its argument.

When this program is run, the cell starts out green. As it grows and divides, the amount of green per cell decreases exponentially until you can't see much green at all. This is because gro automatically distributes all numerical quantities in the cell between the parent and the daughter upon division.

To see how much GFP is in each cell, you can do something like the following:

program report() := {

  needs gfp;

  selected : { message ( 1, tostring(id) <> ": " <> tostring (gfp) ) }

};

program p(m) := dilute(m) + report() sharing gfp;

ecoli ( [ x := 0, y := 0 ], program p(1000) );

Here, we have defined a new program, report, that sends a message to the gro window whenever the cell is selected by the mouse. By default, selected cells are outlined in red. In particular, the variable selected is true whenever the cell is selected. When a selected cell divides, only the parent remains selected. The message function takes two arguments. The first is a position index (0, 1 or 2) for where to write the message. The second is a string. The <> operator stands for string concatenation. The variables id and gfp are numeric, so the tostring function is used to convert them to strings.

In line 11, we compose the report program with the dilute program to get a new program p(m). Note that we need to share the variable gfp and the report() program has to declare that it needs gfp to be initialized somewhere else. This is one way gro deals with parallelism - by using shared variables.

Try selecting the first cell and then running the program. You should see the number of GFP molecules in the cell printed in the window.

Production, dilution and degradation

Now let's add protein production and aim for 100 GFP molecules per cell at steady state. The cells are dividing every 20 minutes. Since the growth rate is 0.0345674 / min, producing protein at 100 times that amount should do it, assuming that the GFP doesn't degrade too fast (below we set it to 0.001). The gro code is as follows:

program make_gfp ( k1, k2, m ) := {

    // initialize gfp copy number
    gfp := m;

    // produce gfp
    rate ( k1 ) : { gfp := gfp + 1 }

    // degrade gfp
    rate ( k2 * gfp ) : { gfp := gfp - 1 }

};

// compute production rate needed for 100 copies of gfp per cell
alpha := - log ( 0.5 ) / 20.0; // dilution rate
k1 := 100 * alpha;             // production rate

// make a new cell
ecoli ( [ x := 0, y := 0 ], program make_gfp ( k1, 0.001, 0 ) );

The average volume of a cell in this simulation is about 2.36 fL so that the concentration of gfp should approach [GFP]^* = 42.4 proteins / fL. The plot below shows that this is indeed the case, on average - although there is quite a bit of noise.

By the way, to make the figure above, you can do something like the following. First, define an output function that writes data to standard output (i.e. the terminal) every delta seconds:

program output(delta) := {

    needs gfp;
    p := [ t := 0, s := 0 ];

    true : {
        p.t := p.t + dt,
        p.s := p.s + dt
    }

    p.s >= delta : {
        print ( id, ", ", p.t, ", ", gfp / volume, "\n" ),
        p.s := 0
    }
};

program p() := make_gfp ( k1, 0.001, 0 ) +  output(50) sharing gfp;

ecoli ( [ x := 0, y := 0 ], program p() );

This program raises a interesting issue. To keep track of time, the program needs a couple of variables that it continuously increments and checks. The problem is that gro will chop numerical variables roughly in half when the cell divides. The solution is to put the time variables, s and t, in a record, as in line 4. Since gro doesn't chop records in half, s and t are safe and sound. When the cell divides, the record p and its contents are simply copied to the daughter cell.

To store the data that this program generates, you can copy it from the sub-window at the bottom of the gro GUI and save it somewhere. Then you will have a comma separated value (csv) file with the data in it that you can deal with however you want. I loaded it up in Mathematica and wrote a bit of code to display it. To print directly to a file you can use fopen and fprint. To open a file and get a file pointer, do

fp := fopen ( "path" );

near the top of your code, and outside of any gro programs. Then replace the print statement above with

fprint ( fp, id, ", ", p.t, ", ", gfp / volume, "\n" )

Dogma

Building on the previous example, we now introduce a bit more reality into the gene expression model by adding messenger RNA. We can then encode the central dogma of biology: That DNA makes RNA and RNA makes protein.

By the way, when you start gro without any arguments, it opens the program "examples/gfp.gro" by default. This program is a model of the genetic construct synthetic biologists first build in their introductory classes. Here is the program.

include gro

set ( "dt", 0.01 );

alpha_r := 69.4 / 2.35;         // mRNA / min / fL 
beta_r := - log ( 0.5 ) / 3.69; // 1/min
alpha_p := 3.0;                 // protein/min/fL/RNA
beta_p := 0.01;                 // 1/min

program gfp() := {

    mRNA := 0;
    gfp := 0; 

    rate ( alpha_r * volume ) : { mRNA := mRNA + 1 };
    rate ( beta_r * mRNA ) :    { mRNA := mRNA - 1 };
    rate ( alpha_p * mRNA ) :   { gfp := gfp + 1 };
    rate ( beta_p * gfp ) :     { gfp := gfp - 1 };

};

set ( "gfp_saturation_max", 1000 );
set ( "gfp_saturation_min", 800 );

ecoli ( [ x := 0, y := 0 ], program gfp() );

Here is an explanation of the choices of rates. Note that this model is not intended to be super accurate. I am sure you can do better. But hopefully it is in the ballpark. More importantly, you should be able to see that you can use gro to encode this process at whatever level of detail you like.

mRNA Production: RNA is transcribed in an E. coli at about 85 nt / second¹. The gfp gene with non-coding parts, is about 800 nt long, so that's 69.4 mRNA / min. I assume the numbers are for an average sized cell (2.35 fL), so we want 2.35 alpha_r = 69.4.
mRNA Degradation: The half life of an mRNA is reported to be 3.69 minutes in one paper² we found. Solving for the rate gives beta_r.
Protein Production: According to another paper² we found, it takes about 80 seconds to translate lacZ, which is about four times as big as GFP. So, if it takes 20 seconds for GFP, that's 3 per minute per mRNA.
Protein Degradation: In our experience, GFP sticks around for a long time. So we choose a small number for the degradation rate.

Notice that in this simulation we set dt=0.01 and that the simulation runs more slowly. How do you decide where to put dt? In this program, there is a rate of 69.4 mRNA / min on average that should be produced. If dt were 0.1 min, then the guard on line 15 would only get called 10 times a minute, and so could not actually produce enough mRNA each minute. With dt = 0.01 the guard gets called 100 times a minute and the rate function ensures that it comes up true about 69.4 percent of the time. An even smaller dt will do an even better job.

Another thing to notice that on lines 22-23 we set the way that the green channel of cells are rendered. By running the simulation a couple of times, we see that we get about 900 gfp molecules per fL. To see the variation better, line 22 maps 1000 gfp / fL and above to the maximum intensity in the green channel, and maps 800 gfp / fL and below to the minimum. This is akin to tweaking the gamma value on your microscope software.

Notes

¹ Bremer, H., Dennis, P. P. (1996)

² Bernstein JA, Lin PH, Cohen SN, Lin-Chao S. (2004)

³ Solomovici J, Lesnik T, Reiss C. (1997)

Signals

When you declare a signal, you specify its diffusion rate and its degradation rate. Then gro sets up a finite element model of the environment with a (hard coded for now) 160x160 grid with each element of the grid 5 pixels = 0.5 μm wide. At each time step, gro numerically integrates the rate of change of concentration of the signal molecule in each grid point. For now, the units are not well-defined -- although you can associate units with numbers you choose.

A Concentration bandpass filter

As an example, the code below shows how to make something like a concentration bandpass filter¹. This example is in examples/bandpass.gro with a few more bells and whistles.

include gro

set ( "dt", 0.1 );

ahl := signal ( 1, 0.01 ); // args = diffusion rate and degradation rate

fun f a . 0.1 < a & a < 0.6;

program sensor() := {

  rfp := 0.0;

  f ( get_signal ( ahl ) ) : { rfp := rfp + 1 };
  rate ( 0.01 * rfp ) : { rfp := rfp - 1 }

};

set ( "rfp_saturation_max", 50 );
set ( "rfp_saturation_min", 0 );

ecoli ( [ x:= 0, y:= 0 ], program sensor() );

program main() := {

  // hold the concentration at (x,y)=(0,0) at 10
  true : { set_signal ( ahl, 0, 0, 10 ) }

};

In line 5, we declare a new signal with diffusion rate 1 and degradation rate 0.01. The return value of the signal function is an integer that you should use when referring to the same signal later. You can declare as many signals as you want to. In line 7, we declare a function that returns true if its argument is between 0.1 and 0.6 -- this is the band of concentration we want. In the program on line 13, we produce rfp if get_signal(ahl) returns a concentration in the band that we defined.

This program also introduces another feature of gro, which is the main program. If you define a program called main, gro will call it once every time it updates. This can be very useful for controlling the simulation (starting, stopping, saving or analyzing data, etc). In this case, we use the main program to hold the value of the signal at 10 (concentration units). If we call set_signal just once, the signal concentration there would quickly decrease due to diffusion and degradation.

The wave

We define an oscillator in the leader program. It basically consists of a timer that increments each time through the program. When it reaches a pre-set limit, it emits a bunch of signal, and then resets the counter. Next, we define the follower to have two modes. In mode 0, the cells are waiting for the signal level to go above some threshold. When it does, the cells emit a signal, and go into mode 1, where they sit in a refractory mode for a while. If they didn't have this second mode, they would respond to their own pulse, which would make a poor wave. The program final looks like this:

include gro

set ( "dt", 0.075 );

ahl := signal ( 1, 1 );

program leader() := {
 
  p := [ t := 2.4 ];
  set ( "ecoli_growth_rate", 0.00 );

  true : { p.t := p.t + dt }

  p.t > 10 : { 
    emit_signal ( ahl, 100 ), 
    p.t := 0 
  }

};

program follower() := { 

  p := [ mode := 0, t := 0 ];
  set ( "ecoli_growth_rate", 0.04 );

  p.mode = 0 & get_signal ( ahl ) > 0.01 : { 
    emit_signal ( ahl, 100 ),
    p.mode := 1,
    p.t := 0
  }

  p.mode = 1 : { p.t := p.t + dt }

  p.mode = 1 & p.t > 9 : { p.mode := 0 }

};

ecoli ( [ x:= 0, y:= 0 ], program leader() );
ecoli ( [ x:= 0, y:= 10 ], program follower() );

A few other details: We set the growth rate of the leader to 0.0 so there is not more than one cell starting waves. Also note the we called the ecoli function twice. In fact, you can call it as many times as you want, with the same program or different programs each time.

Notes

¹ Basu et al. made something like this, but at the macro level with lawns of bacteria: Basu, 2005.

Evolution

To encode evolution in gro we need to be able to model mutation. To do this, we use the daughter variable. This variable is true in the daughter cell only in the time step immediately after the cell has divided. At that point, we can take a parameter from the parent cell and "mutate" it to pass on to the daughter cell. This same parameter will then get mutated again and again as the cells divide, allowing the population to explore the space of possible values.

Here is a gro program that demonstrates this idea. The parameter that is mutated is the production rate k of the enzyme.

include gro

chemostat ( true );
set ( "dt", 0.075 );

nutrient := 1;
kinit := 0.25;
dk := 0.05;

fun cost e n . 0.2 * e * n / ( 50.0 + n );
fun benefit e n . 0.002 * e / ( 1.0 - 0.01 * e );
fun fitness e n . cost e n - benefit e n;

program evolver() := {

  p := [ k := kinit ];
  E := 25;
  t := 0;

  rate ( p.k * volume ) : { E := E + 1 }
  rate ( 0.05 * E ) : { E := E - 1 }

  true : { set ( "ecoli_growth_rate", 0.001 + fitness E nutrient ), t := t + dt }

  daughter : {
   p.k := p.k + dk * ( rand ( 1000 ) - 500 ) / 1000.0
  }

};

ecoli ( [ x := 0, y := 0 ], program evolver() );

Lines 9-11 define the fitness function, using the same functional form as in the Alon paper¹, but with pretty much arbitrary values chosen so that the resulting growth rate falls within a reasonable range for gro. (Of course, you can figure out the optimal value using some calculus, but hey: why not do it 1000 times more slowly with a graphical simulation tool?).

Nick reminds me that the parent and the daughter would both possibly have mutated after division. So one might want to substitute the just_divided for daughter. This variable is true for each cell in the timesetp just after division. -ek

Line 22 is where the growth rate is adjusted based on the amount of enzyme and nutrient. Thus, the growth rate fluctuates considerably, which may or may not be realistic.

Besides the daughter variable, the only new featured introduced in this example is the rand function, which returns a random integer between 0 and its argument minus one.

Notes

¹Dekel, E. and Alon, U., 2005.

Global Control

Suppose your cells are programmed to respond to a small molecule inducer, such as IPTG or Arabinose. In many flow chambers you can control the concentration of inducers by switching media sources. You can simulate this in grow using global variables and a main program as follows.

include gro

iptg := 0;

program p() := {

  gfp := 0;

  rate ( 1 + 10 * iptg / ( 1 + iptg ) ) : { gfp := gfp + 1 };
  rate ( 0.001 * gfp ) : { gfp := gfp - 1 };

};

program main() := {

  t := 0;

  true : { t := t + dt };

  t > 50 : {
    t := 0,
    iptg := 1.0 - iptg,
    clear_messages(1),
    message ( 1, "IPTG at " <> tostring(iptg) <> "uM/L" )
  };

};


ecoli ( [], program p() );

Line 3 initializes a global variable, iptg. This variable is accessible to programs, changable by programs, etc. On line 9, the rate of GFP production is a function of the global variable iptg.

To change the levels of iptg, lines 14 to 27 define a main() program. If you define a main() program, gro will call the main program once every cycle, after updating the programs associated with any cells. The main() program is not supposed to be associated with any cell, but rather should be used to control the simulation in general. To communicate between main() and other programs, the easiest thing to do is to use global variables, as is done above.

Simulation Control

You can also use a main function to start and stop simulations. For example, suppose you want to run 50 simulations of growth, each 150 simulated minutes long, and count the number of cells at the end of the each simulation (which will be random). You could do something like the following:

include gro

num_cells := 1;

program p() := {

  daughter : { num_cells := num_cells + 1 }

};

program main() := {

  mode := 0;
  t := 0;
  n := 0;

  mode = 0 & n < 50 : {
    ecoli ( [], program p() ),
    num_cells := 1,
    mode := 1
  }

  mode = 0 & n = 50 : {
    exit()
  }

  mode = 1 & t > 150 : {
    print ( "simulation ended with ", num_cells, " cells\n" ),
    reset(),
    t := 0,
    n := n + 1,
    mode := 0
  }

  true : { t := t + dt }

}

This program shows the use of the reset() function, which erases all cells and resets time. Alse note the use of exit(), which completely exits gro. And finally, note how the global variable num_cells is shared among all the cells running p() and also with main. This is an easy way for you to gather information from the programs as they are running.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tutorial

Tutorial

Growth

Letting gro grow

Controlling growth rate and division volume

Chemostat Mode

Notes

Molecules

Dilution

Production, dilution and degradation

Dogma

Notes

Signals

A Concentration bandpass filter

The wave

Notes

Evolution

Notes

Global Control

Simulation Control

Uh oh!

Uh oh!

Clone this wiki locally