14.6.3. Running Multiple Seed Values

Warning

This tutorial has not yet been updated for CellBlender 1.0 RC. Therefore, some things might not work exactly as described.

In MCell, diffusion (amongst other things) happen stochastically. However, the results are replicable as long as one provides the same seed value. Given this stochastic nature, you can expect the data generated from a simulation to look noisy, especially if the number of reacting molecules is small. We can overcome this problem by running many different simulations, each with a different seed value, and then averaging the results of all the simulations.

We’ll begin by creating a directory called seed. Inside it, create an MDL called seed.mdl with this text:

iterations = 1000
time_step = 5e-6
ITERATIONS = iterations
TIME_STEP = time_step

DEFINE_MOLECULES {
    vol1 {DIFFUSION_CONSTANT_3D = 1E-6}
    vol2 {DIFFUSION_CONSTANT_3D = 1E-6}
    vol3 {DIFFUSION_CONSTANT_3D = 1E-6}
}

DEFINE_REACTIONS {
    vol1 + vol2 -> vol1 + vol3 [1E7]
    vol1 + vol3 -> vol2 + vol3 [1E6]
}

INSTANTIATE World OBJECT {
    Cube BOX {CORNERS = [-0.1,-0.1,-0.1],[0.1,0.1,0.1]}
    vol1_rel RELEASE_SITE {
        SHAPE = World.Cube
        MOLECULE = vol1
        NUMBER_TO_RELEASE = 100
    }
    vol2_rel RELEASE_SITE {
        SHAPE = World.Cube
        MOLECULE = vol2
        NUMBER_TO_RELEASE = 100
    }
}

sprintf(seed,"%04g",SEED)

REACTION_DATA_OUTPUT {
    STEP=time_step
    {COUNT[vol1,WORLD]}=> "./react_data/vol1." & seed & ".dat"
    {COUNT[vol2,WORLD]}=> "./react_data/vol2." & seed & ".dat"
    {COUNT[vol3,WORLD]}=> "./react_data/vol3." & seed & ".dat"
}

All the syntax should be familiar except for the line sprintf(seed,”%04g”,SEED). This assigns the value of the SEED to the variable seed. The %04g formats it so that there are four leading zeros.

Next, create the following python script called run_seeds.py:

#!/usr/bin/env python

import os, sys

if len(sys.argv) >= 2:
    mdl = sys.argv[1]
else:
    mdl = raw_input('What mdl do you want to run?\n')

for i in range(1, 21):
    os.system("mcell -seed %i %s" % (i, mdl))

This file is similar to the scaling.py file that we created in the checkpointing section. This will run MCell twenty different times, each time incrementing the seed value by one. Save and run this file. You should now have sixty files in the react_data directory (20 for each molecule). Now we will begin the process of averaging these results. Create a python script called avg_seeds.py with the following text in it:

#!/usr/bin/env python

import numpy as np
import matplotlib.pyplot as plt
import os

startOfFileToAverage = "vol1"      # beginning of filenames to average
                                   # over

mol_counts = None
files = os.listdir('react_data')   # build a list of reaction data file names
files.sort()                       # sort that list alphabetically

for f in files:                    # iterate over the list of file names
    if f.startswith(startOfFileToAverage):
        rxn_data = np.genfromtxt("./react_data/%s" % f, dtype=float)
        rxn_data = rxn_data[:, 1]  # take the second column
        plt.plot(rxn_data, '0.5')  # plot the results as a gray line
        if mol_counts is None:
            mol_counts = rxn_data
        else:
            # built up 2d array of molecule counts (one col/seed)
            mol_counts = np.column_stack((mol_counts, rxn_data))
    else:
        pass

mol_counts = mol_counts.mean(axis=1)  # take the mean of the rows
plt.plot(mol_counts, 'r')             # plot the results as a red line
plt.show()                            # show the plot

This script will load (and plot) each of the twenty vol1.####.dat files into a two dimensional array, take the mean of the rows, and plot the results.

Run the first script by typing the following commands:

python run_seeds.py seed.mdl

Finally, run the second script by typing:

python avg_seeds.py