Plotting samples¶
To gain qualitative intuition about a dataset, it is common to visualize
trajectories among few samples. tunacell
provides a
matplotlib-based framework to visualize timeseries as well as the underlying
colony/lineage strutures arising from dividing cells.
Note
In order for the colour-code to work properly, matplotlib must be updated to a version >=2.
In this document we will describe how to use the set of tools defined in
tunacell.plotting.samples
.
We already saw in the 10 minute tutorial a simple plot of length vs. time in a colony from our numerical simulations. Here we will review the basics of plotting small samples in few test cases.
Note
If you cloned tunacell
repository, there are two ways of executing
quickly the following tutorial.
You may run the script plotting-samples.py
with the following command:
python plotting-samples.py -i --seed 951
The seed is used to select identical samples as the one printed below.
Alternatively it can be run from the root folder using the Makefile:
make plotting-demo
If you execute one of the commands above, there is no need to run the commands below. Follow the command line explanations and cross-reference it with the following commands to understand how it works. If you didn’t execute the commands above, you can run sequentially the commands below.
Contents
Setting up samples and observables¶
For plotting demonstration, we will create a numerically simulated experiment, where the dynamics is sampled on a time interval short enough for the colonies to be of reasonable size. Call from a terminal:
tunasimu -l simushort --stop 120 --seed 167389
In a Python script/shell, we load data with the usual:
from tunacell import Experiment, Parser, Observable, FilterSet
from tunacell.filters.cells import FilterCellIDparity
from tunacell.plotting.samples import SamplePlot
exp = Experiment('~/tmptunacell/simushort')
parser = Parser(exp)
np.random.seed(seed=951) # uncomment this line to match samples/plots below
parser.add_sample(10)
# define a condition
even = FilterCellIDparity('even')
condition = FilterSet(filtercell=even)
# define observable
length = Observable(name='length', raw='exp_ou_int')
ou = Observable(name='growth-rate', raw='ou')
We have defined two observables and one condition used as a toy example.
With these preliminary lines, we are ready to plot timeseries. The main object
to call is SamplePlot
, which accepts the following parameters:
samples
, an iterable overColony
orLineage
instances- the
Parser
instance used to parse data, - the list of conditions (optional).
We already saw how to define instances of the class Observable
.
Samples can be chosen samples, or random samples from the experiment. We will
review below the different cases with concrete examples from our settings.
We have 10 samples in our parser
, that have been chosen randomly.
Remember that they can also be specified on purpose with the container and
cell identifiers. Once stored in the parser object, they can be addressed by
their index in the table; to check the table of samples, call:
print(parser)
If you used the default settings, you should observe:
index container cell
------- ------------- ------
0 container_015 3
1 container_087 14
2 container_002 6
3 container_012 12
4 container_096 15
5 container_040 8
6 container_088 14
7 container_007 1
8 container_042 2
9 container_013 5
How to plot a colony sample¶
We start from the basic example initiated in the 10 minute tutorial:
colony = parser.get_colony(0) # any index between 0 and 9 would do
and we call our plotting environment:
colplt = SamplePlot([colony, ], parser=parser, conditions=[condition, ])
The first argument is an Observable
instance, the second the sample(s)
to be plotted, then it is more explicit. Conditions must be given as a list of
FilterSet
instances (the list can be left empty).
Using default settings¶
We start with the default settings and will inspect the role of each parameter:
colplt.make_plot(length)
The figure is stored as the fig
attribute of colplt
:
colplt.fig.show() # in non-interactive mode, colplt.fig in interactive mode
This kind of plot should be produced:
The default settings for a colony plot display:
- one lineage per row (it comes from keyword parameter
superimpose='none'
), - cell identifiers on top of each cell (
report_cids=True
), - container and colony root identifiers when they change,
- vertical lines to follow divisions (
report_divisions=True
).
Data points are represented by plain markers (show_markers=True
)
and with underlying, transparent connecting lines for visual help
(show_lines=True
).
Title of plot is made from the Observable.as_latex_string()
method.
Visualization of a given condition¶
The first feature we explore is to visualize whether samples verify a given
condition. To do so, use the report_condition
keyword parameter:
colplt.make_plot(length, report_condition=repr(condition))
Conditions are labeled according to their representation, this is why we used
the repr()
call.
Now the fig
attribute should store the following result:
Colouring options¶
Colour can be changed for distinct cells, lineages, colonies, or containers (given in order of priority), or not changed at all.
Changing cell colour¶
colplt.make_plot(length, report_condition=repr(condition), change_cell_color=True)
Changing lineage colour¶
colplt.make_plot(length, report_condition=repr(condition), change_lineage_color=True)
Superimposition options¶
The default setting is not to superimpose lineages. It is possible to change
this behaviour by changing the superimpose
keyword parameter. Some
keywords are reserved:
'none'
: do not superimpose timeseries,'all'
: superimpose all timeseries into a single row plot,colony
: superimpose all timeseries from the same colony, thereby making as many rows as there are different colonies in the list of samples,container
: idem with container level,
and when an integer is given, each row will be filled with at most that number of lineages.
For example, if we superimpose at most 3 lineages:
colplt.make_plot(length, report_condition=repr(condition), change_lineage_color=True,
superimpose=3)
Plotting few colonies¶
So far our sample was a unique colony. It is possible to plot multiples colonies in the same plot, that can be given as an iterable over colonies:
splt = SamplePlot(parser.iter_colonies(mode='samples', size=2),
parser=parser, conditions=[condition, ])
splt.make_plot(length, report_condition=repr(condition), change_colony_color=True)
Here we iterated over colonies from the samples defined in parser.samples
.
Now we will switch to the other observable, ou
, which is the instantaneous
growth rate:
splt3.make_plot(ou, report_condition=repr(condition), change_colony_color=True,
superimpose=2)
We can also iterate over unselected samples: iteration goes through container files:
splt = SamplePlot(parser.iter_colonies(size=5), parser=parser,
conditions=[condition, ])
splt.make_plot(ou, report_condition=repr(condition), change_colony_color=True,
superimpose=2)
To get an idea of the divergence of growth rate, it is better to plot all timeseries in a single row plot. We mask markers and set the transparency to distinguish better individual timeseries:
splt.make_plot(ou, change_colony_color=True, superimpose='all', show_markers=False,
alpha=.6)
Plotting few lineages¶
Instead of a colony, or an iterable over colonies, one can use a lineage or an iterable over lineages as argument of the plotting environment:
splt = SamplePlot(parser.iter_lineages(size=10), parser=parser,
conditions=[condition, ])
splt.make_plot(ou, report_condition=repr(condition), change_lineage_color=True,
superimpose='all', alpha=.6)
Adding reference values¶
One can add expectation values for the mean, and for the variance, to be plotted as a line for the mean and +/- standard deviations.
From the numerical simulation metadata, it is possible to compute the mean value and the variance of the process:
md = parser.experiment.metadata
# ou expectation values
ref_mean = float(md.target)
ref_var = float(md.noise)/(2 * float(md.spring))
and then to plot it to check how our timeseries compare to these theoretical values:
splt.make_plot(ou, report_condition=repr(condition), change_lineage_color=True,
superimpose='all', alpha=.5, show_markers=False,
ref_mean=ref_mean, ref_var=ref_var)
Adding information from computed statistics¶
We sill review the computation of the statistics in the next document, but we
will assume it has been performed for our observable ou
.
The data_statistics
option is used to display results of statistics, which
is useful when no theoretical values exist (most of the time):
splt.make_plot(ou, report_condition=repr(condition), change_lineage_color=True,
superimpose='all', alpha=.5, show_markers=False,
data_statistics=True)