Filters¶
Outliers may have escaped segmentation/tracking quality control tools and thus there might be a need to apply further filtering when analysing their output data, as tunacell does. For example, filamentous cells may have been reported in data, but one might exclude them from a given analysis.
tunacell provides a set of user-customisable filters that allows user to define properly the statistical ensemble of samples over which its analysis will be performed.
In addition to removing outliers, filters are also used for conditional analysis, as they allow to divide the statistical ensemble in sub-populations of samples that verify certain rules.
Series of filters have already been defined for each of the following types:
cell, lineage, colony, and container. In addition boolean operations
AND, OR, NOT can be used within each type. Then filters of different types are
combined in FilterSet
instances: one is used to define the statistical
ensemble (remove outliers), and optionnally, others may be used to create
sub-populations of samples for comparative analyses.
Contents
How individual filters work¶
Filters are instances of the FilterGeneral
class.
A given filter class is instantiated (possibly) with parameters, that define
how the filter work.
Then the instantiated object is callable on the object to be filtered.
It returns either True
(the object is valid) or False
(the object is rejected).
Four main subclasses are derived from FilterGeneral
, one for each
structure that tuna recognizes: FilterCell
for Cell
objects,
FilterTree
for Colony
objects, FilterLineage
for
Lineage
objects, FilterContainer
for Container
objects.
Example: testing the parity of cell identifier¶
The filter FilterCellIdparity
has been designed for illustration:
it tests whether the cell identifier is even (or odd).
First we set the filter by instantiating its class with appropriate parameter:
>>> from tunacell.filters.cells import FilterCellIDparity
>>> filter_even = FilterCellIDparity(parity='even')
For this filter class, there is only one keyword parameter, parity
,
which we have set to 'even'
: accept cells with even identifier, rejects
cells with odd identifier.
First, we can print the string representation:
>>> print(str(filter_even))
CELL, Cell identifier is even
The first uppercase word in the message reminds the type of objects the filter is acting upon. Then the message is a label that has been defined in the class definition).
We set two Cell
instances, one with even identifier, and one odd:
>>> from tunacell.base.cell import Cell
>>> mygoodcell = Cell(identifier=12)
>>> mybadcell = Cell(identifier=47)
Then we can perform the test over both objects:
>>> print(filter_even(mygoodcell))
True
>>> print(filter_even(mybadcell))
False
We also mention another feature implemented in the representation of such filters:
>>> print(repr(filter_even))
FilterCellIDparity(parity='even', )
Such representation is the string one would type to re-instantiate the filter. This representation is used by tuna when a data analysis is exported to text files. Indeed, when tuna reads back this exported files, it is able to load the objects defined in the exported session. Hence, no need of remembering the precise parameters adopted on a particular analysis: if it’s exported, it can be loaded later on.
Creating a new filter¶
Few filters are already defined in the following modules:
tunacell.filters.cells
for filters acting on cells,tunacell.filters.lineages
for filters acting on lineages,tunacell.filters.trees
for filters acting on colonies,tunacell.filters.containers
for filters acting on containers.
Within each type, filters can be combined with boolean operations (see below), that allows user to explore a range of filters. However a user may need to define its own filter(s), and he/she is encouraged to do so following the general guidelines:
- define a
label
attribute (human-readable message, which was'Cell identifier is even'
in our previous example), - define the
func()
method that performs the boolean testing.
From the module tunacell.filters.cells
we copied below the class definition
of the filter used in our previous example:
class FilterCellIDparity(FilterCell):
"""Test whether identifier is odd or even"""
def __init__(self, parity='even'):
self.parity = parity
self.label = 'Cell identifier is {}'.format(parity)
return
def func(self, cell):
# test if even
try:
even = int(cell.identifier) % 2 == 0
if self.parity == 'even':
return even
elif self.parity == 'odd':
return not even
else:
raise ValueError("Parity must be 'even' or 'odd'")
except ValueError as ve:
print(ve)
return False
Although this filter may well be useless in actual analyses, it shows how to define a filter class. Also have a look at filters defined in the above-mentioned modules.
How to combine individual filters together with boolean operations¶
Filters already implemented are “atomic” filters, i.e. they perform one testing operation. It is possible to combine many atomic filters of the same type (type refers to the object type on which filter is applied: cell, lineage, colony, container) by using Boolean filter types.
There are 3 of them, defined in tuna.filters.main
: FilterAND
,
FilterOR
, FilterNOT
. The first two accepts any number of
filters, that are combined with the AND/OR logic respectively; the third accepts
one filter as argument.
With these boolean operations, complex combinations of atomic filters can be created.
How to define a FilterSet
instance¶
So far we saw how to use filters for each type of structures, independently: cell, lineage, colony, and container.
The FilterSet
registers filters to be applied on each of these types.
It is used to define the statistical ensemble of valid samples, or to define
a condition (rules to define a sub-population from the statistical ensemble).
Explicitely, if we would like to use our filter_even
from our
example above as the only filter to make the statistical ensemble, we would
define:
from tunacell.filters.main import FilterSet
fset = FilterSet(filtercell=filter_even)
(the other keyword parameters are filterlineage
, filtertree
, and
filtercontainer
)