Nitime: an overview¶
Nitime can be used in order to represent, manipulate and analyze data in time-series from experimental data. The main intention of the library is to serve as a platform for analyzing data collected in neuroscientific experiments, ranging from single-cell recordings to fMRI. However, the object-oriented interface may match other kinds of time-series.
In the Tutorial, we provide examples of usage of the objects in the library and of some basic analysis.
Here, we will provide a brief overview of the guiding principles underlying the structure and implementation of the library and the programming interface provided by library.
We will survey the library their attributes and central functions and some possible example use-cases.
Design Principles¶
The main principle of the implementation of this library is a separation between representation of time-series and the analysis of time-series. Thus, the implementation is divided into three main elements:
- Base classes for representation of time and data: These include objects representing time (including support for the representation and conversion between time-units) and objects that serve as containers for data: representation of the time-series to be analyzed. These base classes will be surveyed in more detail in the Base classes
- Algorithms for analysis of time-series: A library containing implementations of algorithms for various analysis methods is provided. Importantly, this library is intentionally agnostic to the existence of the library base-classes. Thus, users can choose to use these algorithms directly, instead of relying on the base-classes provided by the library
- Analyzer interfaces: These objects provide an interface between the algorithm library and the time-series objects. Each one of these objects calls an algorithm from the algorithms These objects rely on the details of the implementation of the time-series objects. The input to these classes is usually a time-series object and a set of parameters, which guide the analysis. Some of the analyzer objects implement a thin interface (or ‘facade’) to algorithms provided by scipy.signal.
This principle is important, because it allows use of the analysis algorithms at two different levels. The algorithms are more general-purpose, but provide less support for the unique properties of time-series. The analyzer objects, on the other hand, provide a richer interface, but may be less flexible in their usage, because they assume use of the base-classes of the library.
This structure also makes development of new algorithms and adoption of analysis code from other sources easier, because no specialized design properties are required in order to include an algorithm or set of algorithms in the algorithm library. However, once algorithms are adopted into the library, it requires that additional development of the analyzer object specific for this set of algorithms be implemented as well.
Another important principle of the implementation is lazy initialization. Most attributes of both time-series and analysis objects are provided on a need-to-know basis. That is, initializing a time-series object, or an analyzer object does not trigger any intensive computations. Instead the computation of the attributes of analyzer objects is delayed until the moment the user calls these attributes. In addition, once a computation is triggered it is stored as an attribute of the object, which assures that accessing the results of an analysis will trigger the computation only on the first time the analysis resut is required. Thereafter, the result of the analysis is stored for further use of this result.
Base classes¶
The library has several sets of classes, used for the representation of time and of time-series, in addition to classes used for analysis.
The first kind of classes is used in order to represent time and inherits from
np.ndarray
, see Time. Another are data containers, used
to represent different kinds of time-series data, see
Time-series A third important kind are analyzer objects. These
objects can be used in order to apply a particular analysis to time-series
objects, see analyzer_objects
Time¶
Experimental data is usually represented with regard to relative time. That is, the time relative to the beginning of the measurement. This is in contrast to many other kinds of data, which are represented with regard to absolute time, (one example of this kind of time is calendaric time, which includes a reference to some common point, such as 0 CE, or Jan. 1st 1970). An example of data which benefits from representation with absolute time is the representation of financial time-series, which can be compared against each other, using the common reference and for which the concept of the work-week applies.
However, because most often the absolute calendar time of the occurrence of events in an experiment is of no importance, we can disregard it. Rather, the comparison of the time progression of data in different experiments conducted in different calendar times (different days, different times in the same day) is more common.
The underlying representation of time in nitime
is in arrays of dtype
int64
. This allows the representation to be immune to rounding errors
arising from representation of time with floating point numbers (see
[Goldberg1991]). However, it restricts the smallest time-interval that can be
represented. In nitime
, the smallest discrete time-points are of size
base_unit
, and this unit is picoseconds. Thus, all underlying
representations of time are made in this unit. Since for most practical uses,
this representation is far too small, this might have resulted, in most cases
in representations of time too long to be useful. In order to make the
time-objects more manageable, time objects in nitime
carry a
time_unit
and a _conversion_factor
, which can be used
as a convenience, in order to convert between the representation of time in the
base unit and the appearance of time in the relevant time-unit.
The first set of base classes is a set of representations of time itself. All
these classes inherit from np.ndarray
. As mentioned above, the dtype of
these classes is int64
and the underlying representation is always at
the base unit. In addition to the methods inherited from np.ndarray
,
these time representations have an at()
method which . The result of this indexing
will be to return the time-point in the the respective TimeSeries
which is most appropriate (see Time-series access for details). They
have an index_at()
method, which returns the integer index of this time
in the underlying array. Finally, they will all have a during()
method,
which will allow indexing into these objects with an
Interval object. This will return the appropriate times corresponding to
an Interval object and index_during()
, which will return the array
of integers corresponding to the indices of these time-points in the array.
For the time being, there are two types of Time classes: TimeArray and UniformTime.
TimeArray
¶
This class has less restrictions on it: it is made of an 1-d array, which contains time-points that are not necessarily ordered. It can also contain several copies of the same time-point. This class can be used in order to represent sparsely occurring events, measured at some unspecified sampling rate and possibly collected from several different channels, where the data is sampled in order of channel and not in order of time. As in the case of the np.ndarray
. This representation of time carries, in addition to the array itself an attribute time_unit
, which is the unit in which we would like to present the time-points (recall that the underlying representation is always in the base-unit).
UniformTime
¶
This class contains ordered uniformly sampled time-points. This class has an explicit representation of t_0
, sampling_rate
and sampling_interval
. Thus, each element in this array can be used in order to represent the entire time interval \(t\), such that: \(t_i\leq t < t + \delta t\), where \(t_i\) is the nominal value held by that element of the array, and \(\delta t\) is the value of sampling_interval
.
This object contains additional attributes that are not shared by the other
time objects. In particular, an object of UniformTime
, UT, will have
the following:
UT.t_0
: the first time-point in the series.UT.sampling_rate
: the sampling rate of the series (this is an instance of .UT.sampling_interval
: the value of \(\delta t\), mentioned above.UT.duration
: the total time of the series.
Obviously, UT.sampling_rate
and UT.sampling_interval
are redundant, but can both be useful.
Frequency
¶
The UT.sampling_rate
of UniformTime
is an object of the Frequency
class. This is a representation of the frequency in Hz. It is derived from a combination of the sampling_interval
and the time_unit
.
Time-series¶
These are data container classes for representing different kinds of time-series data types.
In implementing these objects, we follow the following principles:
- The time-series data representations do not inherit from
np.ndarray
. Instead, one of their attributes is adata
attribute, which is anp.ndarray
. This principle should allow for a clean and compact implementation, which doesn’t carry all manner of unwanted properties into a bloated object with obscure and unknown behaviors. We have previously decided to make time the last dimension in this object, but recently we have been considering making this a user choice (in order to enable indexing into the data by time in a straight-forward manner (using expressions such asTI.data[i]
. - In tandem, one of their attributes is one of the Time base
classes described above. This is the
time
attribute of the time-series object. Therefore, forTimeSeries
it is implemented in the object with adesc.setattr_on_read()
decoration, so that it is only generated if it is needed.
TimeSeries
¶
This represents time-series of data collected continuously and regularly. Can be used in order to represent typical physiological data measurements, such as measurements of BOLD responses, or of membrane-potential. The representation of time here is UniformTime.
XXX Write more about the different attributes of this class.
Epochs
¶
This class represents intervals of time, or epochs. Each instance of this class contains several attributes:
E.start
: This is an object of classTimeArray
, which represents a collection of starting times of epochsE.stop
: This is an object of classTimeArray
which represents a collection of end points of the epochs.E.duration
: This is an object of classTimeArray
which represents the durations of the epochs.E.offset
: This attribute represents the offset of the epochE.time_unit
: This is
Events
¶
This is an object which represents a collection of events. For example, this
can represent discrete button presses occurring during an experiment. This
object contains a TimeArray as its representation of time. This means
that the events recorded in the data
array can be organized
according to any organizing principle you would want, not necessarily according
to their organization or order in time. For example, if events are read from
different devices, the order of the events in the data array can be arbitrarily
chosen to be the order of the devices from which data is read.
Analyzers¶
These objects implement a particular analysis, or family of analyses. Typically, the initialization of this kind of object can happen with a time-series object provided as input, as well as a set of parameter values setting. However, for most analyzer objects, the inputs can be provided upong calling the object, or by assignment to the already generated object.
Sometimes, a user may wish to revert the computation, change some of the
analysis parameters and recompute one or more of the results of the
analysis. In order to do that, the analyzer objects implement a reset
attribute, which reverts the computation of analysis attributes and allows to
change parameters in the analyzer and recompute the analysis results. This
structure keeps the cost of computation of quantities derived from the analysis
rather low.
[Goldberg1991] | Goldberg D (1991). What every computer scientist should know about floating-point arithmetic. ACM computing surveys 23: 5-48 |