.. _biap9: ################################ BIAP9 - The Coordinate Image API ################################ :Author: Chris Markiewicz :Status: Draft :Type: Standards :Created: 2021-09-16 ********** Background ********** Surface data is generally kept separate from geometric metadata =============================================================== In contrast to volumetric data, whose geometry can be fully encoded in the shape of a data array and a 4x4 affine matrix, data sampled to a surface require the location of each sample to be explicitly represented by a coordinate. In practice, the most common approach is to have a geometry file and a data file. A geometry file consists of a vertex coordinate array and a triangle array describing the adjacency of vertices, while a data file is an n-dimensional array with one axis corresponding to vertex. Keeping these files separate is a pragmatic optimization to avoid costly reproductions of geometric data, but presents an administrative burden to direct consumers of the data. Terminology =========== For the purposes of this BIAP, the following terms are used: * Coordinate - a triplet of floating point values in RAS+ space * Vertex - an index into a table of coordinates * Triangle (or face) - a triplet of adjacent vertices (A-B-C); the normal vector for the face is ($\overline{AB}\times\overline{AC}$) * Topology - vertex adjacency data, independent of vertex coordinates, typically in the form of a list of triangles * Geometry - topology + a specific set of coordinates for a surface * Parcel - a subset of vertices; can be the full topology. Special cases include: * Patch - a connected parcel * Decimated mesh - a parcel that has a desired density of vertices * Parcel sequence - an ordered set of parcels * Data array - an n-dimensional array with one axis corresponding to the vertices (typical) OR faces (more rare) in a patch sequence Currently supported surface formats =================================== * FreeSurfer * Geometry (e.g. ``lh.pial``): :py:func:`~nibabel.freesurfer.io.read_geometry` / :py:func:`~nibabel.freesurfer.io.write_geometry` * Data * Morphometry: :py:func:`~nibabel.freesurfer.io.read_morph_data` / :py:func:`~nibabel.freesurfer.io.write_morph_data` * Labels: :py:func:`~nibabel.freesurfer.io.read_label` * MGH: :py:class:`~nibabel.freesurfer.mghformat.MGHImage` * GIFTI: :py:class:`~nibabel.gifti.gifti.GiftiImage` * Every image contains a collection of data arrays, which may be coordinates, topology, or data (further subdivided by type and intent) * CIFTI-2: :py:class:`~nibabel.cifti2.cifti2.Cifti2Image` * Pure data array, with image header containing flexible axes * The ``BrainModelAxis`` is a subspace sequence including patches for each hemisphere (cortex without the medial wall) and subcortical structures defined by indices into three-dimensional array and an affine matrix * Geometry referred to by an associated ``wb.spec`` file (no current implementation in NiBabel) * Possible to have one with no geometric information, e.g., parcels x time Other relevant formats ====================== * MNE's STC (source time course) format. Contains: * Subject name (resolvable with a FreeSurfer ``SUBJECTS_DIR``) * Index arrays into left and right hemisphere surfaces (subspace sequence) * Data, one of: * ndarray of shape ``(n_verts, n_times)`` * tuple of ndarrays of shapes ``(n_verts, n_sensors)`` and ``(n_sensors, n_times)`` * Time start * Time step ***************************************** Desiderata for an API supporting surfaces ***************************************** The following are provisional guiding principles: 1. A surface image (data array) should carry a reference to geometric metadata that is easily transferred to a new image. 2. Partial images (data only or geometry only) should be possible. Absence of components should have a well-defined signature, such as a property that is ``None`` or a specific ``Exception`` is raised. 3. All arrays (coordinates, triangles, data arrays) should be proxied to avoid excess memory consumption 4. Selecting among coordinates (e.g., gray/white boundary, inflated surface) for a single topology should be possible. 5. Combining multiple brain structures (canonically, left and right hemispheres) in memory should be easy; serializing to file may be format-specific. 6. Splitting a data array into independent patches that can be separately operated on and serialized should be possible. Prominent use cases =================== We consider the following use cases for working with surface data. A good API will make retrieving the components needed for each use case straightforward, as well as storing the results in new images. * Arithmetic/modeling - per-vertex mathematical operations * Smoothing - topology/geometry-respecting smoothing * Plotting - paint the data array as a texture on a surface * Decimation - subsampling a topology (possibly a subset, possibly with interpolated vertex locations) * Resampling to a geometrically-aligned surface * Downsampling by decimating, smoothing, resampling * Inter-subject resampling by using ``?h.sphere.reg`` * Interpolation of per-vertex and per-face data arrays When possible, we prefer to expose NumPy ``ndarray``\s and allow use of numpy, scipy, scikit-learn. In some cases, it may make sense for NiBabel to provide methods. ******** Proposal ******** A ``CoordinateImage`` is an N-dimensional array, where one axis corresponds to a sequence of points in one or more parcels. .. code-block:: python class CoordinateImage: """ Attributes ---------- header : a file-specific header coordaxis : ``CoordinateAxis`` dataobj : array-like """ class CoordinateAxis: """ Attributes ---------- parcels : list of ``Parcel`` objects """ def load_structures(self, mapping): """ Associate parcels to ``Pointset`` structures """ def __getitem__(self, slicer): """ Return a sub-sampled CoordinateAxis containing structures matching the indices provided. """ def get_indices(self, parcel, indices=None): """ Return the indices in the full axis that correspond to the requested parcel. If indices are provided, further subsample the requested parcel. """ class Parcel: """ Attributes ---------- name : str structure : ``Pointset`` indices : object that selects a subset of coordinates in structure """ To describe coordinate geometry, the following structures are proposed: .. code-block:: python class Pointset: @property def n_coords(self): """ Number of coordinates """ def get_coords(self, name=None): """ Nx3 array of coordinates in RAS+ space """ class TriangularMesh(Pointset): @property def n_triangles(self): """ Number of faces """ def get_triangles(self, name=None): """ Mx3 array of indices into coordinate table """ def get_mesh(self, name=None): return self.get_coords(name=name), self.get_triangles(name=name) def get_names(self): """ List of surface names that can be passed to ``get_{coords,triangles,mesh}`` """ def decimate(self, *, n_coords=None, ratio=None): """ Return a TriangularMesh with a smaller number of vertices that preserves the geometry of the original """ # To be overridden when a format provides optimization opportunities class NdGrid(Pointset): """ Attributes ---------- shape : 3-tuple number of coordinates in each dimension of grid """ def get_affine(self, name=None): """ 4x4 array """ The ``NdGrid`` class allows raveled volumetric data to be treated the same as triangular mesh or other coordinate data. Finally, a structure for containing a collection of related geometric files is defined: .. code-block:: python class GeometryCollection: """ Attributes ---------- structures : dict Mapping from structure names to ``Pointset`` """ @classmethod def from_spec(klass, pathlike): """ Load a collection of geometries from a specification. """ The canonical example of a geometry collection is a left hemisphere mesh, right hemisphere mesh. Here we present common use cases: Modeling ======== .. code-block:: python from nilearn.glm.first_level import make_first_level_design_matrix, run_glm bold = CoordinateImage.from_filename("/data/func/hemi-L_bold.func.gii") dm = make_first_level_design_matrix(...) labels, results = run_glm(bold.get_fdata(), dm) betas = CoordinateImage(results["betas"], bold.coordaxis, bold.header) betas.to_filename("/data/stats/hemi-L_betas.mgz") In this case, no reference to the surface structure is needed, as the operations occur on a per-vertex basis. The coordinate axis and header are preserved to ensure that any metadata is not lost. Here we assume that ``CoordinateImage`` is able to make the appropriate translations between formats (GIFTI, MGH). This is not guaranteed in the final API. Smoothing ========= .. code-block:: python bold = CoordinateImage.from_filename("/data/func/hemi-L_bold.func.gii") bold.coordaxis.load_structures({"lh": "/data/anat/hemi-L_midthickness.surf.gii"}) # Not implementing networkx weighted graph here, so assume we have a function # that retrieves a graph for each structure graphs = get_graphs(bold.coordaxis) distances = distance_matrix(graphs['lh']) # n_coords x n_coords matrix weights = normalize(gaussian(distances, sigma)) # Wildly inefficient smoothing algorithm smoothed = CoordinateImage(weights @ bold.get_fdata(), bold.coordaxis, bold.header) smoothed.to_filename(f"/data/func/hemi-L_smooth-{sigma}_bold.func.gii") Plotting ======== Nilearn currently provides a `plot_surf `_ function. With the proposed API, we could interface as follows: .. code-block:: python def plot_surf_img(img, surface="inflated"): from nilearn.plotting import plot_surf coords, triangles = img.coordaxis.parcels[0].get_mesh(name=surface) data = img.get_fdata() return plot_surf((triangles, coords), data) tstats = CoordinateImage.from_filename("/data/stats/hemi-L_contrast-taskVsBase_tstat.mgz") # Assume a GeometryCollection that reads a FreeSurfer subject directory fs_subject = FreeSurferSubject.from_spec("/data/subjects/fsaverage5") tstats.coordaxis.load_structures(fs_subject.get_structure("lh")) plot_surf_img(tstats) Subsampling CIFTI-2 =================== .. code-block:: python img = nb.load("sub-01_task-rest_bold.dtseries.nii") # Assume CIFTI CoordinateImage parcel = nb.load("sub-fsLR_hemi-L_label-DLPFC_mask.label.gii") # GiftiImage structure = parcel.meta.metadata['AnatomicalStructurePrimary'] # "CortexLeft" vtx_idcs = np.where(parcel.agg_data())[0] dlpfc_idcs = img.coordaxis.get_indices(parcel=structure, indices=vtx_idcs) # Subsampled coordinate axes will override any duplicate information from header dlpfc_img = CoordinateImage(img.dataobj[dlpfc_idcs], img.coordaxis[dlpfc_idcs], img.header) # Now load geometry so we can plot wbspec = CaretSpec("fsLR.wb.spec") dlpfc_img.coordaxis.load_structures(wbspec) ...