Working with Arrays

This section is about objects that have any data associated with them. It provides details on how to handle the data - how to create a new object, how to access related data, what are the possibilities when making data queries.

Data is stored in Files

We use HDF5 file format to store any array data. We made several performance tests which showed this is the optimal way of managing large data arrays. Practically that means that for every object, that has an array data associated with it (like AnalogSignal or SpikeTrain) we store the data in related HDF5 file. In order to access the data using the API you basically have to access the associated file. There are several format options, explained in the sections below.

Access data

Accessing the data basically also suggests you send a couple of HTTP requests:
  • at first you get the object in JSON with reference to the datafile,
  • at second you access the datafile directly using the reference from previous response (a detailed example is in the next section).

This routine is needed to mitigate the unefficiency of transferring data, for instance, directly inside one full JSON response body assembled for several objects, which increases the amount of data transferred and leads to a high server and client load. In cases with large data arrays, which are typical for neuroscience, efficient data access is crucial.

Partial data requests

All data-related objects, which represent electrophysiological signals, support partial data requests. In fact, that means you may request only a slice of the data array if needed and avoid downloading of the whole signal, which could be very practical when accessing large data arrays. This works for both single and multiple object requests. Use the following GET request parameters to configure partial data response:

  • [start_time] - start time of the required range (calculated using the same time unit as the t_start of the signal)
  • [end_time] - end time of the required range (calculated using the same time unit as the t_start of the signal)
  • [duration] - duration of the required range (calculated using the same time unit as the t_start of the signal)
  • [start_index] - start index of the required datarange (an index of the starting datapoint)
  • [end_index] - end index of the required range (an index of the end datapoint)
  • [samples_count] - number of points of the required range (an index of the end datapoint)
  • [downsample] - number of datapoints. This parameter is used to indicate whether downsampling is needed. The downsampling is applied on top of the selected data range using other parameters (if specified).

Note. Some reasonable combinations of these parameters (like ‘start_time’ and ‘duration’ or ‘start_index’ and ‘end_time’ will return a correct response. Using redundant number of parameters will lead to their disregard, useless combinations may throw a 400 bad request.

For example, send the following GET request

Request: GET /electrophysiology/analogsignal/11/?start_time=50&duration=100&downsample=20

to get the 100ms time window of the AnalogSignal with ID = 11 starting from 50ms, downsampled from to 20 datapoints, first request the object:

HTTP SUCCESS (200)

{
...
"selected": [
    {
    "fields": {
        "id": 11,
        "name": "LFP FIX Signal-10, size: 2000",
        "signal": {
            "units": "mv",
            "data": "/datafiles/2177?start_index=1000&end_index=3001&downsample=20"
        },
        "t_start": {
            "units": "ms",
            "data": 50
        },
        "sampling_rate": {
            "units": "hz",
            "data": 20000
        },
        "guid": "00ec575f2f20eae37907d1927afaabe09bab2b39",
        ...
    },
    "model": "neo_api.analogsignal",
    "permalink": "/electrophysiology/analogsignal/11"
    }
],
...
}

and then request an appropriate data slice using the link in the response (in this case “/datafiles/2177?start_index=1000&end_index=3001&downsample=20”).

Notice that the “t_start” data field in the response has a data value of 3, indicating the start of the requested part of the array, but not the full original array.

Data in different formats

You may request data in different format, depending on what better is suitable for your client application. Up to the moment we support the following formats:
  • HDF5
  • JSON

HDF5

By default you get back data as an HDF5 file. Inside the file, an appropriate array is stored in the root of the HDF5 hierarchy (normally there shouldn’t be any other object in the file, so just accessing the first object should return a correct array); an array should have a name equal to its global ID. To summarize:
  • it is an HDF5 file
  • an array is in the root (“/”)
  • the dimensions of the array are compatible with the object specification

JSON

In order to request different format, provide “format” parameter in the GET request. For the moment ther is only one alternative format - JSON, so just provide “format=json” in the request GET to get back data in the response body as a list of values (be aware the response is a simple string).

Table Of Contents

Related Topics

This Page