datarray package

Submodules

datarray.datarray module

class datarray.datarray.AxesManager(arr, axes)

Bases: object

Class to manage the logic of the datarray.axes object.

>>> A = DataArray(np.random.randn(200, 4, 10),                 axes=('date', ('stocks', ('aapl', 'ibm', 'goog', 'msft')), 'metric'))
>>> isinstance(A.axes, AxesManager)
True

At a basic level, AxesManager acts like a sequence of axes:

>>> A.axes 
(Axis(name='date', index=0, labels=None), ..., Axis(name='metric', index=2, labels=None))
>>> A.axes[0]
Axis(name='date', index=0, labels=None)
>>> len(A.axes)
3
>>> A.axes[4]
Traceback (most recent call last):
    ...
IndexError: Requested axis 4 out of bounds

Each axis is accessible as a named attribute:

>>> A.axes.stocks
Axis(name='stocks', index=1, labels=('aapl', 'ibm', 'goog', 'msft'))

An axis can be indexed by integers or ticks:

>>> np.all(A.axes.stocks['aapl':'goog'] == A.axes.stocks[0:2])
True
>>> np.all(A.axes.stocks[0:2] == A[:,0:2,:])
True

Axes can also be accessed numerically:

>>> A.axes[1] is A.axes.stocks
True

Calling the AxesManager with string arguments will return an AxisIndexer object which can be used to restrict slices to specified axes:

>>> Ai = A.axes('stocks', 'date')
>>> np.all(Ai['aapl':'goog', 100] == A[100, 0:2])
True

You can also mix axis names and integers when calling AxesManager. (Not yet supported.)

# >>> np.all(A.axes(1, ‘date’)[‘aapl’:’goog’,100:200] == A[100:200, 0:2]) # True

class datarray.datarray.Axis(name, index, parent_arr, labels=None)

Bases: object

Object to access a given axis of an array.

at(label)

Return data at a given label.

>>> narr = DataArray(np.random.standard_normal((4,5)), axes=['a', ('b', 'abcde')])
>>> arr = narr.axes.b['c']
>>> arr.axes
(Axis(name='a', index=0, labels=None),)
drop(labels)

Keep only certain labels of an axis.

Example

>>> darr = DataArray(np.random.standard_normal((4,5)),
...                  axes=['a', ('b', ['a','b','c','d','e'])])
>>> arr1 = darr.axes.b.keep(['c','d'])
>>> arr2 = darr.axes.b.drop(['a','b','e'])
>>> np.all(arr1 == arr2)
True
keep(labels)

Keep only certain labels of an axis.

>>> narr = DataArray(np.random.standard_normal((4,5)),
...                  axes=['a', ('b', 'abcde')])
>>> arr = narr.axes.b.keep('cd')
>>> [a.labels for a in arr.axes]
[None, 'cd']
>>> arr.axes.a.at('label')
Traceback (most recent call last):
    ...
ValueError: axis must have labels to extract data at a given label
make_slice(key)

Make a slicing tuple into the parent array such that this Axis is cut up in the requested manner

Parameters

key (a slice object, single label-like item, or None) – This slice object may have arbitrary types for .start, .stop, in which case label labels will be looked up. The .step attribute of course must be None or an integer.

Returns

keys

Return type

parent_arr.ndim-length tuple for slicing

set_name(name)
class datarray.datarray.AxisIndexer(arr, *args)

Bases: object

An object which holds a reference to a DataArray and a list of axes and allows slicing by those axes.

class datarray.datarray.DataArray

Bases: numpy.ndarray

property T

Same as self.transpose(), except that self is returned if self.ndim < 2.

Examples

>>> x = np.array([[1.,2.],[3.,4.]])
>>> x
array([[ 1.,  2.],
       [ 3.,  4.]])
>>> x.T
array([[ 1.,  3.],
       [ 2.,  4.]])
>>> x = np.array([1.,2.,3.,4.])
>>> x
array([ 1.,  2.,  3.,  4.])
>>> x.T
array([ 1.,  2.,  3.,  4.])
all(axis=None, out=None, keepdims=False)

Returns True if all elements evaluate to True.

Refer to numpy.all for full documentation.

See also

numpy.all()

equivalent function

any(axis=None, out=None, keepdims=False)

Returns True if any of the elements of a evaluate to True.

Refer to numpy.any for full documentation.

See also

numpy.any()

equivalent function

argmax(axis=None, out=None)

Return indices of the maximum values along the given axis.

Refer to numpy.argmax for full documentation.

See also

numpy.argmax()

equivalent function

argmin(axis=None, out=None)

Return indices of the minimum values along the given axis of a.

Refer to numpy.argmin for detailed documentation.

See also

numpy.argmin()

equivalent function

argsort(axis=-1, kind='quicksort', order=None)

Returns the indices that would sort this array.

Refer to numpy.argsort for full documentation.

See also

numpy.argsort()

equivalent function

cumprod(axis=None, dtype=None, out=None)

Return the cumulative product of the elements along the given axis.

Refer to numpy.cumprod for full documentation.

See also

numpy.cumprod()

equivalent function

cumsum(axis=None, dtype=None, out=None)

Return the cumulative sum of the elements along the given axis.

Refer to numpy.cumsum for full documentation.

See also

numpy.cumsum()

equivalent function

diagonal(offset=0, axis1=0, axis2=1)

Return specified diagonals. In NumPy 1.9 the returned array is a read-only view instead of a copy as in previous NumPy versions. In a future version the read-only restriction will be removed.

Refer to numpy.diagonal() for full documentation.

See also

numpy.diagonal()

equivalent function

flatten(order='C')

Return a copy of the array collapsed into one dimension.

Parameters

order ({'C', 'F', 'A', 'K'}, optional) – ‘C’ means to flatten in row-major (C-style) order. ‘F’ means to flatten in column-major (Fortran- style) order. ‘A’ means to flatten in column-major order if a is Fortran contiguous in memory, row-major order otherwise. ‘K’ means to flatten a in the order the elements occur in memory. The default is ‘C’.

Returns

y – A copy of the input array, flattened to one dimension.

Return type

ndarray

See also

ravel()

Return a flattened array.

flat()

A 1-D flat iterator over the array.

Examples

>>> a = np.array([[1,2], [3,4]])
>>> a.flatten()
array([1, 2, 3, 4])
>>> a.flatten('F')
array([1, 3, 2, 4])
index_by(*args)
max(axis=None, out=None, keepdims=False)

Return the maximum along a given axis.

Refer to numpy.amax for full documentation.

See also

numpy.amax()

equivalent function

mean(axis=None, dtype=None, out=None, keepdims=False)

Returns the average of the array elements along given axis.

Refer to numpy.mean for full documentation.

See also

numpy.mean()

equivalent function

min(axis=None, out=None, keepdims=False)

Return the minimum along a given axis.

Refer to numpy.amin for full documentation.

See also

numpy.amin()

equivalent function

property names

Returns a tuple with all the axis names.

prod(axis=None, dtype=None, out=None, keepdims=False)

Return the product of the array elements over the given axis

Refer to numpy.prod for full documentation.

See also

numpy.prod()

equivalent function

ptp(axis=None, out=None, keepdims=False)

Peak to peak (maximum - minimum) value along a given axis.

Refer to numpy.ptp for full documentation.

See also

numpy.ptp()

equivalent function

ravel([order])

Return a flattened array.

Refer to numpy.ravel for full documentation.

See also

numpy.ravel()

equivalent function

ndarray.flat()

a flat iterator on the array.

repeat(repeats, axis=None)

Repeat elements of an array.

Refer to numpy.repeat for full documentation.

See also

numpy.repeat()

equivalent function

reshape(shape, order='C')

Returns an array containing the same data with a new shape.

Refer to numpy.reshape for full documentation.

See also

numpy.reshape()

equivalent function

Notes

Unlike the free function numpy.reshape, this method on ndarray allows the elements of the shape parameter to be passed in as separate arguments. For example, a.reshape(10, 11) is equivalent to a.reshape((10, 11)).

set_name(i, name)
sort(axis=-1, kind='quicksort', order=None)

Sort an array, in-place.

Parameters
  • axis (int, optional) – Axis along which to sort. Default is -1, which means sort along the last axis.

  • kind ({'quicksort', 'mergesort', 'heapsort', 'stable'}, optional) – Sorting algorithm. Default is ‘quicksort’.

  • order (str or list of str, optional) – When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

See also

numpy.sort()

Return a sorted copy of an array.

argsort()

Indirect sort.

lexsort()

Indirect stable sort on multiple keys.

searchsorted()

Find elements in sorted array.

partition()

Partial sort.

Notes

See sort for notes on the different sorting algorithms.

Examples

>>> a = np.array([[1,4], [3,1]])
>>> a.sort(axis=1)
>>> a
array([[1, 4],
       [1, 3]])
>>> a.sort(axis=0)
>>> a
array([[1, 3],
       [1, 4]])

Use the order keyword to specify a field to use when sorting a structured array:

>>> a = np.array([('a', 2), ('c', 1)], dtype=[('x', 'S1'), ('y', int)])
>>> a.sort(order='y')
>>> a
array([('c', 1), ('a', 2)],
      dtype=[('x', '|S1'), ('y', '<i4')])
squeeze(axis=None)

Remove single-dimensional entries from the shape of a.

Refer to numpy.squeeze for full documentation.

See also

numpy.squeeze()

equivalent function

std(axis=None, dtype=None, out=None, ddof=0, keepdims=False)

Returns the standard deviation of the array elements along given axis.

Refer to numpy.std for full documentation.

See also

numpy.std()

equivalent function

sum(axis=None, dtype=None, out=None, keepdims=False)

Return the sum of the array elements over the given axis.

Refer to numpy.sum for full documentation.

See also

numpy.sum()

equivalent function

swapaxes(axis1, axis2)

Return a view of the array with axis1 and axis2 interchanged.

Refer to numpy.swapaxes for full documentation.

See also

numpy.swapaxes()

equivalent function

transpose(*axes)

Returns a view of the array with axes transposed.

For a 1-D array, this has no effect. (To change between column and row vectors, first cast the 1-D array into a matrix object.) For a 2-D array, this is the usual matrix transpose. For an n-D array, if axes are given, their order indicates how the axes are permuted (see Examples). If axes are not provided and a.shape = (i[0], i[1], ... i[n-2], i[n-1]), then a.transpose().shape = (i[n-1], i[n-2], ... i[1], i[0]).

Parameters

axes (None, tuple of ints, or n ints) –

  • None or no argument: reverses the order of the axes.

  • tuple of ints: i in the j-th place in the tuple means a’s i-th axis becomes a.transpose()’s j-th axis.

  • n ints: same as an n-tuple of the same ints (this form is intended simply as a “convenience” alternative to the tuple form)

Returns

out – View of a, with axes suitably permuted.

Return type

ndarray

See also

ndarray.T()

Array property returning the array transposed.

Examples

>>> a = np.array([[1, 2], [3, 4]])
>>> a
array([[1, 2],
       [3, 4]])
>>> a.transpose()
array([[1, 3],
       [2, 4]])
>>> a.transpose((1, 0))
array([[1, 3],
       [2, 4]])
>>> a.transpose(1, 0)
array([[1, 3],
       [2, 4]])
var(axis=None, dtype=None, out=None, ddof=0, keepdims=False)

Returns the variance of the array elements, along given axis.

Refer to numpy.var for full documentation.

See also

numpy.var()

equivalent function

class datarray.datarray.KeyStruct(**kw)

Bases: object

A slightly enhanced version of a struct-like class with named key access.

Examples

>>> a = KeyStruct()
>>> a.x = 1
>>> a['x']
1
>>> a['y'] = 2
>>> a.y
2
>>> a[3] = 3
Traceback (most recent call last):
  ...
TypeError: hasattr(): attribute name must be string
>>> b = KeyStruct(x=1, y=2)
>>> b.x
1
>>> b['y']
2
>>> b['y'] = 4
Traceback (most recent call last):
  ...
AttributeError: KeyStruct already has atribute 'y'
exception datarray.datarray.NamedAxisError

Bases: Exception

datarray.datarray.is_numpy_scalar(arr)
datarray.datarray.names2namedict(names)

Make a name map out of any name input.

datarray.print_grid module

Functions for pretty-printing tabular data, such as a DataArray, as a grid.

class datarray.print_grid.BoolFormatter(data=None)

Bases: datarray.print_grid.GridDataFormatter

The BoolFormatter prints ‘True’ and ‘False’ if there is room, and otherwise prints ‘T’ and ‘-‘ (‘T’ and ‘F’ are too visually similar).

format(value, width=5)

Formats a given value to a fixed width.

max_width()
standard_width()
class datarray.print_grid.ComplexFormatter(data)

Bases: datarray.print_grid.GridDataFormatter

A ComplexFormatter uses two FloatFormatters side by side. This can make its min_width fairly large.

format(value, width=None)

Formats a given value to a fixed width.

max_width()
min_width()
standard_width()
class datarray.print_grid.FloatFormatter(data, sign=False, strip_zeros=True)

Bases: datarray.print_grid.GridDataFormatter

Formats floating point numbers either in standard or exponential notation, whichever fits better and represents the numbers better in the given amount of space.

format(value, width=None)

Formats a given value to a fixed width.

format_all(values, width=None)

Formats an array of values to a fixed width, returning a string array.

max_width()
min_width()
standard_width()
class datarray.print_grid.GridDataFormatter(data=None)

Bases: object

A GridDataFormatter takes an ndarray of objects and represents them as equal-length strings. It is flexible about what string length to use, and can make suggestions about the string length based on the data it will be asked to render.

Each GridDataFormatter instance specifies:

  • min_width, the smallest acceptable width

  • standard_width, a reasonable width when putting many items on the screen

  • max_width, the width it prefers if space is not limited

This top-level class specifies reasonable defaults for a formatter, and subclasses refine it for particular data types.

format(value, width=None)

Formats a given value to a fixed width.

format_all(values, width=None)

Formats an array of values to a fixed width, returning a string array.

max_width()
min_width()
standard_width()
class datarray.print_grid.IntFormatter(data, sign=False, strip_zeros=True)

Bases: datarray.print_grid.FloatFormatter

The IntFormatter tries to just print all the digits of the ints, but falls back on being an exponential FloatFormatter if there isn’t room.

standard_width()
class datarray.print_grid.StrFormatter(data=None)

Bases: datarray.print_grid.GridDataFormatter

A StrFormatter’s behavior is almost entirely defined by the default. When it must truncate strings, it insists on showing at least 3 characters.

min_width()
datarray.print_grid.array_to_string(arr, width=75, height=10)

Get a 2-D text representation of a NumPy array.

datarray.print_grid.datarray_to_string(arr, width=75, height=10)

Get a 2-D text representation of a datarray.

datarray.print_grid.get_formatter(arr)

Get a formatter for this array’s data type, and prime it on this array.

datarray.print_grid.grid_layout(arr, width=75, height=10)

Given a 2-D non-empty array, turn it into a list of lists of strings to be joined.

This uses plain lists instead of a string array, because certain formatting tricks might want to join columns, resulting in a ragged- shaped array.

datarray.print_grid.labeled_layout(arr, width=75, height=10, row_label_width=9)

Given a 2-D non-empty array that may have labeled axes, rows, or columns, render the array as strings to be joined and attach the axes in visually appropriate places.

Returns a list of lists of strings to be joined.

datarray.print_grid.layout_to_string(layout)

datarray.version module

datarray version information

Module contents

Arrays with rich geometric semantics.