forked from scalanlp/breeze
-
Notifications
You must be signed in to change notification settings - Fork 50
Tensors
dlwh edited this page Feb 16, 2013
·
1 revision
Breeze has a fairly large hierarchy of vector and matrix types. At the base of the hierarchy is breeze.linalg.Tensor[K, V]
, which is more or less like a scala.collection.mutable.Map[K, V]
that is geared towards working with numeric value types. The key type K
is the index type. For instance, Vectors have a key type of Int
, while Matrices have a key type of (Int, Int)
. (Matrices also have a more efficient access apply(r: Int, c: Int)
.)
TODO
The built-in Tensor
hierarchy looks something like this:
- Int-keyed
Tensor
s:-
Vector
- DenseVector: a "normal" array-backed Vector.
-
SparseVector: a sparse vector backed by binary search. O(log numNonZero) access to elements with optimizations for in-order traversal. Increasing the number of nonzero elements in the SparseVector (say, via
update
) is expensive, as O(numNonZero). -
HashVector: a sparse vector backed by a quadratic-probing open address hash array. O(1) access, but typically less memory efficient than
SparseVector
. No in-order traversal.
- Matrix
-
DenseMatrix: a "normal" array-backed
Matrix
. Column-major unless isTranspose is true, in which case it is row-major. (This is consistent with standard BLAS/LAPACK assumptions.) - CSCMatrix: a still-under-development Compressed Sparse Columns Matrix. Each column of the matrix is analogous to a SparseVector: access to an element is O(log numNonZeroInColumn).
-
DenseMatrix: a "normal" array-backed
-
- Arbitrarily keyed
Tensor
s:
Some terminology you might see frequently in Breeze:
- active means that the method only operates on key/value pairs (or just keys or just values) that have storage associated with them. For instance, for a SparseVector, active means only those elements that are actually stored in the array (typically with non-zero value). For DenseVectors, active will look at all elements.
- A slice is a view of an underlying tensor. Slices can reuse storage.
- A UFunc is a universal function, much like numpy's. The basic idea is that it's a function that can be applied to all values of any type that exposes its values via an implicit. See UFunc for more information. (TODO)
- A URFunc is a universal reduction function. The basic idea is that it's a function that can be used to "reduce" all values into a single result value. See URFunc for more information. (TODO)