Skip to content
dlwh edited this page Feb 16, 2013 · 1 revision

Breeze has a fairly large hierarchy of vector and matrix types. At the base of the hierarchy is breeze.linalg.Tensor[K, V], which is more or less like a scala.collection.mutable.Map[K, V] that is geared towards working with numeric value types. The key type K is the index type. For instance, Vectors have a key type of Int, while Matrices have a key type of (Int, Int). (Matrices also have a more efficient access apply(r: Int, c: Int).)

TODO

Kinds of Tensors

The built-in Tensor hierarchy looks something like this:

  • Int-keyed Tensors:
    • Vector
      • DenseVector: a "normal" array-backed Vector.
      • SparseVector: a sparse vector backed by binary search. O(log numNonZero) access to elements with optimizations for in-order traversal. Increasing the number of nonzero elements in the SparseVector (say, via update) is expensive, as O(numNonZero).
      • HashVector: a sparse vector backed by a quadratic-probing open address hash array. O(1) access, but typically less memory efficient than SparseVector. No in-order traversal.
    • Matrix
      • DenseMatrix: a "normal" array-backed Matrix. Column-major unless isTranspose is true, in which case it is row-major. (This is consistent with standard BLAS/LAPACK assumptions.)
      • CSCMatrix: a still-under-development Compressed Sparse Columns Matrix. Each column of the matrix is analogous to a SparseVector: access to an element is O(log numNonZeroInColumn).
  • Arbitrarily keyed Tensors:
    • Counter: basically a wrapped Map[K, V] with nice defaults.
    • Counter2: the "matrix" version of a Counter.

Terminology

Some terminology you might see frequently in Breeze:

  • active means that the method only operates on key/value pairs (or just keys or just values) that have storage associated with them. For instance, for a SparseVector, active means only those elements that are actually stored in the array (typically with non-zero value). For DenseVectors, active will look at all elements.
  • A slice is a view of an underlying tensor. Slices can reuse storage.
  • A UFunc is a universal function, much like numpy's. The basic idea is that it's a function that can be applied to all values of any type that exposes its values via an implicit. See UFunc for more information. (TODO)
  • A URFunc is a universal reduction function. The basic idea is that it's a function that can be used to "reduce" all values into a single result value. See URFunc for more information. (TODO)
Clone this wiki locally