============
Introduction
============

tensap (Tensor Approximation Package) is a Python package for the
approximation of functions and tensors, available on GitHub at
https://github.com/anthony-nouy/tensap, or through its GitHub page
https://anthony-nouy.github.io/tensap/.

To install from PyPi, run::

    pip install tensap

Alternatively, you can install tensap directly from github by running::

    pip install git+git://github.com/anthony-nouy/tensap@master

The package tensap features low-rank tensors (including canonical,
tensor-train and tree-based tensor formats or tree tensor networks),
sparse tensors, polynomials, and allows the plug-in of other
approximation tools. It provides different approximation methods based
on interpolation, least-squares projection or statistical learning.

The package is shipped with tutorials showing its main applications. A
documentation is also available.

At minimum, tensap requires the packages numpy and scipy. The packages
tensorflow and sklearn are required for some applications.

**FullTensor**
==============

A **FullTensor** **X** represents an order :math:`d` tensor
:math:`X \in {\mathbb{R}}^{N_1 \times \cdots \times N_d}`, or
multidimensional array of size :math:`N_1\times \ldots \times N_d`.
The entries of :math:`X` are :math:`X_{i_1, \ldots, i_d}`, with
:math:`(i_1,\ldots,i_d)` a tuple of indices, where
:math:`i_\nu \in \{0,\ldots,N_\nu-1\}` is related to the
:math:`\nu`-th mode of the tensor.

We present in this section how to create a **FullTensor** using tensap,
and several possible operations with such an object. For an introduction
to tensor calculus, we refer to the monograph [Hackbusch2019]_.

For examples of use, see the tutorial file
``tutorials\tensor_algebra\tutorial_FullTensor.py``.

Creating a **FullTensor**
-------------------------

| Provided with an array **data** of shape **[N\_1, ..., N\_d]**, the
  command **X = tensap.FullTensor(data)** returns a tensor
  :math:`X \in {\mathbb{R}}^{N_1 \times \cdots \times N_d}`, with order
  **X.order = d** and shape **X.shape = (N\_1, ..., N\_d)**. The number
  of entries of **X** is given by
  :math:`{\texttt{\detokenize{X.size = X.storage()}}} = \prod_{i=1}^d N_i`.
  The number of nonzero entries of **X** is given by
  **X.sparse\_storage()**.

It is also possible to generate a **FullTensor** with entries:

-  equal to 0 with **tensap.FullTensor.zeros([N\_1, ..., N\_d])**,

-  equal to 1 with **tensap.FullTensor.ones([N\_1, ..., N\_d])**,

-  drawn randomly according to the uniform distribution on
   :math:`[0, 1]` with **tensap.FullTensor.rand([N\_1, ..., N\_d])**,

-  drawn randomly according to the standard gaussian distribution with
   **tensap.FullTensor.randn([N\_1, ..., N\_d])**,

-  different from 0 only on the diagonal, provided in **diag_data**,
   with **tensap.FullTensor.diag(diag_data, d)** (generating a tensor
   of order d and shape :math:`[N, \ldots, N]`, with :math:`N = len(diag_data)`,

-  generated using a provided **generator** with
   **tensap.FullTensor.create(generator, [N\_1, ..., N\_d])**.

Accessing the entries of a **FullTensor** 
------------------------------------------

The entries of a tensor **X** can be accessed with the method
**eval\_at\_indices**: **X.eval\_at\_indices(ind)** returns the entries
of :math:`X` indexed by the list **ind** containing the indices to
access in each dimension.

Extracting diagonal entries.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For a tensor :math:`X \in {\mathbb{R}}^{N, \ldots, N}`, the command
**X.eval\_diag()** returns the diagonal entries
:math:`X_{i, \ldots, i}`, :math:`i = 1, \ldots, N`, of the tensor. The
command **X.eval\_diag(dims)** returns the entries
:math:`X_{i, \ldots, i}`, with :math:`i` in **ind**.

Extracting a sub-tensor.
^^^^^^^^^^^^^^^^^^^^^^^^

A sub-tensor can be extracted from **X** with the method
**sub\_tensor**: for an order-3 **FullTensor** **X** of size
:math:`N_1\times N_2 \times N_3`, **X.sub\_tensor([0, 1], ’:’, 2)**
returns a sub-tensor of size :math:`2\times N_2 \times 1` containing the
entries :math:`X_{i_1,i_2,i_3}` with :math:`i_1\in \{0,1\}`,
:math:`0\le i_2 \le N_2-1` and :math:`i_3=2`.

Permuting the modes of a **FullTensor**
---------------------------------------

The methods **transpose** and **itranspose** permute the dimensions of a
tensor **X**, given a permutation **dims** of :math:`\{1, \ldots, d\}`.
They are such that **X = X.transpose(dims).itranspose(dims)**.

Reshaping a **FullTensor**.
---------------------------

The command **X.reshape(shape)** reshapes a **FullTensor** using a
column-major order (e.g. used in Fortran, Matlab, R). It relies on the
numpy’s reshape function with Fortran-like index (argument
**order=’F’**). For a tuple :math:`(i_1,\ldots,i_d)`, we define

.. math:: \overline{i_1,\ldots,i_d} = i_1 + N_1(i_{2-1}-1) + N_1N_2(i_{3-1}-1) + \ldots + N_1\ldots N_{d-1}(i_d-1) .

A tensor :math:`X` is be identified with a vector
:math:`\mathrm{vec}(X)` whose entries are
:math:`\mathrm{vec}(X)_{\overline{i_1,\ldots,i_d}}`. This vector can be
obtained with the command **X.reshape(N)** with
**N=numpy.prod(X.shape)**.

:math:`\alpha`-Matricization.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For :math:`\alpha \subset \{1,\ldots,d\}` an its complementary subset
:math:`\alpha^c` in :math:`\{1,\ldots,d\}`, an
:math:`\alpha`-matricization of a tensor :math:`X` is a matrix :math:`M`
of size
:math:`(\prod_{i \in \alpha} N_i) \times (\prod_{i \in \alpha^c} N_i)`,
such that
:math:`X_{i_1,\ldots,i_d} = M_{\overline{i_\alpha},\overline{i_{\alpha^c}}}`
with :math:`i_\alpha = (i_\nu)_{\nu\in \alpha}`. It can be obtained with
**X.matricize(alpha)**, which returns a **FullTensor** or order
:math:`2`. The matricization relies on the method **reshape**.

Orthogonalization.
^^^^^^^^^^^^^^^^^^

It is possible to obtain a representation of a tensor :math:`X` such
that its :math:`\alpha`-matricization is an orthogonal matrix (i.e. with
orthogonal columns) using the method **X.orth(alpha)**.

Norms and singular-values
-------------------------

Computing the Frobenius norm of a **FullTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The command **X.norm()** returns the Frobenius norm
:math:`\Vert X \Vert_F` of :math:`X`, defined by

.. math:: \Vert X \Vert_F^2 = \sum_{i_1}^{N_1} \cdots \sum_{i_d}^{N_d} X_{i_1, \ldots, i_d}^2.

Computing the :math:`\alpha`-singular values and :math:`\alpha`-principal components of a **FullTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For a subset :math:`\alpha \subset \{1, \ldots, d\}` and its
complementary subset :math:`\alpha^c`, the :math:`\alpha`-matricization
:math:`M` of :math:`X` admits a singular value decomposition

.. math:: M_{i_\alpha,i_{\alpha^c}} = \sum_{k} \sigma^k v^k_{i_\alpha} w^k_{i_\alpha^c}

where the :math:`\sigma^k` are the singular values of :math:`M` and the
:math:`v^k` the corresponding left singular vectors, or principal
components of :math:`M`. They are respectively called the
:math:`\alpha`-singular values and :math:`\alpha`-principal components
of :math:`X`. The :math:`\alpha`-singular values are obtained with
**X.singular\_values()**. The :math:`\alpha`-principal components (and
:math:`\alpha`-singular values) are obtained with
**X.alpha\_principal\_components(alpha)**, which is equivalent to
**X.matricize(alpha).principal\_components()**.

Operations with **FullTensor**
------------------------------

Outer product.
^^^^^^^^^^^^^^

| The outer product :math:`X \circ Y` of two tensors
  :math:`X\in {\mathbb{R}}^{N_1 \times \cdots \times N_d}` and
  :math:`Y \in {\mathbb{R}}^{\hat N_1 \times \cdots \times \hat N_{\hat d}}`
  is a tensor
  :math:`Z \in {\mathbb{R}}^{N_1 \times \ldots \times N_d \times \hat N_1 \times \cdots \times \hat N_{\hat d}}`
  of order :math:`d + \hat d` with entries

  .. math:: {Z}_{i_1,\ldots,i_d,j_1,\ldots,j_{\hat d}} =  X_{i_1, \ldots,i_ d} Y_{j_1, \ldots, j_{\hat d}}

  It is provided by **X.tensordot(Y, 0)**, similarly to numpy’s
  tensordot function.

Kronecker product.
^^^^^^^^^^^^^^^^^^

The Kronecker product :math:`X\otimes Y` of two tensors :math:`X` and
:math:`Y` of the same order :math:`d=\hat d` is a tensor :math:`Z` of
size :math:`N_1\hat N_1 \times  \ldots \times N_d \hat N_{\hat d}` with
entries

.. math:: Z_{\overline{i_1j_1},\ldots,\overline{i_dj_d}} = X_{i_1,\ldots,i_d} Y_{j_1,\ldots,j_d}.

It is given by the command **kron**, which is similar to numpy’s kron
function, but for arbitrary tensors.

Hadamard product.
^^^^^^^^^^^^^^^^^

The Hadamard (elementwise) product :math:`X \circledast Y` of two
tensors :math:`X` and :math:`Y` of the same order and size is obtained
through the command **\_\_mul\_\_(X,Y)**, which returns a tensor
:math:`Z` with entries

.. math:: Z_{i_1,\ldots,i_d} = X_{i_1,\ldots,i_d} Y_{i_1,\ldots,i_d}

Contracted product.
^^^^^^^^^^^^^^^^^^^

| For :math:`I\subset \{1,\ldots,d\}` and
  :math:`J \subset \{1,\ldots,\hat d\}` with :math:`\#I = \#J`, **Z =
  X.tensordot(Y, I, J)** performs the mode :math:`(I,J)`-contracted
  product of :math:`X` and :math:`Z` which is a tensor Z of order
  :math:`d + \hat d - \#I - \#J` with entries

  .. math::

     {Z}_{(i_\nu)_{\nu \notin I}, (j_\mu)_{\mu \notin J}} =
                 \sum_{\substack{i_\nu=1 \\ \nu \in I}}^{N_\nu} \sum_{\substack{j_\mu=1 \\ \mu \in J}}^{N_\mu} 
                 \prod_{\nu \in I} \prod_{\mu \in J} 
                 \delta_{i_\nu, j_\mu} X_{i_1, \ldots,i_ d} Y_{j_1, \ldots, j_{\hat d}}

  with :math:`\delta_{i,j}` the Kronecker delta, that is a contraction
  of tensors :math:`X` and :math:`Y` along dimensions :math:`I` of
  :math:`X` and :math:`J` of :math:`Y`. For example, for
  order-\ :math:`4` tensors :math:`X` and :math:`Y`, **Z =
  X.tensordot(Y, [0,1], [1,2])** returns a tensor :math:`Z` or order
  :math:`4` such that

  .. math:: Z_{i_3,i_4,j_1,j_4} = \sum_{i_1,i_2} X_{i_1,i_2,i_3,i_4} Y_{j_1,i_1,i_2,j_4} .

  The method **tensordot\_eval\_diag** provides the diagonal (or
  entries with equal pairs of indices) of the result of the method
  **tensor\_dot**, but at a cost lower than when using **X.tensordot(Y,
  I, J).eval\_diag()**.

| For example, for order-\ :math:`4` tensors :math:`X` and :math:`Y`,
  **X.tensordot\_eval\_diag(Y,[0,1],[1,2],[2,3],[0,3])** returns the
  diagonal of :math:`Z`, i.e. an order-one tensor :math:`M` with entries

  .. math:: M_k = Z_{k,k,k,k} = \sum_{i_1,i_2} X_{i_1,i_2,k,k} Y_{k,i_1,i_2,k}

  **X.tensordot\_eval\_diag(Y,[0,1],[1,2],[2,3],[0,3],diag = True)**
  returns a tensor :math:`M` of order :math:`2` with entries

  .. math:: M_{k_1,k_2} = Z_{k_1,k_2,k_1,k_2} = \sum_{i_1,i_2} X_{i_1,i_2,k_1,k_3} Y_{k,i_1,i_2,k}

  **X.tensordot\_eval\_diag(Y,[0,1],[1,2],[2],[0])** returns the
  diagonal of :math:`Z`, i.e. a tensor :math:`M` of order :math:`3`
  :math:`v` with entries

  .. math:: M_{k,i_4,j_4} = Z_{k,i_4,k,j_4} = \sum_{i_1,i_2} X_{i_1,i_2,k,i_4} Y_{k,i_1,i_2,j_4}

Dot product.
^^^^^^^^^^^^

The dot product of two tensors :math:`X` and :math:`Y` with same shape
:math:`[N_1, \ldots, N_d]`, defined by

.. math:: ( X, Y ) = \sum_{\substack{i_\nu = 1 \\ \nu = 1, \ldots, d}}^{N_\nu} X_{i_1, \ldots, i_d} Y_{i_1, \ldots, i_d},

can be obtained with **X.dot(Y)**. It is equivalent to **X.tensordot(Y,
range(X.order), range(Y.order))**.

Contractions with matrices or vectors
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Given a tensor :math:`X` and a list of matrices
:math:`M = [M^1, ..., M^d]`, the command **Z =
X.tensor\_matrix\_product(M)** returns an order-d tensor :math:`Z` whose
entries are

.. math:: Z_{i_1, \ldots, i_d} = \sum_{\substack{k_\nu = 1 \\ \nu = 1, \ldots, d}}^{N_\nu} X_{k_1, \ldots, k_d} \prod_{\nu = 1}^d M^\nu_{i_\nu, k_\nu}

The same method exists for vectors instead of matrices:
**tensor\_vector\_product**. Similarly to **tensordot\_eval\_diag**, the
method **tensor\_matrix\_product\_eval\_diag** evaluates the diagonal of
the result of **tensor\_matrix\_product**, with a lower cost.

Tensor formats
==============

Here we present tensor formats available in tensap, which are structured
formats of tensors in :math:`\mathbb{R}^{N_1\times \ldots \times N_d}.`
For a detailed description of methods, see the description of the
corresponding methods for **FullTensor** in . For an introduction to
tensor formats, we refer to the monograph [Hackbusch2019]_ and the survey
[Nouy2017book]_.

**CanonicalTensor**
-------------------

The entries of an order-\ :math:`d` tensor
:math:`X \in {\mathbb{R}}^{N_1 \times \cdots \times N_d}` in canonical
format can be written

.. math::

   \label{eq:CanonicalTensor}
               X_{i_1, \ldots, i_d} = \sum_{k=1}^r C_k U^1_{i_1, k} \cdots U^d_{i_d, k},

with :math:`r` the canonical rank, and where the
:math:`U_\nu = (U^\nu_{i_\nu, k})_{1\le i_\nu \le N_\nu , 1\le k \le r}`
are order-two tensors.

Creating a **CanonicalTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To create a canonical tensor in tensap, one can use the command
**tensap.CanonicalTensor(C, U)**, where **C** contains the
:math:`(C_k)_{k=1}^d`, and **U** is a list containing the :math:`U^\nu`,
:math:`1\le \nu\le d`.

| The storage complexity of such a tensor, obtained with
  **X.storage()**, is equal to :math:`r(1 + N_1 + \cdots + N_d)`.
| It is also possible to generate a **CanonicalTensor** with entries

-  equal to 0 with **tensap.CanonicalTensor.zeros(r, [N\_1, ...,
   N\_d])**,

-  equal to 1 with **tensap.CanonicalTensor.ones(r, [N\_1, ...,
   N\_d])**,

-  drawn randomly according to the uniform distribution on
   :math:`[0, 1]` with **tensap.CanonicalTensor.rand(r, [N\_1, ...,
   N\_d])**,

-  drawn randomly according to the standard gaussian distribution with
   **tensap.CanonicalTensor.randn(r, [N\_1, ..., N\_d])**,

-  generated using a provided **generator** with
   **tensap.CanonicalTensor.create** **(generator, r, [N\_1, ...,
   N\_d])**.

Converting a **CanonicalTensor** to a **FullTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **CanonicalTensor** **X** can be converted to a **FullTensor**
(introduced in Section [sec:FullTensor]) with the command **X.full()**.

Converting a **CanonicalTensor** to a **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **CanonicalTensor** **X** can be converted to a **TreeBasedTensor**
(introduced in Section [sec:TreeBasedTensor]) with the command
**X.tree\_based\_tensor(tree, is\_active\_node)**, with **tree** a
**DimensionTree** object, and **is\_active\_node** a list or array of
booleans indicating if each node of the tree is active.

Accessing the diagonal of a **CanonicalTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For a canonical tensor
:math:`X \in {\mathbb{R}}^{N \times \ldots \times N}`, the command
**X.eval\_diag()** returns the diagonal :math:`X_{i, \ldots, i}`,
:math:`i = 1, \ldots, N`, of the tensor. The method **eval\_diag** can
also be used to evaluate the diagonal in a subset of dimensions **dims**
of the tensor with **X.eval\_diag(dims)**, which returns a
**CanonicalTensor**.

Computing the Frobenius norm of a **CanonicalTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The command **X.norm()** returns the Frobenius norm of :math:`X`. The
Frobenius norm of :math:`X` is equal to the Frobenius norm of its core
:math:`C` if **X.is\_orth** is **True**.

Computing the derivative of **CanonicalTensor** with respect to one of its parameters.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Given an order-\ :math:`d` canonical tensor :math:`X` in
:math:`{\mathbb{R}}^{N \times \cdots \times N}`, the command
**X.parameter\_gradient\_eval\_diag(k)**, for :math:`1 \leq k \leq d`,
returns the derivative

.. math:: \left.\frac{\partial X_{i_1, \ldots, i_d}}{\partial U^k}\right|_{i_1=\cdots=i_d=i}, \; i = 1, \ldots, N.

The derivative of :math:`X` with respect to its core :math:`C`, that
writes

.. math:: \left.\frac{\partial X_{i_1, \ldots, i_d}}{\partial C}\right|_{i_1=\cdots=i_d=i}, \; i = 1, \ldots, N,

is obtained with **X.parameter\_gradient\_eval\_diag(d+1)**.

The method **parameter\_gradient\_eval\_diag** is used in the
statistical learning algorithms presented in Section
[sec:TensorLearning].

Performing operations with **CanonicalTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Some operations between tensors are implemented for **DiagonalTensor**
(see for a detailed description of the operations): the Kronecker
product with **kron**, the contraction with matrices with
**tensor\_matrix\_product**, the evaluation of the diagonal of a
contraction with matrices with **tensor\_matrix\_product\_eval\_diag**,
the dot product with **dot**.

Given a tensor :math:`X` and a list of matrices
:math:`M = [M^1, ..., M^d]`, the command **Z =
X.tensor\_matrix\_product(M)** returns an order-d tensor :math:`Z` whose
entries are

.. math:: Z_{i_1, \ldots, i_d} = \sum_{k=1}^r \sum_{\substack{k_\nu = 1 \\ \nu = 1, \ldots, d}}^{N_\nu} C_k U^1_{k_1, k} \cdots U^d_{k_d, k} \prod_{\nu = 1}^d M^\nu_{i_\nu, k_\nu}

The method **tensor\_matrix\_product\_eval\_diag** evaluates the
diagonal of the result of **tensor\_matrix\_product**.

The dot product of two canonical tensors :math:`X` and :math:`Y` with
same shape :math:`[N_1, \ldots, N_d]` can be obtained with **X.dot(Y)**.

**DiagonalTensor**
------------------

A diagonal tensor
:math:`X \in {\mathbb{R}}^{N_1 \times \cdots \times N_d}` is a tensor
whose entries :math:`X_{i_1, \ldots, i_d}` are non-zero only if
:math:`i_1 = \cdots = i_d`.

Creating a **DiagonalTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To create a diagonal tensor in tensap, one can use the command
**tensap.DiagonalTensor(D, d)**, where **D** (of length :math:`r`)
contains the diagonal of the tensor, and **d** is the order of the
tensor. The result if an order :math:`d` tensor in
:math:`\mathbb{R}^{r\times \ldots \times r} = \mathbb{R}^{ r^d}`.

The sparse storage complexity of such a tensor, obtained with
**X.sparse\_storage()**, is equal to **r = len(D)**. Its storage
complexity, not taking into account the fact that only the diagonal is
non-zero, is equal to :math:`r^d` and obtained with **X.storage()**.

It is also possible to generate a **DiagonalTensor** with entries

-  equal to 0 with **tensap.DiagonalTensor.zeros(r, d)**,

-  equal to 1 with **tensap.DiagonalTensor.ones(r, d)**,

-  drawn randomly according to the uniform distribution on
   :math:`[0, 1]` with **tensap.DiagonalTensor.rand(r, d)**,

-  drawn randomly according to the standard gaussian distribution with
   **tensap.DiagonalTensor.randn(r, d)**,

-  generated using a provided **generator** with
   **tensap.DiagonalTensor.create** **(generator, r, d)**.

Converting a **DiagonalTensor** to a **FullTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **DiagonalTensor** **X** can be converted to a **FullTensor**
(introduced in Section [sec:FullTensor]) with the command **X.full()**.

Converting a **DiagonalTensor** to a **SparseTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **DiagonalTensor** **X** can be converted to a **SparseTensor**
(introduced in Section [sec:SparseTensor]) with the command
**X.sparse()**.

Converting a **DiagonalTensor** to a **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **DiagonalTensor** **X** can be converted to a **TreeBasedTensor**
(introduced in Section [sec:TreeBasedTensor]) with the command
**X.tree\_based\_tensor(tree, is\_active\_node)**, with **tree** a
**DimensionTree** object, and **is\_active\_node** a list or array of
booleans indicating if each node of the tree is active.

Accessing the entries of a **DiagonalTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

| The entries of the tensor **X** can be accessed with the method
  **eval\_at\_indices**: **X.eval\_at\_indices(ind)** returns the
  entries of :math:`X` indexed by the list **ind** containing the
  indices to access in each dimension.
| A sub-tensor can be extracted from **X** with the method
  **sub\_tensor**.

For a tensor :math:`X \in {\mathbb{R}}^{N \times \ldots \times N}`, the
command **X.eval\_diag()** returns the diagonal
:math:`X_{i, \ldots, i}`, :math:`i = 1, \ldots, N`, of the tensor. The
method **eval\_diag** can also be used to evaluate the diagonal in some
dimensions **dims** of the tensor with **X.eval\_diag(dims)**.

Computing the Frobenius norm of a **DiagonalTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The command **X.norm()** returns the Frobenius norm of :math:`X`.

Performing operations with **DiagonalTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Some operations between tensors are implemented for **DiagonalTensor**
(see for a detailed description of the operations): the outer product
with **tensordot**, the evaluation of the diagonal (or subtensors) of an
outer product with **tensordot\_eval\_diag**, the Kronecker product with
**kron**, the contraction with matrices or vectors with
**tensor\_matrix\_product** or **tensor\_vector\_product** respectively,
the evaluation of the diagonal of a contraction with matrices with
**tensor\_matrix\_product\_eval\_diag**, the dot product with **dot**.

**SparseTensor**
----------------

A sparse tensor
:math:`X \in {\mathbb{R}}^{N_1 \times \cdots \times N_d}` is a tensor
whose entries :math:`X_{i_1, \ldots, i_d}` are non-zero only for
:math:`(i_1, \ldots, i_d) \in I`, with :math:`I` a set of multi-indices.

Creating a **SparseTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To create a sparse tensor :math:`{\texttt{\detokenize{X}}}` in tensap,
one can use the command **tensap.SparseTensor(D, I, [N\_1, ...,
N\_d])**, where **D** contains the non-zero entries of :math:`X`, **I**
is a **tensap.MultiIndices** containing the indices of its non-zero
enties, and where :math:`N_1, \ldots, N_d` is its shape.

The sparse storage complexity of such a tensor, obtained with
**X.sparse\_storage()**, is equal to :math:`\text{card}(I)`. Its storage
complexity, not taking into account the sparsity, is equal to
:math:`N_1 \cdots N_d` and can be accessed with **X.storage()**.

Converting a **SparseTensor** to a **FullTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **SparseTensor** **X** can be converted to a **FullTensor**
(introduced in Section [sec:FullTensor]) with the command **X.full()**.

Converting a **FullTensor** to a **SparseTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **FullTensor** **X** can be converted to a **SparseTensor**
(introduced in Section [sec:SparseTensor]) with the command
**X.sparse()**.

Accessing the entries of a **SparseTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The entries of the tensor **X** can be accessed with the method
**eval\_at\_indices**: **X.eval\_at\_indices(ind)** returns the entries
of :math:`X` indexed by the list **ind** containing the indices to
access in each dimension.

A sub-tensor can be extracted from **X** with the method
**sub\_tensor**.

For a tensor :math:`X \in {\mathbb{R}}^{N, \ldots, N}`, the command
**X.eval\_diag()** returns the diagonal :math:`X_{i, \ldots, i}`,
:math:`i = 1, \ldots, N`, of the tensor. The method **eval\_diag** can
also be used to evaluate the diagonal in some dimensions **dims** of the
tensor with **X.eval\_diag(dims)**.

Reshaping a **SparseTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The method **reshape** reshapes a **SparseTensor** using the
Fortran-like index order of numpy’s reshape function.

The methods **transpose** and **itranspose** permute the dimensions of a
tensor **X**, given a permutation **dims** of :math:`\{1, \ldots, d\}`.
They are such that **X = X.transpose(dims).itranspose(dims)**.

Computing the Frobenius norm of a **SparseTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The command **X.norm()** returns the Frobenius norm of :math:`X`.

Performing operations with **SparseTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Some operations between tensors are implemented for **SparseTensor**
(see for a detailed description of the operations): the Kronecker
product with **kron**, the contraction with matrices or vectors with
**tensor\_matrix\_product** or **tensor\_vector\_product** respectively,
the evaluation of the diagonal of a contraction with matrices with
**tensor\_matrix\_product\_eval\_diag**, the dot product with **dot**.

**TreeBasedTensor** and **DimensionTree**
-----------------------------------------

We present in this section the **DimensionTree** and **TreeBasedTensor**
objects. For examples of use, see the tutorial file
``tutorials\tensor_algebra\tutorial_DimensionTree.py`` and
``tutorials\tensor_algebra\tutorial_TreeBasedTensor.py``.

**DimensionTree**
^^^^^^^^^^^^^^^^^

| A dimension tree :math:`T` is a collection of non-empty subsets of
  :math:`D = \{1, \ldots, d\}` which is such that (i) all nodes
  :math:`\alpha \in T` are non-empty subsets of :math:`D`, (ii)
  :math:`D` is the root of :math:`T`, (iii) every node
  :math:`\alpha \in T` with :math:`\#\alpha \ge 2` has at least two
  children and the set of children of :math:`\alpha`, denoted by
  :math:`S(\alpha)`, is a non-trivial partition of :math:`\alpha`, and
  (iv) every node :math:`\alpha` with :math:`\#\alpha = 1` has no child
  and is called a leaf (see for example Figure [fig:treeExamples]).
| We let
  :math:`\operatorname{depth}(T) = \max_{\alpha \in T} \operatorname{level}(\alpha)`
  be the depth of :math:`T`, and :math:`{\mathcal{L}}(T)` be the set of
  leaves of :math:`T`, which are such that :math:`S(\alpha) = \emptyset`
  for all :math:`\alpha \in {\mathcal{L}}(T)`.

.32

=[circle,fill=black] child node [active,label=below::math:`\{1\}`] child
node [active,label=below::math:`\{2\}`] child node
[active,label=below::math:`\{3\}`] child node
[active,label=below::math:`\{4\}`] ;

.32

=[circle,fill=black] =[sibling distance=15mm] =[sibling distance=15mm]
=[sibling distance=15mm] child node [active,label=below::math:`\{1\}`]
child node [active,label=above right:\ :math:`\{2,3,4\}`] child node
[active,label=below::math:`\{2\}`] child node [active,label=above
right:\ :math:`\{3,4\}`] child node [active,label=below::math:`\{3\}`]
child node [active,label=below::math:`\{4\}`] ;

.32

=[circle,fill=black] =[sibling distance=20mm] =[sibling distance=10mm]
child node [active,label=above left:\ :math:`\{1,2\}`] child node
[active,label=below::math:`\{1\}`] child node
[active,label=below::math:`\{2\}`] child node [active,label=above
right:\ :math:`\{3,4\}`] child node [active,label=below::math:`\{3\}`]
child node [active,label=below::math:`\{4\}`] ;

Creating a **DimensionTree**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **DimensionTree** is characterized by its adjacency matrix and the
dimension associated with each leaf node: **T =
tensap.DimensionTree(dims, adjacency\_matrix)**. The adjacency matrix of
a dimension tree :math:`T` can be accessed with **T.adjacency\_matrix**.
The dimension associated with each leaf node can be accessed with
**T.dim2ind**.

Denoting by **order** the number of leaf nodes, it is possible to create

-  a trivial tree with **tensap.DimensionTree.trivial(order)** (Figure
   [fig:trivialTree]),

-  a linear tree with **tensap.DimensionTree.linear(order)** (Figure
   [fig:linearTree]),

-  a balanced tree with
   **tensap.DimensionTree.balanced(order)**\ (Figure
   [fig:balancedTree]),

-  a random tree with **tensap.DimensionTree.random(order, arity)**,
   with **arity** the arity of the tree, equal to the maximum number of
   children per node (randomly selected in an interval if provided).

Finally, a dimension tree can be created by extracting a sub-tree from
an existing tree :math:`T` with **T.sub\_dimension\_tree(root)** where
**root** is the node in :math:`T` that will become the root node of the
extracted tree.

Displaying a **DimensionTree**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **DimensionTree** can be displayed with the command **T.plot()**. The
dimension associated with each leaf node can be plotted on the tree with
**T.plot\_dims()**. Finally, the tree can be plotted with some quantity
displayed at each node with **T.plot\_with\_labels\_at\_nodes(labels)**.

Accessing properties of the tree.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The number of nodes of a dimension tree :math:`T` is given by
**T.nb\_nodes**.

The parent of :math:`\alpha`, denoted by :math:`P(\alpha)`, can be
obtained with **T.parent(alpha)**, and its ascendants :math:`A(\alpha)`
and descendants :math:`D(\alpha)` by **T.ascendants(alpha)** and
**T.descendants(alpha)**, respectively. The children of :math:`\alpha`
are given by **T.children(alpha)**. The command
**T.child\_number(alpha)** returns :math:`i^\gamma_\alpha`, for
:math:`\alpha \in T \setminus \{D\}` and :math:`\gamma = P(\alpha)`,
which is such that :math:`\alpha` is the :math:`i^\gamma_\alpha`-th
child of :math:`\gamma`. For instance, in the tree of Figure
[fig:linearTree], the node :math:`\alpha = \{3,4\}` is the second child
of :math:`\gamma = \{2,3,4\}`.

The level of a node :math:`\alpha` is denoted by
:math:`\operatorname{level}(\alpha)`. The levels are defined such that
:math:`\operatorname{level}(D) = 0` and
:math:`\operatorname{level}(\beta) = \operatorname{level}(\alpha) + 1`
for :math:`\beta \in S(\alpha)`. The nodes of :math:`T` with level
:math:`l` are returned by **T.nodes\_with\_level(l)**.

The leaf nodes :math:`\alpha \in {\mathcal{L}}(T)` are such that
**T.is\_leaf[alpha-1]** is **True**.

**TreeBasedTensor**
~~~~~~~~~~~~~~~~~~~

| Given a dimension tree :math:`T`, a **TreeBasedTensor** **X** is a
  tensor in *tree-based format* (see
  [Falco2018]_, [Hackbusch2019]_). It represents
  an order :math:`d` tensor
  :math:`X \in {\mathbb{R}}^{N_1 \times \cdots \times N_d}` in the set
  of tensors with :math:`\alpha`-ranks bounded by some integer
  :math:`r_\alpha`, :math:`\alpha \in T`. Such a tensor admits a
  representation

  .. math:: X_{i_1, \ldots, i_d} = \sum_{\substack{1 \leq k_\beta \leq r_\beta \beta \in T\setminus \{D\}}} \prod_{\alpha \in T\setminus {\mathcal{L}}(T)} C^\alpha_{(k_\beta)_{\beta \in S(\alpha)},k_\alpha} \prod_{\alpha \in {\mathcal{L}}(T)} C^{\alpha}_{i_\alpha,k_\alpha},

  with :math:`C^\alpha`, :math:`\alpha \in T`, some tensors that
  parameterize the representation of :math:`X`. When :math:`T` is a
  binary tree, the corresponding format is the so-called hierarchical
  Tucker (HT) format. The particular case of a linear binary tree is the
  tensor train Tucker format.

| The *Tucker format* corresponds to a trivial tree
  :math:`T=\{\{1\},\ldots,\{d\},\{1,\ldots,d\}\}` and admits the
  representation

  .. math:: X_{i_1, \ldots, i_d} = \sum_{k_1=1}^{r_1} \ldots \sum_{k_d=1}^{r_d} C^{1,\ldots,d}_{k_1,\ldots,k_d} C^{1}_{i_1,k_1} \ldots C^{d}_{i_d,k_d}.

A *degenerate tree-based format* is defined as the set of tensors with
:math:`\alpha`-ranks bounded by some integer :math:`r_\alpha`, for all
:math:`\alpha` in a subset :math:`A` of :math:`T`. The set :math:`A`
corresponds to active nodes, which should contain all interior nodes
:math:`T\setminus{\mathcal{L}}(T)`. A **TreeBasedTensor** **X** with
active nodes :math:`A` admits a representation.

.. math:: X_{i_1, \ldots, i_d} = \sum_{\substack{1 \leq k_\beta \leq r_\beta \beta \in A \setminus \{D\}}} \prod_{\alpha \in A\setminus {\mathcal{L}}(T)} C^\alpha_{(k_\beta)_{\beta \in S(\alpha)},k_\alpha} \prod_{\alpha \in {\mathcal{L}}(T) \cap A} C^{\alpha}_{i_\alpha,k_\alpha},

with :math:`C^\alpha`, :math:`\alpha \in A`, some tensors that
parameterize the representation of :math:`X`.

| The *tensor train format* is a degenerate tree-based format with a
  linear tree :math:`T` and all leaf nodes inactive except the first
  one, that means :math:`A = \{\{1\},\{1,2\}, \ldots, \{1,\ldots,d\}\}`.
  A tensor :math:`X` in tensor train format admits a representation

  .. math:: X_{i_1,\ldots,i_d} = \sum_{k_1=1}^{r_1} \ldots \sum_{k_{d-1}=1}^{r_{d-1}} C^1_{1,i_1,k_1} C^2_{k_1,i_2,k_2} \ldots C^{d-1}_{k_{d-2},i_{d-1},k_{d-1}} C^d_{k_{d-1},i_d,1}

  with tensor :math:`C^\nu` and rank :math:`r_\nu` associated with the
  node :math:`\alpha = \{1,\ldots,\nu\}`.

| For a more detailed presentation of tree-based formats (possibly
  degenerate) and more examples, see [Nouy2017book]_.
| If the rank :math:`r_D` associated with the root node is different
  from :math:`1`, a **TreeBasedTensor** **X** represents a tensor of
  order :math:`d+1` with entries :math:`X_{i_1,\ldots,i_d,k_D}`,
  :math:`1\le k_D \le r_D`. I can be used to defined vector-valued
  functional tensors (see ).

Creating a **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

| A **TreeBasedTensor** is created with the command **X =
  tensap.TreeBasedTensor(C, T)**, with **C** the list of **FullTensor**
  objects representing the :math:`C^\alpha`, :math:`\alpha \in T`, and
  **T** a **DimensionTree**. If some entries of the list **C**
  corresponding to leaf nodes are empty, it creates a degenerate tensor
  format, with :math:`T\setminus A` corresponding to the empty entries
  of **C**.
| It is possible to create a **TreeBasedTensor** in tensor-train format
  with the command **tensap.TreeBasedTensor.tensor\_train(C)**, with
  **C** a list containing the tensors :math:`C^1,\ldots,C^d`.
| Given a **DimensionTree** **T**, it is also possible to generate a
  **TreeBasedTensor** with entries

-  equal to 0 with **tensap.TreeBasedTensor.zeros(T, r, s, I)**,

-  equal to 1 with **tensap.TreeBasedTensor.ones(T, r, s, I)**,

-  drawn randomly according to the uniform distribution on
   :math:`[0, 1]` with **tensap.TreeBasedTensor.rand(T, r, s, I)**,

-  drawn randomly according to the standard gaussian distribution with
   **tensap.TreeBasedTensor.randn(T, r, s, I)**,

-  generated using a provided **generator** with
   **tensap.TreeBasedTensor.create** **(generator, T, r, s, I)**,

where **r** is a list containing the :math:`\alpha`-ranks,
:math:`\alpha \in T`, or **’random’**, **s** is a list containing the
sizes :math:`N_1, \ldots, N_d`, or **’random’**, and **I** is a list of
booleans indicating if the node :math:`\alpha` is active,
:math:`\alpha \in T`, or **’random’**.

Storage complexity. 
^^^^^^^^^^^^^^^^^^^^

| The storage complexity of **X** is given by
  :math:`{\texttt{\detokenize{X.size = X.storage()}}}` and returns the
  number of entries in tensors :math:`C^\alpha`, :math:`\alpha\in A`.
| The storage complexity of **X** taking into account the sparsity in
  the :math:`C^\alpha`, :math:`\alpha \in T`, is given by
  **X.sparse\_storage()**. It returns the number of non-zero entries in
  tensors :math:`C^\alpha`, :math:`\alpha\in A`.
| The storage complexity of **X** taking into account the sparsity only
  in the leaf nodes is given by **X.sparse\_leaves\_storage()**.

Displaying a **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A graphical representation of a **TreeBasedTensor** **X** can be
obtained with the command **X.plot()**. Labels can be added to the nodes
of the tree, as well as a title, with **X.plot(labels, title)**.

Converting a **TreeBasedTensor** to a **FullTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **TreeBasedTensor** **X** can be converted to a **FullTensor**
(introduced in Section [sec:FullTensor]) with the command **X.full()**.

Converting a **FullTensor** to a **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **FullTensor** **X** can be converted to a **TreeBasedTensor** with
the command **X.tree\_based\_tensor()**. The associated dimension tree
is a trivial tree with active nodes.

Accessing the entries of a **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The entries of the tensor **X** can be accessed with the method
**eval\_at\_indices**: **X.eval\_at\_indices(ind)** returns the entries
of :math:`X` indexed by the list **ind** containing the indices to
access in each dimension.

A sub-tensor can be extracted from **X** with the method **sub\_tensor**
(see )

For a tensor :math:`X \in {\mathbb{R}}^{N, \ldots, N}`, the command
**X.eval\_diag()** returns the diagonal :math:`X_{i, \ldots, i}`,
:math:`i = 1, \ldots, N`, of the tensor. The method **eval\_diag** can
also be used to evaluate the diagonal in some dimensions **dims** of the
tensor with **X.eval\_diag(dims)**.

Obtaining an orthonormal representation of a **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The command **X.orth()** returns a representation of **X** where all the
core tensors except the root core represent orthonormal bases of
principal subspaces.

The command **X.orth\_at\_node(alpha)** returns a representation of
**X** where all the core tensors except the one of node :math:`\alpha`
represent orthonormal bases of principal subspaces. The core tensor
:math:`C^\alpha` of the node :math:`\alpha` is such that the tensor
writes

.. math:: X_{i_\alpha, i_{\alpha^c}} = \sum_{k} \sum_{l} C^\alpha_{k,l} u_l(i_\alpha) w_k(i_{\alpha^c}),

where the :math:`u_l` are orthonormal tensors and the :math:`w_k` are
orthonormal tensors. This orthonormality of the representation can be
checked by computing the Gram matrices of the bases of minimal subspaces
associated with the nodes of the tree with **X.gramians()**.

Modifying the tree structure of a **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It is possible to modify the tree of a **TreeBasedTensor** **X** by
permuting two of its nodes :math:`\alpha` and :math:`\beta` given a
relative tolerance **tol** with **X.permute\_nodes([alpha, beta],
tol)**.

The leaves of the tree can also be permuted with the command
**X.permute\_leaves(perm, tol)**, where **perm** is a permutation of
:math:`(1, \ldots, d)`.

The method **optimize\_dimension\_tree** tries random permutations of
nodes to minimize the storage complexity of a tree-based tensor
:math:`X`: **X.optimize\_dimension\_tree(tol, n)** tries :math:`n`
random permutations and returns a **TreeBasedTensor** **Y** which is
such that **Y.storage()** is less or equal than **X.storage()**. The
nodes to permute are drawn according to probability measures favoring
high decreases of the ranks while maintaining a permutation cost as low
as possible (see [Grelier2019]_).

The similar method **optimize\_leaves\_permutations** focuses on the
permutation of the leaf nodes to try to reduce the storage complexity of
a **TreeBasedTensor**.

Computing the Frobenius norm of a **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The command **X.norm()** returns the Frobenius norm of :math:`X`.

Computing the :math:`\alpha`-singular values of a **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For all :math:`\alpha \in T`, the :math:`\alpha`-singular values of
:math:`X` can be obtained with **X.singular\_values()**.

The method **rank** uses the method **singular\_values** to compute the
:math:`\alpha`-ranks, :math:`\alpha \in T`, of a **TreeBasedTensor**.

Computing the derivative of **TreeBasedTensor** with respect to one of its parameters.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For an order-\ :math:`d` tree-based tensor **X** in
:math:`{\mathbb{R}}^{N \times \cdots \times N}`,
**X.parameter\_gradient\_eval\_diag(alpha)**, for :math:`\alpha \in T`,
returns the derivative

.. math:: \left.\frac{\partial X_{i_1, \ldots, i_d}}{\partial C^\alpha}\right|_{i_1=\cdots=i_d=i}, \; i = 1, \ldots, N.

The method **parameter\_gradient\_eval\_diag** is used in the
statistical learning algorithms presented in Section
[sec:TensorLearning].

Performing operations with **TreeBasedTensor**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Some operations between tensors are implemented for **TreeBasedTensor**
(see for a detailed description of the operations): the Kronecker
product with **kron**, the contraction with matrices or vectors with
**tensor\_matrix\_product** or **tensor\_vector\_product** respectively,
the evaluation of the diagonal of a contraction with matrices with
**tensor\_matrix\_product\_eval\_diag**, the dot product with **dot**.

**Z = X.tensor\_matrix\_product(M)** **tensor\_vector\_product**
**tensor\_matrix\_product\_eval\_diag** **X.kron(Y)** **X.dot(Y)**

Tensor truncation with **Truncator**
------------------------------------

The object **Truncator** embeds several methods of truncation of tensors
in different formats. Given a tolerance **tol** and a maximum rank or
tuple of ranks **r**, a **Truncator** object can be created with **t =
tensap.Truncator(tol, r)**. The thresholding type (**’hard’** or
**’soft’**) can also be specified as a third argument.

For examples of use, see the tutorial file
``tutorials\tensor_algebra\tutorial_tensor_truncation.py``.

Truncation.
^^^^^^^^^^^

The generic method **truncate** calls one of the methods presented
below, based on the type and order of its input, to obtain a truncation
of the provided tensor satisfying the relative prevision and maximal
rank requirements.

For an order :math:`2` tensor, the method **svd** is called. For a
tensor of order greater than :math:`2`, the method **hosvd** is called
for a **FullTensor**, and **hsvd** for a **TreeBasedTensor**.

Truncated singular value decomposition.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The method **svd** computes the truncated singular value decomposition
of an order :math:`2` tensor. The input tensor can be a
**numpy.ndarray**, a **tensorflow.Tensor**, a **FullTensor** or a
**CanonicalTensor**, in which case the method **trunc\_svd** is called,
or a **TreeBasedTensor**, in which case the method **hsvd** is called.

The method **trunc\_svd** computes the truncated singular value
decomposition of a matrix, with a given relative precision in Schatten
:math:`p`-norm (with a specified value for :math:`p`) and given maximal
rank. The returned truncation is a **CanonicalTensor**.

Truncated higher-order singular value decomposition.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A truncated higher-order singular value decomposition of a
**numpy.ndarray**, a **FullTensor** or a **TreeBasedTensor** can be
computed with the method **hosvd**. The output is either a
**CanonicalTensor** for an order :math:`2` tensor, or a
**TreeBasedTensor** with a trivial tree for a tensor of order greater
than :math:`2`.

Truncation in tree-based tensor format.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The method **hsvd** computes, given a **TreeBasedTensor** or a
**FullTensor** with a tree and a set of active nodes, a truncation in
tree-based tensor format.

Truncation in tensor train format.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The method **ttsvd**, given a **FullTensor**, calls the method **hsvd**
with a linear tree and all the leaf nodes inactive except the first one,
resulting in a truncation in tensor-train format.

Measures, bases and functions
=============================

**RandomVariable**
------------------

A random variable :math:`X` can be created by calling its name: for
instance, **X = tensap.UniformRandomVariable(a, b)** creates a random
variable with a uniform distribution on the interval :math:`[a, b]`. The
random variables currently implemented in tensap are:

-  **tensap.DiscreteRandomVariable(v, p)**: a random variable with
   discrete values :math:`v` and associated probabilities :math:`p`,

-  **tensap.UniformRandomVariable(a, b)**: a uniform random variable on
   :math:`[a, b]`,

-  **tensap.NormalRandomVariable(m, s)**: a normal random variable with
   mean :math:`m` and standard deviation :math:`s`,

-  **tensap.EmpiricalRandomVariable(S)**: a random variable created from
   a sample :math:`S` using kernel density estimation with Scott’s rule
   of thumb to determine the bandwidth.

A new random variable can easily be implemented in tensap by making its
class inheriting from **RandomVariable** and implementing the few
methods necessary for its creation.

Once a random variable :math:`X` is created, one can for instance
generate :math:`n` random numbers according to its distribution with
**X.random(n)**, create the orthonormal polynomials associated with its
measure with **X.orthonormal\_polynomials()** (as presented in Section
[sec:polynomials]), or evaluate its probability density function
(**X.pdf(x)**), cumulative distribution function (**X.cdf(x)**) or
inverse cumulative distribution function (**X.icdf(x)**).

**RandomVector**
----------------

A random vector :math:`X` if defined in tensap by a list of
**RandomVariable** objects and a **Copula**, characterizing the
dependencies between the random variables. Currently, only the
independent copula **IndependentCopula** is implemented.

Given a list of **RandomVariable** **random\_variables** and a **Copula
C**, a random vector can be created with **X =
tensap.RandomVector(random\_variables, copula=C)**.

Once a random vector :math:`X` is created, one can for instance generate
:math:`n` random numbers according to its distribution with
**X.random(n)**, create the orthonormal polynomials associated with its
measure with **X.orthonormal\_polynomials()** (as presented in Section
[sec:polynomials]), or evaluate its probability density function
(**X.pdf(x)**) or cumulative distribution function (**X.cdf(x)**).

**Polynomials**
---------------

Families of univariate polynomials :math:`(p_i)_{i\ge 0}` are
represented in tensap with classes inheriting from
**UnivariatePolynomials**. The :math:`i`-th polynomial :math:`p_i`
represented by a **UnivariatePolynomials** object **P** can be evaluated
with **P.polyval(x, i)**, as well as its first order derivative
(**P.d\_polyval(x, i)**) and its :math:`n`-th order derivative
(**P.dn\_polyval(x, n, i)**).

Given a measure :math:`\mu`, the moments
:math:`\int  p_{i_1}(x)...p_{i_k}(x) d\mu(x)` for
:math:`(i_1,...,i_k) \in {\mathbb{N}}^{k}` can be obtained with
**P.moment(I, mu)**, with :math:`{\texttt{\detokenize{I}}}` a
:math:`n`-by-:math:`k` array representing :math:`n` tuples
:math:`(i_1,...,i_k)`. **P.moment(I, X)** with
:math:`{\texttt{\detokenize{X}}}` a random variable considers for
:math:`\mu` the probability distribution of :math:`X`.

**CanonicalPolynomials**.
^^^^^^^^^^^^^^^^^^^^^^^^^

The family of canonical polynomials is implemented in the class
**CanonicalPolynomials**. It is such that its :math:`i`-th polynomial is
:math:`p_i(x) = x^i`.

**OrthonormalPolynomials**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Orthonormal polynomials are families of polynomials
:math:`(p_i)_{i \geq 0}` that satisfy

.. math:: \int p_i(x) p_j(x) d\mu(x)= \delta_{ij}

with :math:`\delta_{ij}` the Kronecker delta, and with :math:`\mu` some
measure.

| In tensap, the orthonormal polynomials :math:`p_i`, :math:`i \geq 0`,
  are defined using the three-term recurrence relation

  .. math::

     \begin{aligned}
                     &\tilde p_{-1}(x) = 0, \quad \tilde p_0(x) = 1, \\
                     &\tilde p_{i+1}(x) = (x - a_i)\tilde p_{i}(x) - b_i \tilde p_{i-1}(x), \quad i \geq 0,\\
                     &p_i(x) = \frac{\tilde p_i(x)}{n_i}, \quad i \geq 0
                 \end{aligned}

  with :math:`a_i` and :math:`b_i` the recurrence coefficients, and
  :math:`n_i` the norm of :math:`\tilde p_i`, defined by

  .. math::

     a_i = \frac{\int \tilde p_i(x) x \tilde p_i(x) d\mu(x)}{\int p_i(x) \tilde p_i(x) d\mu(x)}, \quad 
                     b_i = \frac{\tilde p_i(x) \tilde p_i(x) d\mu(x)}{\int \tilde p_{i-1}(x) \tilde p_{i-1}(x)  d\mu(x)}, \quad
                     n_i = \sqrt{\int\tilde p_i(x) \tilde p_i(x) d\mu(x)}.

  Implementing a new family of orthonormal polynomials in tensap is
  easy: one only needs to create a class with a method providing the
  recurrence coefficients :math:`a_i`, :math:`b_i` and the norms
  :math:`n_i`, :math:`\forall i \geq 0`.

| Are currently implemented in tensap:

-  **DiscretePolynomials**: discrete polynomials orthonormal with
   respect to the measure of a **DiscreteRandomVariable**;

-  **LegendrePolynomials**: polynomials defined on :math:`[-1,1]` and
   orthonormal with respect to the uniform measure on :math:`[-1,1]`
   with density :math:`\frac{1}{2}\mathbf{1}_{[-1,1]}(x)`;

-  **HermitePolynomials**: polynomials defined on :math:`{\mathbb{R}}`
   and orthonormal with respect to the standard gaussian measure with
   density :math:`\exp(-x^2/2)/\sqrt{2\pi}`;

-  **EmpiricalPolynomials**: polynomials orthonormal with respect to the
   measure of an **EmpiricalRandomVariable**.

| If **mu** is a **LebesgueMeasure** on :math:`[-1,1]`,
  **mu.orthonormal\_polynomials()** returns a **LegendrePolynomials**
  with suitably normalized coefficients. If **mu** is a
  **LebesgueMeasure** on :math:`[a,b]` different from :math:`[-1,1]`,
  **mu.orthonormal\_polynomials()** returns a
  **ShiftedOrthonormalPolynomials**.
| If :math:`{\texttt{\detokenize{X}}}` is a **DiscreteRandomVariable**,
  a **UniformRandomVariable**, a **NormalRandomVariable**, or a
  **EmpiricalRandomVariable**, the corresponding family of orthonormal
  polynomials can be created with the command
  **X.orthonormal\_polynomials()**. If :math:`{\texttt{\detokenize{X}}}`
  does not correspond to a default measure but can be obtained as the
  push-forward measure of a default measure by an affine transformation
  (e.g. a uniform measure on :math:`[a,b] \neq [-1,1]`, or a gaussian
  measure with mean :math:`a` and standard deviation :math:`\sigma` with
  :math:`(a,\sigma)\neq (0,1)`.), the returned object is a
  **ShiftedOrthonormalPolynomials**.

**FunctionalBasis**
-------------------

| Bases of functions can be implemented in tensap by inheriting from
  **FunctionalBasis**. The basis functions of a **FunctionalBasis**
  object **H** can be evaluated with **H.eval(x)**, as well as their
  :math:`i`-th order derivative with **H.eval\_derivative(i, x)**.
| We present below some specific bases implemented in tensap. New bases
  can easily be implemented by making their class inherit from
  **FunctionalBasis**.

**PolynomialFunctionalBasis**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The command **tensap.PolynomialFunctionalBasis** **(basis, indices)**,
with **basis** a **UnivariatePolynomials** and **indices** a list,
returns the basis of polynomials :math:`(p_i)_{i \in I}` with :math:`I`
given by **indices**.

**UserDefinedFunctionalBasis**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Given a list of functions **fun**, taking each as inputs :math:`d`
variables, and a **Measure mu**, the command
**tensap.UserDefinedFunctionalBases(fun, mu, d)** returns a basis whose
functions are the ones given in **fun**, with a domain equipped with the
measure :math:`mu`.

**FullTensorProductFunctionalBasis**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **FullTensorProductFunctionalBasis** object represents a basis of
multivariate functions
:math:`\{\phi^1_{i_1}(x_1) \cdots \phi^d_{i_d}(x_d)\}_{i_1 \in I^1, \ldots, i_d \in I^d}`.
It is obtained with the command
**tensap.FullTensorProductFunctionalBasis(bases)**, where **bases** is a
list of **FunctionalBasis** or a **FunctionalBases**, containing the
different bases :math:`\{\phi^\nu_{i_\nu}\}_{i_\nu \in I^\nu}`,
:math:`\nu = 1, \ldots, d`.

**SparseTensorProductFunctionalBasis**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **SparseTensorProductFunctionalBasis** object represents a basis of
multivariate functions
:math:`\{\phi^1_{i_1}(x_1) \cdots \phi^d_{i_d}(x_d)\}_{(i_1, \ldots, i_d) \in \Lambda}`,
with :math:`\Lambda \subset I^1 \times \cdots \times I^d` a set of
multi-indices. It is obtained with the command
**tensap.SparseTensorProductFunctionalBasis(bases, indices)**, where
**bases** is a list of **FunctionalBasis** or a **FunctionalBases**,
containing the different bases
:math:`\{\phi^\nu_{i_\nu}\}_{i_\nu \in I^\nu}`,
:math:`\nu = 1, \ldots, d`, and **indices** is a **MultiIndices**
representing the set of multi-indices :math:`\Lambda`.

**FunctionalBases**
-------------------

The command **tensap.FunctionalBases(bases)**, with **bases** a list of
**FunctionalBasis**, returns an object representing a collections of
bases. To obtain a collection of :math:`d` identical bases, one can use
**tensap.FunctionalBases.duplicate(basis, d)**.

Similarly to **FunctionalBasis**, the basis functions of a
**FunctionalBases** object **H** can be evaluated with **H.eval(x)**, as
well as their :math:`i`-th order derivative with **H.eval\_derivative(i,
x)**.

**FunctionalBasisArray**
------------------------

Given a basis of functions :math:`\{\phi_{i}\}_{i \in I}`, a
**FunctionalBasisArray** object represents a function :math:`f` that
writes

.. math:: f(x) = \sum_{i \in I} a_i \phi_i(x),

with some coefficients :math:`a_i`, :math:`i \in I`, and can be created
with the command **f = tensap.FunctionalBasisArray(a, basis, shape)**,
with **shape** the output shape of :math:`f`.

A **FunctionalBasisArray** is a **Function**. It can be evaluated with
the command **f.eval(x)**, and one can obtain its derivatives with
**f.eval\_derivative(n, x)**.

**FunctionalTensor**
--------------------

Given :math:`d` bases of functions
:math:`\{\phi^\nu_{i_\nu}\}_{i_\nu \in I^\nu}`,
:math:`\nu = 1, \ldots, d`, and a tensor
:math:`a \in {\mathbb{R}}^{I^1 \times \cdots \times I^d}`, a
**FunctionalTensor** object represents a function :math:`f` that writes

.. math:: f(x) = \sum_{i_1 \in I^1} \cdots \sum_{i_d \in I^d} a_{i_1, \ldots, i_d} \phi^1_{i_1}(x_1) \cdots \phi^d_{i_d}(x_d).

The tensor :math:`a` can be in different tensor formats
(**FullTensor**, **TreeBasedTensor**, ...).

A **FunctionalTensor** is a **Function**. It can be evaluated with the
command **f.eval(x)**, and one can obtain its derivatives with
**f.eval\_derivative(n, x)**.

**Tensorizer** and **TensorizedFunction**
-----------------------------------------

| For an introduction to tensorization of functions, see
  [Ali2020I]_, [Ali2020II]_, [Ali2021III]_.
| We consider functions defined on the interval :math:`I = [0,1)`. For a
  given :math:`b \in \{2,3,\ldots, \}` and :math:`d\in {\mathbb{N}}`, an
  element :math:`x \in I` can be identified with the tuple
  :math:`(i_1,\ldots,i_d,y)`, such that

  .. math::

     \label{eq:tensorization}
                 x = t_{b,d}(i_1,\ldots,i_d,y) = \sum_{k=1}^d i_kb^{-k} + b^{-d}y

  with :math:`i_k \in I_b = \{0,\ldots,b-1\}`, :math:`k = 1,\ldots,d`,
  and :math:`y = b^d x - \lfloor b^d x \rfloor \in [0,1)`. The tuple
  :math:`(i_1,\ldots,i_d)` is the representation in base :math:`b` of
  :math:`\lfloor b^d x \rfloor`. This defines a bijective map
  :math:`t_{b,d}` from :math:`\{0,\ldots,b-1\}^{d} \times [0,1)` to
  :math:`[0,1)`.

Such a mapping is represented in tensap by the object **Tensorizer**:
**t = tensap.Tensorizer(b, d)**. For a given :math:`x` in :math:`[0,1)`,
on obtains the corresponding tuple :math:`(i_1, ..., i_d,y)` with the
command **= t.map(x)**. For a given tuple :math:`(i_1, ..., i_d,y)`, on
obtains the corresponding :math:`x` with **t.inverse\_map([i\_1, ...,
i\_d,y])**.

This identification is generalized to functions of :math:`D` variables
with **t = tensap.Tensorizer(b, d, D)**.

| The map :math:`t_{b,d}` allows to define a tensorization map
  :math:`T_{b,d}`, which associates to a univariate function :math:`F`
  defined on :math:`[0,1)` the multivariate function
  :math:`f = F \circ t_{b,d}` defined on :math:`I_b^d \times I`, such
  that

  .. math:: f(i_1,\ldots,i_d,y) = F(t_{b,d}(i_1,\ldots,i_d,y)).

  Such a function is represented in tensap by a **TensorizedFunction**,
  and can be created with **f = tensap.TensorizedFunction(fun, t)**,
  with **fun** a **function** or **Function** and **t** a
  **Tensorizer**. The **TensorizedFunction** **f** is a function of
  :math:`d+1` variables that can be evaluated with **f.eval(x)**, with
  **x** a list or **numpy.ndarray** with :math:`d+1` columns.

See the tutorial file ``tutorials\functions\tutorial_TensorizedFunction.py``.

Tools
=====

**MultiIndices**
----------------

A multi-index is a tuple :math:`(i_1,\ldots,i_d) \in {\mathbb{N}}_0^d`.
A set :math:`I \subset {\mathbb{N}}_0^d` of multi-indices is represented
with an object **MultiIndices**.

To create a multi-index set :math:`I`, we use the command
**tensap.MultiIndices(I)** with **I** a numpy array of size
:math:`\#I \times d`.

A product set :math:`I = I_1 \times \ldots \times I_d` can be obtained
with **tensap.MultiIndices.product\_set([I1,...,Id])**.

The set of multi-indices

.. math:: I = \{i  \in {\mathbb{N}}_0^d : \Vert i \Vert_{\ell^p} \le m\}

can be obtained with **tensap.MultiIndices.with\_bounded\_norm(d, p,
m)**

The set of multi-indices

.. math:: I = \{i  \in {\mathbb{N}}_0^d : i_\nu \le m_\nu , 1\le \nu \le d\}

can be obtained with **tensap.MultiIndices.bounded\_by(d, p, m)**. If
:math:`m` is of length :math:`1`, it uses :math:`m_\nu = m` for all
:math:`\nu`.

For obtaining the margin or reduced margin of an multi-index set
:math:`I`, we can use For other operations of **MultiIndices**, see the
tutorial file ``tutorials\tools\tutorial_MultiIndices.py``.

**TensorGrid**, **FullTensorGrid** and **SparseTensorGrid**
-----------------------------------------------------------

| Tensor product grids or sparse grids are represented with classes
  **FullTensorGrid** and **SparseTensorGrid**, that inherit from
  **TensorGrid**.
| See the tutorial file
  ``tutorials\functions\tutorial_functions_bases_grids.py``.

Learning
========

We present in this section some objects implemented in tensap for
learning functions or tensors.

**(Functional)TensorPrincipalComponentAnalysis**
------------------------------------------------

The objects **TensorPrincipalComponentAnalysis** (resp.
**FunctionalTensorPrincipalComponentAnalysis**) implements approximation
methods for algebraic (resp. functional) tensors based on principal
component analysis, using an adaptive sampling of the entries of the
tensor (or the function). See [Nouy2017hopca]_ for a
description of the algorithms, and for examples of use, see the tutorial
files ``tutorials\approximation\tutorial_TensorPrincipalComponentAnalysis.py``
and ``tutorials\approximation\tutorial_FunctionalTensorPrincipalComponentAnalysis.py``.

| The difference between the two objects if that
  **TensorPrincipalComponentAnalysis**\ ’ methods take as first input a
  function returning components of the algebraic tensor to learn,
  whereas the methods of **FunctionalTensorPrincipalComponentAnalysis**
  take as first input the functional tensor to learn.

Both objects are parameterized by the attributes:

-  **pca\_sampling\_factor**: a factor to determine the number of
   samples :math:`N` for the estimation of the principal components (1
   by default): if the precision is prescribed,
   :math:`N = {\texttt{\detokenize{pca_sampling_factor}}} \times N_\alpha`,
   if the rank is prescribed,
   :math:`N = {\texttt{\detokenize{pca_sampling_factor}}} \times t`;

-  **pca\_adaptive\_sampling**: a boolean indicating if adaptive
   sampling is used to determine the principal components with
   prescribed precision;

-  **tol**: an array containing the prescribed relative precision; set
   **tol = inf** for prescribing the rank;

-  **max\_rank**: an array containing the maximum alpha-ranks (the
   length depends on the format). If **len(max\_rank) == 1**, uses the
   same value for all alpha; setting **max\_rank = inf** prescribes the
   precision.

Furthermore, a **FunctionalTensorPrincipalComponentAnalysis** is
parameterized by the attributes:

-  **bases**: the functional bases used for the projection of the
   function;

-  **grid**: the **FullTensorGrid** used for the projection of the
   function on the functional bases;

-  **projection\_type**: the type of projection, the default being
   ’interpolation’.

Both objects implement four main methods:

-  **hopca**: returns the set of :math:`\{\nu\}`-principal components of
   an order :math:`d` tensor, for all :math:`\nu \in \{1,\ldots,d\}`;

-  **tucker\_approximation**: returns an approximation of a tensor of
   order :math:`d` or a function of :math:`d` variables in Tucker
   format;

-  **tree\_based\_approximation**: provided with a tree and a list of
   active nodes, returns an approximation of a tensor of order :math:`d`
   or a function of :math:`d` variables in tree-based tensor format;

-  **tt\_approximation**: returns an approximation of a tensor of order
   :math:`d` or a function of :math:`d` variables in tensor-train
   format.

**LossFunction**
----------------

| In tensap, a loss function is an object inheriting from
  **LossFunction**. Given a function **fun** and a sample as a list used
  to evaluate the loss function, a **LossFunction** object :math:`\ell`
  can be evaluated with **l.eval(fun, sample)**. The risk associated
  with **fun** can be evaluated using the sample with
  **l.risk\_estimation(fun, sample)**. Finally, the test error and
  relative test error (if defined) can be evaluated with
  **l.test\_error(fun, sample)** and **l.relative\_test\_error(fun,
  sample)**, respectively.
| Currently, three loss functions are implemented in tensap:

-  **SquareLossFunction**: :math:`\ell(g, (x, y)) = (y - g(x))^2`, used
   for least-squares regression in supervised learning, to construct an
   approximation of a random variable :math:`Y` as a function of a
   random vector :math:`X` (a predictive model);

-  **DensityL2LossFunction**: :math:`\ell(g, x) = \|g\|^2 - 2g(x)`, used
   for least-squares density estimation, to approximate the distribution
   of a random variable :math:`X` from samples of :math:`X`;

-  **CustomLossFunction**: defined by the user as any function defining
   a loss. If the loss is defined using tensorflow operations, then the
   empirical risk can be minimized using tensorflow’s automatic
   differentiation capability with a **LinearModelLearningCustomLoss**
   object, presented in the next section.

**LinearModelLearning**
-----------------------

Objects inheriting from **LinearModelLearning** implement the empirical
risk minimization associated with a linear model that writes

.. math:: g(x) = \sum_{i \in I} a_i \phi_i(x),

with :math:`\{\phi_i\}_{i \in I}` a given basis (or a set of features)
and :math:`(a_i)_{i \in I}` some coefficients, and a loss function,
introduced in the previous section.

In order to perform empirical risk minimization, a
**LinearModelLearning** object **s** must be provided with a training
sample in **s.training\_sample**. In supervised learning, for the
approximation of a random variable :math:`Y` as a function of :math:`X`,
the training sample is a list **[x, y]**, with
:math:`{\texttt{\detokenize{y}}}` represents :math:`n` samples
:math:`\{y_k\}_{k=1}^n` of :math:`Y` and :math:`x` the :math:`n`
corresponding samples :math:`\{x_k\}_{k=1}^n` of :math:`X`. In density
estimation, the training sample is an array **x** containing samples
:math:`\{x_k\}_{k=1}^n` from the distribution to estimate.

One must also provide a basis (in **s.basis**) or evaluations of the
basis on the training set (in **s.basis\_eval**, in which case the
:math:`x` are not mandatory in **s.training\_sample**). The latter
option allows for providing features :math:`\phi_i(x_k)` associated with
samples :math:`x^k`, without providing the feature maps :math:`\phi_i`.

| One can also provide the **LinearModelLearning** **s** with a test
  sample in **s.test\_data** to compute a test error.
| Currently in tensap, three different **LinearModelLearning** objects
  are implemented:

-  **LinearModelLearningSquareLoss**, to minimize the risk associated
   with a **SquareLossFunction**;

-  **LinearModelLearningDensityL2**, to minimize the risk associated
   with a **DensityL2LossFunction**;

-  **LinearModelLearningCustomLoss**, to minimize the risk associated
   with a **CustomLossFunction**.

**LinearModelLearningSquareLoss**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **LinearModelLearningSquareLoss** object **s** implements three ways
of solving the empirical risk minimization associated with a
**SquareLossFunction**:

-  by default, **s.solve()** solves the ordinary least-squares problem

   .. math:: \min_{(a_i)_{i \in I}} \frac 1 n \sum_{k=1}^n (y_k - \sum_{i \in I} a_i \phi_i(x_k))^2;

-  with the attribute **s.regularization = True**, **s.solve()** solves
   the regularized problem

   .. math:: \min_{(a_i)_{i \in I}} \frac 1 n \sum_{k=1}^n (y_k - \sum_{i \in I} a_i \phi_i(x_k))^2 + \lambda \|a\|_p

   with :math:`\lambda` a regularization hyper-parameter, selected with
   a cross-validation estimate of the error and :math:`p` specified by
   **s.regularization\_type** which can be **’l0’** (:math:`p = 0`),
   **’l1’** (:math:`p = 1`) or **’l2’** (:math:`p = 2`);

-  let us suppose that we have a collection of candidate sparsity
   patterns :math:`K_\lambda`, :math:`\lambda \in \Lambda`, for the
   parameter :math:`a`: with the attribute **s.basis\_adaptation =
   True**, **s.solve()** solves, for all :math:`\lambda\in \Lambda`, the
   problem

   .. math:: \min_{(a_i)_{i \in I}} \frac 1 n \sum_{k=1}^n (y_k - \sum_{i \in I} a_i \phi_i(x_k))^2 \quad \text{subject to } \mathrm{support}(a) \subset K_\lambda,

   where :math:`\mathrm{support}(a) = \{ k \in K : a_k \neq 0 \}`, and
   selects the optimal sparsity pattern using a cross-validation
   estimate of the error.

**LinearModelLearningDensityL2**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **LinearModelLearningDensityL2** object **s** implements three ways of
solving the empirical risk minimization associated with a
**DensityL2LossFunction**:

-  by default, **s.solve()** solves the minimization problem

   .. math:: \min_{(a_i)_{i \in I}} \| \sum_{i \in I} a_i \phi_i \|_{L^2}^2 - \frac 2 n \sum_{k=1}^n \sum_{i \in I} a_i \phi_i(x_k);

-  with the attribute **s.regularization = True**, **s.solve()** solves
   the constrained problem

   .. math:: \min_{(a_i)_{i \in I}} \|\sum_{i \in I} a_i \phi_i \|_{L^2}^2 - \frac 2 n \sum_{k=1}^n \sum_{i \in I} a_i \phi_i(x_k) \quad \text{subject to } \mathrm{support}(a) \subset K_\lambda,

   with :math:`K_\lambda`, :math:`\lambda \in \Lambda`, a sequence of
   sets of indices that introduce the coefficients solution of the
   minimization problem without regularization in decreasing order of
   magnitude. The optimal sparsity pattern is determined using a
   cross-validation estimate of the error;

-  let us suppose that we have a collection of candidate patterns
   :math:`K_\lambda`, :math:`\lambda \in \Lambda`, for the parameter
   :math:`a`: with the attribute **s.basis\_adaptation = True**,
   **s.solve()** solves, for all :math:`\lambda\in \Lambda`, the problem

   .. math:: \min_{(a_i)_{i \in I}} \|\sum_{i \in I} a_i \phi_i \|^2 - \frac 2 n \sum_{k=1}^n \sum_{i \in I} a_i \phi_i(x_k) \quad \text{subject to } \mathrm{support}(a) \subset K_\lambda,

   and selects the optimal sparsity pattern using a cross-validation
   estimate of the error.

**LinearModelLearningCustomLoss**.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A **LinearModelLearningCustomLoss** object **s** implements a way of
solving the empirical risk minimization associated with a
**CustomLossFunction** using tensorflow’s automatic differentiation
capabilities.

By default, the optimizer used is keras’ Adam algorithm, which is a
“stochastic gradient descent method that is based on adaptive estimation
of first-order and second-order moments” (per tensorflow’s
documentation).

The algorithm requires a starting point, provided in
**s.initial\_guess**, and several options can be set:

-  **s.options[’max\_iter’]** sets the maximum number of iterations used
   in the optimization algorithm,

-  **s.options[’stagnation’]** sets the stopping tolerance on the
   stagnation between two iterates,

-  for the Adam algorithm (and other minimization algorithms provided by
   tensorflow/keras), the learning rate can be provided in
   **s.optimizer.learning\_rate**.

**TensorLearning**
------------------

The package tensap implements algorithms to perform statistical learning
with canonical and tree-based tensor formats. See [Grelier2018]_,
[Grelier2019]_, [Michel2020]_ for a detailed presentation
of algorithms and related theory.

For examples, see the tutorial files
``tutorials\approximation\tutorial_tensor_learning_CanonicalTensorLearning.py``,
``tutorials\approximation\tutorial_tensor_learning_TreeBasedTensorLearning.py``,
``tutorials\approximation\tutorial_tensor_learning_TreeBasedTensorDensityLearning.py``,
``tutorials\approximation\tutorial_tensor_learning_tensorized_function_learning.py``.

These algorithms are implemented in the core object **TensorLearning**,
common to all the tensor formats, so that implementing such a learning
algorithm for a new tensor format is simple. In tensap are currently
implemented **CanonicalTensorLearning** for the learning in canonical
tensor format and **TreeBasedTensorLearning** for the learning in
tree-based tensor format.

Two algorithms are proposed: the standard one, which minimizes an
empirical risk over the set of tensors in a given format thanks to an
alternating minimization over the parameters of the tensors, and the
adaptive one, which returns a sequence of empirical risk minimizers with
adapted rank (for the canonical and tree-based tensor formats) and
adapted tree (for the tree-based tensor format).

In order to perform empirical risk minimization, a **TensorLearning**
object **s** must be provided with a training sample in
**s.training\_sample**. In supervised learning, for the approximation of
a random variable :math:`Y` as a function of :math:`X`, the training
sample is a list **[x, y]**, with :math:`{\texttt{\detokenize{y}}}`
represents :math:`n` samples :math:`\{y_k\}_{k=1}^n` of :math:`Y` and
:math:`x` the :math:`n` corresponding samples
:math:`\{x_k = (x_{k,1},\ldots,x_{k,d})\}_{k=1}^n` of :math:`X`. In
density estimation, the training sample is an array **x** containing
samples :math:`\{x_k= (x_{k,1},\ldots,x_{k,d})\}_{k=1}^n` from the
distribution to estimate.

One must also provide bases (in **s.bases**) or evaluations of the bases
on the training set (in **s.bases\_eval**, in which case the :math:`x`
are not mandatory in **s.training\_sample**). The latter option allows
for providing features :math:`\phi^\nu_i(x_{\nu,k})`,
:math:`1\le \nu\le d`, associated with samples
:math:`x_k = (x_{k,1},\ldots,x_{k,d})`, without providing the feature
maps :math:`\phi^\nu_i`.

One can also provide the **TensorLearning** **s** with a test sample in
**s.test\_data** to compute a test error.

Rank adaptation.
^^^^^^^^^^^^^^^^

(See [Grelier2018]_) The rank adaptation
is enabled by setting **s.rank\_adaptation** to **True**.

For tensors in canonical format, the algorithm returns a sequence of
rank-\ :math:`r` approximations, with
:math:`r = 1, \ldots, r_{\text{max}}`, :math:`r_{\text{max}}` being
given by **s.rank\_adaptation\_options** **[’max\_iterations’]**.

For tensors in tree-based format, the algorithm returns a sequence of
tensors with non-decreasing tree-based rank, obtained by increasing, at
each iterations, the ranks associated with a subset of nodes of the tree
:math:`T`. The number of nodes in this subset is influenced by a
parameter **s.rank\_adaptation\_options[’theta’]** in :math:`[0, 1]`,
which is such that the larger it is, the more ranks are increased at
each iteration. The default value of :math:`0.8`.

Tree adaptation.
^^^^^^^^^^^^^^^^

(See [Grelier2018]_) For tree-based
tensor formats, the tree can be adapted at each iteration using the
algorithm mentioned in Section [sec:TreeBasedTensor], by setting
**s.tree\_adaptation** to **True**. The tolerance for the tree
adaptation is provided by **s.tree\_adaptation\_options[’tolerance’]**
and the maximal number of tried trees by
**s.tree\_adaptation\_options[’max\_iterations’]**.

Model selection.
^^^^^^^^^^^^^^^^

(See [Michel2020]_) At the end of the adaptive
procedure, a model can be selected by setting **s.model\_selection** to
**True**, using either a test error (specified by
**s.model\_selection\_options[’type’] = ’test\_error’**) or a
cross-validation estimate of the error (specified by
**s.model\_selection\_options[’type’] = ’cv\_error’**).

Example: character classification in tree-based tensor format.
--------------------------------------------------------------

We present below a part of the tutorial file
``tutorial\tensor\learning\digits\recognition.py`` shipped with the
package tensap. Its aim is to create a classifier in tree-based tensor
format, able to recognize hand written digits from :math:`0` to
:math:`9`.

The output of the algorithm is displayed below the Python script, as
well as in Figure [fig:classification\_results], which shows the
confusion matrix on the test sample as well as a visual comparison on
some test samples. We see that, using a training sample of size
:math:`1617`, it returns a classifier that obtains a score of
:math:`98.89\%` of correct classification on a test sample of size
:math:`180`.

::

    from sklearn import datasets, metrics
    import random
    import numpy as np
    import tensorflow as tf
    import time
    import matplotlib.pyplot as plt
    import tensap

    # %% Data import and preparation
    DIGITS = datasets.load_digits()
    DATA = DIGITS.images.reshape((len(DIGITS.images), -1))
    DATA = DATA / np.max(DATA)  # Scaling of the data

    # %% Patch reshape of the data: the patches are consecutive entries of the data
    PS = [4, 4]  # Patch size
    DATA = np.array([np.concatenate(
        [np.ravel(np.reshape(DATA[k, :], [8]*2)[PS[0]*i:PS[0]*i+PS[0],
                                                PS[1]*j:PS[1]*j+PS[1]]) for
         i in range(int(8/PS[0])) for j in range(int(8/PS[1]))]) for
        k in range(DATA.shape[0])])
    DIM = int(int(DATA.shape[1]/np.prod(PS)))

    # %% Probability measure
    print('Dimension %i' % DIM)
    X = tensap.RandomVector(tensap.DiscreteRandomVariable(np.unique(DATA)), DIM)

    # %% Training and test samples
    P_TRAIN = 0.9  # Proportion of the sample used for the training

    N = DATA.shape[0]
    TRAIN = random.sample(range(N), int(np.round(P_TRAIN*N)))
    TEST = np.setdiff1d(range(N), TRAIN)
    X_TRAIN = DATA[TRAIN, :]
    X_TEST = DATA[TEST, :]
    Y_TRAIN = DIGITS.target[TRAIN]
    Y_TEST = DIGITS.target[TEST]

    # One hot encoding (vector-valued function)
    Y_TRAIN = tf.one_hot(Y_TRAIN.astype(int), 10, dtype=tf.float64)
    Y_TEST = tf.one_hot(Y_TEST.astype(int), 10, dtype=tf.float64)

    # %% Approximation bases: 1, cos and sin for each pixel of the patch
    FUN = [lambda x: np.ones((np.shape(x)[0], 1))]
    for i in range(np.prod(PS)):
        FUN.append(lambda x, j=i: np.cos(np.pi / 2*x[:, j]))
        FUN.append(lambda x, j=i: np.sin(np.pi / 2*x[:, j]))

    BASES = [tensap.UserDefinedFunctionalBasis(FUN, X.random_variables[0],
                                               np.prod(PS)) for _ in range(DIM)]
    BASES = tensap.FunctionalBases(BASES)

    # %% Loss function: cross-entropy custom loss function
    LOSS = tensap.CustomLossFunction(
            lambda y_true, y_pred: tf.nn.sigmoid_cross_entropy_with_logits(
                logits=y_pred, labels=y_true))


    def error_function(y_pred, sample):
        '''
        Return the error associated with a set of predictions using a sample, equal
        to the number of misclassifications divided by the number of samples.
        
        Parameters
        ----------
        y_pred : numpy.ndarray
        The predictions.
        sample : list
        The sample used to compute the error. sample[0] contains the inputs,
        and sample[1] the corresponding outputs.
        
        Returns
        -------
        int
        The error.
        
        '''
        try:
            y_pred = y_pred(sample[0])
        except Exception:
            pass
        return np.count_nonzero(np.argmax(y_pred, 1) - np.argmax(sample[1], 1)) / \
            sample[1].numpy().shape[0]


    LOSS.error_function = error_function

    # %% Learning in tree-based tensor format
    TREE = tensap.DimensionTree.balanced(DIM)
    IS_ACTIVE_NODE = np.full(TREE.nb_nodes, True)
    SOLVER = tensap.TreeBasedTensorLearning(TREE, IS_ACTIVE_NODE, LOSS)

    SOLVER.tolerance['on_stagnation'] = 1e-10
    SOLVER.initialization_type = 'random'
    SOLVER.bases = BASES
    SOLVER.training_data = [X_TRAIN, Y_TRAIN]
    SOLVER.test_error = True
    SOLVER.test_data = [X_TEST, Y_TEST]

    SOLVER.rank_adaptation = True
    SOLVER.rank_adaptation_options['max_iterations'] = 15
    SOLVER.model_selection = True
    SOLVER.display = True

    SOLVER.alternating_minimization_parameters['display'] = False
    SOLVER.alternating_minimization_parameters['max_iterations'] = 10
    SOLVER.alternating_minimization_parameters['stagnation'] = 1e-10

    # Options dedicated to the LinearModelCustomLoss object
    SOLVER.linear_model_learning.options['max_iterations'] = 10
    SOLVER.linear_model_learning.options['stagnation'] = 1e-10
    SOLVER.linear_model_learning.optimizer.learning_rate = 1e3

    SOLVER.rank_adaptation_options['early_stopping'] = True
    SOLVER.rank_adaptation_options['early_stopping_factor'] = 10

    T0 = time.time()
    F, OUTPUT = SOLVER.solve()
    T1 = time.time()
    print(T1-T0)

    # %% Display of the results
    F_X_TEST = np.argmax(F(X_TEST), 1)
    Y_TEST_NP = np.argmax(Y_TEST.numpy(), 1)

    print('\nAccuracy = %2.5e\n' % (1 - np.count_nonzero(F_X_TEST - Y_TEST_NP) /
                                    Y_TEST_NP.shape[0]))

    IMAGES_AND_PREDICTIONS = list(zip(DIGITS.images[TEST], F_X_TEST))
    for i in np.arange(1, 19):
        plt.subplot(3, 6, i)
        plt.imshow(IMAGES_AND_PREDICTIONS[i][0],
                   cmap=plt.cm.gray_r, interpolation='nearest')
        plt.axis('off')
        plt.title('Pred.: %i' % IMAGES_AND_PREDICTIONS[i][1])

    print('Classification report:\n%s\n'
          % (metrics.classification_report(Y_TEST_NP, F_X_TEST)))
    MATRIX = metrics.confusion_matrix(Y_TEST_NP, F_X_TEST)
    plt.matshow(MATRIX)
    plt.title('Confusion Matrix')
    plt.show()
    print('Confusion matrix:\n%s' % MATRIX)

::

    Dimension 4

    The implemented learning algorithms are designed for orthonormal bases. These algorithms work with non-orthonormal bases, but without some guarantees on their results.


    Rank adaptation, iteration 0:
        Enriched nodes: []
        Ranks = [10, 1, 1, 1, 1, 1, 1]
        Storage complexity = 144
        Test error = 9.38889e-01

    Rank adaptation, iteration 1:
        Enriched nodes: [2, 4, 3, 5, 6, 7]
        Ranks = [10, 2, 2, 2, 2, 2, 2]
        Storage complexity = 320
        Test error = 8.44444e-01

    Rank adaptation, iteration 2:
        Enriched nodes: [2, 3, 4, 5, 7]
        Ranks = [10, 3, 3, 3, 3, 2, 3]
        Storage complexity = 498
        Test error = 7.00000e-01

    Rank adaptation, iteration 3:
        Enriched nodes: [2, 3, 4, 6, 7]
        Ranks = [10, 4, 4, 4, 3, 3, 4]
        Storage complexity = 718
        Test error = 5.61111e-01

    Rank adaptation, iteration 4:
        Enriched nodes: [2, 5, 3]
        Ranks = [10, 5, 5, 4, 4, 3, 4]
        Storage complexity = 885
        Test error = 1.22222e-01

    Rank adaptation, iteration 5:
        Enriched nodes: [2, 3, 4, 5, 6, 7]
        Ranks = [10, 6, 6, 5, 5, 4, 5]
        Storage complexity = 1257
        Test error = 5.55556e-02

    Rank adaptation, iteration 6:
        Enriched nodes: [2, 3, 5, 6]
        Ranks = [10, 7, 7, 5, 6, 5, 5]
        Storage complexity = 1568
        Test error = 2.22222e-02

    Rank adaptation, iteration 7:
        Enriched nodes: [2, 3]
        Ranks = [10, 8, 8, 5, 6, 5, 5]
        Storage complexity = 1773
        Test error = 3.33333e-02

    Rank adaptation, iteration 8:
        Enriched nodes: [3, 7, 2, 6]
        Ranks = [10, 9, 9, 5, 6, 6, 6]
        Storage complexity = 2163
        Test error = 2.22222e-02

    Rank adaptation, iteration 9:
        Enriched nodes: [4, 5]
        Ranks = [10, 9, 9, 6, 7, 6, 6]
        Storage complexity = 2337
        Test error = 2.22222e-02

    Rank adaptation, iteration 10:
        Enriched nodes: [3, 4, 2, 6]
        Ranks = [10, 10, 10, 7, 7, 7, 6]
        Storage complexity = 2801
        Test error = 1.66667e-02

    Rank adaptation, iteration 11:
        Enriched nodes: [2, 3, 5]
        Ranks = [10, 11, 11, 7, 8, 7, 6]
        Storage complexity = 3212
        Test error = 2.22222e-02

    Rank adaptation, iteration 12:
        Enriched nodes: [2, 4, 6, 7, 3]
        Ranks = [10, 12, 12, 8, 8, 8, 7]
        Storage complexity = 3903
        Test error = 1.66667e-02

    Rank adaptation, iteration 13:
        Enriched nodes: [2, 3, 4, 6, 7]
        Ranks = [10, 13, 13, 9, 8, 9, 8]
        Storage complexity = 4684
        Test error = 1.66667e-02

    Rank adaptation, iteration 14:
        Enriched nodes: [5]
        Ranks = [10, 13, 13, 9, 9, 9, 8]
        Storage complexity = 4834
        Test error = 1.11111e-02

    Model selection using the test error: model #14 selected
    Ranks = [10, 13, 13, 9, 9, 9, 8], test error = 1.11111e-02
    615.6790609359741

    Accuracy = 9.88889e-01

    Classification report:
                  precision    recall  f1-score   support

               0       1.00      1.00      1.00        23
               1       1.00      0.96      0.98        23
               2       1.00      1.00      1.00        19
               3       1.00      1.00      1.00        18
               4       1.00      1.00      1.00        22
               5       0.93      1.00      0.96        13
               6       1.00      0.94      0.97        17
               7       1.00      1.00      1.00        15
               8       0.95      1.00      0.97        18
               9       1.00      1.00      1.00        12

        accuracy                           0.99       180
       macro avg       0.99      0.99      0.99       180
    weighted avg       0.99      0.99      0.99       180


    Confusion matrix:
    [[23  0  0  0  0  0  0  0  0  0]
     [ 0 22  0  0  0  0  0  0  1  0]
     [ 0  0 19  0  0  0  0  0  0  0]
     [ 0  0  0 18  0  0  0  0  0  0]
     [ 0  0  0  0 22  0  0  0  0  0]
     [ 0  0  0  0  0 13  0  0  0  0]
     [ 0  0  0  0  0  1 16  0  0  0]
     [ 0  0  0  0  0  0  0 15  0  0]
     [ 0  0  0  0  0  0  0  0 18  0]
     [ 0  0  0  0  0  0  0  0  0 12]]

.. .40 |Obtained results for the classification tutorial.|

.. .59 |Obtained results for the classification tutorial.|

.. .. |Obtained results for the classification tutorial 1.| image:: confusion_matrix
.. .. |Obtained results for the classification tutorial 2.| image:: test_sample_comparison