Parameter space matrix representations

All parameter space matrix representations inherit from nngeometry.object.pspace.PMatAbstract. This abstract class defines all method that can be used with all representations (with some exceptions!). nngeometry.object.pspace.PMatAbstract cannot be instantiated, you instead have to choose one of the concrete representations below.

class nngeometry.object.pspace.PMatAbstract(layer_collection, generator, data=None, examples=None, **kwargs)

A \(d \times d\) matrix in parameter space. This abstract class defines common methods used in concrete representations.

Parameters:
  • generator (nngeometry.generator.jacobian.Jacobian) – The generator

  • data – if None, it requires examples to be different from None, and it uses the generator to populate the matrix data

  • examples – if data is None, it uses these examples to populate the matrix using the generator. examples is either a Dataloader, or a single mini-batch of (inputs, targets) from a Dataloader

Note

Either data or examples has to be different from None, and both cannot be not None at the same time.

abstractmethod get_diag()

Computes and returns the diagonal elements of this matrix.

Returns:

a PyTorch Tensor

mapTMmap(pfmap, reduction='sum')

Performs batched vTMv for each PVector v in a PFMap

Parameters:
  • pfmap (object.map.PFMap) – the PFMap to multiply

  • reduction (str) – one of [“sum”, “diag”]

mmap(pfmap)

Performs multiplication between PMat and PFMap

Parameters:

pfmap (object.map.PFMap) – the PFMap to multiply

size(dim=None)

Size of the matrix as a tuple, regardless of the actual size in memory.

Parameters:

dim (int or None) – dimension

>>> M.size()
(1254, 1254)
>>> M.size(0)
1254
solve(x, regul, solve='default', **kwargs)

Solves Fx = b in x

Parameters:
  • regul (float) – regularization, depending of the type of solve (e.g. Tikhonov damping, or high-pass filter)

  • b (PVector or PFMap) – b

  • solve – solve implementation, this is dependent on the PMat representation

abstractmethod vTMv(v)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

Concrete representations

NNGeometry allows to switch between representations easily. With each representation comes a tradeof between accuracy and memory/computational cost. If testing a new algorithm, we recommend testing on a small network using the most accurate representation that fits in memory (typically nngeometry.object.pspace.PMatDense), then switch to a larger scale experiment, and to a lower memory representation.

class nngeometry.object.pspace.PMatBlockDiag(layer_collection, generator, data=None, examples=None, **kwargs)
get_diag()

Computes and returns the diagonal elements of this matrix.

Returns:

a PyTorch Tensor

mm(other)

Matrix-matrix product where other is another instance of PMatBlockDiag

Parameters:

other (nngeometry.object.PMatBlockDiag) – Other FIM matrix

Returns:

The matrix-matrix product

Return type:

nngeometry.object.PMatBlockDiag

vTMv(vector)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

class nngeometry.object.pspace.PMatDense(layer_collection, generator, data=None, examples=None, **kwargs)
get_diag()

Computes and returns the diagonal elements of this matrix.

Returns:

a PyTorch Tensor

mm(other)

Matrix-matrix product where other is another instance of PMatDense

Parameters:

other (nngeometry.object.PMatDense) – Other FIM matrix

Returns:

The matrix-matrix product

Return type:

nngeometry.object.PMatDense

solvePFMap(x, regul=1e-08, solve='solve')

solves J = AX in X

solvePVec(x, regul=1e-08, solve='solve')

solves v = Ax in x

vTMv(v)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

class nngeometry.object.pspace.PMatDiag(layer_collection, generator, data=None, examples=None, **kwargs)
get_diag()

Computes and returns the diagonal elements of this matrix.

Returns:

a PyTorch Tensor

mm(other)

Matrix-matrix product where other is another instance of PMatDiag

Parameters:

other (nngeometry.object.PMatDiag) – Other FIM matrix

Returns:

The matrix-matrix product

Return type:

nngeometry.object.PMatDiag

solvePVec(x, regul=1e-08, solve='default')

solves v = Ax in x

vTMv(v)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

class nngeometry.object.pspace.PMatEKFAC(layer_collection, generator, data=None, examples=None, eigendecomposition=None, **kwargs)

EKFAC representation from George, Laurent et al., Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis, NeurIPS 2018

get_KFE(layer_id, layer, split_weight_bias=True)

Returns a dict index by layers, of dense eigenvectors constructed from Kronecker-factored eigenvectors

  • split_weight_bias (bool): if True then the parameters are ordered in

the same way as in the dense or blockdiag representation, but it involves more operations. Otherwise the coefficients corresponding to the bias are mixed between coefficients of the weight matrix

get_diag(v)

Computes and returns the diagonal elements of this matrix.

Returns:

a PyTorch Tensor

mapTMmap(pfmap, reduction='sum')

Performs batched vTMv for each PVector v in a PFMap

Parameters:
  • pfmap (object.map.PFMap) – the PFMap to multiply

  • reduction (str) – one of [“sum”, “diag”]

to_torch(split_weight_bias=True)
  • split_weight_bias (bool): if True then the parameters are ordered in

the same way as in the dense or blockdiag representation, but it involves more operations. Otherwise the coefficients corresponding to the bias are mixed between coefficients of the weight matrix

update_diag(examples)

Will update the diagonal in the KFE (aka the approximate eigenvalues) using current values of the model’s parameters

vTMv(vector)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

class nngeometry.object.pspace.PMatEKFACBlockDiag(layer_collection, generator, data=None, examples=None, **kwargs)

A mixed representation where EKFAC-table layers use EKFAC, and other layers use a block-diagonal matrix

class nngeometry.object.pspace.PMatEye(layer_collection, scaling=tensor(1.), **kwargs)
get_diag()

Computes and returns the diagonal elements of this matrix.

Returns:

a PyTorch Tensor

vTMv(v)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

class nngeometry.object.pspace.PMatImplicit(layer_collection, generator, data=None, examples=None, **kwargs)

PMatImplicit is a very special representation, since no elements of the matrix is ever computed, but instead various linear algebra operations are performed implicitely using efficient tricks.

The computations are done exactly, meaning that there is no approximation involved. This is useful for networks too big to fit in memory.

get_diag()

Computes and returns the diagonal elements of this matrix.

Returns:

a PyTorch Tensor

mmap(pfmap)

Performs multiplication between PMat and PFMap

Parameters:

pfmap (object.map.PFMap) – the PFMap to multiply

vTMv(v)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

class nngeometry.object.pspace.PMatKFAC(layer_collection, generator, data=None, examples=None, **kwargs)
get_diag(split_weight_bias=True)
  • split_weight_bias (bool): if True then the parameters are ordered in

the same way as in the dense or blockdiag representation, but it involves more operations. Otherwise the coefficients corresponding to the bias are mixed between coefficients of the weight matrix

mm(other)

Matrix-matrix product where other is another instance of PMatKFAC

Parameters:

other (nngeometry.object.PMatKFAC) – Other FIM matrix

Returns:

The matrix-matrix product

Return type:

nngeometry.object.PMatKFAC

to_torch(split_weight_bias=True)
  • split_weight_bias (bool): if True then the parameters are ordered in

the same way as in the dense or blockdiag representation, but it involves more operations. Otherwise the coefficients corresponding to the bias are mixed between coefficients of the weight matrix

vTMv(vector)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

class nngeometry.object.pspace.PMatLowRank(layer_collection, generator, data=None, examples=None, **kwargs)
get_diag()

Computes and returns the diagonal elements of this matrix.

Returns:

a PyTorch Tensor

vTMv(v)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

class nngeometry.object.pspace.PMatMixed(layer_collection, generator, layer_collection_each, layer_map, sub_pmats)
get_diag()

Computes and returns the diagonal elements of this matrix.

Returns:

a PyTorch Tensor

mapTMmap(pfmap, reduction='sum')

Performs batched vTMv for each PVector v in a PFMap

Parameters:
  • pfmap (object.map.PFMap) – the PFMap to multiply

  • reduction (str) – one of [“sum”, “diag”]

vTMv(v)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

class nngeometry.object.pspace.PMatQuasiDiag(layer_collection, generator, data=None, examples=None, **kwargs)

Quasidiagonal approximation as decribed in Ollivier, Riemannian metrics for neural networks I: feedforward networks, Information and Inference: A Journal of the IMA, 2015

get_diag()

Computes and returns the diagonal elements of this matrix.

Returns:

a PyTorch Tensor

vTMv(vs)

Computes the quadratic form defined by M in v, namely the product \(v^\top M v\)

Parameters:

v (object.vector.PVector) – vector \(v\)

nngeometry.object.pspace.bdot(A, B)

batched dot product