The GRANOLA Project :
Determinantal Point Processes on Graphs for Randomized Numerical Linear Algebra
The project in a few words
Graphs are ubiquitous tools to represent networks, may they be networks modeling data from neurosciences, sociology, molecular biology, chemistry, etc. A cornerstone of the analysis of graphs is the Laplacian matrix L that encodes their structure. From a linear algebra pointofview, the analysis of L offers fundamental insights on key properties of the graph: the diffusion speed of an information or a disease on a network, the vulnerability of a network to targeted or random attacks, the redundancy or certain parts of the network, the network's structure in more or less independent modules, are all examples of characteristics of a network one may extract from the Laplacian matrix.
Unfortunately, from a computational standpoint, the cost of this analysis is often too expensive in practice. For instance, computing the pseudoinverse or the SVD of L requires a worstcase cubic time in the size of the graph n and becomes rapidly burdensome as n increases. Among the strategies developed in the stateoftheart to overcome this challenge, the class of stochastic methods called RandNLA (for Randomized Numerical Linear Algebra) has aroused a great interest in the last decade. The gist of these approaches is to compute an approximation of the solution by following three steps: randomly sample and/or project the matrix L, solve the algebra problem in this reduced dimension, and extrapolate the solution back to the original dimension. On top of its simplicity, a reason for this approach's great success is that, provided a few conditions are met, the approximation error is theoretically very well controlled. In practice, this means that by allowing a small error on the result, RandNLA methods enable to reduce computation costs by several orders of magnitude in some cases.
A paradigm on which the first step of RandNLA methods relies is iid (independent and identically distributed) sampling, that enables to take advantage of powerful concentration theorems to have a precise theoretical control over the approximation error. However, iid sampling is not optimal in the sense that it may provide very redundant samples: an ideal sampling scheme should indeed incorporate some form of repulsion in order to obtain samples as representative as possible of the matrix at hand. The class of determinantal point processes (DPP) are such a class of repulsive process, that have the added benefit of being tractable (for instance, the marginal probabilities at any order are known), which makes them natural candidates to improve the current performance of RandNLA algorithms.
Sampling from these DPPs, however, is necessarily more expensive than iid sampling. In fact, it requires in general a cubic cost in the size of the set to sample. Thankfully, some specific DPPs are much more efficient to sample. It is the case for instance of DPPs defined over graphs such as random spanning forests that sample edges and nodes of a graph in a time quasilinear in the number of edges. This type of acceleration is regularly observed when it comes to linear algebra problems involving graphs: a classical example is the inversion problem Mx=b, whose worstcase complexity is cubic in the size of M, and for which a good approximation may be obtained in a time quasilinear in the number of nonzero elements of M provided that M is the Laplacian matrix of a graph.
In a nutshell, this project aims at developing noniid RandNLA algorithms specifically designed for Laplacian matrices, by taking advantage of the tractable repulsion enabled by DPPs and the efficiency of graphbased algorithms. The originality of our proposal is to leverage the representative power of DPPs for RandNLA in the specific yet promising context of Laplacian matrices, in order to implement these DPPs with very efficient graphbased sampling algorithms.
Team members. This project is a collaboration between several researchers: PierreOlivier Amblard (DR CNRS at Gipsalab, Grenoble, France), Luca Avena (assistant professor at Leiden University, Netherlands), Simon Barthelmé (CR CNRS, Gipsalab, Grenoble, France), Alexandre Gaudillière (CR CNRS at I2M, Marseille, France), Hugo Jaquard (PhD student 20212024, Gipsalab, France), Andreas Loukas (research scientist at EPFL, Lausanne, Switzerland), Yigit Pilavci (PhD student 20192022, Gipsalab, Grenoble, France), and myself.
Support. This project is partially supported by the French Agence Nationale de la Recherche (ANR JCJC ANR21CE480009).
Timeline. This project officially starts on the 1st of September 2021, for a period of 3 years; even though collaborations and ideas underlying the project stem from several years of prior research activities.
Associated activity
October 2021. Start of Hugo Jaquard's PhD at Gipsalab.
Since October 2021. work on the project!
Associated publications
[6] 
Simon Barthelmé, Nicolas Tremblay, PierreOlivier Amblard.
A Faster Sampler for Discrete Determinantal Point Processes.
In AISTATS, 2023. [PDF] [Code]

[5] 
Hugo Jaquard, Michael Fanuels, PierreOlivier Amblard, Rémi Bardenet, Simon Barthelmé, Nicolas Tremblay.
Smoothing ComplexValued Signals on Graphs with MonteCarlo.
In ICASSP, 2023. [PDF] [Code]

[4] 
Nicolas Tremblay, Simon Barthelmé, Konstantin Usevich, PierreOlivier Amblard.
Extended Lensembles: a new representation for Determinantal Point Processes.
In
Annals of Applied Probability, 2023. [PDF]

[3] 
Simon Barthelmé, Nicolas Tremblay, Konstantin Usevich, PierreOlivier Amblard.
Determinantal Point Processes in the Flat Limit.
In
Bernoulli, 2023. [PDF]

[2] 
Yusuf Y. Pilavci, PierreOlivier Amblard, Simon Barthelmé, Nicolas Tremblay.
Variance Reduction for Inverse Trace Estimation via Random Spanning Forests.
In GRETSI, 2022. [PDF]

[1] 
Yusuf Y. Pilavci, PierreOlivier Amblard, Simon Barthelmé, Nicolas Tremblay.
Variance Reduction in Stochastic Methods for Largescale Regularised Leastsquare Problems.
In EUSIPCO, 2022. [PDF]

Last update of this page: 31st of March 2023.