RMT4GRAPH aims at developing an original framework for the theoretical analysis of large dimensional random graphs, by means of advanced random matrix methods. This analysis allows in turn for the performance study as well as the improvement of a wide scope of signal processing and machine learning methods currently met within the big data paradigm.
Along with Hafiz Tiomoko Ali, whose PhD is oriented towards the theoretical analysis of large dimensional random graphs, we have devised a random matrix framework for the study of the leading eigenvectors in (degree-corrected) stochastic block models, so far for dense networks. This constitutes a first step to the understanding of the performance of classically used community detection schemes based on spectral methods.
Along with Florent Benaych-Georges for the most technical part of the work and PhD student Evgeny Kusmenko, we have studied the performance achieved by kernel spectral clustering methods on large dimensional mixtures of Gaussian input vectors. The intimacy between the choice of the kernel function and the discriminability of the data through their means and covariances was set in light. These works have laid out the basis for further investigations. On spectral clustering, a novel subspace clustering method for zero-mean data was proposed (collaborative work with Abla Kammoun) with application to user clustering 5G massive MIMO wireless communications. Investigations are also on-going in the performance analysis and improvement of standard machine learning methods for large dimensional data: semi-supervised learning (internship work of Xiaoyi Mai) and SVM (internship work of Zhenyu Liao), both based on kernels; key findings on induced biases and corresponding improvements were made. Accurate fit was demonstrated between theory and practice.
In the course of a short (6-month) internship, Harry Sevi (now a PhD student working on signal processing on graphs) derived the fundamental equations ruling the performance of a linear echo-state neural network. This work resulted in first publications (to ICML and SSP, along with a submission to JMLR). During a second 6-month internship, Cosme Louart (student from ENS Paris) undertook the difficult study of non-linear neural networks, and most particularly of one-layer extreme learning machines (simple case study as a prilimary study). Unexpected simple findings were made which provide simple expressions for the performance of large and numerous data large-layer networks.
During a 6-month stay, Liusha Yang (PhD student at HKUST) has derived a novel spiked random matrix based method for portfolio optimization in finance. Along with Matthew McKay (professor at HKUST), a (not too) sparse PCA method based on random matrix theory was also devised. The method is restricted to data the sparsity of which is block-wise structure, but allows to identify in practice the correct number of blocks to be selected.
In this two-day joint event co-organized by the ANR DIONISOS and the ANR RMT4GRAPH, lectures on advances of random matrix theory in signal processing and machine learning were proposed to a large audience, mostly composed of researchers in France. The summer school was consituted of four 3h-courses as follows: (i) Jamal Najim: Introduction to large random matrix theory, (ii) Philippe Loubaton: Large random matrices for array processing, (iii) Abla Kammoun: Robust estimation in large systems, (iv) Romain Couillet: Random matrices and machine learning.
A special session at the 2016 Statistical Signal Processing Workshop (SSP'16), Palma de Majorca (Spain), was organized by the the PI in the scope of the ANR RMT4GRAPH. The session was composed of 7 poster presentations (ranging from theoretical random matrix theory to applications to machine learning and array processing).
The early results of the ANR-RMT4GRAPH project were presented in a large audience at the European Signal Processing Conference (EUSIPCO) in September 2016, as part of the STATOS Thematic Workshop on Machine Learning and BigData. The PI was there invited as a distinguished keynote speaker. [Link]
A special issue on Random Matrices and its applications to signal processing and machine learning was edited by the PI. The special issue contains 6 articles ranging from introduction to basic notions of random matrices to advanced applications in robust statistics and machine learning.
As a follow up of some of the key publications above, several invited talks were given by the PI and co-authors (to ENS Paris twice, to ENS Lyon twice, to the University of Orsay, etc.). A willingness to broadcast the results of RMT4GRAPH was also ensured by proposing talks in various GdR and local meetings.