- 2018-2021:
**RMT4TL**Project funding a PhD student (Malik TIOMOKO) on random matrices for transfer learning. - 2018-2021:
**PhyML**Project funding a PhD student (Lorenzo DALL'AMICO) on statistical physics tools for machine learning. - 2018-2021:
**CEA-ConcentratedNN**Project funding a PhD student (Cosme LOUART) on concentration of measure theory for neural networks. - 2017-2020:
**CEA-RMT4NN**Project funding a PhD student (Mohamed SEDDIK) on random matrix methods for kernel and neural networks analysis. - 2016-2019:
**RMTinML**Project funding a PhD student (Xiaoyi MAI) on the development of random matrix methods for supervised and semi-supervised learning methods. - 2016-2019:
**DeepRMT**Project funding a PhD student (Zhenyu LIAO) on random matrices and neural networks. - 2013-2017:
**ERC MORE**[Link] ERC fund hosted by Mérouane Debbah with Mérouane Debbah and myself as co-leaders. - 2013-2016:
**ANR DIONISOS**[Link] ANR project on array processing applications in the random matrix regime. - 2015-2016:
**HUAWEI RMTin5G**Bilateral project on the development of machine learning methods for 5G, based on recent advances in random matrix theory. - 2015-2018:
**ANR-RMT4GRAPH**RMT4GRAPH aims at developing an original framework for the theoretical analysis of large dimensional random graphs, by means of advanced random matrix methods. This analysis allows in turn for the performance study as well as the improvement of a wide scope of signal processing and machine learning methods currently met within the big data paradigm.

- Documents
- Presentation of the research program [slides]
- Mid-project presentation [slides]
- Deliverable at t0 + 1 year [document]
- Important dates
- Beginning of the project: February 2nd, 2015.
- End of the project: before September 30th, 2018.
- PhD student: from September 1st, 2015 to August 31st, 2018.
- Taskforce
- Principal Investigator
- Students
- Hafiz Tiomoko Ali (PhD student under RMT4GRAPH grant, sep. 2015-2018), on community detection and neural networks.
- Xiaoyi Mai (intern, under ERC-MORE grant, 2016), on large dimensional semi-supervised learning performance.
- Zhenyu Liao (intern, under ERC-MORE grant, 2016), on large dimensional support vector machines performance.
- Cosme Louart (intern, under ERC-MORE grant, 2016), on neural networks and random matrices (extreme learning machines).
- Evgeny Kusmenko (PhD student under ERC-MORE grant, jan. 2015-dec. 2015), on spectral clustering methods.
- Liusha Yang (PhD student, visiting from HKUST, 2015), on financial applications of robust estimation.
- Aymeric Thibault (intern, under ERC-MORE grant, 2015), on eigenvectors of sample covariance matrices, applied to clustering.
- Harry Sevi (intern, under ERC-MORE grant, 2015), on recurrent (echo-state) neural networks.
- Collaborators
- Florent Benaych-Georges (professor at Université Paris Descartes), on kernel random matrices.
- Gilles Wainrib (assistant professor at ENS Paris), on neural networks.
- Florent Benaych-Georges (professor at Université Paris Descartes), on kernel random matrices.
- Matthew M. McKay (professor at Hong Kong UST), on sparse PCA and applied robust estimation.
- Abla Kammoun (research scientist at KAUST), on subspace clustering and robust estimation.
- Contributions
- Graphs
- Kernel methods: spectral clustering, semi-supervised learning and SVM
- Neural networks
- Miscallanies: applied spiked models, sparse PCA, etc.
- List of Publications
- R. Couillet, F. Benaych-Georges,
**"Kernel Spectral Clustering of Large Dimensional Data"**, Electronic Journal of Statistics, vol. 10, no. 1, pp. 1393-1454, 2016. [preprint] - F. Benaych-Georges, R. Couillet,
**"Spectral Analysis of the Gram Matrix of Mixture Models"**(in Press), ESAIM: Probability and Statistics, 2016. [preprint] - R. Couillet, G. Wainrib, H. Sevi, H. Tiomoko Ali,
**"The asymptotic performance of linear echo state neural networks"**, Journal of Machine Learning Research, vol. 17, no. 178, pp. 1-35, 2016. [preprint] - H. Tiomoko Ali, R. Couillet,
**"Random Matrix Improved Community Detection in Heterogeneous Networks"**, Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 2016. [article] - R. Couillet, A. Kammoun,
**"Random Matrix Improved Subspace Clustering"**, Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 2016. [article] - H. Tiomoko Ali, R. Couillet,
**"Performance analysis of spectral community detection in realistic graph models"**, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'16), Shangai, China, 2016. [article] - R. Couillet, F. Benaych-Georges,
**"Understanding Big Data Spectral Clustering"**, IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP'15), Cancun, Mexico, 2015. [article] - L. Yang, R. Couillet, M. R. McKay,
**"Minimum Variance Portfolio Optimization in the Spiked Covariance Model"**, IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), Cancun, Mexico, 2015. [article] - R. Couillet, G. Wainrib, H. Sevi, H. Tiomoko Ali,
**"A Random Matrix Approach to Recurrent Neural Networks"**, International Conference on Machine Learning (ICML), New York, USA, 2016. [article] - R. Couillet, G. Wainrib, H. Sevi, H. Tiomoko Ali,
**"Training performance of echo state neural networks"**, IEEE Statistical Signal Processing Workshop (SSP), Palma de Majorca, Spain, 2016. [article] - E. Kusmenko, R. Couillet,
**"Spectral clustering for high dimensional mixture models"**, Report. [article] - Dissemination
- Summer School on
**"Large Random Matrices and High Dimensional Statistical Signal Processing"**(Telecom ParisTech, June 7-8, 2016) [Link] - SSP'16 Special Session
**"Random matrices in signal processing and machine learning"**[Link] - Distinguished keynote speaker at EUSIPCO 2016
- Special Issue on Random Matrices in "Revue du Traitement du Signal"
- Invited talks and contributions to local events

Along with Hafiz Tiomoko Ali, whose PhD is oriented towards the theoretical analysis of large dimensional random graphs, we have devised a random matrix framework for the study of the leading eigenvectors in (degree-corrected) stochastic block models, so far for dense networks. This constitutes a first step to the understanding of the performance of classically used community detection schemes based on spectral methods.

Along with Florent Benaych-Georges for the most technical part of the work and PhD student Evgeny Kusmenko, we have studied the performance achieved by kernel spectral clustering methods on large dimensional mixtures of Gaussian input vectors. The intimacy between the choice of the kernel function and the discriminability of the data through their means and covariances was set in light. These works have laid out the basis for further investigations. On spectral clustering, a novel subspace clustering method for zero-mean data was proposed (collaborative work with Abla Kammoun) with application to user clustering 5G massive MIMO wireless communications. Investigations are also on-going in the performance analysis and improvement of standard machine learning methods for large dimensional data: semi-supervised learning (internship work of Xiaoyi Mai) and SVM (internship work of Zhenyu Liao), both based on kernels; key findings on induced biases and corresponding improvements were made. Accurate fit was demonstrated between theory and practice.

In the course of a short (6-month) internship, Harry Sevi (now a PhD student working on signal processing on graphs) derived the fundamental equations ruling the performance of a linear echo-state neural network. This work resulted in first publications (to ICML and SSP, along with a submission to JMLR). During a second 6-month internship, Cosme Louart (student from ENS Paris) undertook the difficult study of non-linear neural networks, and most particularly of one-layer extreme learning machines (simple case study as a prilimary study). Unexpected simple findings were made which provide simple expressions for the performance of large and numerous data large-layer networks.

During a 6-month stay, Liusha Yang (PhD student at HKUST) has derived a novel spiked random matrix based method for portfolio optimization in finance. Along with Matthew McKay (professor at HKUST), a (not too) sparse PCA method based on random matrix theory was also devised. The method is restricted to data the sparsity of which is block-wise structure, but allows to identify in practice the correct number of blocks to be selected.

In this two-day joint event co-organized by the ANR DIONISOS and the ANR RMT4GRAPH, lectures on advances of random matrix theory in signal processing and machine learning were proposed to a large audience, mostly composed of researchers in France. The summer school was consituted of four 3h-courses as follows: (i) Jamal Najim:

*Introduction to large random matrix theory*, (ii) Philippe Loubaton:*Large random matrices for array processing*, (iii) Abla Kammoun:*Robust estimation in large systems*, (iv) Romain Couillet:*Random matrices and machine learning*.A special session at the 2016 Statistical Signal Processing Workshop (SSP'16), Palma de Majorca (Spain), was organized by the the PI in the scope of the ANR RMT4GRAPH. The session was composed of 7 poster presentations (ranging from theoretical random matrix theory to applications to machine learning and array processing).

The early results of the ANR-RMT4GRAPH project were presented in a large audience at the European Signal Processing Conference (EUSIPCO) in September 2016, as part of the STATOS Thematic Workshop on Machine Learning and BigData. The PI was there invited as a distinguished keynote speaker. [Link]

A special issue on Random Matrices and its applications to signal processing and machine learning was edited by the PI. The special issue contains 6 articles ranging from introduction to basic notions of random matrices to advanced applications in robust statistics and machine learning.

As a follow up of some of the key publications above, several invited talks were given by the PI and co-authors (to ENS Paris twice, to ENS Lyon twice, to the University of Orsay, etc.). A willingness to broadcast the results of RMT4GRAPH was also ensured by proposing talks in various GdR and local meetings.