Approches profondes pour la détection de communautés dans le Big Data

BEKKAIR, Abdelfateh; BELLAOUAR, Slimane Encadreur; OULED-NAOUI, Slimane co-encadreur

Approches profondes pour la détection de communautés dans le Big Data

BEKKAIR, Abdelfateh; BELLAOUAR, Slimane Encadreur; OULED-NAOUI, Slimane co-encadreur

URI: https://dspace.univ-ghardaia.edu.dz/xmlui/handle/123456789/10411

Date: 2025-12-01

Abstract:

The intricate relationships and organizational principles inherent in complex systems are effectively deciphered through network analysis, an indispensable tool in this regard. Within this field, this thesis addresses the fundamental challenge of community detection in complex networks, particularly focusing on attributed graphs where both topological structure and node features are available. Effectively integrating this rich information to identify cohesive communities remains a critical task in understanding complex systems. This work contributes the field by offering both a systematic understanding of existing methodologies and novel algorithmic contributions. We first establish a comprehensive taxonomy that categorizes community detection techniques across classical, traditional machine learning, and deep learning paradigms, providing a structured state-of-the-art review. Building on this, rigorous comparative studies on prominent graph neural network models, including CNN-based and Graph Autoencoder (GAE)- based approaches, identify current performance limitations. Our methodological contributions include AA-LPA, a classical heuristic algorithm that enhances the Label Propagation Algorithm (LPA). By leveraging the Adamic-Adar index and incorporating deterministic mechanisms for node prioritization and tie resolution, AA-LPA significantly improves stability and robustness against randomness in non-attributed graphs. The primary contribution is G2ACO, a novel deep graph autoencoder for attributed networks. G2ACO integrates a K-means clustering objective with reconstruction, with the aim of maximizing inter-community and minimizing intra-community distances. A key innovation is its unique optimization strategy, which decouples K-means centroid updates from gradient propagation to effectively handle non-differentiability, ensuring stable training and robust performance in learning clustering-oriented node representations via multi-head attention. Empirical evaluations on various benchmark datasets demonstrate that our methods provide effective solutions. AA-LPA is tested on classical social network datasets, where it significantly enhances stability compared to traditional LPA. Moreover, G2ACO is rigorously evaluated in both citation network datasets and social networks, consistently outperforming state-of-the-art baselines. This thesis contributes to valuable insights and tools for network analysis, providing a comprehensive solution for community detection in complex and attributed graphs.