Louvain Community Detection

Overview

Louvain algorithm detects communities by optimizing modularity — the fraction of edges within communities minus expected fraction if edges were random. A greedy, hierarchical algorithm that runs in O(n log n) for sparse graphs. Produces a hierarchy of communities at multiple resolutions.

When to Use

Trigger conditions:

Discovering natural groupings in social, organizational, or interaction networks
Segmenting users/customers by behavioral similarity
Analyzing modular structure of complex networks

When NOT to use:

For overlapping communities (use DEMON or BigCLAM)
When communities are pre-defined and you're classifying nodes (use label propagation)

Algorithm

IRON LAW: Modularity Has a RESOLUTION LIMIT
Louvain optimizes modularity, which has a known resolution limit
(Fortunato & Barthélemy, 2007): it cannot detect communities smaller
than √(2E) where E = total edges. In large networks, small but real
communities may be merged. Use multi-resolution methods or Leiden
algorithm (improved Louvain) for better results.

Phase 1: Input Validation

Build undirected weighted graph from interaction data. Edge weights represent interaction strength (frequency, duration, volume). Gate: Graph loaded, no isolated nodes (or decide how to handle them).

Phase 2: Core Algorithm

Phase 1 — Local moves:

Assign each node to its own community
For each node, compute modularity gain of moving to each neighbor's community
Move node to community with maximum positive gain
Repeat until no beneficial moves remain

Phase 2 — Aggregation: 5. Build new graph where nodes = communities, edges = sum of inter-community edges 6. Repeat Phase 1 on the aggregated graph 7. Continue until modularity stops improving

Phase 3: Verification

Check: modularity Q > 0 (non-trivial partitioning), community sizes are reasonable (not one giant + many singletons), manual inspection of sample communities. Gate: Modularity positive, community sizes follow power-law-like distribution.

Phase 4: Output

Return community assignments with modularity score.

Output Format

{
  "communities": [{"id": 0, "size": 45, "top_members": ["Alice", "Bob"], "internal_density": 0.35}],
  "summary": {"num_communities": 12, "modularity": 0.65, "largest": 120, "smallest": 5},
  "metadata": {"algorithm": "louvain", "nodes": 500, "edges": 2000}
}

Examples

Sample I/O

Input: Email network of 200 employees, weighted by email frequency Expected: Communities roughly corresponding to departments/teams, modularity ~0.5-0.7.

Edge Cases

Input	Expected	Why
Complete graph	One community or random split	No modular structure
Disconnected components	Each component = community	Natural separation
Weighted vs unweighted	Different communities	Weights change modularity calculation

Gotchas

Non-deterministic: Node processing order affects results. Run multiple times and select the partition with highest modularity, or use Leiden algorithm (more stable).
Resolution parameter: Standard Louvain uses γ=1 in modularity. Varying γ reveals communities at different scales. γ>1 finds smaller communities; γ<1 finds larger ones.
Leiden > Louvain: Louvain can produce badly connected communities (communities where removing one node disconnects them). Leiden algorithm fixes this guarantee.
Temporal stability: In dynamic networks, community assignments can change drastically between snapshots even when the network changes minimally. Use temporal smoothing.
Interpretation: Community detection finds structure, but interpreting WHY nodes cluster requires domain knowledge. Don't over-interpret automatically detected communities.

References

For Leiden algorithm (improved Louvain), see references/leiden.md
For multi-resolution community detection, see references/multi-resolution.md

algo-net-community