← Back to notes
Concept AI Published Published 30 May 2026

Structural plasticity: when the network rewrites its own architecture

Creating and removing neurons and connections at runtime, instead of freezing the topology before training.

Most neural networks have a shape fixed in advance: only the weights change. Structural plasticity flips that rule by turning topology into a living variable. A concept shared between the SOAG and AEIF programs.

  • #plasticite-structurelle
  • #topologie
  • #neuroevolution
  • #auto-organisation
  • #elagage
  • #apprentissage-local
  • #sparse-training

A human brain does not merely tune the strength of its connections: it grows new ones, destroys others, and lets neurons be born and die throughout life. Its shape is in motion. A classical artificial neural network, by contrast, receives its architecture once and for all before training and never touches it again. This piece explores what we gain, and what we still fail to understand, when we make the shape of the network itself something that learns.

Two plasticities not to confuse

The word plasticity covers two very different mechanisms, which must be kept apart to avoid going in circles.

  • Synaptic plasticity: the strength (the weight) of an already existing connection changes. This is exactly what backpropagation does: adjusting numbers on a frozen graph.
  • Structural plasticity: the graph itself changes. Neurons and connections are created and removed. Topology stops being scenery and becomes a variable.

The first answers the question “how much”. The second answers “who talks to whom”. This whole piece is about the second.

What the brain does

Structural plasticity is not an engineer’s oddity: it is the normal regime of living systems.

  • Synaptogenesis: the formation of new synapses, intense during development and never fully stopped.
  • Synaptic pruning: human adolescence eliminates a massive share of the synapses formed in childhood. The brain sculpts itself by removing as much as by adding.
  • Adult neurogenesis: new neurons still appear in adulthood, notably in the hippocampus.
  • Critical periods: time windows where topology is especially malleable, then settles.

The lesson is sharp: a good architecture is not laid down in advance, it is found through growth and destruction guided by activity (Kandel et al., 2013).

Reshape the topology yourself

Before theory, the feel of it. Here is a small network whose structure you can change live. Add and remove neurons and connections, and watch the shape reorganize.

Change the network structure live. Everything is added or removed at random, with no criterion at all.

Neurons
6
Synapses
3

You quickly notice one thing: adding and removing at random builds nothing useful. Hold on to that impression, it is the heart of the piece.

Why make the shape learn

Three gains motivate turning topology into a variable:

  • Continual learning: a network whose shape can evolve does not need to be frozen after training. It keeps adapting as its world changes.
  • Efficiency: if structure is built only where activity justifies it, the network allocates resources only where they are useful, instead of computing everything everywhere all the time.
  • Self-organization: shape becomes a result rather than a choice imposed in advance. You no longer draw the architecture, you define the rules that make it emerge.

The central trap: mutating is not learning

Here is the most important experiment of the piece. Two networks start from the same point and undergo structural mutations. The only difference: on the left, we keep every mutation at random; on the right, we keep only those that improve a performance measure.

Both start from the same point. On the left we keep every random mutation; on the right we keep only those that improve. Selection climbs, randomness wanders.

Performance
Random mutation
0%
Mutation + selection
0%
Step
0 / 90

Randomness alone wanders forever. As soon as a criterion sorts the mutations, performance climbs. That is the whole difference between shaking a structure and making it learn. This criterion can take two forms:

  • a global selection pressure, as in an evolutionary algorithm;
  • a local rule, like biological STDP, where a connection strengthens when it takes part in the causality of a neighboring spike.

Remember this line, it is the thread: structural plasticity is only worth the criterion that steers it.

The other side: pruning

Growing is only half the story. The other half is removing. A dense network holds many nearly useless connections; removing the weakest ones lightens computation without breaking the function.

Slide to remove the weakest connections first. The network keeps its scaffolding well after losing half its links.

Active connections
25 / 25

This principle has a name in recent research: dynamic sparse training. It reveals that a well-chosen subnetwork, sometimes called a lottery ticket, reaches the full network’s performance with a fraction of its connections.

What the research says

Structural plasticity driven by local rules is not speculative. Several families of work make it concrete:

ApproachKey ideaTopologyLearning
NEAT (Stanley & Miikkulainen, 2002)evolve weights + structure by genetic algorithmemergentglobal selection
SET / RigL (Mocanu 2018; Evci 2020)start sparse, drop then regrow by usefulnessdynamicgradient + criterion
Barland & Gil (2024)node splitting and merging, no global errorself-organizedlocal rules
SMGrNN (Chen et al., 2025)insertion and pruning from local statisticsonlinelocal + gradient
MorphSNN (Liu et al., 2026)reorganization on the millisecond scalespikingevent-driven

The common thread: topology can become a locally governed variable, not a template imposed from the outside. Notably, Barland and Gil obtain generalization, and even the grokking phenomenon, from purely local rules, with no global error function at all.

A minimal formalization

Let us state things cleanly. A network is a graph Gt=(Vt,Et)G_t = (V_t, E_t) evolving over time: VtV_t is the set of neurons, EtE_t the set of connections. Structural plasticity is four operations on this graph: add or remove a neuron, add or remove a connection.

A local growth rule can rely on the activity correlation between two neurons:

Δij=aiajt\Delta_{ij} = \langle a_i \, a_j \rangle_t

We create the connection (i,j)(i, j) when Δij\Delta_{ij} exceeds a threshold: two units that fire together gain from talking to each other. A pruning rule instead removes a connection whose weight stays negligible:

remove eifwe<θ\text{remove } e \quad \text{if} \quad |w_e| < \theta

The whole thing reads as a joint optimization of function and form, under a sparsity pressure:

minG,  w  L(w;G)  +  λE\min_{G,\; w} \; \mathcal{L}(w;\, G) \; + \; \lambda \, |E|

The term λE\lambda \, |E| penalizes each connection: it is what pushes the network to stay sparse, hence efficient. Growth, pruning and sparsity are just three faces of the same question: which shape deserves to exist?

Place in my research programs

This concept is the exact hinge between two of my programs.

  • AEIF provides the substrate: a network where every structural mutation is an immutable, traced, replayable event. But its mutations stay random: it knows how to mutate, not yet how to choose.
  • SOAG looks for the missing engine: the local rules that decide what to grow and what to let die, with no central control.

The “mutating is not learning” lab above sums up their relationship: AEIF is the left panel, SOAG aims for the right one.

Limits and open questions

It would be dishonest to present all this as solved.

  • Stability versus plasticity: a topology that changes constantly can diverge or forget everything. Finding the right balance is an open problem.
  • Scaling: current demonstrations stay on small problems, far from large models.
  • Incomplete theory: we lack a general framework saying which local rule produces which capability.
  • Measurement: comparing networks of different topologies is delicate, since we no longer compare only numbers but shapes.

Test your understanding

Quiz
  1. 1. What is the difference between synaptic and structural plasticity?

  2. 2. Why is mutating a topology at random not enough to learn?

  3. 3. What does pruning a network show?

  4. 4. In my programs, what role does AEIF play for structural plasticity?

Sources