The Boring Rules

How a 1987 Logic Paper Predicted Rust's Borrow Checker

Before I wrote Rust, I spent years writing C++. When I finally started, I realized that borrow checker was fighting me on everything. I'm not stupid, I know when a reference is valid. Just let me write my code in peace.

So apparently, I did not know when a reference was valid. I just thought I did, because every language I'd ever used was happily letting me be wrong about it in silence.

Not lying maliciously though. Lying the way my parents lies to me about where my pet bunny went. A comforting lie.

Stuff can be copied freely and discarded freely. LIES!!!

C lets you copy a pointer to a heap allocation into six different variables and never once asks who actually owns that memory. C++ bolted on shared_ptr and unique_ptr, which are basically post it notes that say please remember i'm lying to you here. Java and Go hand the whole problem to a garbage collector, which I've always thought of as hiring someone full time to manage the consequences of a bad abstraction rather than fixing it.

I'm being a bit unfair though, garbage collectors work. But if you've ever watched a GC pause hit a p99 latency-sensitive system in production, like, say, the kind of thing that serves real-time auctions across regions, you start to wonder if works are the bar we should be clearing.

Point is, resources are not values. You cannot copy a mutex. You cannot copy a file descriptor. You cannot copy a database connection. You can copy the handle to these things, sure, and then you have two handles and one resource and a race condition. This is not a niche concern. This is the main thing systems programming is about.

And yet every mainstream language defaulted to letting you copy and discard things freely. The entire field collectively decided that the answer to "what if someone copies a resource handle?" was either "segfault at runtime" (C), "ref counting overhead" (C++), or "just GC it" (everyone else). Nobody asked if the default itself was wrong.

#enter: a french logician

So in 1987 this guy Jean-Yves Girard published a paper about something called linear logic. He was doing proof theory, which is basically math about math. Specifically about the structure of proofs themselves. Nothing to do with programming. He just wanted to mess with the foundations and see what happens.

OK so logic has these things called structural rules. They're not about what your statements say, they're about what you're allowed to do with them. Two of them matter here and honestly nobody pays attention to them because they seem obvious:

Contraction: if you know A, you can use it twice. That's copy.
Weakening: if you know A, you can just not use it. That's drop.

For math? Obviously fine. If 2+2=4 I can use that fact seventeen times in a proof or I can use it zero times. Mathematical truths don't get consumed. They just sit there being true.

Girard's move was asking what happens if you just... take those rules out. Not because they're wrong. Just to see.

And the whole thing splits open.

I studied statistical physics and this gave me the exact same feeling as learning about symmetry breaking for the first time. You have a system, it looks uniform, boring, nothing to see. You change one parameter and suddenly there's all this hidden structure that was there the entire time, you just couldn't see it because the symmetry was hiding it. Same thing here. "And" in classical logic? Turns out that's actually two different operations wearing a trenchcoat. One means "I have both of these, right now, simultaneously." The other means "I can choose to get one or the other but not both." "Or" splits the same way. The boring structural rules were smoothing over all of this.

interactive
Contraction (copy)
Weakening (drop)
AND
Tensor I have both, from separate resources
& With I can choose one, consumer picks
OR
Plus I give you one, producer picks
Par both obligations, shared resources

He called it linear logic. In linear logic every hypothesis gets used exactly once. Not zero. Not twice. Once.

I could not let go of this when I first read it. Because... that's just the borrow checker? Like, that's literally it?

#that's literally it

Move semantics. You use a value, it's gone, you can't use it again. That's contraction being absent. That's Girard's thing.

Destructors and drop. Rust doesn't let you just forget about something you own. Drop runs. You cannot silently walk away from a resource. That's weakening being absent. Also Girard's thing.

And here's the part that actually made me go wait what. Girard knew you'd sometimes want the structural rules back. Like, integers are fine to copy. Booleans are fine to copy. So he added this thing he called the "of course" modality, written !A. It means: this specific thing? You can copy it and discard it freely. I'm explicitly marking it as safe for that.

That's Copy. That's Clone. That's literally the Copy trait in Rust. You opt in to having the structural rules for specific types where it actually makes sense. Integers get it because they're pure values. TcpStream doesn't get it because come on.

Twenty eight years between the paper and Rust 1.0. Different continent, different field, different motivation. Same answer.

#the learning curve is just deprogramming

I don't think the borrow checker is hard. I think habits are hard to break.

Every language you used before Rust rewarded you for treating resources like values. Copy the reference, let it go out of scope, whatever, it's fine. Those habits worked because every language was covering for you. Rust stops covering. That's the whole learning curve. There's no deep concept to learn. There's years of muscle memory to unlearn.

I've seen this click for people. It's always the same. Weeks of frustration, then one random afternoon it makes sense, and then they genuinely cannot understand what was ever hard about it. Because nothing was hard about it. They were just trying to do things that don't make sense and every previous language had been too polite to mention it.

Girard figured this out with pen and paper in the 80s because he couldn't stop poking at the foundations of logic. He wasn't trying to help anyone write safer code. He just wanted to know what the boring rules were hiding.

Turns out they were hiding everything.



Part 2: What's Actually Inside Linear Logic

The Boring Rules, continued

Part 1 covered the intuition: borrow checking is linear logic, Girard worked it out in 1987, and most programming languages quietly paper over this. Now for the actual system.

#sequent calculus, fast

Before we get into linear logic's rules we need to know what a sequent is. A sequent looks like this:

A₁, A₂, ..., Aₙ  ⊢  B₁, B₂, ..., Bₘ

Left side: things you have (hypotheses). Right side: things you're trying to prove (conclusions). The turnstile ⊢ means "from the stuff on the left, I can get to the stuff on the right."

In classical sequent calculus you get structural rules for free on both sides:

Contraction:    Γ, A, A ⊢ Δ          Γ ⊢ Δ, A, A
              ────────────          ────────────────
                Γ, A ⊢ Δ              Γ ⊢ Δ, A

Weakening:      Γ ⊢ Δ                  Γ ⊢ Δ
              ──────────            ──────────────
              Γ, A ⊢ Δ              Γ ⊢ Δ, A

Exchange:     Γ, A, B, Δ ⊢ Σ
              ─────────────────
              Γ, B, A, Δ ⊢ Σ

Contraction: if you have two copies of A, collapse them into one. That's "I can use something multiple times."
Weakening: if you don't have A, you can introduce it anyway and ignore it. That's "I can throw things away."
Exchange: order doesn't matter. We keep this one. Linear logic is not ordered logic (that's a different, even more restrictive thing).

Girard's move: delete contraction and weakening. Keep exchange. See what happens.

#the one-sided presentation

Girard preferred to present linear logic using one-sided sequents:

⊢ Γ

Everything is on the right side. No turnstile splitting left from right. This looks weird at first but it simplifies things enormously. If you want "A implies B" you write A on the right side, where A is the linear negation of A. More on negation later.

Here are the rules. This is the core of it.

#identity

Axiom:          ─────────
                ⊢ A⊥, A

Cut:          ⊢ Γ, A      ⊢ Δ, A⊥
              ──────────────────────
                    ⊢ Γ, Δ

Axiom creates a link between A and its dual. Cut composes two proofs: if one proof produces A and another consumes A, you can plug them together and eliminate the intermediate formula. This is exactly function composition in programming. Producing a value of type A in one place and consuming it in another.

#multiplicatives

This is where it gets interesting. In classical logic, "and" is one thing. In linear logic it splits into two: tensor (⊗) and par (⅋).

Tensor (⊗):     ⊢ Γ, A      ⊢ Δ, B
                ──────────────────────
                    ⊢ Γ, Δ, A ⊗ B

Par (⅋):          ⊢ Γ, A, B
                  ──────────────
                  ⊢ Γ, A ⅋ B

Look at the tensor rule carefully. The context SPLITS. Γ goes with A, Δ goes with B, and they're disjoint. You can't use a resource from Γ to prove B or vice versa. This is resource separation. In Rust terms, if you're constructing a pair (a, b), the stuff that went into making a and the stuff that went into making b must be completely disjoint sets of resources.

Par keeps the context together. Same Γ feeds both A and B. Par is the dual of tensor, which means A ⅋ B is the same as (A ⊗ B). If tensor is about producing two things from separate resources, par is about handling two obligations with shared resources.

Units:

One (1):        ─────
                ⊢ 1

Bottom (⊥):     ⊢ Γ
                ──────
                ⊢ Γ, ⊥

One is the unit for tensor (A ⊗ 1 ≡ A). Bottom is the unit for par.

interactive
Click on a formula to apply a rule.

#additives

The second pair. With (&) and Plus (⊕).

With (&):       ⊢ Γ, A      ⊢ Γ, B
                ──────────────────────
                    ⊢ Γ, A & B

Plus (⊕):        ⊢ Γ, A                    ⊢ Γ, B
                ──────────────          ──────────────
                ⊢ Γ, A ⊕ B             ⊢ Γ, A ⊕ B

Now look at the with rule. The context is SHARED. Same Γ appears in both premises. That's the key difference from tensor. Tensor splits the context, with shares it. This means: to provide A & B, you must be able to provide A from Γ AND provide B from the same Γ. But the consumer only gets to pick one.

Plus gives the producer the choice. You provide either A or B (you pick which), and the consumer has to handle both cases.

In Rust:

// With (&): I can give you either, same resources back me up for both
// This is like a trait with multiple methods - caller picks which to call
trait Resource {
    fn as_reader(&self) -> Reader;    // caller can ask for this
    fn as_writer(&self) -> Writer;    // or this, but not both
}

// Plus (⊕): I give you one, you handle both cases
// This is just Result/enum
fn connect() -> Result<Connection, Error>
// You get one. You match on it.

The multiplicative/additive distinction is one of linear logic's biggest contributions. Classical logic can't tell these apart:

Context splits (multiplicative) Context shared (additive)
Positive A ⊗ B (both, from separate resources) A ⊕ B (one of, producer picks)
Negative A ⅋ B (both, shared obligation) A & B (one of, consumer picks)

When you have contraction and weakening, the difference between "context splits" and "context shares" is meaningless because you can always copy context. Removing the structural rules makes this distinction real.

#exponentials (the escape hatch)

Promotion:      ⊢ ?Γ, A
                ──────────
                ⊢ ?Γ, !A

Dereliction:    ⊢ Γ, A
                ──────────
                ⊢ Γ, ?A

Contraction:    ⊢ Γ, ?A, ?A
                ──────────────
                  ⊢ Γ, ?A

Weakening:        ⊢ Γ
                ──────────
                ⊢ Γ, ?A

Here they are. Contraction and weakening return, but ONLY for formulas marked with ?. The ! modality (read: "of course") is the magic marker. !A means "as many copies of A as you want." ?A is its dual (read: "why not").

Promotion says: if you can produce A from a context that's entirely made of ?-marked things (things that can be freely copied), then you can promote A to !A, meaning it too can be freely copied.

This is the formal version of what I said in Part 1. Rust's Copy trait is the ! modality. You opt in. Integers are !Int because contraction and weakening are fine for them. TcpStream is bare linear TcpStream because no.

Important subtlety: Rust is technically affine, not linear. Affine logic has weakening but not contraction. You CAN drop things in Rust (values get dropped at end of scope). You can NOT copy them without Copy/Clone. So Rust's actual logic is:

Linear logic    = no contraction, no weakening
Affine logic    = no contraction, YES weakening  ← Rust lives here
Relevant logic  = YES contraction, no weakening
Classical logic = YES contraction, YES weakening

The Drop trait is Rust making weakening explicit and controlled rather than eliminating it entirely. Technically this means the Rust-linear-logic mapping is approximate, not exact. But the core insight, that the structural rules should be opt-in rather than free, is the same.

#proof nets

OK now the part I actually wanted to write about.

Sequent calculus proofs have a problem: bureaucracy. The same underlying proof can be written in multiple ways depending on what order you apply rules. Watch:

Proving ⊢ A, B, A ⊗ B (which says: "if I consume A and B, I can produce A ⊗ B"):

Proof 1:
                ──────────    ──────────
                ⊢ A⊥, A      ⊢ B⊥, B
                ─────────────────────────
                  ⊢ A⊥, B⊥, A ⊗ B

Proof 2:
                ──────────    ──────────
                ⊢ B⊥, B      ⊢ A⊥, A
                ─────────────────────────
                  ⊢ B⊥, A⊥, A ⊗ B
                ─────────────────────────  (exchange)
                  ⊢ A⊥, B⊥, A ⊗ B

These are "different proofs" in sequent calculus but they mean the same thing. The only difference is which axiom we wrote first. This is bureaucratic noise.

Proof nets eliminate this. Instead of a tree of inference rules, a proof net is a graph. Formulas are nodes, and you draw links between them.

Here's the proof net for ⊢ A, B, A ⊗ B:

    A⊥ ─────── A
                 ╲
                  ⊗ ─── A ⊗ B
                 ╱
    B⊥ ─────── B

The two axiom links connect A to A and B to B. The tensor node joins A and B into A ⊗ B. That's it. There's no ordering ambiguity. Both sequent calculus proofs above map to this single proof net.

A slightly more involved example. Let's prove the commutativity of tensor:

⊢ (A ⊗ B), B ⊗ A

Since (A ⊗ B) = A ⅋ B, this is really ⊢ A ⅋ B, B ⊗ A.

    A⊥ ─────────────────── A
     ╲                      ╲
      ⅋ ─── A⊥ ⅋ B⊥        ⊗ ─── B ⊗ A
     ╱                      ╱
    B⊥ ─────────────────── B

The axiom links cross: A on the left connects to A on the right, B on the left connects to B on the right. The par node groups A and B. The tensor node groups B and A (note: reversed order). The crossing of the axiom links is what "does the swap." No exchange rule needed. The geometry handles it.

interactive

#correctness: Danos-Regnier

Not every graph you can draw is a valid proof net. You need a correctness criterion. The standard one is from Danos and Regnier (1989).

The idea: at every par node (⅋), you have two subtrees coming in. A "switching" is a choice, for each par node, of which subtree to keep (left or right). The Danos-Regnier criterion says:

A proof structure is a proof net if and only if, for every switching, the resulting graph is connected and acyclic (i.e., a tree).

For our commutativity example, the only par node is the A ⅋ B node. We have two switchings:

Switching 1 (keep left):     Switching 2 (keep right):
    A⊥ ─── A                     B⊥ ─── B
             ╲                             ╲
              ⊗                             ⊗
             ╱                             ╱
    B⊥ ─── B                     A⊥ ─── A

Both are connected? Yes. Both are acyclic? Yes. So this is a valid proof net.

If you tried to draw a graph where the axiom links didn't cross (connecting A to B and B to A), one of the switchings would produce a disconnected graph, and the criterion would reject it. The geometry enforces logical validity.

This is what I meant in Part 1 when I said proof nets eliminate bureaucracy. The sequent calculus has multiple proof trees for the same proof. The proof net is unique. And the Danos-Regnier criterion is a purely geometric/topological condition. You're checking correctness of a logical proof by checking properties of a graph. That's wild if you think about it.

If you've worked with SSA-based IRs in compilers, this should remind you of something. A program can be written many ways but the dataflow graph is canonical. Independent operations can be reordered freely. The graph captures actual dependencies, not syntactic order. Proof nets are doing the same thing for proofs. Girard was building a compiler IR for logic in 1987.

#coherence spaces

Now the semantics. Syntax tells you the rules. Semantics tells you what the rules mean, what mathematical universe the logic lives in. For classical logic the standard semantics is Boolean: propositions are true or false, connectives are truth table operations.

Linear logic needed something more refined. Girard built coherence spaces.

Definition. A coherence space X is a pair (|X|, ⌢) where:

We write a ⌣ b (strict coherence) when a ⌢ b and a ≠ b. We write a # b (incoherence) when ¬(a ⌢ b).

Definition. A clique of X is a subset c ⊆ |X| such that for all a, b ∈ c, a ⌢ b. (Every pair of tokens in the set is coherent.)

The cliques of X, ordered by inclusion, form the denotation of X. They're the "states" or "points" of the space.

Definition. The dual X has the same web |X| but flipped coherence:

a ⌢X⊥ b   iff   a = b  or  a #X b

Tokens that were coherent (compatible) in X become incoherent (incompatible) in X, and vice versa. Cliques of the dual are exactly the anti-cliques of the original space (sets of pairwise incoherent tokens, plus singletons).

Negation in linear logic is involutive: (X) = X. Check it: flipping the coherence twice gives you back the original. This is a property classical negation has but intuitionistic negation doesn't. Linear logic is symmetric in a way that intuitionistic logic is not.

Operations on coherence spaces:

Tensor product X ⊗ Y:

|X ⊗ Y| = |X| × |Y|
(a, b) ⌢ (a', b')  iff  a ⌢X a'  and  b ⌢Y b'

Both components must be pairwise coherent. Cliques of X ⊗ Y are sets of pairs where projecting to either component gives a clique in that component.

Par X ⅋ Y:

|X ⅋ Y| = |X| × |Y|
(a, b) ⌢ (a', b')  iff  (a,b) = (a',b')  or  a ⌣X a'  or  b ⌣Y b'

Note the switch from ⌢ to ⌣ (strict coherence, meaning coherent AND distinct). This trips people up. Tensor is naturally stated with the reflexive relation ⌢. Par needs the strict relation ⌣. Two pairs are par-coherent if they're identical, OR they differ in a way that's strictly coherent in at least one component. Using ⌢ here (as I originally did and then caught) would be too permissive. It would make unequal pairs coherent whenever they share a component value, which breaks the duality. You can verify: X ⅋ Y should equal (X ⊗ Y), and it only does with the ⌣ formulation.

The linear function space X ⊸ Y is defined as X ⅋ Y. A linear function from X to Y is a clique of X ⅋ Y. Unwinding the definitions, this gives you a relation between tokens of X and tokens of Y satisfying specific conditions.

Linear maps between coherence spaces (the morphisms in the category) are exactly the cliques of X ⊸ Y. They satisfy a stronger condition than just continuity. There's actually a three-level hierarchy here:

  1. Scott-continuous maps: monotone and preserve directed unions (directed joins)
  2. Stable maps (Berry): Scott-continuous and additionally satisfy a pullback condition: f(x ∩ x') = f(x) ∩ f(x') when x ∪ x' is a clique
  3. Linear maps: stable and additionally preserve ALL compatible unions, not just directed ones. If x ∪ y is a clique, then f(x ∪ y) = f(x) ∪ f(y). Equivalently: f(x) = ∪a∈x f({a}).

The distinctive property of linear maps is this additivity: the output on a clique is completely determined by the outputs on individual tokens. Every linear map is stable, every stable map is Scott-continuous, and neither converse holds. This hierarchy is central to the denotational semantics of linear logic and it's easy to confuse the levels.

#the physics

OK here's the part I've been wanting to write since Part 1.

If you've done statistical mechanics, the coherence space definition should be setting off alarm bells. Let me line them up.

Coherence spaces Statistical mechanics
Web (set of tokens) Set of microstates
Coherence relation Compatibility between microstates
Clique (pairwise coherent set) Valid configuration
Maximal clique Ground state / equilibrium configuration
Dual space X Conjugate thermodynamic description
interactive
Coherence Space X
Statistical Mechanics
Click tokens to build a clique. Coherent pairs are connected.

In stat mech, a system has microstates. Not all microstates can coexist (energy constraints, exclusion principles, geometric constraints). The valid macrostates are the compatible collections of microstates. The partition function sums over these configurations weighted by energy.

A coherence space is this same structure. Tokens are like microstates. The coherence relation says which tokens can coexist. Cliques are the valid configurations. The tensor product X ⊗ Y looks like the combined state space of two independent systems, you take the product of the microstate sets and require both components to be independently valid. This is exactly how you compose non-interacting systems in stat mech.

The duality is suggestive too. In thermodynamics, conjugate variables come in pairs: temperature/entropy, pressure/volume, chemical potential/particle number. Legendre transforms swap between conjugate descriptions of the same system. In coherence spaces, taking the dual X flips the coherence relation, giving you a complementary view of the same underlying structure. I don't think this is a formal equivalence. But the pattern is too structural to be a coincidence.

Here's what I think is actually happening. Linear logic is the logic of things that are conserved. Physics is the science of things that are conserved. They converge on the same mathematical structures because the constraint is the same: you can't duplicate energy, you can't discard entropy, you can't copy a quantum state (no-cloning theorem), you can't ignore a resource (second law says entropy must be accounted for).

The no-cloning theorem is maybe the cleanest example. In quantum mechanics, there's no operation that takes an arbitrary state |ψ⟩ and produces |ψ⟩ ⊗ |ψ⟩. You cannot copy a quantum state. From the linear logic perspective this is just: contraction doesn't hold. That's it. That's the default in linear logic. Quantum mechanics isn't being weird here. Classical programming languages were being weird by allowing copies of things that shouldn't be copied. Physics never made that mistake.

This isn't just an informal analogy either. Samson Abramsky proved formally (arXiv:0910.2401) that in a compact closed category, a natural diagonal map (which is what contraction is, categorically) forces every endomorphism to be a scalar multiple of the identity. The no-cloning theorem and the absence of contraction in linear logic are the same result in different categories. The nLab puts it directly: when quantum physics is axiomatized by linear type theory, the content of "no-cloning" is the very linearity of the logic.

The coherence space semantics also connects to something called the geometry of interaction program, which is Girard's later work on giving a dynamic, execution-based semantics to linear logic using operator algebras. There are direct connections to von Neumann algebras, which are also the mathematical framework for quantum mechanics. The rabbit hole goes deep and I'm not qualified to go all the way down. But the surface-level connections are already striking enough that I wanted to put them in writing, because I haven't seen anyone lay this out clearly.

#Curry-Howard: proofs are programs

Last section. The one that ties everything together.

The Curry-Howard correspondence says: types are propositions, programs are proofs. A function of type A → B is a proof that A implies B. If the program typechecks, the proof is valid.

For classical logic, Curry-Howard gives you programs with continuations and control operators (callcc and friends). For intuitionistic logic, you get the simply typed lambda calculus. Clean and well-studied.

For linear logic, you get a linear lambda calculus where every variable must be used exactly once. Here are the typing rules:

Variable:
                ───────────
                x : A ⊢ x : A

Linear application (cut):
                Γ ⊢ e₁ : A ⊸ B      Δ ⊢ e₂ : A
                ─────────────────────────────────────
                        Γ, Δ ⊢ e₁ e₂ : B

Linear abstraction:
                Γ, x : A ⊢ e : B
                ──────────────────────
                Γ ⊢ λx. e : A ⊸ B

Tensor introduction:
                Γ ⊢ e₁ : A      Δ ⊢ e₂ : B
                ─────────────────────────────────
                  Γ, Δ ⊢ (e₁, e₂) : A ⊗ B

Tensor elimination:
                Γ ⊢ e₁ : A ⊗ B      Δ, x : A, y : B ⊢ e₂ : C
                ──────────────────────────────────────────────────
                        Γ, Δ ⊢ let (x, y) = e₁ in e₂ : C

With introduction:
                Γ ⊢ e₁ : A      Γ ⊢ e₂ : B
                ─────────────────────────────────
                    Γ ⊢ ⟨e₁, e₂⟩ : A & B

Plus introduction:
                Γ ⊢ e : A                    Γ ⊢ e : B
                ────────────────          ────────────────
                Γ ⊢ inl(e) : A ⊕ B       Γ ⊢ inr(e) : A ⊕ B

! introduction (promotion):
                !Γ ⊢ e : A
                ───────────────
                !Γ ⊢ promote(e) : !A

Stare at the tensor introduction rule. Context splits: Γ and Δ are disjoint. The resources that go into e₁ and the resources that go into e₂ cannot overlap. Compare it directly to the sequent calculus tensor rule from earlier. Same rule. The program term (e₁, e₂) is literally the proof tree, with code instead of formulas.

Now look at with introduction. Same context Γ in both premises. Both e₁ and e₂ have to be producible from the same resources. But the consumer only picks one. This is why I mapped it to a trait with multiple methods earlier. The implementor has to be able to provide both, using the same internal state. The caller picks one.

The linear function space A ⊸ B is the type of functions that consume their argument exactly once. In Rust, this is FnOnce:

// A ⊸ B in linear logic = FnOnce(A) -> B in Rust
// The function consumes its input. Can only be called once.

fn consume_file(f: File) -> Summary {
    // f is moved in. Used exactly once.
    let contents = f.read_to_string();  // f consumed here
    summarize(contents)
}

Compare to non-linear function space A → B, which would be Fn:

// !A ⊸ B ≈ Fn(&A) -> B
// Input is under !, so it can be used multiple times.
// In Rust, this means borrowing, not consuming.

fn inspect_file(f: &File) -> Metadata {
    // f is borrowed. Can be used repeatedly.
    f.metadata()
}

The & in Rust is not just "reference." From the linear logic perspective, &T is something like making the value available under a controlled form of !, where you can use it multiple times but only for reading. &mut T gives you linear (single exclusive) access. The actual mapping has subtleties (Rust's lifetimes add constraints that pure linear logic doesn't have), but the core idea is that Rust's reference system is a pragmatic implementation of linear logic's exponential modalities.

Here's the thing that I keep coming back to. When rustc accepts your program, it has checked that your code uses its resources correctly within the rules it enforces. The type system is acting as a proof checker for something that looks a lot like a fragment of linear (well, affine) logic. Your program is something that looks a lot like a proof.

I should be precise about what I'm claiming here. The FnOnce ↔ A ⊸ B and &T! correspondences are well-motivated structural analogies, not formally established isomorphisms for Rust's full type system. Rust's borrowing, lifetimes, unsafe, and Drop semantics go well beyond textbook affine type systems. The Oxide paper (Weiss et al.) provides a formal model of a Rust-like language but frames it as type safety (progress/preservation), not Curry-Howard. Nobody has published a precise Curry-Howard isomorphism for all of Rust. The analogies capture something real about the structure, but claiming "Rust programs ARE proofs in linear logic" is stronger than what's been formally shown. I think it's true in spirit. Just know I'm extrapolating from the formal results, not citing them.

So with that caveat: every safe Rust program is, at minimum, doing something very close to theorem proving about resource safety. You're doing linear logic every time you compile. You just didn't know it.

#wrapping up

Linear logic's connectives tell you that "and" and "or" hide real distinctions that matter when you can't freely copy things. Proof nets show that once you strip away syntactic noise, proofs have a clean geometric structure. Coherence spaces give linear logic a semantics that looks suspiciously like statistical mechanics. And Curry-Howard tells you that Rust programs are doing something that looks very much like proofs in that logic, even if the full formal isomorphism hasn't been published yet.

Girard wrote in the opening line of the 1987 paper that linear logic is "a logic behind logic." Not "the." He was careful about that. It's one possible foundational decomposition, not a claim of uniqueness. I still think he undersold it.


thanks for reading. i write about stuff i like. Part 1 is here.