Binding ciphertext to its context
Associated data, domain separation, authenticated headers and versioned formats: why a perfectly valid ciphertext can still be a vulnerability if replayed outside its context.
In the previous chapter, you saw that the nonce must be unique for the guarantees of AEAD to hold. But there is another attack vector, a subtler one: a ciphertext can be perfectly authentic, with a valid tag and a unique nonce, and still serve as a weapon in the hands of an adversary who replays it in the wrong context. The problem is no longer the algorithm. It is the absence of a binding between the ciphertext and the situation in which it is meant to be used.
Imagine a software update system. An update package is encrypted and authenticated for component A, version 2. An adversary intercepts this package and replays it as an update for component B. The tag is valid. The nonce is unique. Not a single bit has been touched. And yet, code intended for one component executes in another.
By the end of this chapter, you will be able to define associated data and its role in binding a ciphertext to its context, explain domain separation and why the same ciphertext must not cross multiple roles, understand why a header transmitted in plaintext must be covered by the tag, and describe how a versioned format enables cryptographic migration without breaking existing data.
Associated data (AAD)
The full notation for an AEAD seal is , where denotes the associated data Associated data (AAD) Data authenticated by an AEAD algorithm but not encrypted, typically metadata such as a header, identifier, or usage context. It binds the ciphertext to its context: any mismatch between the expected associated data and the value provided at decryption invalidates the tag and causes the open operation to fail. (sometimes abbreviated AAD). This parameter is mentioned in module 1; this chapter examines it in detail.
Associated data is information that travels in plaintext with the encrypted message but is bound to the ciphertext by the authentication tag. It is not part of the ciphertext: anyone observing the transmission can read it. However, it is part of the tag computation: if the associated data differs between seal and open, verification fails. The open returns an error, without decrypting, without additional information.
The central intuition: AAD binds a ciphertext to its context of use.
AAD can contain a user identifier, a session identifier, a version number, a destination identifier, a creation timestamp, or any combination of these elements. The constraint is: the value used during seal must be identical to the value presented during open. Any divergence invalidates the tag.
Returning to the update package example: if the seal includes in the AAD the identifier of the target component and the expected version, the open fails as soon as the package is presented to a different component or for a different version. The ciphertext cannot be replayed outside its declared context, even by someone who does not know the key.
Domain separation
Domain separation Domain separation A technique that derives distinct keys or contexts for each role or usage, ensuring that a valid ciphertext in one domain cannot be replayed in another. It is typically implemented via role-specific associated data, key derivation prefixes, or version bytes. It is essential for preventing context-confusion attacks. is the principle that a key, or a cryptographic value, must serve only one well-defined purpose. Two different operations in the same system must not share the same key, even if they use the same algorithm.
Why? Because a ciphertext produced for one purpose can be replayed into another if nothing distinguishes the two contexts cryptographically. Here are two representative cases.
Session tokens and API tokens. A service generates tokens encrypted with AES-GCM. If the same key is used to encrypt both session tokens and long-lived API tokens, an adversary who obtains an API token can try to present it as a session token. The tag is valid, the nonce is unique. Without domain separation, the system has no way to distinguish the two.
Encryption for sender and for recipient. In a messaging protocol, messages encrypted by Alice to Bob and by Bob to Alice must not share the same directional key. Otherwise, a message from Alice to Bob can be replayed as if it came from Bob to Alice.
Separation is implemented in two ways, often combined.
- Key per purpose: derive a distinct key for each role from a master key, with an explicit label (for example a key derivation function with a context of “session-token” for one and “api-token” for the other).
- Context in the AAD: systematically include a domain identifier in the associated data (a string such as
"purpose=session"or"direction=client-to-server"). Two ciphertexts produced in different domains will have different AAD values, hence different tags, even with the same key.
Both approaches reinforce each other. The key per purpose forbids replay at the level of cryptographic material. The context in the AAD forbids replay even if two purposes accidentally share a key.
The principle to retain: an encrypted object must explicitly declare its domain of use, and the system must make it impossible to present it in a different domain.
Plaintext headers, but authenticated
In many protocols, a header travels in plaintext before the ciphertext. This header can contain a version identifier, an algorithm identifier, a recipient identifier, a message length, or a combination of these fields. It must be readable before decryption, notably to route the request to the correct key or algorithm.
The problem is: if this header is not covered by the tag, it is malleable Malleability A property of an encryption scheme where modifying the ciphertext produces a predictable, exploitable change in the corresponding plaintext. Unauthenticated cipher modes (stream, CTR, CBC without MAC) are malleable. Using an AEAD algorithm eliminates this property by causing any modified ciphertext to fail decryption. . An adversary can modify it without invalidating the authentication. The consequences are concrete.
- Recipient redirection. If the recipient identifier in the header is not authenticated, an adversary can replace Bob’s identifier with Alice’s. The message is decrypted by Alice, who receives content intended for Bob. The tag is valid, because it does not cover the header.
- Algorithm downgrade. If the algorithm identifier in the header is not authenticated, an adversary can replace it with a weaker algorithm. The recipient attempts to decrypt with that algorithm, believing it is what the sender chose.
- Version confusion. If the version number is not covered by the tag, an adversary can make a message from one version pass as another, hoping the other version’s parser handles the bytes differently.
The solution is direct: place the entire header in the AAD. The header remains readable in plaintext (AAD is not encrypted), but any modification invalidates the tag. The sender includes the header in the seal. The recipient reconstructs the header from the received bytes and passes it to the open. If a single field was modified in transit, the open fails.
This is the same mechanism as in module 1: the authentication tag covers everything that must be protected against modification, whether it is encrypted or not.
Versioned formats and cryptographic agility
A system deployed in production cannot change algorithms in a single restart. Data exists, encrypted with the current algorithm. Older clients must continue to work. Migration must be progressive.
Cryptographic agility Crypto-agility The ability of a format or protocol to migrate to new cryptographic primitives without breaking existing data. It is typically implemented via a version byte at the head of the ciphertext, allowing old records to be decoded and new ones encrypted with the current algorithm. It is essential for preparing a post-quantum migration. is the ability of a system to migrate from one algorithm to another without rewriting the entire codebase and without invalidating existing data. It relies on a versioned message format.
The principle is simple: the first byte (or first few bytes) of the encoded message indicate which primitive was used to encrypt it. The decryptor reads this field first, chooses the appropriate decryption path, then proceeds.
[version: 1 byte | nonce: N bytes | ciphertext + tag: M bytes]
Version 1 means AES-256-GCM. Version 2 means XChaCha20-Poly1305. Version 3 could, tomorrow, mean a quantum-resistant algorithm. Neither format is incompatible with the other: data encrypted in version 1 remains decryptable as long as the version 1 key is retained.
This mechanism is already present in common protocols. TLS encodes the cipher suite identifier in the handshake. GPG key formats include an algorithm byte. JWT formats include an alg field in the header.
But it introduces a critical constraint: the version byte must itself be covered by the tag. Otherwise, an adversary can modify the version to force the decryptor to use a weaker algorithm or a different key. The version byte is in plaintext (so the decryptor knows what to do), but it must appear in the AAD of both seal and open.
Structure of an authenticated message
The structure above illustrates the general principle. The header is readable without a key, which allows the recipient to select the correct key and algorithm before attempting decryption. But the header is cryptographically bound to the ciphertext via the tag: it is impossible to modify the version, the nonce, or the recipient identifier without invalidating the authentication.
The binding principle
Everything above converges on a single principle:
An authentic ciphertext outside its context remains a vulnerability. Context is part of the message.
This principle has immediate practical consequences.
- Each ciphertext must explicitly declare who it is for, for what purpose, in which domain, with which version.
- These metadata must not merely accompany the ciphertext: they must be bound to it by the tag, so that a replay in the wrong context is detected cryptographically, not merely by an application-level check.
- The application-level check (verifying an identifier against a database, validating token expiry) remains necessary, but it does not replace cryptographic binding. Both layers are complementary.
Quiz
1. What is associated data (AAD) in an AEAD scheme?
2. A service encrypts session tokens and API tokens with the same AES-GCM key, without domain separation. What risk does this create?
3. A header containing the recipient identifier travels in plaintext before the ciphertext, but is not included in the AAD. What can an adversary do?
4. Why must the version byte in a versioned message format be included in the AAD?
Key takeaways
- Associated data Associated data (AAD) Data authenticated by an AEAD algorithm but not encrypted, typically metadata such as a header, identifier, or usage context. It binds the ciphertext to its context: any mismatch between the expected associated data and the value provided at decryption invalidates the tag and causes the open operation to fail. is transmitted in plaintext but covered by the authentication tag. Modifying it causes the open to fail. It binds the ciphertext to its context of use.
- An empty AAD is cryptographically valid, but means the ciphertext is not bound to any context and can be freely replayed.
- Domain separation Domain separation A technique that derives distinct keys or contexts for each role or usage, ensuring that a valid ciphertext in one domain cannot be replayed in another. It is typically implemented via role-specific associated data, key derivation prefixes, or version bytes. It is essential for preventing context-confusion attacks. consists of deriving distinct keys per purpose or including a domain identifier in the AAD, so that a ciphertext produced in one domain is not valid in another.
- Every plaintext header field must appear in the AAD: version, algorithm, recipient identifier, message direction. Without this, the header is malleable Malleability A property of an encryption scheme where modifying the ciphertext produces a predictable, exploitable change in the corresponding plaintext. Unauthenticated cipher modes (stream, CTR, CBC without MAC) are malleable. Using an AEAD algorithm eliminates this property by causing any modified ciphertext to fail decryption. without detection.
- A versioned format enables primitive migration Crypto-agility The ability of a format or protocol to migrate to new cryptographic primitives without breaking existing data. It is typically implemented via a version byte at the head of the ciphertext, allowing old records to be decoded and new ones encrypted with the current algorithm. It is essential for preparing a post-quantum migration. without invalidating existing data. The version byte must itself be covered by the tag.
- An authentic ciphertext outside its context remains a vulnerability. Context is part of the message.