Encapsulating Security Protocol (ESP)

Finjan TeamBlog, data security

Finjan Encapsulating Security Protocol (ESP)

Messages, documents, and files sent via the internet are transmitted in the form of data packets using one or more transfer mechanisms or protocols such as TCP/IP. But how can we ensure that the information received is the authentic material which the originator of the message claims to have sent? That its confidentiality has been preserved along the way? And that it retains the integrity and authenticity of the source material?

One way to ensure all that is through Encapsulating Security Payload or ESP.

Encapsulating Security Payload – Nothing supernatural or psychic about this ESP

Encapsulating Security Payload (or ESP) is a transport layer security protocol designed to function with both the IPv4 and IPv6 protocols. It takes the form of a header inserted after the Internet Protocol or IP header, before an upper layer protocol like TCP, UDP, or ICMP, and before any other IPSec headers that have already been put in place.

ESP gives protection to upper layer protocols, with a Signed area indicating where a protected data packet has been signed for integrity, and an Encrypted area which indicates the information that’s protected with confidentiality. Unless a data packet is being tunneled, ESP protects only the IP data payload (hence the name), and not the IP header.

ESP may be used to ensure confidentiality, the authentication of data origins, connectionless integrity, some degree of traffic-level confidentiality, and an anti-replay service (a form of partial sequence integrity which guards against the use of commands or credentials which have been captured through password sniffing or similar attacks).

The set of services provided by ESP depends on the options selected when a Security Association (or SA) was established, and also on the location of the service’s deployment within the network configuration.

The Encapsulating Security Payload Header

In IPv4 and IPv6, the ESP header is designed to provide a range of security services. The ESP protocol may be applied in isolation, in combination with an Authentication Header (AH), or in a nested manner. Security services may be provided between a pair of communicating hosts, a pair of communicating security gateways, or between a host and a security gateway.

In practice, the ESP header is placed after the IP header and before the next layer protocol header when used in transport mode (see below), or before an encapsulated IP header in tunnel mode. The ESP header itself consists of two parts: a Security Parameters Index, and a sequence number.

Security Parameters Index

The Security Parameters Index (SPI) is an arbitrary number of 32 bits which, in conjunction with the destination IP address and the ESP security protocol itself, uniquely identifies the Security Association for a protected datagram (packet). An SPI value in the range from 1 to 255 is usually selected by the destination system on establishment of an SA, and is a mandatory field in describing the ESP header.

Sequence Number

A 32-bit sequence number is also mandatory, and always present – even if a receiver elects not to enable the anti-replay service for a given Security Association. The sender is obliged to always transmit this field of the ESP header, whose processing (or not) is left to the discretion of the recipient.

When an SA is set up, the counters at both the sender’s and receiver’s end are initialized to zero. The first packet sent using a given SA will have a value of 1. Intended as a mechanism for anti-replay protection, the sequence number for subsequent transmissions increases in single steps from 1, and is never allowed to cycle.

The receiver checks this field to verify that a packet for a Security Association bearing this number has not been received already. The packet is rejected if one has been received.

The Payload

This is a variable-length data field containing the information described by the Next header field. This field is mandatory, and its length must be an integral number of bytes.

Any algorithm requiring explicit, per-packet synchronization data to be used in encrypting the payload must indicate the payload data length, any structure for such data, and the location of this information as part of an RFC specification on how to use the algorithm with ESP.

The ESP Trailer

As its name suggests, the ESP trailer comes after the data payload, and consists of three fields: padding, padding length, and the next header.


To ensure that the ciphertext resulting from data packet encryption terminates on a 4 byte boundary (and regardless of any other requirements laid down by the encryption algorithm or block cipher), some padding in the 0 to 255 bytes range is used for 32-bit alignment. The 4 byte boundary condition is necessary to ensure the correct positioning of the Authentication data field, if present.

Padding Length

This is an 8-bit figure which specifies the length of the Padding field in bytes. The Padding Length is used by the data recipient as a criterion to judge whether or not to accept or discard the Padding field received.

Next Header

This field indicates the nature of the payload (e.g., TCP or UDP). The Next Header is an IPv4 or IPv6 protocol number describing the format of the Payload data field.

The ESP Authentication Trailer

The ESP Authentication Trailer contains the Authentication Data field, which holds the Integrity Check Value (ICV), and a message authentication code for verifying both the sender’s identity and the message’s integrity. The ICV is calculated with respect to the ESP header, the payload data, and the ESP trailer.

Transport mode

When used in transport mode, the ESP header follows the IP header of the original IP datagram. If the datagram carries an IPSec header, then the ESP header goes before this. The ESP trailer and optional authentication data are inserted after the payload.

Transport mode doesn’t authenticate or encrypt the IP header, which can potentially expose the addressing information to attackers while the packet is in transit. But although it doesn’t provide as much security protection as tunnel mode, hosts typically use ESP in transport mode, as this requires less processing power.

Tunnel mode

Encapsulation or protective coverage occurs more extensively in tunnel mode, which creates and uses a new IP header as the outermost IP header of a datagram. This is followed by the ESP header, then the original datagram (which includes both the IP header and the original payload). As in transport mode, the ESP trailer and optional authentication data are appended to the payload.

In tunnel mode, ESP completely protects the original datagram, which now forms the payload data for the newly formed ESP data packet. Again, though, ESP does not protect the new IP header. Gateways are required to use ESP in tunnel mode.

Protection Under ESP

Encapsulating Security Payload (ESP) protocol ensures data confidentiality, and also optionally provides data origin authentication, data integrity checking, and replay protection. ESP provides encryption, with both communicating parties using a shared key for encrypting and decrypting the data they exchange.

The combined use of encryption and authentication under ESP reduces processor overhead, and reduces a system’s vulnerability to denial-of-service (DoS) attacks.

Share this Post

Finjan Encapsulating Security Protocol (ESP)
Article Name
Encapsulating Security Protocol (ESP) and its Role In Data Integrity
ESP ensures data confidentiality and optionally provides data origin authentication, data integrity checking, replay protection and shared key encryption.
Publisher Name
Publisher Logo
Finjan Encapsulating Security Protocol (ESP)