HolyGhost logoHolyGhost
← cd ..
Analysis

Insecure Deserialization: Trusting a Blob of Bytes Too Much

Turning saved data back into live objects sounds harmless. When the data comes from an attacker, it can lead to remote code execution. Here is how insecure deserialization works and how to avoid it.

HolyGhost··8 min read

Imagine you handed someone a flat pack wardrobe in a box, along with a sheet of instructions, and asked them to build it in your bedroom. You trust the box because you packed it yourself. Now imagine a stranger swaps the instruction sheet for one of their own. It still looks like assembly steps, but buried in the middle is a line that says "and while you are here, unlock the front door and let me in". The person building the wardrobe follows every step faithfully, because following the instructions is their whole job. That is insecure deserialization. The dangerous part is not the box of parts, it is that rebuilding something from instructions can quietly do far more than assemble furniture.

Insecure deserialization is one of the less intuitive vulnerabilities, which is exactly why it catches people out. It hides inside a routine, everyday operation, turning stored data back into live objects, and when that data is attacker controlled it can escalate all the way to running code on the server. Here is the idea, kept plain.

Scope

This is a defensive explainer of a well known vulnerability class, for use on systems you own or are authorised to test.

Serialization, in one minute

To understand the vulnerability you first need one concept, and it is genuinely simple. Programs work with objects in memory. An object is a structured bundle of related information, and sometimes behaviour, held together as one thing. A "user" object might hold a name, an email address, and a list of permissions all in one place.

Objects live in the computer's memory, but memory is temporary and local. The moment you want to save an object to disk, send it across a network, or tuck it into a browser cookie, you cannot ship the live object itself. You have to flatten it into something portable. That flattening is called serialization, turning a structured object into a plain stream of bytes or text. Later, when the program needs the object back, it reverses the process. That is deserialization, rebuilding a live object from the flattened stream.

object in memory  --serialize-->  bytes  --deserialize-->  object in memory

This happens constantly and invisibly all over modern software: session data that remembers you are logged in, caches that store computed results, message queues that pass work between services, and API payloads that carry data between systems. It is useful and completely normal. The trouble starts only when the bytes being deserialized came from someone you do not trust.

Why rebuilding an object can be dangerous

Here is the part that surprises people. Deserializing is not always a passive "read some data into a box" step. Depending on the language and the library doing the work, reconstructing an object can run code as part of the rebuilding process. To recreate the object faithfully, the deserializer may set fields, call special setup methods, and, most importantly, decide which types of object to create based on what the incoming data tells it to create.

That last point is the crack in the wall. If the data itself gets to say "build me an object of this type, with these values", then an attacker who controls the data gets to influence what objects come to life inside your program, and what methods run as they do.

An attacker who can supply the serialized data can therefore craft input that, when deserialized, brings dangerous objects into existence, or triggers a chain of the application's own existing code that ends somewhere harmful. That chain has a name, a gadget chain. The attacker does not usually smuggle in brand new code. Instead they cleverly link together pieces of code that are already present in the application and its libraries, arranging for them to call one another in a sequence that finishes by doing something the attacker wants, such as running a system command.

1. The app deserializes data from a cookie, upload, or message.
2. The attacker supplies crafted serialized data instead.
3. Rebuilding it instantiates objects and calls methods the attacker chose.
4. A chain of those calls leads to code execution, or other abuse.

Why gadget chains feel like magic

It seems impossible that harmless building block code lying around in libraries could be strung into an attack. But large applications include enormous amounts of code, and among all those methods there are almost always some that, in the right order with the right inputs, add up to something powerful. The attacker is not writing an exploit so much as discovering a path through code that already exists.

The results range from tampering with data and causing a denial of service, where the application is knocked over or hung, all the way to full remote code execution, meaning the attacker runs their own commands on your server from afar. That top end is why this sits among the more serious web vulnerabilities. It affects many ecosystems, including Java object streams, Python's pickle, PHP's unserialize, and .NET.

Pickle and friends are not for untrusted data

Some serialization formats are explicitly documented as unsafe for untrusted input, Python's pickle being the classic example. Its own documentation warns plainly that you should never unpickle data from a source you do not trust. If you find code deserializing untrusted data with one of these, treat it as a red flag, not an edge case.

How it shows up in the real world

The reason this vulnerability keeps appearing is that the dangerous data often does not look dangerous, and it arrives through channels that feel routine. A common pattern is an application that stores session state in a cookie. To save server memory, it serializes the session object, hands the bytes to the browser as a cookie, and deserializes them again on the next request. From the outside, a cookie is just a string. But if that string is a serialized object and the server rebuilds it without suspicion, the browser, which the user fully controls, has become a direct pipe into the deserializer.

The same shape appears with uploaded files that carry serialized state, with hidden form fields, with data pulled off a message queue, and with tokens passed between services. Anywhere serialized bytes cross a trust boundary and get rebuilt on the other side, the question to ask is the same: could an attacker have shaped these bytes?

The tell in code review

When reading code, treat any call that turns external bytes back into rich language objects as a place to stop and think, especially names built around unserialize, readObject, pickle.loads, or a formatter deserialising a stream. The vulnerability is rarely loud. It looks like ordinary plumbing, which is exactly why it survives so long unnoticed.

Avoiding it

The theme echoes the other injection style flaws. Do not let untrusted input drive a powerful operation.

  1. Do not deserialize untrusted data with rich, code capable formats. This is the real fix, and it is worth stating first because everything else is a fallback. If input comes from a user, do not feed it to a deserializer that can instantiate arbitrary objects.
  2. Prefer simple data formats. Exchange data as plain JSON, a lightweight text format that represents only data such as numbers, strings, and lists, parsed into known structures, rather than serialised language objects. The principle is a good one to keep in your head: data should be data, not a recipe for building objects.
  3. Verify integrity. Where you genuinely must round trip serialised data through a client, attach a cryptographic signature, a tamper evident seal the server creates with a secret key. On the way back in, the server checks the seal and refuses anything it did not produce itself, so an attacker cannot swap in their own payload undetected.
  4. Constrain types. If a rich deserializer is truly unavoidable, lock it down to an allowlist of expected, safe types, a short list of exactly what it is permitted to build, rather than letting it construct anything the data names.
  5. Keep libraries patched. Gadget chains are often discovered inside popular libraries. Staying current on updates removes known chains as maintainers close them off, which shrinks the toolkit an attacker has to work with.

A simple mental model

Ask one question of any deserialization in your system: if an attacker wrote these exact bytes, what is the worst that could happen? If the honest answer is "they could build objects and run methods of their choosing", you are relying on nobody ever tampering with the input. That is not a security control, it is a hope.

The takeaway

Insecure deserialization happens when an application rebuilds objects from data it does not trust, and the act of rebuilding can run code or trigger harmful gadget chains, sometimes reaching remote code execution. The dependable answer is to never deserialize untrusted input with a format that can construct arbitrary objects. Use plain data formats, verify integrity when data must travel, constrain the types a deserializer may build, and keep the powerful deserializers well away from anything an attacker can touch. It is the same lesson as SQL injection and command injection in a subtler disguise: untrusted input should never become executable behaviour.