HolyGhost logoHolyGhost
← cd ..

Shellshock: The Bash Bug Hiding in an Environment Variable

For years, Bash would run code smuggled into the end of an environment variable. Shellshock (CVE-2014-6271) turned that into remote code execution across a huge slice of the internet.

HolyGhost··7 min read

In September 2014, a researcher looked closely at a feature that had been sitting in one of the most widely used programs on earth for roughly twenty five years, and found that it had been quietly willing to run strangers' commands the whole time. The program was Bash, the command line shell that lives on almost every Linux and Unix machine and on the Macs of that era. The feature looked innocent. The consequence was that an attacker could run commands on a server simply by shaping a piece of data the right way and sending it along. Shellshock is a wonderful and terrifying example of how a decades old feature, tucked away in a tool almost nobody thinks about, can turn out to be a gaping hole. This is how it worked.

Credit and scope

Shellshock was discovered by Stéphane Chazelas in 2014 and reported responsibly. This is a defensive explainer of the flaw, not original research.

First, what is a shell and what is Bash?

A shell is the program that reads the commands you type at a terminal and carries them out. When you type something like list the files in this folder, the shell is what understands that and makes it happen. Bash, short for the Bourne Again Shell, is the most common shell on Linux and Unix systems. It is everywhere, from web servers to routers to the scripts that glue big systems together.

Crucially, Bash does not only run when a human types at it. Other programs launch Bash behind the scenes all the time to run small scripts and helper commands. That habit, programs quietly calling Bash for us, is what turned a quirky parsing bug into a global emergency.

Environment variables, in plain terms

To understand the flaw you need one more idea: environment variables. These are named values that the operating system hands to a program when it starts, a bit like a note pinned to the program saying "by the way, here are some settings you might want." A program might receive an environment variable telling it the current user's name, or where to find temporary files. They are ordinary, everywhere, and usually harmless.

Bash has a feature that lets you pass not just simple values but whole functions to child programs through these environment variables. To support that, when Bash started up it would look at incoming environment variables, spot any that defined a function, and evaluate them so the function would be ready to use.

The feature that went wrong

Here is the bug. Bash did not stop reading at the end of the function definition. If someone tacked extra code on after the function, Bash cheerfully ran that too.

An environment variable set to:
   () { :; };  echo VULNERABLE
 
Bash sees the function definition () { :; }
...then keeps going and runs   echo VULNERABLE

The part inside () { :; } is a harmless empty function. The part after it, echo VULNERABLE, is a separate command that was never meant to run. But Bash's parser kept going past the function and executed whatever followed. That trailing command ran automatically, just because Bash imported the variable at startup. On its own that might sound obscure and academic. The real danger lived in all the places where an attacker could control the content of an environment variable.

The pattern to remember

The problem is not the function feature by itself. It is that data crossing a trust boundary, the environment variable, was fed straight into a powerful interpreter, Bash, which treated part of that data as commands. Any time outside data reaches something that can execute commands, you have to draw a very firm line between what is data and what is instruction.

Why it was everywhere

Plenty of systems copy untrusted input into environment variables and then invoke Bash. The classic path was older style web servers running CGI scripts. CGI, the Common Gateway Interface, is a simple and once popular way for a web server to run a program to build a page. It works by placing details of the incoming request into environment variables and then launching a script, which very often used Bash.

1. A web request arrives with a crafted header, e.g. User-Agent.
2. The web server puts that header into an environment variable.
3. It launches a CGI script, which uses Bash.
4. Bash imports the variable and runs the attacker's trailing command.

So an attacker could get code execution on a server just by sending an ordinary looking HTTP request with a poisoned header. No login, no clever memory tricks, just a string in the right field. The User-Agent header, which your browser sends to identify itself, was a favourite carrier because servers log and process it as a matter of course.

The same pattern showed up well beyond web servers. Some DHCP clients, which are the programs that request an address when your machine joins a network, would place server supplied values into environment variables. Certain SSH configurations and some mail processing systems did similar things. Because Bash was on almost every Linux and Unix system, and on macOS at the time, the exposed surface was enormous and the exploitation was trivial.

Trivial to exploit, huge to reach

Unlike bugs that need careful memory manipulation, Shellshock was as simple as putting a string in the right place. Combined with how many internet facing systems funnelled input into Bash, it was scanned and exploited en masse within days of disclosure.

What an attacker could do next

Getting a single command to run is only the opening move. Once an attacker can run one command through Shellshock, they can usually run more. A common next step was to fetch and launch a small program that gave the attacker an interactive foothold on the machine, often called a reverse shell because the victim server calls back out to the attacker. From there they could read files, harvest credentials, plant malware, or use the box as a stepping stone deeper into the network. This is why a bug that "just runs echo" in a demonstration was treated as a five alarm fire in reality.

Fixing it

  • Patch Bash. The fix corrects the parser so it no longer runs anything after a function definition. Several follow up CVEs tightened the fix as researchers found edge cases in the first attempts, so a fully updated Bash is the answer, not the very first patch alone.
  • Retire CGI style designs. Passing untrusted input into environment variables and then shelling out is a risky pattern in general. Modern application architectures avoid handing raw request data to a shell in the first place.
  • Defence in depth. Running web server processes with least privilege, meaning only the access they truly need, and using egress filtering to control what a server can connect out to, both limit what a successful exploit can achieve next.

The first fix is rarely the last

Shellshock's initial patch did not fully close the hole, and additional CVEs followed as the parser was hardened further. This is normal for a bug in old and subtle code. When a serious flaw is disclosed, watch for follow up fixes and make sure you are on a genuinely current version rather than the first emergency release.

If you like seeing how one small oversight becomes an internet wide event, Shellshock has good company. Heartbleed leaked server memory because a length was trusted without checking, and Log4Shell turned a logging feature into remote code execution because a log message was treated as an instruction. The through line across all three is a trust boundary that was allowed to slip.

The takeaway

Shellshock came from Bash trusting the content of an environment variable a little too far, running code that had been smuggled in after a function definition. It became a global incident because so many systems feed untrusted input into environment variables and then call Bash. The immediate fix was to patch Bash, but the wider lesson echoes the others. Data from outside should never be handed to a powerful interpreter without a very clear boundary, because the moment that boundary slips, data quietly becomes commands.