HolyGhost logoHolyGhost
← cd ..
Analysis

Directory Traversal: Escaping the Folder You Were Meant to Stay In

By adding ../ to a file path, an attacker can climb out of the intended directory and read files the application never meant to expose. Here is how path traversal works and how to block it.

HolyGhost··9 min read

Imagine a hotel where the front desk fetches things from a store room for guests. You ask for "towel", and a porter walks into the store room, picks up a towel, and brings it to you. Simple and safe, as long as the porter only ever looks inside that one room. Now imagine you ask for "the box two rooms down the corridor, then up the stairs, in the manager's office, the one marked payroll". If the porter follows those directions literally, without ever stopping to ask whether they should be leaving the store room at all, you have just walked them out of the space they were meant to stay in and into somewhere they should never go. That, in a nutshell, is directory traversal.

Directory traversal, also called path traversal, is a wonderfully simple flaw. If an application uses your input to decide which file to open, and does not check it carefully, you can walk right out of the folder it intended and into the rest of the file system. A file system is just the way a computer organises files into folders, also called directories, arranged in a tree. All it takes is a few ../ sequences. Here is how it works.

Scope

Path traversal is a long documented vulnerability class. This is an explainer for defenders and learners, on systems you own or are authorised to test.

The core idea

Many applications serve files based on a name in the request, for example a document viewer that shows a PDF, or an image loader that displays a picture. The application keeps those files in one folder and expects you to name a file inside it:

The app opens:   /var/www/files/<user input>
You request:     report.pdf   ->  /var/www/files/report.pdf

To understand the attack you need one small piece of file system knowledge. In a path, a single dot . means "this folder", and two dots .. mean "the folder one level up", the parent directory. This is genuinely useful. It is how you move around a folder tree without typing out the full address every time. But it is also a loaded weapon if untrusted input can use it. If the application drops your input straight into the path without checking, you can use ../ over and over to climb above the intended folder, one level at a time, until you reach the top of the file system and can then descend into anywhere you like:

You request:  ../../../../etc/passwd
The app opens: /var/www/files/../../../../etc/passwd
Which resolves to:  /etc/passwd

Each ../ cancels out one folder in the path. Stack enough of them and you climb all the way up to the root of the file system, the very top of the tree, from which every other folder branches. From there /etc/passwd, a Linux file listing user accounts, is easy to reach. You have escaped the files directory entirely and read a system file the application never meant to hand out. It is worth throwing in a few extra ../ for good measure, because going up from the root simply stays at the root, so overshooting costs nothing.

Windows behaves the same way with backslashes, so an attacker will try both styles depending on the target:

Linux style:    ../../../../etc/passwd
Windows style:  ..\..\..\..\windows\win.ini

What it exposes

Traversal is primarily a read vulnerability, meaning its usual prize is the ability to read files that were never meant to leave the server. That may sound modest next to something like remote code execution, but what it reads can be devastating:

  • Configuration files containing database passwords, application programming interface keys, and other secrets, the kind of file that quietly holds the keys to everything else.
  • Source code, revealing exactly how the application works and, helpfully for an attacker, where else it is weak.
  • System files like credential stores and user account listings.
  • Application logs and other users' private data.

The reason this is so dangerous is that secrets beget secrets. Read one configuration file with a database password in it, and now you can reach the database. Read the source code, and you learn where the other doors are and how their locks work. A pure read flaw can unravel a whole system one file at a time.

In some situations it goes further. If an attacker can traverse to a location they can also write to, or influence a file that later gets executed by the server, path traversal can escalate towards code execution, where the attacker gets the machine to run their own instructions. There is a related twist during file uploads, sometimes called path traversal on write, where a crafted file name like ../../ in the upload places the file somewhere it should never land, perhaps overwriting a script the server will later run. But even as a pure read, leaking secrets and keys is often enough to compromise everything else.

Encoding tricks defeat naive filters

Blocking the literal string "../" is not enough. Attackers use URL encoded forms like %2e%2e%2f, double encoding, and other representations to slip past simple filters. This is why the fix is not to search for bad strings, but to resolve the path and verify where it actually lands.

Why blocklisting fails, in detail

It is tempting to reach for a quick fix: just find any ../ in the input and strip it out or reject it. This feels reasonable and it fails in practice, for reasons worth understanding because they recur across the whole family of injection flaws.

The problem is that the same path can be written in a dizzying number of ways, and the operating system treats them all as identical when it finally opens the file. A filter that only recognises one spelling misses the others. Attackers exploit this with encoding, which means representing characters in an alternative form that the web server decodes back to the original before use:

Plain:              ../
URL encoded:        %2e%2e%2f          (%2e is a dot, %2f is a slash)
Double encoded:     %252e%252e%252f    (encode the % as well, decoded twice)
Backslash variant:  ..\  or  ..%5c
Mixed and nested:   ....//             (a naive strip of "../" leaves "../" behind)

That last one is sneaky. If a filter removes the literal ../ once and does not repeat the pass, the input ....// has its inner ../ removed and collapses down to a fresh ../, right past the guard. Chasing every encoding and every trick is exactly the same losing treadmill as blocklisting metacharacters in command injection. You will always miss one.

Blocking it

The reliable defences work on the resolved path, not on spotting bad input. The key word is resolve. To resolve, or canonicalise, a path means to work out the single true location it actually points to after all the dots, slashes, and encodings have been applied, exactly as the operating system would. Once you have that final, honest answer, checking it is easy:

  1. Resolve, then check the boundary. Turn the requested path into its full canonical form, the real destination, then confirm it still sits inside the intended base directory. If it does not, reject the request outright. This single check defeats ../, every encoding trick, mixed styles, and everything in between, because it does not care how the path was written. It only cares where it ends up.

  2. Avoid using user input as a path at all. Where you can, do not let the user name a file directly. Instead, map their request to a file through an allowlist of permitted names, or through an internal identifier like a number that you look up in a table. The user says "give me document 42", and your code decides which file that means. The user never supplies a raw path, so there is no path to poison.

  3. Sandbox and least privilege. Run the application with access only to the directory it actually needs. Least privilege means granting each component just the access required to do its job and nothing more. If the process simply cannot see the rest of the file system, then even a successful traversal has nowhere to go. This is your safety net for when the checks above are somehow bypassed.

Safe pattern:
  base = "/var/www/files/"
  full = canonicalise(base + input)      # resolve to the real, final location
  if not full.startsWith(base): reject   # is it still inside the allowed folder?

Canonicalise first, always check second

The order matters. You must resolve the path to its final form before you compare it against the allowed folder. Checking the raw input first and resolving later leaves a gap where an encoded or dotted path passes the check but resolves somewhere else. Resolve, then check, then open. Never open a path you have not confirmed lands inside the boundary.

Watch for symbolic links

Even a path that sits neatly inside your base folder can betray you if that folder contains a symbolic link, a special file that acts as a signpost pointing to another location elsewhere on the disk. A canonical path check that follows links will catch this, which is another reason to resolve to the true final location rather than trusting the text of the path.

The takeaway

Directory traversal lets an attacker use ../ to climb out of the folder an application meant to confine them to, reading configuration, secrets, and source code, and sometimes writing files or reaching code execution. Filtering for bad strings does not work, because the same path can be written countless ways through encoding and nesting, and you will always miss one. The dependable fix is to resolve the path to its true final location and confirm it still lives inside the intended directory, ideally avoiding raw user supplied paths altogether by mapping requests through allowlists or internal identifiers. Back it up with least privilege so a slip cannot reach the wider system. It is the file system member of the same family as command injection: untrusted input steering an operation it should never control.