By Dmitry Petrakov · June 29, 2026 · ~13 min read
"Build a PDF upload form" sounds like a 30-minute task. An LLM will write the handler, you add accept=".pdf" on the frontend and a multipart parser on the backend, and you have a working upload. Ship it.
The problem is that a working upload and a secure upload are two very different things. The gap between them is a handful of vulnerabilities, each of which can turn your server into an entry point for an attacker.
With the spread of LLM tools, the barrier to entry in software development has dropped dramatically. That is great, more people can build products. But along with the barrier to entry for development, the barrier to entry for vulnerabilities has dropped too. When an LLM generates file-upload code, it solves the functional problem: accept a file, save it, process it. Security? "I'll add that later." And "later" usually comes after an incident.
I decided not to wait for an incident and worked it out upfront: what attacks exist around file uploads, what happens when you forget about them, and how we defended against them in a real project.
Disclaimer. This is not a universal security guide, nor a claim that "I know the right way." It is an engineering case study: how I designed upload protection for a specific service and what trade-offs I made. The example is a browser extension for converting PDF to Markdown. A PDF converter by itself might not warrant this level of protection, but the approaches are universal and apply to systems where the stakes are higher: medical documents, financial reports, legal scans, UGC platforms. I'm demonstrating the principles with real code, and you decide which ones apply to your case.
TL;DR – this article breaks down 7 attacks on file upload and how I mitigated them in a Go backend:
- File type spoofing → magic bytes
%PDF, never trust the extension - Disk exhaustion →
MaxBytesReader+ per-device slot limits - Path traversal → fixed filename
{UUID}/input.pdf, no user input in paths - SSRF → DNS resolution before the request, redirect blocking, private-IP denylist
- Replay attack → nonce + timestamp + ECDSA signature on every request
- Device spoofing → cryptographic identity via WebCrypto (ECDSA P-256)
- Application-level abuse → rate limit + signature + slots
A summary table with statuses is at the end of the article.
Upload architecture: what happens when you send a file
Before talking about attacks, here is the file's journey through the system:
Two upload channels:
- Direct upload – the user selects a file or drags and drops it
- URL upload – the user provides a link to a PDF
Each channel carries its own set of threats, and each requires its own defenses. But before the specific attacks, we need the foundation the whole security model rests on.
All code snippets below are simplified excerpts from the real project, highlighting the key checks. The actual code differs in error handling and additional edge cases.
Foundation: anonymous device identity
Why so complex? If your backend uses conventional authentication (JWT, sessions, OAuth) you can skip this section; your auth layer already solves the identity problem. We describe this approach for a specific case: a product with no logins or accounts, where you still need to control load per device. If your project has authentication, use it, it's simpler and more reliable.
In most applications, file upload is protected by login: you have an account, the server knows who you are. Our browser extension has no registration and no accounts. So how do we distinguish a legitimate user from an attacker, and how do we rate-limit when there is no login?
We solved this through cryptographic device identity – in effect, each browser profile becomes an anonymous yet verifiable "account."
Important: this is not user tracking or covert fingerprinting. The device_id is tied to a key pair in the IndexedDB of a specific browser profile. We don't know who the person is, only that this same browser profile has made requests before. Incognito window = new device. Extension uninstalled = keys lost forever. At the auth-model level, device_id provides no cryptographic linkage between devices. The operator still has indirect signals (IP, ASN) that could theoretically be used to hypothesize device correlation, but that is not part of the auth protocol and is not used for identification.
How it works
On first launch, the extension generates an ECDSA P-256 key pair via the WebCrypto API:
const keyPair = await crypto.subtle.generateKey(
{ name: 'ECDSA', namedCurve: 'P-256' },
false, // extractable = false – the key cannot be exported!
['sign', 'verify']
);
The critical detail is extractable: false. The private key is stored in the browser's IndexedDB, but it cannot be extracted – not via JavaScript, not via DevTools, not via extensions. You can only ask the browser to sign data with this key.
The extension then sends the public key to the server (POST /register) and receives a device_id and device_token. From that point on, every API request is signed with the private key:
| Header | Purpose |
|---|---|
Authorization: Bearer <token> | Device identification |
X-Timestamp | Request time (±5 min window) |
X-Nonce | Unique request ID (anti-replay) |
X-Body-SHA256 | Body hash – data integrity |
X-Signature | ECDSA signature of all the above |
The signature covers METHOD + PATH + TIMESTAMP + NONCE + BODY_HASH. Forging it without the private key is impossible.
Signature ≠ encryption. An ECDSA signature ensures the integrity and authenticity of a request, but not confidentiality. The file contents, token and metadata travel in plaintext without TLS. Request signing complements HTTPS, it does not replace it.
The server verifies the signature using the public key bound to the device_id. If it doesn't match, the request is rejected, regardless of whether the token is valid.
Why this matters for file upload
This model gives us something you normally don't get without accounts:
- Per-device control – we can limit concurrent tasks, files per day and requests per minute for each device (browser profile)
- Cost of a "new account" – an attacker can't just swap cookies or tokens; they must generate a new key pair and register, which is limited to 5 attempts per IP per hour
- Protection against token theft – even if the
device_tokenleaks, it's useless without the private key (which can't be exported) - Request integrity – the body (including the file) is covered by the signature via a SHA-256 hash; tampering in transit is impossible
Essentially, each browser profile gets its own unforgeable "passport." And every limit we discuss next – slots, rate limits, size restrictions – works precisely because we can reliably identify the device.
Is this protection absolute? No. A motivated attacker could write an emulator that reproduces the whole chain: key generation, registration, request signing, with no browser at all. But that is the point: we significantly raise the barrier to entry. Without cryptographic identity, attacking the API is one curl command. With our model, the attacker must implement ECDSA P-256, form the canonical string correctly, sign every request, and manage nonces and timestamps – all for a limit of 3 active slots and 5 registrations per hour per IP. The cost of the attack grows by orders of magnitude while the payoff stays the same.
Attack 1: file type spoofing
The attack. An attacker renames a malicious file (script, executable, HTML with XSS) to report.pdf and uploads it. If the server trusts the extension, it saves the file and, under certain conditions, may execute it or serve it to other users.
How we defended – three layers of file-type validation.
Layer 1, frontend (extension):
const ALLOWED_TYPES = ['application/pdf'];
const ALLOWED_EXTENSIONS = ['.pdf'];
function validateFile(file) {
const isPdf = ALLOWED_TYPES.includes(file.type) ||
ALLOWED_EXTENSIONS.some(ext =>
file.name.toLowerCase().endsWith(ext));
if (!isPdf) return { valid: false, error: 'PDF only' };
// ...
}
We check both the MIME type and the extension. But frontend validation is a UX filter, not a security measure – anyone can send a request directly, bypassing the extension.
Layer 2, backend request Content-Type: the server routes by Content-Type (multipart/form-data for files, application/json for URLs). Not content validation, but the first server-side barrier.
Layer 3, backend magic bytes (file signature):
pdfMagicBytes = "%PDF"
header4 := make([]byte, 4)
_, err = file.Read(header4)
if string(header4) != pdfMagicBytes {
respondError(w, http.StatusBadRequest,
models.ErrCodeValidationError,
"File is not a valid PDF")
}
file.Seek(0, 0) // Reset position for further processing
This is the key check. Every file format starts with a specific byte sequence, its "magic bytes." PDF always starts with %PDF. Even if an attacker renames an .exe to .pdf, the first bytes give it away.
Important: we validate the actual file contents, not what the browser wrote in the header. Headers are easy to forge; bytes are not (unless the attacker crafts a file that is simultaneously a valid PDF and something malicious – such polyglots exist, more below).
Attack 2: disk exhaustion (file-size DoS)
The attack. An attacker uploads a gigabyte-sized file (or thousands of files in a row) to exhaust the server's disk or memory.
How we defended – three layers of size limits:
The key element is http.MaxBytesReader at the middleware level:
r.Body = http.MaxBytesReader(w, r.Body, h.cfg.Storage.MaxFileSize + 1024*1024)
This wraps r.Body and stops reading as soon as the limit is exceeded. The server won't read 10 GB just to say "file too large", it stops at the 11th megabyte.
And here the device-identity model kicks in at full strength. Since each browser profile is cryptographically bound to a device_id, we can limit load per device:
const maxJobsPerDevice = 3
Three active slots per device (a slot is occupied by tasks in queued, processing, ready and error states, until auto-cleanup or manual deletion). Bypassing this by swapping cookies or tokens is impossible. To get new slots, the attacker needs a new browser profile, new keys and registration (limited to 5 attempts per IP per hour).
In the current service this fixed constant has become tier-driven: the free and anonymous tier is still 3 active slots, while paid tiers raise it through the entitlements layer. The principle is unchanged – the cap is per verified device, not per cookie.
Attack 3: path traversal
The attack. An attacker sends a file named ../../../etc/passwd or ..\..\windows\system32\config.txt. If the server uses the filename for storage without sanitization, the file lands not in the uploads folder but at an arbitrary path.
Frontend – filename sanitization:
function sanitizeFileName(name) {
return name
.replace(/[/\\?%*:|"<>]/g, '-') // Remove dangerous characters
.replace(/^\.+/, '') // Remove leading dots
.replace(/\.+$/, '') // Remove trailing dots
.substring(0, 255) // Limit length
.trim();
}
But the real defense is on the backend, where the filename is completely ignored:
func (h *JobsHandler) saveFile(jobID uuid.UUID, file io.Reader) error {
jobDir := filepath.Join(h.cfg.Storage.SharedDataPath, jobID.String())
os.MkdirAll(jobDir, 0755)
filePath := filepath.Join(jobDir, "input.pdf") // Fixed filename!
// ...
}
The file is always saved as {UUID}/input.pdf. No user input in the path at all. The UUID is generated by the server; predicting it is infeasible. This robustly closes path traversal at the storage-path level.
But the filename doesn't end there. The original name lives on as metadata: in the database, the UI, and the Content-Disposition header on download. The storage path is fully closed (the user-provided name isn't involved), but at the output level the name still needs proper escaping to prevent XSS or header injection. Frontend sanitization is the first barrier; the main responsibility is correct escaping when serving the result.
Attack 4: SSRF (server-side request forgery)
The attack. When uploading via URL, the attacker submits not a link to a PDF but an internal address: http://169.254.169.254/latest/meta-data/ (the cloud metadata endpoint, identical across AWS, GCP and others), http://localhost:6379/ (Redis), or http://192.168.1.1/admin. The server, trusting the URL, makes the request on its own behalf and the attacker reaches internal infrastructure.
SSRF is part of the OWASP Top 10 and is one of the most prevalent vulnerabilities in modern applications.
Step 1 – IP address validation. Before making the request, resolve DNS and verify that every resulting IP is public:
func validatePublicURL(rawURL string) error {
u, err := url.Parse(rawURL)
// ... scheme and host validation
ips, err := net.LookupIP(u.Hostname())
for _, ip := range ips {
if isPrivateOrReservedIP(ip) {
return fmt.Errorf("URL resolves to private/reserved IP")
}
}
return nil
}
func isPrivateOrReservedIP(ip net.IP) bool {
if ip.IsLoopback() { return true } // 127.0.0.0/8, ::1
if ip.IsLinkLocalUnicast() { return true } // 169.254.0.0/16 – cloud metadata
if ip.IsMulticast() { return true }
if ip4 := ip.To4(); ip4 != nil {
// 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
if ip4[0] == 10 { return true }
if ip4[0] == 172 && ip4[1] >= 16 && ip4[1] <= 31 { return true }
if ip4[0] == 192 && ip4[1] == 168 { return true }
}
// IPv6 ULA (fc00::/7)
if ip[0] == 0xfc || ip[0] == 0xfd { return true }
return false
}
Caveat: this is a denylist – we enumerate "what to block." OWASP recommends a positive allowlist for SSRF (only allow known-safe destinations). A denylist is simpler but easier to bypass: miss a range (e.g. 100.64.0.0/10 Carrier-Grade NAT, or 198.18.0.0/15 benchmark) and the request goes through. For our use case the denylist covers the main risks, but for critical infrastructure move to an allowlist or dial-time validation.
Step 2 – blocking redirects:
httpClient = &http.Client{
CheckRedirect: func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse // Do not follow redirects
},
}
A classic trick: http://safe-looking-url.com → 302 → http://169.254.169.254/. We don't follow redirects at all. If a URL returns 3xx, the request is rejected:
if resp.StatusCode >= 300 && resp.StatusCode < 400 {
respondError(w, http.StatusBadRequest, ...,
"URL redirects are not allowed for security reasons")
}
A note on DNS rebinding: a more sophisticated attack where a name first resolves to a public IP (passes validation) then to a private IP (when the HTTP client actually connects). Full protection requires pinning the resolved address or a specialized client. Here this vector is partially mitigated by redirect blocking and the single-shot nature of the request.
Attack 5: replay attack
The attack. An attacker intercepts a legitimate upload request and sends it again, repeatedly: dozens of identical tasks, quota exhaustion, server load.
This is where the signature system from the "Foundation" section pays off. Every request already carries a timestamp, nonce and ECDSA signature, and these are exactly what make a replay futile.
Server-side checks:
- Time window – requests older than 5 minutes are rejected
- Nonce uniqueness – every nonce is stored in Redis with a 5-minute TTL; a duplicate nonce →
409 Replay Detected - Body integrity – the SHA-256 hash of the body is compared with the declared value; tampering while keeping the signature valid is impossible
- Cryptographic signature – covers method, path, timestamp, nonce and body hash; without the private key a valid signature can't be created
// Body integrity check
actualHash := sha256.Sum256(bodyBytes)
if !strings.EqualFold(hex.EncodeToString(actualHash[:]), bodySHA256Header) {
respondError(w, http.StatusUnauthorized, ..., "Body hash mismatch")
}
So even if an attacker intercepts a request, they can't replay it (nonce already used), modify it (signature won't match), or stretch it in time (timestamp expired).
Attack 6: device spoofing
The attack. An attacker tries to impersonate another device to use its quotas, access its results, or bypass rate limits.
As described in "Foundation", device identity is based on asymmetric cryptography, not simple tokens. That makes spoofing fundamentally impossible:
- The private key can't be stolen –
extractable: falsemeans even a malicious extension can't read the key from IndexedDB; it can only be used for signing - A token without the key is useless – the
device_tokengrants the right to send a request, but without an ECDSA signature the server rejects it - Creating a "new device" is expensive – a new browser profile + registration, limited to 5 attempts per IP per hour
- The key is tied to the browser profile – switching tabs, restarting, updating the extension keeps the key; only deleting the profile or the extension loses it
Attack 7: mass abuse via upload
The attack. Mass-submitting upload requests to overload the server – saturating network, disk or CPU (PDF parsing is resource-intensive). This is not network-level DDoS (handled by CDN/WAF and infrastructure like Cloudflare), but application-level abuse of the upload business logic.
How we mitigate – defense in depth at the application layer:
| Layer | Mechanism | Limit |
|---|---|---|
| Registration | Rate limit on /register | 5 per hour per IP |
| Requests | Signature on every request | No key, no access |
| Concurrency | Per-device slots | 3 active slots |
| Size | MaxBytesReader | 11 MB per request |
| Request body | Multipart part size | 10 MB per file |
To mass-abuse uploads, an attacker would need to create many devices (rate-limited registration), generate a key pair per device and sign every request, and live with 3 active slots and 10 MB per file each. This raises the cost of mass abuse at the application level, but it is not a substitute for network-level DDoS mitigation.
On IP rate-limiting and CGNAT. Yes, thousands of users can sit behind one mobile carrier IP. That is why the IP limit is supplementary, not primary, and only on the registration endpoint. Working requests are rate-limited by device_id, not by IP. Legitimate traffic from a single mobile IP goes through; the limit only triggers on an anomalous burst of registrations, the signature of automated farming.
What else exists: attacks where protection is incomplete
Honesty is the best policy. Here are the vectors we're aware of where our defenses are still imperfect.
PDF bomb (decompression bomb)
A PDF can contain compressed streams that expand to gigabytes. A 5 MB file can become 10 GB during parsing.
What already protects us: the conversion engine (MinerU by default, Docling as an opt-in) runs in a separate Docker container with a memory limit and timeout:
The separate conversion service, the 15-minute timeout and the memory limit reduce the blast radius: even if a bomb starts decompressing, the consequences stay inside one container and don't touch the API, the database or other workers. Exact OOM behavior depends on the runtime (Docker Desktop, Swarm and Kubernetes apply deploy.resources differently), but the isolation principle holds everywhere. The timeout is a second line of defense: if memory doesn't run out but decompression drags on, the worker terminates processing and marks the task failed.
What this doesn't cover: there is no preventive check – we don't analyze the compression ratio before processing. A bomb still starts decompressing, just in isolation. For full protection, add an input heuristic: an anomalously high compression ratio (file size vs. number of streams/pages) is a reason to reject before processing.
Malicious PDF content
PDF is a full-fledged container that can hold embedded JavaScript, forms with auto-submit, links to external resources, and embedded files of other formats.
We don't execute or render PDFs – we only extract text and structure, which substantially reduces the risk. And the engine's container isolation means that even if the parser is vulnerable to a crafted PDF, the consequences are confined to the container.
What this doesn't cover: no antivirus scanning (ClamAV) and no CDR (Content Disarm & Reconstruct – rebuilding the PDF while stripping JavaScript, forms and embedded files). Where output is served to third parties, both the incoming PDF and the output Markdown should be checked for malicious links/scripts. A separate concern is timely updates to the parsers themselves (MinerU, Docling and their dependencies): a PDF parser is its own attack surface, and CVEs appear regularly.
Storage and access to uploaded files
What the OWASP File Upload Cheat Sheet lists as mandatory, and what we've implemented but not yet discussed:
- Files are stored outside the webroot – uploaded PDFs live in shared storage between the API and worker, not in a publicly served directory. Nginx does not serve them.
- The original PDF is not exposed externally – only the conversion result (Markdown) is available through the API, and the original is deleted after processing.
- The application does not execute uploaded files – even if a non-PDF somehow got in, the server won't interpret it; the upload storage is not a public serving or execution point.
Polyglot files
A polyglot is a file that is simultaneously a valid PDF and, say, a valid ZIP or HTML. Our magic-bytes check passes (%PDF at the start), but other software may interpret the file differently.
Current status: only the first 4 bytes are checked; no deeper PDF-structure analysis. Solution: validate the PDF structure (xref table, trailer) or use specialized libraries for deep validation.
DNS rebinding
As noted under SSRF, there is a time gap (TOCTOU) between the IP check and the actual request. An attacker could theoretically bypass validatePublicURL via a controlled DNS server: return a public IP on the first resolve, then a private IP on the second.
What already protects us: even with successful rebinding, the attacker only gets blind SSRF – the server can "touch" the internal address, but the response isn't returned: it won't pass the magic-bytes check (%PDF) and is rejected with a generic "URL does not point to a valid PDF file". No response data reaches the user.
What this doesn't cover: blind SSRF still lets you "reach" internal services with a GET. For critical infrastructure that can be undesirable – cloud metadata APIs may return IAM tokens via GET without auth. Full protection means moving IP validation to the TCP-connection level (dial-time validation) to close the TOCTOU gap.
Summary: attacks and protection status
| Attack | Severity | Status | How we're protected |
|---|---|---|---|
| File type spoofing | High | Protected | MIME + extension + magic bytes |
| Disk exhaustion | High | Protected | MaxBytesReader + multipart limit + slots |
| Path traversal | Critical | Protected | Fixed filename input.pdf + UUID directories |
| SSRF | Critical | Risk reduced | IP denylist + redirect blocking (not allowlist, no dial-time validation) |
| Replay attack | Medium | Protected | Nonce + timestamp + ECDSA signature |
| Device spoofing | High | Protected | Asymmetric cryptography (P-256) |
| Application-level abuse | High | Risk reduced | Rate limit + signature + slots (not a substitute for network DDoS mitigation) |
| PDF bomb | Medium | Partial | Container isolation + OOM + 15 min timeout, no preventive analysis |
| Malicious PDF | Medium | Partial | No rendering + container isolation, no antivirus |
| Polyglot files | Low | Partial | Magic bytes only, no deep analysis |
| DNS rebinding | Low | Partial | Blind SSRF only – magic bytes block data leakage, no dial-time validation |
The approach scales
Everything above was shown on a PDF-to-Markdown converter, but the same set of principles applied to another UGC project with image uploads. There, instead of magic bytes %PDF – image.DecodeConfig() in Go (effectively the same magic-bytes idea via the standard library); instead of container isolation against PDF bombs – a pixel limit on input (reject before decoding into memory); and instead of a fixed input.pdf – server-generated UUIDs at every path level. Plus re-encoding: every uploaded image is decoded into a pixel buffer and re-encoded, destroying EXIF, GPS and embedded scripts and neutralizing polyglots. The specifics change (PDF vs. images, extension vs. web app), the foundation does not.
Principles worth taking with you
- Security is not "later." If you use an LLM to generate file-upload code, review the result against this checklist. "I'll add validation later" is technical debt that bites first.
- Never trust the frontend. Any client-side validation is UX, not security.
accept=".pdf"filters accidents, not attacks. Real protection is on the server. - Validate content, not metadata. Extension and Content-Type are "what the file calls itself." Magic bytes are "what the file actually is."
- Don't use user-provided filenames in storage paths. UUID + fixed name closes the storage-path side of path traversal. The name as metadata (DB, UI,
Content-Disposition) still needs separate escaping. - When uploading by URL, validate the IP before the request. SSRF can't be filtered after the request is sent. Resolve DNS, check ranges, block redirects.
- Make the attack economically unattractive. Absolute protection doesn't exist, but you can make the cost of an attack vastly exceed its payoff.
- Be honest about the gaps. Security is a process, not a state. Knowing your weak spots and planning to close them beats believing you're invulnerable.
This runs on a real product. The same backend powers pdf2md.dev: convert a PDF to clean, LLM-ready Markdown from the web app, the REST API or hosted MCP, with files auto-deleted after processing. The privacy notice states retention and training policy in plain language.