Resources/Technical

Why Cloudflare Workers Is the Right Runtime for API Key Security

Most API key proxies run on containerised servers - a Node.js process shared across thousands of concurrent requests, with a centralised secrets manager and a single point of failure. Cloudflare Workers' architecture is different in every dimension that matters for security. This article explains why.

11 min read·May 2026·KeyVault Edge Team

V8 isolates: no shared memory between requests

The fundamental property that makes Cloudflare Workers suitable for handling decrypted API keys is the V8 isolate model. Each incoming request executes in a separate V8 isolate - a lightweight, sandboxed JavaScript execution context. Isolates do not share memory with each other.

This is categorically different from how Node.js servers handle concurrency. A Node.js process runs a single event loop and shares heap memory across all concurrent requests. If a decrypted API key is stored in a variable, that variable exists in the shared heap - accessible (in principle) to any code running in the same process, including third-party dependencies or a malicious module introduced through supply-chain compromise.

Memory model comparison

Node.js process

Single heap, shared across requests
Concurrent requests can observe each other's memory (via bugs/malicious deps)
Key persists in heap until garbage collected
Supply-chain attack can read any variable

Cloudflare Worker isolate

Isolated heap per request
No cross-request memory access by design
Key zeroed when request lifecycle ends
Isolate boundaries enforced at V8 engine level

For API key proxying, this matters because the decrypted key lives in Worker memory only for the duration of a single request - from the point of HSM unwrap to the moment the upstream response begins streaming. After that, the isolate teardown zeroes the heap. There is no window during which the key persists in accessible memory beyond that single request's lifecycle.

Sub-millisecond cold starts vs Lambda latency

Lambda functions and containerised proxies suffer from cold starts - the latency penalty when a new container must be initialised before a request can be served. Cold starts for AWS Lambda range from 50ms to several seconds depending on runtime and memory size. For an API key proxy that adds itself to every outbound API call, that latency tax compounds with every request.

Cloudflare Workers use a different model: isolates are created in under 5 milliseconds and are pre-warmed across Cloudflare's global network. The overhead of a Worker handling a request is typically 0–2ms for the Worker itself. The additional latency introduced by routing through KeyVault Edge is the HSM unwrap call (~2–5ms) plus the Worker execution overhead - a total budget of 5–10ms before the upstream API call begins.

RuntimeCold start p50Warm request overhead
AWS Lambda (Node.js)200–800ms0–5ms
GCP Cloud Run500–3000ms0–10ms
Fly.io (container)50–300ms0–3ms
Cloudflare Workers<5ms0–2ms
Approximate figures. Cold start times vary with memory, package size, and PoP location.

The latency advantage is not just about developer experience - it affects the security architecture. A proxy with high cold-start latency creates pressure to cache decrypted keys across requests (to amortise the startup cost). Caching a decrypted key is a security regression. Workers' sub-millisecond cold starts eliminate the economic incentive to cache - every request can safely do a fresh HSM unwrap without meaningfully affecting end-to-end latency.

300+ PoPs: edge execution near every user

Cloudflare operates more than 300 Points of Presence globally. A request from a developer in Singapore, São Paulo, or Warsaw hits a Cloudflare edge node within 20–50ms in almost all cases. The Worker executes at that nearby PoP - not in us-east-1.

For an API key proxy, geographic distribution matters in two ways:

  • Latency: The proxy adds latency relative to the client-to-proxy hop, not the client-to-origin hop. If the proxy is co-located with the user (rather than in a single data centre), the latency penalty shrinks from 100–200ms to 5–20ms in most regions.
  • DDoS resilience:Cloudflare's network absorbs over 200Tbps of DDoS capacity globally. An API key proxy running on Cloudflare inherits that resilience - a volumetric attack against the proxy endpoint does not reach the HSM or the origin.

Centralised proxies (a single region VM or container cluster) create a geography-dependent latency problem and a single-region availability risk. A proxy that is unavailable means API calls fail. The only architecturally sound choice for a proxy that sits in the critical path of every API call is one that is globally distributed by default.

Stateless by default - why that matters for secrets

Workers have no persistent in-process storage. Variables do not survive between requests. There is no global object that accumulates state across the lifetime of a server process, because there is no persistent server process.

This is a security property, not just a technical constraint. A long-running server process accumulates secrets over time - tokens, decrypted keys, session data - in its heap. A Worker accumulates nothing. Its heap starts empty at the beginning of each isolate invocation and is discarded when the request ends.

The implication for forensics is also valuable: if a Worker is compromised (theoretically), the attacker gains access only to data relevant to the current request. There is no historical accumulation of secrets to exfiltrate.

What Workers can't do - and why that's fine

Workers have constraints that would be significant for general-purpose server applications but are non-issues for an API key proxy:

CONSTRAINT

CPU time limit (typically 50ms per request)

WHY IT DOESN'T MATTER

API key decryption and header injection take under 2ms. The limit is irrelevant for proxy workloads.

CONSTRAINT

No direct TCP connections to arbitrary hosts

WHY IT DOESN'T MATTER

Workers use fetch() for all outbound connections, which is sufficient for HTTP/HTTPS proxy forwarding to any upstream API provider.

CONSTRAINT

No filesystem access

WHY IT DOESN'T MATTER

Not needed. Keys are stored in KV and HSM, not on disk.

CONSTRAINT

128MB memory limit per isolate

WHY IT DOESN'T MATTER

An API key proxy doing AES-GCM decryption and header injection uses under 10MB. Well within the limit.

The constraints that make Workers challenging for complex server applications - no long-running processes, no arbitrary I/O, no large in-memory state - are precisely the properties that make Workers excellent for a security proxy. The architecture forces the exact runtime behaviour you want when handling live API keys.

The full KeyVault Edge architecture on Workers

Here is how KeyVault Edge uses Workers specifically for the proxy flow:

  1. 1Request arrives at the closest Cloudflare PoP (≤50ms for most of the world). TLS 1.3 terminates at the edge.
  2. 2Worker reads the Authorization header containing the host-bound token. Validates token format and extracts the token ID.
  3. 3Worker validates the request origin against the token's host-binding. If the origin domain or IP does not match, the request is rejected immediately - no HSM call, no key access.
  4. 4Worker calls the HSM unwrap endpoint to decrypt the Data Encryption Key (DEK) for this token. One network hop, ~2–5ms.
  5. 5DEK decrypts the AES-256-GCM ciphertext stored in Cloudflare KV. The real API key is now in the isolate's local heap - isolated, non-shared memory.
  6. 6Worker constructs the outbound request with the real API key injected into the Authorization header. The sanitized token is replaced.
  7. 7Response streams back from the upstream API provider through the Worker to the client. As soon as the response begins streaming, the key variable is explicitly zeroed.
  8. 8Isolate teardown. Heap discarded. No trace of the key remains in the Cloudflare network.

Total added latency for the proxy: HSM unwrap (2–5ms) + Worker overhead (0–2ms) = approximately 5–10ms p50. The architecture achieves sub-40ms end-to-end overhead in all major geographies.

See the architecture in production

KeyVault Edge is built on exactly the architecture described above. The full encryption flow, memory handling policy, and employee access model are documented publicly.