A webhook is one party's way of telling yours that something happened — Stripe says "this subscription got cancelled," GitHub says "this PR was merged," Cloudflare says "this build finished." Conceptually it's the simplest integration there is: they POST JSON to a URL you give them, you 2xx-acknowledge, and you act on the event.
It's also the place where production apps quietly lose data, and the failure modes are subtle:
- You don't verify the signature → anyone on the internet who guesses your URL can POST forged events and you'll process them as real.
- You handle the same event twice → you charge a customer twice, send a duplicate email, run a job again.
- You take too long → the sender times out and retries while your first attempt is still running.
- The event is malformed or unhandleable → it disappears and nobody knows.
This chapter is the four fixes. By the end you'll have a Stripe / GitHub / Cloudflare-compatible webhook receiver that verifies the sender, never double-processes, responds in milliseconds, and never silently drops a bad payload.
The Webhook Contract
Figure 1 — Every webhook sender enforces some variant of these rules: "we POST you JSON + a signature, we expect a 2xx within (typically) 30 seconds, and if we don't get one we retry — sometimes for days." Lose track of any of those expectations and you have a bug.
Fix #1: Verify the Signature (HMAC)
The single non-negotiable rule of webhooks: anyone on the public internet can POST to your URL. If you don't verify the request actually came from the sender you trust, you're processing forged events.
Every reputable webhook sender ships a shared secret and signs each request. Three near-identical patterns:
| Sender | Header | What's signed |
|---|---|---|
| Stripe | Stripe-Signature: t=…,v1=… | HMAC‑SHA256 of . |
| GitHub | X-Hub-Signature-256: sha256=… | HMAC‑SHA256 of raw_body |
| Cloudflare | X-Signature-Ed25519 (or custom secret) | varies — check the integration's docs |
A production-shaped Stripe verifier in a Cloudflare Worker — every line matters:
async function verifyStripeSig(rawBody, header, secret) {
if (!header) return false;
// Parse "t=1717,v1=abc..."
const parts = Object.fromEntries(header.split(",").map(p => p.split("=")));
const t = parts.t;
const v1 = parts.v1;
if (!t || !v1) return false;
// Optional: reject signatures older than ~5 minutes (replay defence).
const ageSec = Math.floor(Date.now() / 1000) - parseInt(t, 10);
if (ageSec > 300 || ageSec < -300) return false;
// Compute HMAC over EXACTLY the bytes Stripe signed: "timestamp.rawBody".
const enc = new TextEncoder();
const key = await crypto.subtle.importKey(
"raw", enc.encode(secret),
{ name: "HMAC", hash: "SHA-256" }, false, ["sign"]
);
const sigBytes = new Uint8Array(
await crypto.subtle.sign("HMAC", key, enc.encode(`${t}.${rawBody}`))
);
const expected = [...sigBytes].map(b => b.toString(16).padStart(2, "0")).join("");
// CONSTANT-TIME COMPARE — never use === for HMAC outputs.
if (expected.length !== v1.length) return false;
let diff = 0;
for (let i = 0; i < expected.length; i++) {
diff |= expected.charCodeAt(i) ^ v1.charCodeAt(i);
}
return diff === 0;
}The Raw-Body Trap
There's one bug everyone writes the first time they handle a webhook. They do this:
// 🚫 BROKEN — re-serialising changes the byte stream.
const parsed = await req.json();
const ok = await verifyStripeSig(JSON.stringify(parsed), sig, secret);That fails. JSON.parse → JSON.stringify can reorder keys, change whitespace, drop trailing newlines — anything the sender included in the signed bytes. The HMAC was computed over the exact bytes the sender sent, not the round-tripped version.
The right shape:
// ✅ CORRECT — verify against the original bytes, parse after.
const rawBody = await req.text();
const sig = req.headers.get("Stripe-Signature");
if (!(await verifyStripeSig(rawBody, sig, env.STRIPE_WEBHOOK_SECRET))) {
return new Response("bad signature", { status: 400 });
}
const event = JSON.parse(rawBody); // only NOW is it safe to parsereq.text() first, JSON.parse after. Memorise it.
Fix #2: Idempotency — Don't Process the Same Event Twice
Webhook senders retry. Stripe retries failed webhooks for up to 3 days with exponential backoff. GitHub retries 3–5 times over a few hours. Cloudflare retries depend on the integration.
That means: even if your handler is perfect, the same event will sometimes arrive twice — once because of network flakiness, once because Stripe didn't get your 2xx in time. If you charge a card or send an email twice you have a real bug.
The fix is an idempotency key, and every reputable sender already gives you one in the event payload (event.id). Two patterns:
Pattern A: KV-backed idempotency (cheap, simple)
const eventId = event.id;
const seen = await env.KV.get(`webhook:stripe:${eventId}`);
if (seen) {
// Already processed; just ack and stop.
return new Response("duplicate, already handled", { status: 200 });
}
await processStripeEvent(event, env);
// Record so the *next* duplicate is short-circuited. 7-day window covers
// Stripe's longest retry horizon.
await env.KV.put(`webhook:stripe:${eventId}`, "1", { expirationTtl: 60 * 60 * 24 * 7 });
return new Response("ok", { status: 200 });This has one race condition (two concurrent retries can both pass the get), but Stripe's retry policy spaces events seconds-to-minutes apart, so in practice it's fine for low-rate webhook flows.
Pattern B: Database UNIQUE constraint (rock-solid, slightly more work)
If you can't tolerate the race, use a uniqueness constraint in D1:
CREATE TABLE webhook_log (
event_id TEXT PRIMARY KEY, -- e.g. "evt_1Q7..."
source TEXT NOT NULL, -- "stripe" / "github" / …
received INTEGER NOT NULL DEFAULT (unixepoch())
);try {
await env.DB.prepare(`
INSERT INTO webhook_log (event_id, source) VALUES (?, ?)
`).bind(event.id, "stripe").run();
} catch (e) {
if (e.message.includes("UNIQUE")) {
return new Response("duplicate", { status: 200 }); // already handled
}
throw e;
}
await processStripeEvent(event, env);D1 serialises the INSERT through its single-writer model — even concurrent duplicates can't both succeed. Use this when you genuinely cannot afford a double-process.
Fix #3: Respond Fast — Move Work Off the Hot Path
Every webhook sender has a timeout. Stripe gives you 30 seconds; if you don't 2xx by then it counts as a failure and starts retrying. If your handler does expensive work inline — calling an LLM, transcoding a video, sending 50 emails — you'll routinely time out and trigger spurious retries.
Two ways to fix this on Cloudflare:
ctx.waitUntil — fire-and-forget background work
The simplest pattern: respond 2xx immediately, do the work after.
export default {
async fetch(req, env, ctx) {
const rawBody = await req.text();
if (!(await verifyStripeSig(rawBody, req.headers.get("Stripe-Signature"), env.STRIPE_WEBHOOK_SECRET))) {
return new Response("bad signature", { status: 400 });
}
const event = JSON.parse(rawBody);
// Idempotency check (Fix #2) - synchronous and cheap.
const seen = await env.KV.get(`webhook:stripe:${event.id}`);
if (seen) return new Response("dup", { status: 200 });
// Schedule the real work to run after the response is sent.
ctx.waitUntil(processStripeEventAndMark(event, env));
return new Response("ok", { status: 200 });
},
};ctx.waitUntil lets the response leave the Worker immediately while the handler keeps running in the background (up to the standard Worker CPU budget). Good for work that takes 1–30 seconds.
Workers Queues — for genuinely long-running fan-out
When the work takes longer than a Worker's budget — or you need real retries / batching — push the event to a Cloudflare Queue (we cover Queues end-to-end in Chapter 4). The receiver becomes a one-liner:
ctx.waitUntil(env.WEBHOOK_QUEUE.send({ source: "stripe", event }));
return new Response("ok", { status: 200 });A consumer Worker on the queue then takes its time, with built-in retries and dead-letter handling.
Fix #4: Dead-Letter — Handle the Inevitable Bad Payload
Some events you genuinely cannot process — malformed JSON, a customer ID that doesn't exist, a code path you haven't written yet. The wrong move is to 500 and trigger retries forever. The right move is to acknowledge with 2xx + record for human review.
try {
await processStripeEvent(event, env);
} catch (err) {
// Don't trigger sender retries for code bugs.
await env.DB.prepare(`
INSERT INTO webhook_dead_letter (event_id, source, payload, error, at)
VALUES (?, 'stripe', ?, ?, unixepoch())
`).bind(event.id, JSON.stringify(event), String(err)).run();
// Still 2xx — the sender did its job; the problem is ours.
}
return new Response("ok", { status: 200 });That gives you a queryable backlog of broken events you can replay once you've fixed the bug, without poisoning the sender's retry queue.
Workers Queues also have a built-in dead-letter queue (DLQ) — set dead_letter_queue in wrangler.toml and messages that fail their retry budget land there automatically.
Putting It Together — The Production Webhook Worker
The full pattern, ~40 lines:
export default {
async fetch(req, env, ctx) {
if (req.method !== "POST") return new Response("Method not allowed", { status: 405 });
// 1. RAW body, then verify signature (Fix #1, raw-body trap).
const rawBody = await req.text();
const sig = req.headers.get("Stripe-Signature");
if (!(await verifyStripeSig(rawBody, sig, env.STRIPE_WEBHOOK_SECRET))) {
return new Response("bad signature", { status: 400 });
}
const event = JSON.parse(rawBody);
// 2. Idempotency (Fix #2).
const key = `webhook:stripe:${event.id}`;
if (await env.KV.get(key)) return new Response("dup", { status: 200 });
await env.KV.put(key, "1", { expirationTtl: 60 * 60 * 24 * 7 });
// 3. Respond fast; do work in background (Fix #3).
ctx.waitUntil((async () => {
try {
await processStripeEvent(event, env);
} catch (err) {
// 4. Dead-letter on unrecoverable failure (Fix #4).
await env.DB.prepare(
"INSERT INTO webhook_dead_letter (event_id, source, payload, error, at) VALUES (?, 'stripe', ?, ?, unixepoch())"
).bind(event.id, rawBody, String(err)).run();
}
})());
return new Response("ok", { status: 200 });
},
};That's a webhook handler you can actually ship: signed, deduplicated, fast, and unable to silently lose data.
Sending Webhooks (the Other Direction)
If you're the sender — your app POSTs to a customer's URL on some event — the same rules invert:
- Sign your payloads. HMAC the body with a per-customer secret, send the signature in a header. Document it.
- Retry on non-2xx with exponential backoff (e.g. 5s, 30s, 5m, 30m, 6h, give up). Cloudflare Queues does this for you if you use it as the outbound queue.
- Include an
event.idin the payload. It's how the receiver does their idempotency. - Time out aggressively (5–10 seconds). A receiver that can't 2xx fast doesn't get unbounded retries from you.
Mental Model — Three Sentences
- Verify the signature against the raw body before parsing JSON — the HMAC was computed over the exact bytes the sender sent; round-tripping through
JSON.parse+JSON.stringifysilently breaks it. - Dedupe by
event.id(KV for cheap, D1 UNIQUE for race-proof), 2xx within milliseconds by moving real work intoctx.waitUntilor a Cloudflare Queue, and dead-letter unrecoverable events to a table instead of letting the sender retry forever. - Webhook senders retry — bake idempotency in from day one and the same event arriving twice becomes a non-event rather than a duplicate charge or duplicate email.
Try It Yourself (15 Minutes)
- Install the Stripe CLI and run
stripe listen --forward-to http://localhost:8787/webhook. Hit your local Worker withstripe trigger checkout.session.completed. - Implement the verifier in this chapter. Confirm the trigger returns 200 from your Worker (logs
okin the CLI). Now intentionally break the secret inwrangler.toml; confirm 400. - Add the idempotency KV check. Replay the same event by re-running
stripe triggeragainst the same id (use--idempotent); confirm the second one returns 200 "dup" without running the handler. - Move the handler body into
ctx.waitUntiland add asetTimeout(..., 5000)simulating real work. Confirm the response is sub-second and the work still happens. - Force the handler to throw on one event; confirm it lands in your
webhook_dead_lettertable and the sender DOES NOT keep retrying (because you still 2xx'd).
Where This Lands in the Series
You can now accept events from the outside world safely. The other half of "production async work" is generating your own events on a schedule — daily digest emails, hourly recomputed leaderboards, periodic data cleanup — plus running expensive jobs without holding up a user request.
Next chapter: Cron Triggers + Workers Queues — Workers Cron for scheduled work, Workers Queues for async fan-out, the retry / batching / dead-letter semantics you get for free, and when each beats the other.
Ship your apps faster
When you're ready to publish your Swift app to the App Store, Simple App Shipper handles metadata, screenshots, TestFlight, and submissions — all in one place.
Try Simple App Shipper