Tutorials Modern Delivery Pipeline Chapter 1

The Modern Delivery Pipeline: From Commit to Production with Agents in the Loop

DeliveryChapter 1 of the Modern Delivery Pipeline29 minJune 7, 2026Intermediate

Most tutorials teach you to write code. Almost none teach you the machine that carries that code from the moment you save a file to the moment a stranger uses it in production — and how that machine changed in 2026, when AI agents stopped being a chat window on the side and moved inside the loop: opening pull requests, reviewing diffs, fixing their own CI failures.

This is the map chapter for a whole series about that machine — the delivery pipeline. By the end you'll have the shape of the entire thing in your head: the canonical path from commit to production, exactly where agents are allowed to act (and the gates where a human must stay), the handful of principles every good pipeline obeys, the single common shape hiding inside five wildly different release targets — Cloudflare web, iOS, Android, Mac, Windows — and a concrete reference architecture for a real indie setup: one always-on Mac mini and a few developer MacBook Pros. Later chapters zoom into each box. This one draws the whole board.

The Shape of a Modern Delivery Pipeline

Strip away tools and brand names and every modern pipeline is the same five-act play. A change is authored, proposed, verified, shipped, and operated — and it only moves forward when each act passes.

Loading diagram…

Figure 1 — The five acts every change passes through. The tools differ wildly between a Cloudflare Worker and a Windows installer, but the acts never change. Most of this series is just "what happens inside each box, per target."

Two properties make this a pipeline and not just a checklist:

  1. It's gated. A change can't skip an act. No merge without a green check; no deploy without a merge. Each gate is a place where "no" is allowed and cheap.
  2. It's one-directional and reproducible. The same change, run twice, produces the same result — because the artifact built in act ④ is the exact artifact that ships, not a fresh rebuild that might differ.

Everything else in this chapter hangs off that diagram.

Where the Agents Actually Sit

Here's what's genuinely new, and where most teams are still finding their footing. In 2026 an AI agent can do real work in acts ① and ③ — it can write the change and review it. The discipline that keeps this safe is one sentence:

Agents propose; humans dispose. An agent may author a branch, open a PR, review a diff, and fix its own failing checks — but the irreversible, outward-facing gates (approve, merge, deploy, submit-for-review) stay with a human.

This isn't fear of the technology; it's the same principle that governs the SimpleAppShipper app itself, whose Safeguard system forbids agents from submitting an app for App Store review automatically. The rule generalises: let agents do the reversible work at full speed, and keep a human on every gate that's hard to undo.

Loading diagram…

Figure 2 — The agent-assisted PR loop. Agents drive the fast, reversible inner cycle (write → review → fix → re-review). The human owns the slow, irreversible outer gate (approve + merge). The edit→re-review loop can run several times before a human ever looks — so the human reviews a diff that's already clean.

The payoff is leverage, not replacement: the human's scarce attention is spent on intent and risk ("is this the right change, and what could it break?") instead of mechanical nits an agent already caught. Ch 2 of this series makes this loop concrete — author-agent and reviewer-agent prompts, the guardrails (least privilege, no secrets, no force-push to main), and how to keep the bar up while the speed goes up. (For the human half of reviewing, the web series' Pull Requests and Code Review chapters are the prerequisite.)

The Non-Negotiable Principles

Tools change every year; these don't. Every reliable pipeline — web or native, solo or 500-engineer — obeys this short list. If you only remember one section of this chapter, make it this one.

PrincipleWhat it meansWhy it matters
Trunk-based, short-lived branchesOne long-lived branch (main); feature branches live hours-to-days, not weeks.Small diffs are reviewable (by humans and agents) and rarely conflict. Long branches rot.
The PR is the unit of changeNothing reaches main except through a reviewed, checked pull request.Every change gets a gate, a record, and a revert handle. No "just pushed a hotfix to prod."
Branch protection + required checksmain physically refuses un-reviewed or red-CI merges. (git Ch 3)Makes the rules mechanical instead of a thing people remember to do.
One check script, many callersLocal, CI, and the agent all run the same npm run check. (web Ch 25)"Green on my machine" and "green in CI" can never disagree about what they checked.
Build once, deploy manyProduce one immutable artifact; promote that same artifact through environments.What you tested in staging is byte-for-byte what ships. No "rebuild for prod" surprises.
Environments + promotiondev → preview/staging → production, each a gate, each with its own secrets.Catch problems on a copy before real users meet them.
Least-privilege secretsEach runner gets only the secrets it needs; nothing sensitive in git or in PR logs.A leaked CI token shouldn't be able to sign a release and drain a bucket.
Progressive deliveryShip to a slice first (preview URL, TestFlight, a % rollout), then widen.Blast radius of a bad release is a few users, not all of them.
Observability + fast rollbackYou can see a release is bad, and undo it in one step.Mean-time-to-recovery beats mean-time-between-failures. Things break; recover fast.

Notice these are mostly about making "no" cheap and "undo" fast — not about preventing all mistakes. A pipeline's job isn't perfection; it's to make the cost of any single mistake small.

One Pipeline, Five Targets

Here's the idea that makes cross-platform shipping tractable: iOS, Android, Mac, Windows, and a Cloudflare web app look completely different up close, but they're the same six stages underneath. Source becomes a build, the build gets signed so the platform trusts it, it's packaged into the format that platform installs, distributed through that platform's channel, and finally updated on machines that already have it.

Loading diagram…

Figure 3 — The universal release spine. Learn these six stages once and every platform becomes "the same thing, different nouns." The table below fills in the nouns.

Stage🌐 Cloudflare web🍎 iOS🤖 Android💻 Mac (direct)🪟 Windows
BuildOpenNext / wranglerxcodebuildGradle → AABxcodebuildMSBuild / dotnet
Sign—¹Cert + provisioning profileUpload keystoreDeveloper ID + notarizeAuthenticode (EV/OV) cert
PackageWorker bundle.ipa.aab / .apk.dmg / .pkgMSIX / MSI / EXE
DistributeWorkers/Pages deployTestFlight → App StorePlay Console tracksR2 / website downloadwinget / MS Store / site
UpdateInstant (edge)App Store auto-updatePlay auto-updateSparkle appcastwinget upgrade / MSIX

¹ The web "signature" is TLS + Cloudflare's own platform trust — you don't code-sign a Worker the way you sign a desktop binary.

Two things fall out of this table immediately:

Your Own Build Cluster: One Mac mini + Several MacBook Pros

Now the concrete part — and the reason a real fleet beats renting everything. Suppose you have what a lot of indie shops actually have: a few developer MacBook Pros and one Mac mini you can leave running. That's not a compromise; with the right roles it's a genuinely good delivery architecture. The trick is giving each machine the job it's shaped for.

Loading diagram…

Figure 4 — A real indie fleet. Laptops are interchangeable, low-trust authoring machines. The Mac mini is the one trusted, always-on machine that holds the signing identity and does the irreversible build/sign/deploy work. GitHub is the control plane that ties them together. Nothing sensitive lives on a laptop.

The role split is the whole design:

MachineRoleHolds secrets?Trust level
MacBook ProsAuthoring: humans + agents write code, run ci:local, push branches, open PRs. Interchangeable and replaceable.No (ideally) — they push code, not releases.Low — a lost laptop shouldn't be able to ship a signed release.
Mac miniThe hub: self-hosted CI runner for macOS/iOS builds, notarization, deploy orchestration, and any always-on service. Builds the signed artifacts.Yes — signing identity in its Keychain, deploy tokens scoped to it.High — the one machine you harden, because it's the one that can ship.
GitHub (cloud)Control plane: the repo, PRs, branch protection, and the Actions orchestration that dispatches jobs to the mini.Scoped tokens only.Medium — public dashboard, private compute.

Why this beats both extremes:

This isn't hypothetical for this project. SimpleAppShipper already runs exactly the seed of it: a real .github/workflows/macos-build.yml that targets runs-on: [self-hosted, macos] — the mini — and is deliberately constrained (workflow_dispatch only, owner-gated, read-only token, no secrets) because the runner is a personal machine. The CLAUDE.md also describes a planned live SwiftUI preview service that proxies render requests to a Mac mini — the same machine wearing its "always-on service host" hat (the SVC box in Figure 4).

Job routing: how work finds the right machine

The mechanism that sends macOS work to the mini and leaves everything else in the cloud is runner labels. A job declares what kind of machine it needs; GitHub routes it there:

jobs:
  web-check:
    runs-on: ubuntu-latest          # cheap Linux in the cloud — lint, typecheck, web tests
    steps:
      - uses: actions/checkout@v4
      - run: npm ci && npm run check  # the same check from web Ch 25
 
  mac-release:
    runs-on: [self-hosted, macos]    # ← only the mini matches this label
    needs: web-check                  # don't even build the app if the checks are red
    steps:
      - uses: actions/checkout@v4
      - run: ./scripts/build-notarize-deploy.sh

Linux-shaped work (lint, typecheck, web tests, the Cloudflare build) stays on free cloud runners. Only the genuinely Mac-bound work — Apple builds, notarization — lands on the mini. That's the cheapest and fastest split: the cloud parallelism is free, and the expensive Apple step runs on hardware you've already paid for.

What It Looks Like End-to-End

Let's trace one real change all the way through, to make the diagram concrete. Say you're adding a new metadata field to both the website and the Mac app.

Loading diagram…

Figure 5 — One change, two targets, the same pipeline. The web half deploys from a free cloud runner in seconds; the Mac half is built, signed, and notarized on the mini and published to R2. Both halves passed the identical checks before merge. A human approved once, at the only gate that mattered.

Notice what the human did in that whole sequence: approved once. Everything else — writing, local checks, CI, the review's first pass, the build, the notarize, the deploy — ran without them. That's the modern pipeline's promise: human judgment at the gates, automation everywhere else.

What simpleappshipper.com Actually Does Today

Keeping with the honesty the web series insists on: this project runs the seeds of the architecture above, not the full thing — and the gap is the point of the series.

That gap — from "works, by hand" to "the reference architecture" — is exactly what Chapters 2–7 build, one box of Figure 1 at a time.

Mental Model — Three Sentences

  1. Every delivery pipeline, on every platform, is the same five gated acts — author, propose, verify, ship, operate — and the same six-stage release spine (build, sign, package, distribute, update); learn those once and every target is "the same thing, different nouns."
  2. Agents now do real work in authoring and review, but the rule that keeps it safe is "agents propose, humans dispose" — agents drive the fast reversible inner loop, humans own every irreversible gate (approve, merge, deploy, submit).
  3. A real indie fleet gives each machine the job it's shaped for: interchangeable, low-trust MacBook Pros author code; one always-on, hardened Mac mini holds the signing identity and does the build/sign/deploy; and the cloud is the control plane that ties them together — which is cheaper, faster, and safer than renting everything or shipping from a laptop.

Try It Yourself (15 Minutes)

  1. Draw your own Figure 1. For a project you ship, write the five acts and name the actual tool in each box. The empty boxes are your pipeline's gaps.
  2. Find your gates. In your repo's Settings → Branches, look at whether main requires a PR, a review, and green checks before merge. If it doesn't, that's the highest-leverage thing to turn on (git Ch 3).
  3. Name your trusted machine. Of all the computers that touch your releases, which one should hold your signing identity and deploy tokens? Everything else should not. If the answer is "my laptop," that's the architecture to evolve.
  4. Spot the 10x. If you build anything for Apple platforms in the cloud, check whether you're paying the macOS runner multiplier — and whether a Mac you already own could do it instead (web Ch 24).

Where This Lands in the Series

This was the map. Each remaining chapter is one region of it in full detail:

ChRegion of the mapWhat it covers
1The whole board (you are here)The five acts, agents-in-the-loop, the principles, the five targets, the fleet.
2Act ③ — VerifyThe agent-assisted pull request: author/reviewer agents, guardrails, keeping the bar up while speed goes up.
3Act ③/④ — CIOne pipeline, five targets: matrix builds, runner selection, caching, what's shared vs platform-specific.
4The fleetArchitecting the Mac mini + MacBook Pro cluster: hardening, Tailscale networking, job labels, keychain isolation, bursting to cloud.
5Act ④ — Ship (web)Cloudflare preview deploys per PR, staging→prod promotion, and instant rollback.
6Act ④ — Ship (apps)The iOS, Android, Mac, and Windows release pipelines side by side — signing, packaging, store/track distribution, auto-update.
7Act ⑤ — OperateEnvironments, secrets, observability, and the rollback safety net that turns "it broke" into a non-event.

Next up, Ch 2 zooms into the busiest box on the board — the pull request — and shows exactly how to put agents on both sides of a review without ever letting the quality bar drop. That's where "agents propose, humans dispose" stops being a slogan and becomes a set of prompts, permissions, and protected branches.

← Series OverviewCh 2: The Agent-Assisted Pull Request — Review, Edit, Merge Without Lowering the Bar
Git + GitHubGit & GitHub Pro SeriesGit and GitHub practices for branches, pull requests, rebase, history repair, and team review.Ship iOSShip iOS Apps SeriesShipping workflows for iOS apps: signing, TestFlight, App Store Connect, CI, and release hygiene.Production WebProduction Web Apps SeriesProduction patterns for web apps: caching, rate limiting, webhooks, queues, cron jobs, and idempotency.

Ship your apps faster

When you're ready to publish your Swift app to the App Store, Simple App Shipper handles metadata, screenshots, TestFlight, and submissions — all in one place.

Try Simple App Shipper
5 free articles remainingSubscribe for unlimited access