Tutorials Git & GitHub Pro Series Chapter 1

How Git Actually Works — Commits, Branches, and the Three Trees

Git + GitHubChapter 1 of the Git & GitHub Pro Series26 minMay 28, 2026Beginner

You've been typing git add, git commit, git push for a while now. Maybe you've watched "Auto-sync" commits land on your repo, seen pull requests merge with a "#5" next to them, and nodded along without really knowing what any of it means. That's normal — almost nobody is taught git; they absorb three commands and hope.

This series fixes that, and it starts here, with the one idea that makes everything else click:

Git is not a backup tool or a file-syncer. It's a graph of snapshots. Every command you run is just adding nodes to that graph or moving labels around on it.

Get this chapter and the scary stuff later — rebase, squash, reflog, "detached HEAD" — turns into obvious consequences of a simple model. Let's build it.

A Commit Is a Snapshot, Not a Diff

The single most common misconception: people think git stores changes (diffs) — "this commit added 3 lines, that one deleted 5." It doesn't. Git stores complete snapshots.

Every time you commit, git takes a photograph of your entire project — every file, exactly as it is — and saves it. A commit is that snapshot plus a little metadata:

Part of a commitWhat it holds
SnapshotThe full state of every tracked file at that moment
ParentA pointer to the commit that came before it
Author + dateWho made it and when
MessageThe text you wrote (-m "...")
SHAA unique 40-character ID (you usually see the first 7)

(Git is smart about storage — identical files aren't duplicated between snapshots — but conceptually every commit is a full snapshot. Diffs are computed on demand by comparing two snapshots, not stored.)

The SHA: Why Commits Can't Be Quietly Changed

That 8367b9b9 you see next to a commit is its SHA — a hash of the commit's entire contents (snapshot + parent + author + message). This has a profound consequence:

Change anything about a commit — one character of the message, one byte of a file, the parent it points to — and you get a completely different SHA. It's a different commit.

This is why git is trustworthy: a commit's ID is a fingerprint of its content. If two people have a commit with the same SHA, it is byte-for-byte the same commit. And it's why "rewriting history" (Ch 2) always produces new commits with new SHAs rather than editing existing ones — the old SHA could never match the new content.

History Is a Graph (the DAG)

Each commit points to its parent — the commit before it. Follow those pointers backward and you walk the entire history. That structure has a name: a DAG (Directed Acyclic Graph) — "directed" because pointers go one way (child → parent), "acyclic" because you can never loop back to where you started.

Most of the time it looks like a simple line:

Loading diagram…

Figure 1 — Commits point backward at their parent. C's parent is B; B's parent is A. To see history, git just follows the arrows. (Arrows point at the parent — that's the direction git actually stores.)

But a straight line is only the common case, not the definition. The moment you branch or merge, that line forks and rejoins — which is the whole reason git needs a graph, not a list. Each of the three letters is a real guarantee, so it's worth unpacking them:

LetterMeansWhy it holds in git
DirectedEvery edge has a directionA commit points at its parent, never forward. A commit can't even know its own children — they didn't exist when it was created and hashed.
AcyclicYou can never loop back to the startFollowing parents only ever moves toward the past, ending at the first commit (the root, which has no parent). A cycle is impossible: a commit's SHA includes its parent's SHA, so it physically cannot point at a descendant whose hash doesn't exist yet.
GraphNodes joined by edges — richer than a lineCommits are nodes; parent links are edges. It's more than a list because a commit can have several children, and a merge can have several parents.

Why a Graph, and Not Just a Line

A straight line would mean every commit has exactly one parent and exactly one child. Git is richer in two specific ways, and those two are what bend the line into a graph:

Loading diagram…

Figure 1b — A real DAG. Commit B forks into two children (D and E); the merge commit M rejoins them by having two parents. A plain line can express neither move — which is exactly why git's history is a graph. (Ch 2 covers how you deliberately create and reshape this.)

A subtle point worth nailing: it's a graph, not a tree. A tree forbids two branches from rejoining — but git allows it, every time you merge. That's the difference between git's DAG and the simpler structures you may have met:

StructureParents per nodeChildren per nodeCan two paths rejoin?
Linked list11No
Tree1manyNo
Git's DAG1 or moremanyYes — at a merge

Reachability: What git log Actually Walks

The DAG hands you one more idea for free, and it pays off all the way in Chapter 4: reachability. Starting from any label — a branch, a tag, or HEAD — the commits you can reach by walking parent pointers are that label's history. git log is literally that walk: start at HEAD, follow parents, print each node until you hit the root.

That single idea explains several everyday things at once:

So hold the full picture now: history is not a flat list but a graph of snapshots — mostly linear, but forking at branches and rejoining at merges. Reshaping that graph on purpose is the whole subject of Ch 2; next, though, we meet the labels that point into the graph — because that's what a branch really is.

HEAD and Branches Are Just Labels

Here's the part that unlocks everything. A branch is not a copy of your code. A branch is just a sticky note with a commit's name on it. That's it. main is a label that points at one specific commit — the newest one on that line.

And HEAD is a special label meaning "you are here right now." Usually HEAD points at a branch (which points at a commit), so HEAD is really saying "the branch you're currently on."

Loading diagram…

Figure 2 — main is a label on commit C. HEAD points at main, meaning "you're on the main branch." When you commit, git creates a new snapshot and slides the main label forward to it.

This is why branching in git is instant and free: creating a branch just writes a new sticky note pointing at the current commit. No files are copied. A branch is 40 bytes — the SHA it points to. Once you internalise "branches are movable labels," git checkout, git reset, and "fast-forward" all become obvious — they're just moving labels around.

The Three Trees: Where add and commit Move Things

Now, the part that confuses everyone about git add. Why two steps to save — why add then commit? Because git has three places your files live, and the two commands move files between them:

"Tree"What it isPlain English
Working directoryThe actual files on disk you edit"My messy desk right now"
Staging area (the "index")A holding pen for the next commit"The stuff I've decided goes in the next snapshot"
Repository (.git)The committed history (the DAG)"The permanent photo album"
Loading diagram…

Figure 3 — git add moves changes from your working directory into the staging area; git commit writes everything staged into a new commit in the repository. The staging area exists so you can commit some of your changes and not others.

The staging area feels like bureaucracy until the first time you've changed five files but only want two of them in this commit. git add file1.js file2.js stages just those; git commit snapshots only what's staged. The other three stay in your working directory for a later commit. That selectivity is the entire reason add and commit are separate.

Push, Fetch, Pull: Syncing Two Graphs

Everything above is local — it all happens in the .git folder on your machine, no internet required. GitHub enters only when you want to share that graph or back it up.

A remote is just another copy of the repository living somewhere else. origin is the conventional name for "the GitHub copy." Three commands move commits between your local graph and the remote graph:

CommandWhat it does
git pushSend your local commits up to the remote (GitHub). Moves the remote's branch label forward.
git fetchDownload commits from the remote into your local repo — but don't touch your working files yet.
git pullfetch + merge in one step: download and integrate the remote's changes into your current branch.

So when you see those Auto-sync 2026-05-28 commits appear on your GitHub repo, here's the full story in model terms: a background process ran git add (working dir → staging), git commit (staging → a new snapshot node, main label slides forward), and git push (that new node is copied up to origin, and GitHub's main label slides forward to match). Three label-and-snapshot operations. Nothing magic.

Loading diagram…

Figure 4 — Two copies of the same graph. push sends your new commits up; fetch/pull bring the remote's new commits down. They're in sync when both main labels point at the same SHA — which is exactly what git status means by "up to date with origin/main."

Mental Model — Three Sentences

  1. A commit is a full snapshot of your project plus a pointer to its parent, identified by a SHA that changes if anything in it changes — so history is an immutable graph of snapshots, not a list of diffs.
  2. Branches and HEAD are just movable labels pointing at commits; "switching branches" and "moving forward" are git sliding labels, not copying files.
  3. add moves changes into the staging area, commit snapshots the staging area into the repository, and push/pull sync your local graph with GitHub's copy.

Try It Yourself (10 Minutes)

Run these in any git repo and watch the model:

  1. git log --oneline --graph --all — see the DAG as ASCII art. Each line is a commit; the left rail shows the parent links.
  2. git cat-file -p HEAD — print the raw commit HEAD points at. You'll literally see parent <sha>, author, and the tree (snapshot) pointer. This is a commit with the lid off.
  3. git status — read it as "differences between the three trees": Changes not staged = working dir vs staging; Changes to be committed = staging vs last commit.
  4. Edit two files. git add only one. Run git status and see one staged, one not. Commit. Notice only the staged change made it into the snapshot.
  5. git branch experiment then git log --oneline --graph --all again — see a second label appear on the same commit. You created a branch and copied zero files.

Where This Lands in the Series

You now have the model: snapshots, SHAs, a DAG, movable labels, three trees, two synced graphs. That's the foundation everything else stands on.

Next chapter uses it to demystify the question every team argues about: when you bring one branch into another, should you merge, rebase, or squash? Each is just a different way of rearranging the graph you now understand — including the squash-merge that produces those tidy "PR #N" commits on your repo.

← Series OverviewCh 2: Merge, Rebase, or Squash?
DeliveryModern Delivery PipelineCI/CD, review, runner, and deploy workflows for teams shipping apps and websites safely.Production WebProduction Web Apps SeriesProduction patterns for web apps: caching, rate limiting, webhooks, queues, cron jobs, and idempotency.WebUltimate Web Development SeriesWeb development tutorials for HTML, CSS, JavaScript, Next.js, Workers, databases, and production shipping.

Ship your apps faster

When you're ready to publish your Swift app to the App Store, Simple App Shipper handles metadata, screenshots, TestFlight, and submissions — all in one place.

Try Simple App Shipper
5 free articles remainingSubscribe for unlimited access