Skip to main content

Command Palette

Search for a command to run...

Inside Git: How It Works and the Role of the .git Folder

"A visual guide to Git’s internal data model and object database"

Published
4 min read
Inside Git: How It Works and the Role of the .git Folder

When we use Git every day, it feels like a set of simple commands:

git add
git commit
git log

But under the hood, Git is doing something surprisingly elegant and mathematical. Instead of “saving files”, Git builds a content-addressed database of objects and connects them in a graph.

To really understand Git (and not just memorize commands), we need to look inside the hidden .git folder.

How Git Works Internally:-

At its core, Git is:

  • A key–value database

  • Where the key is a hash

  • And the value is some piece of project data

Every important thing in Git (a file, a folder, a commit) is stored as an object and identified by a SHA-1 hash like:

e83c5163316f89bfbde7d9ab23ca2e25604af290

This hash is calculated from the content itself.
If the content changes even slightly, the hash changes completely.

This gives Git two superpowers:

  1. Integrity – corruption or tampering is immediately visible.

  2. Deduplication – identical content is stored only once.

Git does not track “file changes” directly.
It stores snapshots of your project, efficiently built from these objects.

Understanding the .git Folder:-

When you run:

git init

Git creates a hidden directory called .git.
This is the entire repository database.

High-level structure

Image

Image

Image

Important parts:

  • objects/
    Stores all Git objects (blobs, trees, commits).

  • refs/
    Pointers to commits (branches, tags).

  • HEAD
    Tells Git which branch/commit you’re currently on.

  • index
    The staging area (what you added with git add).

  • config
    Repository-specific settings.

If you delete .git, your project becomes a normal folder again.
All history lives only inside this directory.

Git Objects: Blob, Tree, Commit:-

Git builds everything from just three object types.

Relationship between them

Image

Image

Image

Image

1. Blob (file data)

  • Stores raw file content

  • No filename, no metadata

  • Just bytes

Example: contents of app.js

2. Tree (folder structure)

  • Represents a directory

  • Maps names to blobs or other trees

Example:
A tree might say:

app.js  -> blob A
utils/  -> tree B

3. Commit (snapshot + metadata)

A commit stores:

  • A pointer to one tree (the project snapshot)

  • Author, date, message

  • Parent commit(s)

Commits link to previous commits, forming a history chain (actually a graph).

How Git Tracks Changes:-

Git doesn’t store “diffs”.
Each commit points to a full snapshot (tree), but unchanged files reuse existing blobs.

So if only one file changes:

  • New blob for that file

  • New trees for folders on the path

  • New commit pointing to the new root tree

  • Everything else is reused

This makes commits cheap and fast.

What Happens During git add :-

git add file.txt does:

  1. Read the file content

  2. Create a blob object

  3. Store it in .git/objects/

  4. Record its hash in the index (staging area)

It does not create a commit yet.

Think of the index as:

“Here is exactly what the next snapshot should contain.”

What Happens During git commit :-

Internal flow:

Image

Image

Steps:

  1. Git reads the index

  2. Builds tree objects to represent folders

  3. Creates a commit object pointing to the top tree

  4. Updates the current branch reference to this new commit

No files are copied around.
Only small objects and hashes are written.

How Git Uses Hashes to Ensure Integrity:-

Each object is stored as:

hash = SHA1(type + size + content)

Because the hash depends on the content:

  • Changing content ⇒ new hash

  • Changing a tree changes its hash

  • That changes the commit hash

So every commit hash indirectly contains the entire project state.

If any historical data is modified, the hashes no longer match and Git detects it immediately.

This is why Git history is effectively tamper-proof.

Building the Right Mental Model:-

Instead of thinking:

“Git saves file versions”

Think:

“Git stores content objects and connects them with hashes.”

  • Blobs = file contents

  • Trees = directory structure

  • Commits = named snapshots pointing to trees

  • Branches = movable labels pointing to commits

git add → put content into the object database and stage it
git commit → create a new snapshot that points to that content

Everything lives inside .git, and everything is connected by hashes.