Core Concepts: Changelog, Changeset, and Tracking Tables
Before writing a single migration, you need the mental model. Understanding how Liquibase thinks about changelogs, changesets, and identity prevents the most common mistakes — ones that are painful to fix after deployment.
The Changelog
The changelog is the file Liquibase reads. It contains an ordered list of changesets. Think of it as your database’s Git history — a sequential record of every change ever made.
# db/changelog/db.changelog-master.yaml
databaseChangeLog:
- changeSet:
id: "20240101-001"
author: abhay
changes:
- createTable:
tableName: users
columns:
- column:
name: id
type: BIGINT
autoIncrement: true
constraints:
primaryKey: true
nullable: false
- changeSet:
id: "20240101-002"
author: abhay
changes:
- addColumn:
tableName: users
columns:
- column:
name: email
type: VARCHAR(255)
constraints:
nullable: false
unique: true
The changelog format (YAML, XML, SQL) doesn’t matter for understanding the concept — it’s always a list of changesets in order.
The Changeset
The changeset is the atomic unit of change. One changeset = one logical database operation. It has three required pieces of identity:
The Three-Part Key
changeSet:
id: "20240101-001" # Unique within this file — YOUR convention
author: abhay # Who wrote this — YOUR name
# filename is added automatically from the file path
ID + AUTHOR + FILENAME is the composite key that Liquibase stores in DATABASECHANGELOG. This is how Liquibase knows whether a changeset has already been applied.
The most important rule: These three values must never change after the changeset has been applied to any environment. Changing the ID, author, or moving the file breaks the checksum and confuses Liquibase.
Changeset Attributes
changeSet:
id: "20240101-003"
author: abhay
# Optional but important:
comment: "Add user status for account lifecycle management"
# Run on every Liquibase execution regardless of checksum (dangerous)
runAlways: false
# Re-run when the changeset content changes (for stored procs/views)
runOnChange: false
# Fail if this changeset errors (default: true)
failOnError: true
# Don't wrap in a transaction (required for some MySQL DDL)
runInTransaction: true
# Only run on MySQL (skip on H2 in tests)
dbms: mysql
# Only run in specific environments
context: "prod or staging"
# Feature/ticket grouping
labels: "feature-users, sprint-1"
# Override the filename part of the three-part key
logicalFilePath: db/changelog/users.yaml
changes:
- createTable:
tableName: user_sessions
The Changeset Identity and Checksums
When Liquibase applies a changeset, it stores:
- The three-part key (
ID,AUTHOR,FILENAME) — for lookup - The
MD5SUM— a checksum of the changeset’s content
On every subsequent run, Liquibase re-calculates the checksum and compares it to the stored value. If they differ, Liquibase throws:
Validation Failed:
1 changesets check sum
db/changelog/users.yaml::20240101-001::abhay was: 8:abc123... but is now 8:def456...
This is the checksum protection — it ensures changesets are immutable once deployed.
The Golden Rule
Never modify a changeset that has been applied to any environment.
If you need to correct a mistake:
- Create a new changeset that undoes or modifies what the first changeset did
- Do not edit the original
Exception: you can use liquibase clear-checksums + liquibase changelog-sync to reset checksums in a development-only database — but never on shared or production environments.
One Change Per Changeset
Each changeset should contain exactly one logical change. This is the most important best practice:
# ❌ Bad — two unrelated changes in one changeset
changeSet:
id: "20240101-001"
author: abhay
changes:
- createTable:
tableName: users
- createTable:
tableName: products
# ✓ Good — one change per changeset
changeSet:
id: "20240101-001"
author: abhay
changes:
- createTable:
tableName: users
changeSet:
id: "20240101-002"
author: abhay
changes:
- createTable:
tableName: products
Why: If the second createTable fails, you can retry from that changeset. If they’re combined, you can’t — the first table already exists and the whole changeset fails.
The Master Changelog Pattern
For real projects, don’t put all changesets in one file. Use a master changelog that references other files:
# db/changelog/db.changelog-master.yaml
databaseChangeLog:
- include:
file: db/changelog/v1.0/users.yaml
relativeToChangelogFile: false
- include:
file: db/changelog/v1.0/products.yaml
relativeToChangelogFile: false
- include:
file: db/changelog/v1.1/orders.yaml
relativeToChangelogFile: false
Or include all files in a directory alphabetically:
databaseChangeLog:
- includeAll:
path: db/changelog/v1.0/
relativeToChangelogFile: false
The master changelog is an index — it contains only include or includeAll tags, never actual changesets.
Change Types vs Raw SQL
Liquibase gives you two approaches:
Platform-Agnostic Change Types (Recommended)
changes:
- createTable:
tableName: users
columns:
- column:
name: id
type: BIGINT
autoIncrement: true
constraints:
primaryKey: true
Liquibase generates the correct MySQL DDL. Works on any database — swap MySQL for PostgreSQL without changing the changelog. Many change types have automatic rollback — createTable automatically generates DROP TABLE for rollback.
Raw SQL
changes:
- sql:
sql: |
CREATE TABLE users (
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
rollback:
sql: DROP TABLE IF EXISTS users;
Full control. Database-specific. No automatic rollback — you must write it yourself.
When to use each: Use change types for standard DDL (tables, columns, indexes, foreign keys). Use raw SQL for complex business logic, bulk data migrations, or MySQL-specific syntax that change types don’t support.
How Execution Works
liquibase update
│
├─ Connect to MySQL
├─ Acquire DATABASECHANGELOGLOCK (set LOCKED=1)
├─ Read DATABASECHANGELOG — which changesets have been applied?
├─ Read changelog file — what changesets exist?
├─ For each changeset not in DATABASECHANGELOG:
│ ├─ Execute the change(s)
│ ├─ If success: INSERT INTO DATABASECHANGELOG (this changeset)
│ └─ If failure: record error, stop (by default)
└─ Release DATABASECHANGELOGLOCK (set LOCKED=0)
Key insight: Liquibase records each changeset to DATABASECHANGELOG after it executes successfully. A failed changeset is not recorded — it will be attempted again on the next run.
Changeset Execution Modes
| Attribute | Default | Effect |
|---|---|---|
runAlways: false | Off | Only runs once (normal behavior) |
runAlways: true | — | Runs on every liquibase update, ignores checksum |
runOnChange: true | — | Runs again when content changes |
failOnError: true | On | Stops all processing if this changeset fails |
failOnError: false | — | Logs error and continues to next changeset |
runOnChange: true is the right setting for stored procedures and views — they’re replaced wholesale every time their content changes.
What You’ve Learned
- The changelog is an ordered list of changesets — Liquibase’s equivalent of Git history
- Every changeset has a three-part identity:
ID + AUTHOR + FILENAME— never change after deployment - Liquibase stores a
MD5SUMchecksum for every applied changeset — editing it breaks validation - One logical change per changeset — enables precise rollback and clean retry on failure
- Platform-agnostic change types generate MySQL DDL with automatic rollback; raw SQL gives full control
- The master changelog is an index of
include/includeAll— it contains no changesets itself - Changeset execution records to
DATABASECHANGELOGonly on success — failed changesets will be retried
Next: Article 3 — Changelog Formats: XML, YAML, JSON, and SQL — which format to choose and why.