Part 2 of 18

Core Concepts: Changelog, Changeset, and Tracking Tables

Before writing a single migration, you need the mental model. Understanding how Liquibase thinks about changelogs, changesets, and identity prevents the most common mistakes — ones that are painful to fix after deployment.

The Changelog

The changelog is the file Liquibase reads. It contains an ordered list of changesets. Think of it as your database’s Git history — a sequential record of every change ever made.

# db/changelog/db.changelog-master.yaml
databaseChangeLog:
  - changeSet:
      id: "20240101-001"
      author: abhay
      changes:
        - createTable:
            tableName: users
            columns:
              - column:
                  name: id
                  type: BIGINT
                  autoIncrement: true
                  constraints:
                    primaryKey: true
                    nullable: false

  - changeSet:
      id: "20240101-002"
      author: abhay
      changes:
        - addColumn:
            tableName: users
            columns:
              - column:
                  name: email
                  type: VARCHAR(255)
                  constraints:
                    nullable: false
                    unique: true

The changelog format (YAML, XML, SQL) doesn’t matter for understanding the concept — it’s always a list of changesets in order.

The Changeset

The changeset is the atomic unit of change. One changeset = one logical database operation. It has three required pieces of identity:

The Three-Part Key

changeSet:
  id: "20240101-001"      # Unique within this file — YOUR convention
  author: abhay           # Who wrote this — YOUR name
  # filename is added automatically from the file path

ID + AUTHOR + FILENAME is the composite key that Liquibase stores in DATABASECHANGELOG. This is how Liquibase knows whether a changeset has already been applied.

The most important rule: These three values must never change after the changeset has been applied to any environment. Changing the ID, author, or moving the file breaks the checksum and confuses Liquibase.

Changeset Attributes

changeSet:
  id: "20240101-003"
  author: abhay
  
  # Optional but important:
  comment: "Add user status for account lifecycle management"
  
  # Run on every Liquibase execution regardless of checksum (dangerous)
  runAlways: false
  
  # Re-run when the changeset content changes (for stored procs/views)
  runOnChange: false
  
  # Fail if this changeset errors (default: true)
  failOnError: true
  
  # Don't wrap in a transaction (required for some MySQL DDL)
  runInTransaction: true
  
  # Only run on MySQL (skip on H2 in tests)
  dbms: mysql
  
  # Only run in specific environments
  context: "prod or staging"
  
  # Feature/ticket grouping
  labels: "feature-users, sprint-1"
  
  # Override the filename part of the three-part key
  logicalFilePath: db/changelog/users.yaml
  
  changes:
    - createTable:
        tableName: user_sessions

The Changeset Identity and Checksums

When Liquibase applies a changeset, it stores:

  1. The three-part key (ID, AUTHOR, FILENAME) — for lookup
  2. The MD5SUM — a checksum of the changeset’s content

On every subsequent run, Liquibase re-calculates the checksum and compares it to the stored value. If they differ, Liquibase throws:

Validation Failed:
  1 changesets check sum
  db/changelog/users.yaml::20240101-001::abhay was: 8:abc123... but is now 8:def456...

This is the checksum protection — it ensures changesets are immutable once deployed.

The Golden Rule

Never modify a changeset that has been applied to any environment.

If you need to correct a mistake:

  • Create a new changeset that undoes or modifies what the first changeset did
  • Do not edit the original

Exception: you can use liquibase clear-checksums + liquibase changelog-sync to reset checksums in a development-only database — but never on shared or production environments.

One Change Per Changeset

Each changeset should contain exactly one logical change. This is the most important best practice:

# ❌ Bad — two unrelated changes in one changeset
changeSet:
  id: "20240101-001"
  author: abhay
  changes:
    - createTable:
        tableName: users
    - createTable:
        tableName: products

# ✓ Good — one change per changeset
changeSet:
  id: "20240101-001"
  author: abhay
  changes:
    - createTable:
        tableName: users

changeSet:
  id: "20240101-002"
  author: abhay
  changes:
    - createTable:
        tableName: products

Why: If the second createTable fails, you can retry from that changeset. If they’re combined, you can’t — the first table already exists and the whole changeset fails.

The Master Changelog Pattern

For real projects, don’t put all changesets in one file. Use a master changelog that references other files:

# db/changelog/db.changelog-master.yaml
databaseChangeLog:
  - include:
      file: db/changelog/v1.0/users.yaml
      relativeToChangelogFile: false

  - include:
      file: db/changelog/v1.0/products.yaml
      relativeToChangelogFile: false

  - include:
      file: db/changelog/v1.1/orders.yaml
      relativeToChangelogFile: false

Or include all files in a directory alphabetically:

databaseChangeLog:
  - includeAll:
      path: db/changelog/v1.0/
      relativeToChangelogFile: false

The master changelog is an index — it contains only include or includeAll tags, never actual changesets.

Change Types vs Raw SQL

Liquibase gives you two approaches:

changes:
  - createTable:
      tableName: users
      columns:
        - column:
            name: id
            type: BIGINT
            autoIncrement: true
            constraints:
              primaryKey: true

Liquibase generates the correct MySQL DDL. Works on any database — swap MySQL for PostgreSQL without changing the changelog. Many change types have automatic rollbackcreateTable automatically generates DROP TABLE for rollback.

Raw SQL

changes:
  - sql:
      sql: |
        CREATE TABLE users (
          id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
          email VARCHAR(255) NOT NULL UNIQUE
        ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;        
  rollback:
    sql: DROP TABLE IF EXISTS users;

Full control. Database-specific. No automatic rollback — you must write it yourself.

When to use each: Use change types for standard DDL (tables, columns, indexes, foreign keys). Use raw SQL for complex business logic, bulk data migrations, or MySQL-specific syntax that change types don’t support.

How Execution Works

liquibase update
   │
   ├─ Connect to MySQL
   ├─ Acquire DATABASECHANGELOGLOCK (set LOCKED=1)
   ├─ Read DATABASECHANGELOG — which changesets have been applied?
   ├─ Read changelog file — what changesets exist?
   ├─ For each changeset not in DATABASECHANGELOG:
   │     ├─ Execute the change(s)
   │     ├─ If success: INSERT INTO DATABASECHANGELOG (this changeset)
   │     └─ If failure: record error, stop (by default)
   └─ Release DATABASECHANGELOGLOCK (set LOCKED=0)

Key insight: Liquibase records each changeset to DATABASECHANGELOG after it executes successfully. A failed changeset is not recorded — it will be attempted again on the next run.

Changeset Execution Modes

AttributeDefaultEffect
runAlways: falseOffOnly runs once (normal behavior)
runAlways: trueRuns on every liquibase update, ignores checksum
runOnChange: trueRuns again when content changes
failOnError: trueOnStops all processing if this changeset fails
failOnError: falseLogs error and continues to next changeset

runOnChange: true is the right setting for stored procedures and views — they’re replaced wholesale every time their content changes.

What You’ve Learned

  • The changelog is an ordered list of changesets — Liquibase’s equivalent of Git history
  • Every changeset has a three-part identity: ID + AUTHOR + FILENAME — never change after deployment
  • Liquibase stores a MD5SUM checksum for every applied changeset — editing it breaks validation
  • One logical change per changeset — enables precise rollback and clean retry on failure
  • Platform-agnostic change types generate MySQL DDL with automatic rollback; raw SQL gives full control
  • The master changelog is an index of include/includeAll — it contains no changesets itself
  • Changeset execution records to DATABASECHANGELOG only on success — failed changesets will be retried

Next: Article 3 — Changelog Formats: XML, YAML, JSON, and SQL — which format to choose and why.