Introduction to JPA, Hibernate, and Spring Data JPA

Part 1 of 28

May 04, 2026 Abhay 6 min read

Introduction to JPA, Hibernate, and Spring Data JPA

Introduction

Every Spring Boot application that touches a relational database eventually encounters three terms used almost interchangeably: JPA, Hibernate, and Spring Data JPA. They are related but distinct, and understanding the difference is essential before writing a single line of mapping code.

This article explains what each one is, how they fit together, and why this stack is the dominant approach to Java database access.

The Problem: Object-Relational Impedance Mismatch

Java applications work with objects: classes, inheritance, collections, references. Relational databases work with tables, rows, columns, and foreign keys. These two worlds don’t map neatly onto each other — this tension is called the object-relational impedance mismatch.

Before JPA, Java developers wrote JDBC code by hand:

// Without JPA — raw JDBC
PreparedStatement stmt = connection.prepareStatement(
    "INSERT INTO customers (name, email) VALUES (?, ?)"
);
stmt.setString(1, customer.getName());
stmt.setString(2, customer.getEmail());
stmt.executeUpdate();

This works, but it means:

Manual mapping between Java objects and SQL rows in every method
Manual transaction management
No object graph navigation — every join requires explicit SQL
No change tracking — you must know what changed and write the UPDATE

JPA was designed to solve all of this.

What Is JPA?

JPA (Jakarta Persistence API) — formerly Java Persistence API — is a specification. It defines a standard set of interfaces, annotations, and behaviour for Java ORM (Object-Relational Mapping).

JPA is defined in the jakarta.persistence package and specifies:

How to annotate a Java class to make it a persistent entity (@Entity, @Id, @Column)
How to map relationships between entities (@OneToMany, @ManyToOne, etc.)
How to query entities using JPQL (Java Persistence Query Language)
How the persistence context (the entity cache) works
How transactions interact with persistence

Crucially, JPA is just an API — a set of interfaces and rules. It ships no implementation.

What Is Hibernate?

Hibernate is the most widely used JPA implementation. It is the engine that actually executes JPA contracts.

When you add JPA to a Spring Boot project, Hibernate is pulled in automatically as the default provider. Hibernate:

Translates @Entity mappings into SQL DDL
Generates SELECT, INSERT, UPDATE, and DELETE SQL from your code
Implements the persistence context (first-level cache)
Manages lazy loading through proxy objects
Handles connection pool integration

Hibernate also provides extensions beyond the JPA spec — features like @BatchSize, @DynamicUpdate, @NaturalId, and the second-level cache API. These are Hibernate-specific and not portable to other JPA providers.

JPA vs. Hibernate at a glance

	JPA	Hibernate
Type	Specification (API)	Implementation
Package	`jakarta.persistence`	`org.hibernate`
Portable	Yes — any JPA provider	No — Hibernate-specific
Extras	No	Many extensions

What Is Spring Data JPA?

Spring Data JPA is a Spring module that sits on top of JPA/Hibernate and adds another layer of abstraction.

It provides:

Repository interfaces — you declare an interface extending JpaRepository, and Spring generates the implementation at startup
Derived query methods — method names like findByEmailAndStatus are automatically converted to JPQL queries
@Query annotation — for custom JPQL or native SQL
Pagination and sorting built in
Auditing (@CreatedDate, @LastModifiedBy)
Specifications for dynamic queries

Without Spring Data JPA, even with Hibernate you would write DAO classes by hand:

// Without Spring Data JPA — manual EntityManager DAO
@Repository
public class CustomerDao {
    @PersistenceContext
    private EntityManager em;

    public Optional<Customer> findById(Long id) {
        return Optional.ofNullable(em.find(Customer.class, id));
    }

    public List<Customer> findByEmail(String email) {
        return em.createQuery(
            "SELECT c FROM Customer c WHERE c.email = :email", Customer.class)
            .setParameter("email", email)
            .getResultList();
    }
}

With Spring Data JPA, this collapses to:

// With Spring Data JPA — zero boilerplate
public interface CustomerRepository extends JpaRepository<Customer, Long> {
    List<Customer> findByEmail(String email);
}

Spring generates the implementation at runtime — you declare what you want, Spring figures out the how.

How the Three Layers Fit Together

Your Application Code
        │
        ▼
Spring Data JPA          ← Repository interfaces, query derivation, auditing
        │
        ▼
JPA API (jakarta.persistence)  ← EntityManager, @Entity, @Query, JPQL spec
        │
        ▼
Hibernate                ← SQL generation, connection pool, caching, proxies
        │
        ▼
JDBC
        │
        ▼
MySQL 8.x

Each layer uses the one below it. Spring Data JPA calls JPA’s EntityManager. The EntityManager is implemented by Hibernate. Hibernate uses JDBC to talk to MySQL.

Key Concepts You Will Master in This Series

Entity

An entity is a Java class mapped to a database table. Each instance represents a row.

@Entity
@Table(name = "customers")
public class Customer {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String name;
    private String email;
}

EntityManager

The EntityManager is the JPA interface for interacting with the persistence context. Spring Data JPA uses it internally — you rarely call it directly.

Key operations:

em.persist(entity) — make an entity managed (schedule INSERT)
em.find(Customer.class, 1L) — load by primary key
em.merge(entity) — merge a detached entity back
em.remove(entity) — schedule DELETE
em.flush() — write pending changes to the database

Persistence Context

The persistence context is an in-memory cache of entities managed by the current EntityManager. Every entity you load or persist is tracked here. When the transaction commits, Hibernate flushes all changes to the database automatically — you do not call UPDATE manually.

This is covered in depth in Article 3.

JPQL

JPQL (Java Persistence Query Language) is a query language that operates on entity objects, not tables:

// JPQL — works on the Customer entity, not the customers table
"SELECT c FROM Customer c WHERE c.email = :email"

// SQL — works on the table
"SELECT * FROM customers WHERE email = ?"

JPQL is portable across databases. Hibernate translates it to database-specific SQL at runtime.

The Domain Used in This Series

Every article builds on an e-commerce domain with these entities:

Category (has many Products)
    │
    └── Product (has many Tags, Reviews; belongs to Category)
            │
            ├── Tag (many-to-many with Product)
            └── Review (belongs to Product and Customer)

Customer (has many Orders, Addresses)
    │
    └── Order (has many OrderItems; belongs to Customer)
            │
            └── OrderItem (belongs to Order and Product)

You will build this schema progressively from Article 2 onward, adding each concept as a new layer.

Quick Comparison: JDBC vs. JPA vs. Spring Data JPA

	JDBC	JPA/Hibernate	Spring Data JPA
SQL	Hand-written	Generated	Generated or @Query
Mapping	Manual ResultSet	Automatic via @Entity	Automatic
Change detection	Manual UPDATE	Dirty checking	Dirty checking
Transactions	Manual	@Transactional	@Transactional
Boilerplate	High	Medium	Low
Control	Maximum	High	Medium
Learning curve	Low	Medium	Low once JPA understood

There is no universally correct choice — large, complex queries often benefit from native SQL even inside a Spring Data JPA project. But for CRUD operations, relationships, and standard queries, Spring Data JPA dramatically reduces the code you write and maintain.

Key Takeaways

JPA is the specification — the contract that defines how Java ORM must behave
Hibernate is the most popular JPA implementation — the engine that generates and executes SQL
Spring Data JPA sits on top — it generates repository implementations and eliminates DAO boilerplate
The persistence context is the foundational concept that makes JPA work — entities inside it are tracked and automatically synchronized to the database
JPQL queries objects, not tables — this is the query language used throughout JPA

What’s Next

Article 2 walks through setting up a Spring Boot 3.3 project with Spring Data JPA, MySQL 8.x, HikariCP, and Flyway — the full production-grade foundation used throughout this series.