Introduction to Spring Data JPA

Most Spring Boot applications need to read and write data to a relational database. Spring Data JPA makes this dramatically simpler by generating the boilerplate data access code for you. This article explains what it is, how it fits together, and how to get started.

The Stack: JPA, Hibernate, and Spring Data JPA

These three things work together — understanding the layering matters:

Your Code (Repository interfaces, @Entity classes)
           │
           ▼
  Spring Data JPA
  (generates repository implementations, adds convenience methods)
           │
           ▼
  JPA (Jakarta Persistence API)
  (standard specification: @Entity, @Id, EntityManager, JPQL)
           │
           ▼
  Hibernate (JPA implementation)
  (translates JPA operations to SQL, manages sessions)
           │
           ▼
  JDBC (Java Database Connectivity)
  (sends SQL to the database)
           │
           ▼
  PostgreSQL / MySQL / H2 / etc.

JPA is a specification — it defines the API (annotations, EntityManager, JPQL). Hibernate is the implementation of that spec. Spring Data JPA sits on top and generates repetitive repository code so you don’t have to write it.

Dependencies

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>

<!-- PostgreSQL driver -->
<dependency>
    <groupId>org.postgresql</groupId>
    <artifactId>postgresql</artifactId>
    <scope>runtime</scope>
</dependency>

<!-- For development: H2 in-memory database -->
<dependency>
    <groupId>com.h2database</groupId>
    <artifactId>h2</artifactId>
    <scope>runtime</scope>
</dependency>

Configuration

# application-dev.yml (H2 in-memory for development)
spring:
  datasource:
    url: jdbc:h2:mem:orderdb;DB_CLOSE_DELAY=-1;MODE=PostgreSQL
    driver-class-name: org.h2.Driver
    username: sa
    password:
  h2:
    console:
      enabled: true   # access at http://localhost:8080/h2-console
  jpa:
    hibernate:
      ddl-auto: create-drop   # recreate schema on each restart
    show-sql: true
    properties:
      hibernate:
        format_sql: true

# application-prod.yml (PostgreSQL)
spring:
  datasource:
    url: ${DB_URL}
    username: ${DB_USERNAME}
    password: ${DB_PASSWORD}
    hikari:
      maximum-pool-size: 20
      minimum-idle: 5
  jpa:
    hibernate:
      ddl-auto: validate  # validate schema but don't change it (Flyway manages schema)
    show-sql: false

ddl-auto options

ValueBehaviorUse for
createDrop and create schema on startup-
create-dropCreate on startup, drop on shutdownUnit tests
validateValidate schema matches entities, error if notProduction
updateUpdate schema to match entities (dangerous in prod)Never in production
noneDo nothingProduction (Flyway manages schema)

Your First Entity

@Entity
@Table(name = "orders")
public class Order {

    @Id
    @GeneratedValue(strategy = GenerationType.UUID)
    private UUID id;

    @Column(name = "customer_id", nullable = false)
    private UUID customerId;

    @Column(name = "order_number", unique = true, nullable = false, length = 20)
    private String orderNumber;

    @Enumerated(EnumType.STRING)
    @Column(nullable = false)
    private OrderStatus status;

    @Column(name = "total_amount", precision = 12, scale = 2)
    private BigDecimal totalAmount;

    @Column(name = "created_at", nullable = false, updatable = false)
    private Instant createdAt;

    @Column(name = "updated_at")
    private Instant updatedAt;

    @PrePersist
    protected void onCreate() {
        createdAt = Instant.now();
        updatedAt = Instant.now();
    }

    @PreUpdate
    protected void onUpdate() {
        updatedAt = Instant.now();
    }

    // getters, setters, constructors
}

Your First Repository

The magic of Spring Data JPA: define an interface, get a full implementation for free.

public interface OrderRepository extends JpaRepository<Order, UUID> {
    // That's it. Spring generates the implementation.
}

JpaRepository<Order, UUID> gives you:

// From CrudRepository
Order save(Order entity)
Optional<Order> findById(UUID id)
boolean existsById(UUID id)
void deleteById(UUID id)
List<Order> findAll()
long count()

// From JpaRepository
List<Order> findAll(Sort sort)
Page<Order> findAll(Pageable pageable)
List<Order> saveAll(Iterable<Order> entities)
void flush()
Order saveAndFlush(Order entity)

Use it in a service:

@Service
@RequiredArgsConstructor
public class OrderService {

    private final OrderRepository orderRepository;

    public Order findById(UUID id) {
        return orderRepository.findById(id)
            .orElseThrow(() -> new OrderNotFoundException(id));
    }

    public Order save(Order order) {
        return orderRepository.save(order);
    }

    public void delete(UUID id) {
        orderRepository.deleteById(id);
    }

    public List<Order> findAll() {
        return orderRepository.findAll();
    }
}

Derived Query Methods

Spring Data JPA generates queries from method names. The method name IS the query:

public interface OrderRepository extends JpaRepository<Order, UUID> {

    // SELECT * FROM orders WHERE customer_id = ?
    List<Order> findByCustomerId(UUID customerId);

    // SELECT * FROM orders WHERE status = ?
    List<Order> findByStatus(OrderStatus status);

    // SELECT * FROM orders WHERE customer_id = ? AND status = ?
    List<Order> findByCustomerIdAndStatus(UUID customerId, OrderStatus status);

    // SELECT * FROM orders WHERE total_amount > ?
    List<Order> findByTotalAmountGreaterThan(BigDecimal amount);

    // SELECT * FROM orders WHERE created_at BETWEEN ? AND ?
    List<Order> findByCreatedAtBetween(Instant from, Instant to);

    // SELECT COUNT(*) FROM orders WHERE status = ?
    long countByStatus(OrderStatus status);

    // SELECT * FROM orders WHERE customer_id = ? ORDER BY created_at DESC
    List<Order> findByCustomerIdOrderByCreatedAtDesc(UUID customerId);

    // LIMIT 10
    List<Order> findTop10ByStatusOrderByCreatedAtDesc(OrderStatus status);

    // EXISTS
    boolean existsByOrderNumber(String orderNumber);

    // DELETE
    void deleteByStatus(OrderStatus status);
}

Keywords you can use: And, Or, Is, Not, Between, LessThan, GreaterThan, Like, Containing, StartingWith, EndingWith, OrderBy, Top, First, Distinct, Count, Exists, Delete.

The EntityManager (Under the Hood)

Spring Data repositories use EntityManager internally. You can access it directly when needed:

@Repository
public class CustomOrderRepository {

    @PersistenceContext
    private EntityManager em;

    public List<Order> findLargeRecentOrders(BigDecimal minAmount, int limit) {
        return em.createQuery("""
                SELECT o FROM Order o
                WHERE o.totalAmount >= :minAmount
                ORDER BY o.createdAt DESC
                """, Order.class)
            .setParameter("minAmount", minAmount)
            .setMaxResults(limit)
            .getResultList();
    }
}

The Persistence Context

The EntityManager maintains a persistence context — a first-level cache of all entities it has loaded in the current transaction. This has important implications:

@Transactional
public void updateOrder(UUID id, OrderStatus newStatus) {
    Order order = orderRepository.findById(id).orElseThrow();

    // Hibernate tracks this entity
    order.setStatus(newStatus);

    // No explicit save needed — Hibernate detects the change and
    // generates UPDATE SQL at the end of the transaction (dirty checking)
}

Dirty checking: Hibernate compares entity state at transaction end with the state when it was loaded. If anything changed, it generates UPDATE SQL automatically.

Auditing with @CreatedDate and @LastModifiedDate

Instead of @PrePersist/@PreUpdate, use Spring Data’s auditing support:

// Enable auditing
@SpringBootApplication
@EnableJpaAuditing
public class OrderServiceApplication { ... }

// In your entity
@Entity
@EntityListeners(AuditingEntityListener.class)
public class Order {

    @Id
    @GeneratedValue(strategy = GenerationType.UUID)
    private UUID id;

    @CreatedDate
    @Column(updatable = false)
    private Instant createdAt;

    @LastModifiedDate
    private Instant updatedAt;

    @CreatedBy
    @Column(updatable = false)
    private String createdBy;

    @LastModifiedBy
    private String lastModifiedBy;
}

For @CreatedBy / @LastModifiedBy, provide an AuditorAware bean:

@Component
public class SecurityAuditorAware implements AuditorAware<String> {

    @Override
    public Optional<String> getCurrentAuditor() {
        return Optional.ofNullable(SecurityContextHolder.getContext())
            .map(SecurityContext::getAuthentication)
            .filter(Authentication::isAuthenticated)
            .map(Authentication::getName);
    }
}

A Complete Data Layer

Here’s the full picture for the order-service:

com.devopsmonk.order
├── domain
│   ├── Order.java          @Entity
│   ├── OrderItem.java      @Entity
│   └── OrderStatus.java    enum
├── repository
│   ├── OrderRepository.java          extends JpaRepository
│   └── OrderItemRepository.java      extends JpaRepository
└── service
    └── OrderService.java    uses repositories
@Entity
@Table(name = "orders")
@EntityListeners(AuditingEntityListener.class)
public class Order {

    @Id
    @GeneratedValue(strategy = GenerationType.UUID)
    private UUID id;

    @Column(name = "customer_id", nullable = false)
    private UUID customerId;

    @OneToMany(mappedBy = "order", cascade = CascadeType.ALL, orphanRemoval = true)
    private List<OrderItem> items = new ArrayList<>();

    @Enumerated(EnumType.STRING)
    @Column(nullable = false)
    private OrderStatus status = OrderStatus.PENDING;

    @Column(name = "total_amount", precision = 12, scale = 2)
    private BigDecimal totalAmount;

    @CreatedDate
    @Column(updatable = false)
    private Instant createdAt;

    @LastModifiedDate
    private Instant updatedAt;

    // business methods
    public void addItem(OrderItem item) {
        items.add(item);
        item.setOrder(this);
        recalculateTotal();
    }

    public void confirm() {
        if (status != OrderStatus.PENDING) {
            throw new IllegalStateException("Can only confirm pending orders");
        }
        status = OrderStatus.CONFIRMED;
    }

    private void recalculateTotal() {
        totalAmount = items.stream()
            .map(item -> item.getUnitPrice().multiply(BigDecimal.valueOf(item.getQuantity())))
            .reduce(BigDecimal.ZERO, BigDecimal::add);
    }
}

What You’ve Learned

  • Spring Data JPA sits on top of JPA (spec) → Hibernate (implementation) → JDBC → database
  • Define an interface extending JpaRepository<Entity, ID> — Spring generates the implementation
  • Derived query methods generate SQL from method names — findByCustomerIdAndStatus(...) etc.
  • ddl-auto: validate in production, create-drop in tests; use Flyway for schema management
  • Hibernate dirty checking: modify an entity in a @Transactional method and it’s saved automatically
  • @EnableJpaAuditing with @CreatedDate/@LastModifiedDate handles timestamps automatically

Next: Article 16 — JPA Entity Mapping@Entity, @Column, @Id strategies, embedded objects, and inheritance.