Part 12 of 16

Flight Recorder and JVM Monitoring (JEP 328)

What Is Java Flight Recorder?

Java Flight Recorder (JFR) is a low-overhead, always-on profiling and diagnostics framework built into the JVM. It was a commercial feature of Oracle JDK until JEP 328 (Java 11) open-sourced it as part of OpenJDK.

JFR collects data about JVM internals and application behaviour — method profiling, allocation, GC pauses, thread states, I/O, lock contention — with a typical overhead of 1–2% in production.

This makes it fundamentally different from traditional profilers (JProfiler, YourKit): those profilers cause 10–50% overhead, making them impractical for production. JFR is safe to run continuously.


Architecture

JFR works on three event types:

Event TypeDescriptionExample
InstantOccurs at a single point in timeThread start/stop, exception
DurationHas a start time and end timeGC pause, method execution
SampledSampled periodicallyCPU-time method profiling (default: 10ms)

Events are written to a circular buffer in native memory (not heap-allocated) and periodically flushed to a .jfr file.


Enabling JFR

Command-line (start recording from launch)

java -XX:StartFlightRecording=\
  disk=true,\
  filename=/var/log/app/recording.jfr,\
  duration=60s,\
  settings=profile \
  -jar myapp.jar

Key parameters:

ParameterDescriptionDefault
disk=trueFlush events to disk continuouslyfalse
filenameOutput file path(auto-generated)
durationRecording duration; omit for continuousno limit
settingsEvent profile: default (low overhead) or profile (more detail)default
maxsizeMax file size before oldest data is discarded250 MB
maxageMax age of retained data1 day

Always-on continuous recording

For production use, enable a continuous recording that retains the last N minutes:

java \
  -XX:StartFlightRecording=disk=true,\
maxsize=100m,\
maxage=6h,\
settings=default \
  -jar myapp.jar

When an incident occurs, dump the recent recording:

jcmd <pid> JFR.dump filename=/tmp/incident.jfr

Controlling JFR with jcmd

jcmd is the JVM command tool. It communicates with a running JVM process:

# Find the JVM PID
jps -l
# 12345 com.example.myapp.Main

# List active JFR recordings
jcmd 12345 JFR.check

# Start a new recording
jcmd 12345 JFR.start name=myrecording settings=profile maxage=1h

# Dump current recording to file
jcmd 12345 JFR.dump filename=/tmp/dump.jfr

# Stop recording
jcmd 12345 JFR.stop name=myrecording filename=/tmp/final.jfr

JFR Configuration Files (settings)

JFR uses XML configuration files called event settings files. Two built-in profiles:

  • default — low overhead (~1%), suitable for always-on production monitoring
  • profile — higher detail (~2%), for investigating a known issue

Locate the built-in profiles:

ls $JAVA_HOME/lib/jfr/
# default.jfc  profile.jfc

Customising an event profile

# Copy the default profile and edit it
cp $JAVA_HOME/lib/jfr/default.jfc my-profile.jfc

# Use the custom profile
java -XX:StartFlightRecording=settings=my-profile.jfc -jar myapp.jar

Example: enable object allocation profiling (disabled in default, enabled in profile):

<!-- my-profile.jfc — enable allocation events -->
<event name="jdk.ObjectAllocationInNewTLAB">
  <setting name="enabled">true</setting>
</event>
<event name="jdk.ObjectAllocationOutsideTLAB">
  <setting name="enabled">true</setting>
</event>

Key JFR Events

CPU and method profiling

EventDescription
jdk.ExecutionSampleCPU-time stack trace samples (default: every 10ms)
jdk.NativeMethodSampleNative method CPU samples
jdk.CPULoadJVM and system CPU usage

Memory and allocation

EventDescription
jdk.ObjectAllocationInNewTLABObject allocated in a new TLAB
jdk.ObjectAllocationOutsideTLABLarge object allocation outside TLAB
jdk.GarbageCollectionGC pause details
jdk.G1HeapSummaryG1 heap summary after each GC
jdk.OldObjectSampleObjects alive on the heap for a long time (memory leak candidates)

Thread and synchronisation

EventDescription
jdk.ThreadStart, jdk.ThreadEndThread lifecycle
jdk.JavaMonitorWaitWait on a monitor (synchronized)
jdk.JavaMonitorEnterLock contention
jdk.ThreadParkLockSupport.park() calls

I/O

EventDescription
jdk.SocketRead, jdk.SocketWriteNetwork I/O timing
jdk.FileRead, jdk.FileWriteFile I/O timing

Analysing Recordings with JDK Mission Control

JDK Mission Control (JMC) is the GUI tool for analysing .jfr recordings. Download it separately from https://jdk.java.net/jmc/

# Open a recording
jmc recording.jfr

Key views in JMC:

  • Automated Analysis — JMC flags common issues (GC pressure, lock contention, allocation hot spots)
  • Method Profiling — flame graph of CPU time by method
  • Memory — heap usage, allocation rate, old object candidates
  • Threads — thread state timeline, lock contention
  • GC — GC pause timeline, heap after GC
  • I/O — network and file I/O latency distribution

Writing Custom JFR Events

You can create application-level events that appear in JFR recordings alongside JVM events. This is useful for correlating application-level actions (order processing, cache misses) with JVM behaviour.

import jdk.jfr.*;

@Name("com.example.OrderProcessed")
@Label("Order Processed")
@Category("Business Events")
@Description("Fired when an order is successfully processed")
public class OrderProcessedEvent extends jdk.jfr.Event {

    @Label("Order ID")
    public long orderId;

    @Label("Customer ID")
    public String customerId;

    @Label("Amount")
    @DataAmount(DataAmount.BYTES)
    public long amountCents;
}
public class OrderService {

    public void processOrder(Order order) {
        var event = new OrderProcessedEvent();
        event.begin();  // starts timing

        try {
            // ... business logic ...
            event.orderId     = order.getId();
            event.customerId  = order.getCustomerId();
            event.amountCents = order.getAmountCents();
        } finally {
            event.commit();  // records the event with duration
        }
    }
}

Custom events appear in JMC under your chosen category and can be correlated with GC pauses, CPU spikes, or lock contention in the same timeline.

Threshold-based events

Only record the event if it took longer than a threshold:

var event = new OrderProcessedEvent();
event.begin();
processInternal(order);
event.commit();  // event is silently dropped if duration < event's threshold setting

Configure the threshold in the settings file:

<event name="com.example.OrderProcessed">
  <setting name="enabled">true</setting>
  <setting name="threshold">10 ms</setting>  <!-- only record if > 10ms -->
</event>

Production JFR Checklist

# Recommended production JVM flags for JFR
java \
  -XX:StartFlightRecording=disk=true,\
maxsize=100m,\
maxage=6h,\
settings=default \
  -XX:FlightRecorderOptions=stackdepth=256 \
  -jar myapp.jar
ConcernRecommendation
OverheadUse settings=default (~1% overhead)
Retentionmaxage=6h keeps last 6 hours of events
File sizemaxsize=100m prevents unbounded growth
Dump on incidentjcmd <pid> JFR.dump immediately when issue observed
Stack depthstackdepth=256 captures full stacks
JVM flagsNo additional flags needed; JFR is on by default in Java 11+

What’s Next

Next: Security: TLS 1.3, ChaCha20-Poly1305, and Curve25519 (JEP 329, 332, 324)