Skip to content

Functional Chaos

Functional Chaos injects real Java exceptions into the application journey at targeted places in the code. Unlike Performance Chaos which degrades the infrastructure, this chaos degrades the code itself: NPE, StackOverflow, OutOfMemory, and silent data corruption.

The pedagogical goal is for students to learn to identify the faulty method through APM (Tempo, Pyroscope) and to understand the exception type without access to the source code.

Service and endpoint

Class: FunctionalChaosService.java Controller: FunctionalChaosController.java Admin endpoint: POST /api/admin/chaos/functional body {"level": 0-4} Public endpoint: GET /api/chaos/public/functional/{status,logs,anomalies}

Levels

Level Label Active anomalies Cumulative
0 Disabled none 0
1 Junior F1 1
2 Intermediate F1, F2 2
3 Expert F1, F2, F3 3
4 Master F1, F2, F3, F4 4

As with all other families, levels are cumulative: going from level 2 to level 3 adds F3 without disabling F1 and F2.

F1 — Payment NullPointerException

Required level: 1+ (Junior) Injection method: applyF1PaymentNpe(String paymentMethod) Metric: chaos_functional_f1_npe Counter: cntF1

Simulated anomaly

NullPointerException raised in the payment service (processPaymentPublic) — every order fails with HTTP 500. The exact exception message is:

Payment gateway configuration is null — service 'PaymentGatewayConfig'
not initialized (injected fault)

Observable symptoms

  • Every order returns HTTP 500 on POST /api/orders
  • Tempo: span OrderService.createOrder → child OrderService.processPaymentPublic marked ERROR with exception.type = java.lang.NullPointerException
  • Loki logs: [FunctionalChaos][F1] NullPointerException injected into processPaymentPublic — method={paymentMethod}
  • Activity log: F1_NPE entry with severity ERROR

Associated pedagogy

Diagnosis of a non-trivial root cause through the Tempo stack. Students must identify that the NPE is inside an injected (null) dependency and not in the business code — a classic case of missing external config.

F2 — Order calculation StackOverflowError

Required level: 2+ (Intermediate) Injection method: applyF2CalculationStackOverflow(String orderRef) Metric: chaos_functional_f2_stackoverflow

Simulated anomaly

StackOverflowError raised in the order total calculation (calculateOrderTotal) through pure infinite recursion:

private void triggerRecursion(int depth) {
    triggerRecursion(depth + 1);
}

Observable symptoms

  • HTTP 500 on POST /api/orders (like F1, but a different cause)
  • Tempo: ERROR span with a repetitive stack trace — the same triggerRecursion frame appears 50+ times in the truncated stack
  • Loki logs: [FunctionalChaos][F2] StackOverflowError injected into calculateOrderTotal — ref={orderRef}
  • Activity log: F2_STACKOVERFLOW severity ERROR

Associated pedagogy

Differential diagnosis versus F1: both produce HTTP 500 on /api/orders, but the signature in Tempo is radically different (NPE = short stack with a clear cause; SOE = ultra-long stack with recursion). Students must be able to read a stack trace to distinguish the two error families.

F3 — Catalog OutOfMemoryError

Required level: 3+ (Expert) Injection method: applyF3CatalogOom() Metric: chaos_functional_f3_oom

Simulated anomaly

Massive heap allocation in getAllProducts() until an OutOfMemoryError is triggered:

java.util.ArrayList<byte[]> leak = new java.util.ArrayList<>();
while (true) {
    leak.add(new byte[8 * 1024 * 1024]); // 8 MB per iteration
}

Observable symptoms

  • The product catalog (GET /api/products) becomes unreachable
  • Automatic heap dump if -XX:+HeapDumpOnOutOfMemoryError is active (PerfShop also exposes it through /actuator/heapdump — this endpoint is excluded from chaos to remain functional)
  • Tempo: ERROR span with exception.type = java.lang.OutOfMemoryError
  • JVM metrics: jvm_memory_used_bytes{area="heap"} saturated, jvm_gc_pause_seconds explodes just before the crash

Associated pedagogy

Generation and analysis of an Eclipse MAT heap dump: students identify the dominant byte[] through the Dominator Tree and trace the chain back to applyF3CatalogOom. This is the flagship scenario of the PerfShop "heap dump" journey.

F4 — Silent product data corruption

Required level: 4 (Master) Injection method: applyF4DataCorruption(Product product) Metric: chaos_functional_f4_corruption

Simulated anomaly

This is the trickiest anomaly in functional chaos. No exception is raised. The method returns a corrupted Product object:

Field Applied transformation
price Multiplied by 1.5 (rounded HALF_UP, scale 2)
stock Forced to 0
description Truncated to 30 characters + suffix [ERROR: truncated data]
id, name, category, imageUrl, dates Unchanged (keeps visual identity)

Observable symptoms

  • HTTP 200 on GET /api/products/{id} — no error
  • Tempo green — no exception propagated
  • No error metric moves
  • Only way to diagnose: manual payload inspection — compare returned prices to expected prices, check that descriptions are not truncated, and that stock is not systematically zero
  • Loki logs (the only hint): [FunctionalChaos][F4] Silent product corruption id={id} name='{name}' — price {old} -> {new}, stock {old} -> 0

Associated pedagogy

This is the anomaly that demonstrates the limits of automatic observability. APM, monitoring, alerting: everything is green. Diagnosing it requires either an assertion test suite (Robot Framework, Cucumber) or a manual database analysis. Ideal for discussing the functional coverage of automated tests.

API — public endpoints

Endpoint Description
GET /api/chaos/public/functional/status Current level + counters F1–F4
GET /api/chaos/public/functional/logs Activity log (last 200 entries)
GET /api/chaos/public/functional/anomalies?level=N Pedagogical catalog for level N

All these endpoints are available without authentication — they feed the instructor's real-time monitoring dashboard.

API — admin endpoints

# Enable Expert level
curl -X POST https://perfshop-api.perfshop.io/api/admin/chaos/functional \
  -H "X-Admin-Token: $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"level": 3}'

# Reset (all anomalies disabled + counters set to 0)
curl -X POST https://perfshop-api.perfshop.io/api/admin/chaos/functional/reset \
  -H "X-Admin-Token: $TOKEN"

# Clear the activity log
curl -X POST https://perfshop-api.perfshop.io/api/admin/chaos/functional/logs/clear \
  -H "X-Admin-Token: $TOKEN"

Student activation

POST /api/chaos/student/functional body {"level": N} — requires student mode and a valid license for level > 0. Without a license, it returns HTTP 402 LICENSE_REQUIRED.