Functional Chaos¶
Functional Chaos injects real Java exceptions into the application journey at targeted places in the code. Unlike Performance Chaos which degrades the infrastructure, this chaos degrades the code itself: NPE, StackOverflow, OutOfMemory, and silent data corruption.
The pedagogical goal is for students to learn to identify the faulty method through APM (Tempo, Pyroscope) and to understand the exception type without access to the source code.
Service and endpoint¶
Class: FunctionalChaosService.java
Controller: FunctionalChaosController.java
Admin endpoint: POST /api/admin/chaos/functional body {"level": 0-4}
Public endpoint: GET /api/chaos/public/functional/{status,logs,anomalies}
Levels¶
| Level | Label | Active anomalies | Cumulative |
|---|---|---|---|
| 0 | Disabled | none | 0 |
| 1 | Junior | F1 | 1 |
| 2 | Intermediate | F1, F2 | 2 |
| 3 | Expert | F1, F2, F3 | 3 |
| 4 | Master | F1, F2, F3, F4 | 4 |
As with all other families, levels are cumulative: going from level 2 to level 3 adds F3 without disabling F1 and F2.
F1 — Payment NullPointerException¶
Required level: 1+ (Junior)
Injection method: applyF1PaymentNpe(String paymentMethod)
Metric: chaos_functional_f1_npe
Counter: cntF1
Simulated anomaly¶
NullPointerException raised in the payment service (processPaymentPublic) — every order fails with HTTP 500. The exact exception message is:
Payment gateway configuration is null — service 'PaymentGatewayConfig'
not initialized (injected fault)
Observable symptoms¶
- Every order returns HTTP 500 on
POST /api/orders - Tempo: span
OrderService.createOrder→ childOrderService.processPaymentPublicmarkedERRORwithexception.type = java.lang.NullPointerException - Loki logs:
[FunctionalChaos][F1] NullPointerException injected into processPaymentPublic — method={paymentMethod} - Activity log:
F1_NPEentry with severityERROR
Associated pedagogy¶
Diagnosis of a non-trivial root cause through the Tempo stack. Students must identify that the NPE is inside an injected (null) dependency and not in the business code — a classic case of missing external config.
F2 — Order calculation StackOverflowError¶
Required level: 2+ (Intermediate)
Injection method: applyF2CalculationStackOverflow(String orderRef)
Metric: chaos_functional_f2_stackoverflow
Simulated anomaly¶
StackOverflowError raised in the order total calculation (calculateOrderTotal) through pure infinite recursion:
Observable symptoms¶
- HTTP 500 on
POST /api/orders(like F1, but a different cause) - Tempo:
ERRORspan with a repetitive stack trace — the sametriggerRecursionframe appears 50+ times in the truncated stack - Loki logs:
[FunctionalChaos][F2] StackOverflowError injected into calculateOrderTotal — ref={orderRef} - Activity log:
F2_STACKOVERFLOWseverityERROR
Associated pedagogy¶
Differential diagnosis versus F1: both produce HTTP 500 on /api/orders, but the signature in Tempo is radically different (NPE = short stack with a clear cause; SOE = ultra-long stack with recursion). Students must be able to read a stack trace to distinguish the two error families.
F3 — Catalog OutOfMemoryError¶
Required level: 3+ (Expert)
Injection method: applyF3CatalogOom()
Metric: chaos_functional_f3_oom
Simulated anomaly¶
Massive heap allocation in getAllProducts() until an OutOfMemoryError is triggered:
java.util.ArrayList<byte[]> leak = new java.util.ArrayList<>();
while (true) {
leak.add(new byte[8 * 1024 * 1024]); // 8 MB per iteration
}
Observable symptoms¶
- The product catalog (
GET /api/products) becomes unreachable - Automatic heap dump if
-XX:+HeapDumpOnOutOfMemoryErroris active (PerfShop also exposes it through/actuator/heapdump— this endpoint is excluded from chaos to remain functional) - Tempo:
ERRORspan withexception.type = java.lang.OutOfMemoryError - JVM metrics:
jvm_memory_used_bytes{area="heap"}saturated,jvm_gc_pause_secondsexplodes just before the crash
Associated pedagogy¶
Generation and analysis of an Eclipse MAT heap dump: students identify the dominant byte[] through the Dominator Tree and trace the chain back to applyF3CatalogOom. This is the flagship scenario of the PerfShop "heap dump" journey.
F4 — Silent product data corruption¶
Required level: 4 (Master)
Injection method: applyF4DataCorruption(Product product)
Metric: chaos_functional_f4_corruption
Simulated anomaly¶
This is the trickiest anomaly in functional chaos. No exception is raised. The method returns a corrupted Product object:
| Field | Applied transformation |
|---|---|
price |
Multiplied by 1.5 (rounded HALF_UP, scale 2) |
stock |
Forced to 0 |
description |
Truncated to 30 characters + suffix [ERROR: truncated data] |
id, name, category, imageUrl, dates |
Unchanged (keeps visual identity) |
Observable symptoms¶
- HTTP 200 on
GET /api/products/{id}— no error - Tempo green — no exception propagated
- No error metric moves
- Only way to diagnose: manual payload inspection — compare returned prices to expected prices, check that descriptions are not truncated, and that stock is not systematically zero
- Loki logs (the only hint):
[FunctionalChaos][F4] Silent product corruption id={id} name='{name}' — price {old} -> {new}, stock {old} -> 0
Associated pedagogy¶
This is the anomaly that demonstrates the limits of automatic observability. APM, monitoring, alerting: everything is green. Diagnosing it requires either an assertion test suite (Robot Framework, Cucumber) or a manual database analysis. Ideal for discussing the functional coverage of automated tests.
API — public endpoints¶
| Endpoint | Description |
|---|---|
GET /api/chaos/public/functional/status |
Current level + counters F1–F4 |
GET /api/chaos/public/functional/logs |
Activity log (last 200 entries) |
GET /api/chaos/public/functional/anomalies?level=N |
Pedagogical catalog for level N |
All these endpoints are available without authentication — they feed the instructor's real-time monitoring dashboard.
API — admin endpoints¶
# Enable Expert level
curl -X POST https://perfshop-api.perfshop.io/api/admin/chaos/functional \
-H "X-Admin-Token: $TOKEN" \
-H "Content-Type: application/json" \
-d '{"level": 3}'
# Reset (all anomalies disabled + counters set to 0)
curl -X POST https://perfshop-api.perfshop.io/api/admin/chaos/functional/reset \
-H "X-Admin-Token: $TOKEN"
# Clear the activity log
curl -X POST https://perfshop-api.perfshop.io/api/admin/chaos/functional/logs/clear \
-H "X-Admin-Token: $TOKEN"
Student activation¶
POST /api/chaos/student/functional body {"level": N} — requires student mode and a valid license for level > 0. Without a license, it returns HTTP 402 LICENSE_REQUIRED.