Loki and Promtail¶
Loki is one of PerfShop's two log sinks (the other being OpenSearch — see opensearch.md). It runs in single-node mode, indexes only on labels (not full-text), and retains logs for 7 days. Promtail is the agent that collects logs from the Docker socket and pushes them to Loki.
Source of truth
This page is taken from loki/loki-config.yml and promtail/promtail-config.yml, and from the bind mounts of the perfshop-promtail service in the compose files.
Pipeline architecture¶
flowchart LR
subgraph sources["Log sources"]
direction TB
SOCK["/var/run/docker.sock<br/>(Docker Engine API)"]
JLOG["./jmeter/logs/jmeter.log<br/>(bind mount RO)"]
RFLOG["./test-runner/logs/*.log<br/>(bind mount RO)"]
end
PT["perfshop-promtail<br/>(grafana/promtail:latest)"]
LOKI[("perfshop-loki<br/>(grafana/loki:latest)<br/>retention 168h")]
GRAF["Grafana<br/>(Loki datasource)"]
SOCK -->|docker SD| PT
JLOG -->|file tail| PT
RFLOG -->|file tail| PT
PT -->|push API<br/>http://perfshop-loki:3100/loki/api/v1/push| LOKI
LOKI --> GRAF
Loki — configuration¶
Mode and storage¶
| Parameter | Value | Effect |
|---|---|---|
auth_enabled: false |
— | No authentication — Loki is only accessible from the internal Docker network |
target: all |
— | All modules (distributor, ingester, querier, query-frontend) run in the same process |
Storage and schema¶
common:
instance_addr: 127.0.0.1
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: 2024-01-01
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
| Parameter | Value | Effect |
|---|---|---|
| Storage backend | filesystem |
Data stored in the named volume loki-data (/loki internally) |
| Index schema | v13 (TSDB) |
Modern format, more efficient than the historical BoltDB schemas |
| Index period | 24h |
A new index per day, prefix index_ |
| Replication factor | 1 |
Single-node mode, no replication |
| Ring KV store | inmemory |
No Consul or etcd |
Limits and retention¶
limits_config:
retention_period: 168h # 7 days
ingestion_rate_mb: 4 # 4 MB/s per tenant
ingestion_burst_size_mb: 8 # tolerated burst
max_query_series: 500 # cap on the number of series returned by a query
compactor:
working_directory: /loki/compactor
retention_enabled: true
retention_delete_delay: 2h
compaction_interval: 10m
delete_request_store: filesystem
| Parameter | Value | Effect |
|---|---|---|
retention_period |
168h (7 days) | Automatic deletion by the compactor |
ingestion_rate_mb |
4 MB/s | Ingestion rate limit |
ingestion_burst_size_mb |
8 MB | Tolerance for spikes |
max_query_series |
500 | Cap to avoid runaway queries |
retention_enabled (compactor) |
true | Enables automatic purge |
compaction_interval |
10m | The compactor runs every 10 minutes |
retention_delete_delay |
2h | Grace period before actual deletion |
Query cache¶
100 MB embedded cache to speed up repeated queries (useful for Grafana, which re-queries the same time ranges on each refresh).
Ports¶
| Internal port | Usage |
|---|---|
| 3100 | HTTP listen — push, query, admin API (http_listen_port: 3100) |
| 9096 | gRPC listen — internal communication, unused in single-node (grpc_listen_port: 9096) |
The default host port is 19100 (variable LOKI_HTTP_PORT). The container internal port remains 3100 — only the host mapping changes.
Promtail — configuration¶
Target and endpoint¶
Promtail pushes logs to Loki via the standard push API, using internal Docker DNS.
Three scrape jobs¶
PerfShop declares three distinct Promtail jobs.
flowchart TB
subgraph p["Promtail"]
J1["Job perfshop-containers<br/>(docker SD)"]
J2["Job jmeter-log<br/>(file tail)"]
J3["Job rf-runner-log<br/>(file tail)"]
end
SOCK["/var/run/docker.sock"] --> J1
J1 -.filter.-> KEEP["perfshop-app<br/>perfshop-frontend<br/>perfshop-db<br/>perfshop-jmeter-ui"]
JFILE["/jmeter-logs/jmeter.log"] --> J2
RFFILES["/rf-logs/*.log"] --> J3
J1 --> LOKI[("Loki")]
J2 --> LOKI
J3 --> LOKI
Job 1 — perfshop-containers (Docker SD)¶
- job_name: perfshop-containers
docker_sd_configs:
- host: unix:///var/run/docker.sock
refresh_interval: 5s
filters:
- name: name
values:
- perfshop-app
- perfshop-frontend
- perfshop-db
- perfshop-jmeter-ui
Promtail queries the Docker API via the bind-mounted Unix socket every 5 seconds. The name filter only includes containers whose name matches one of the listed values. Only four containers are collected via Docker SD: the backend, the frontend, the MySQL database, and perfshop-jmeter-ui.
The other services (Grafana, Tempo, Squash TM, Forgejo, etc.) are not collected by Loki — their logs only go to OpenSearch via Vector. This is intentional: Loki is sized for the "hot" logs used in pedagogical demos; OpenSearch is the exhaustive sink for full-text search.
Relabeling:
relabel_configs:
- source_labels: [__meta_docker_container_name]
regex: /(.*)
target_label: container
- source_labels: [__meta_docker_container_name]
regex: /(.*)
target_label: job
- source_labels: [container]
regex: "perfshop-app|perfshop-frontend|perfshop-db|perfshop-jmeter-ui"
action: keep
The / prefix that Docker adds to container names (/perfshop-app) is stripped to expose the labels container=perfshop-app and job=perfshop-app directly in Loki.
Pipeline stages:
pipeline_stages:
- docker: {}
- match:
selector: '{container="perfshop-app"}'
stages:
- regex:
expression: '(?P<level>ERROR|WARN|INFO|DEBUG)'
- labels:
level:
- match:
selector: '{container="perfshop-app"}'
stages:
- multiline:
firstline: '^\d{4}-\d{2}-\d{2}'
max_wait_time: 3s
Three stages:
docker: {}— parses the JSON format of Docker logs and extractstime,stream,attrs, andlog(the actual text).- Level extraction for
perfshop-app—ERROR|WARN|INFO|DEBUGregex that extracts the level and sets it as a Loki label. This is the label that enables the query{container="perfshop-app"} | level="ERROR". - Multiline for Java stack traces — all lines that do not start with a timestamp (
^\d{4}-\d{2}-\d{2}) are concatenated to the previous event. A Java stack trace therefore remains a single Loki event, which makes reading much more natural in Grafana.
Job 2 — jmeter-log (file tail)¶
- job_name: jmeter-log
static_configs:
- targets: [localhost]
labels:
job: perfshop-jmeter
container: perfshop-jmeter
__path__: /jmeter-logs/jmeter.log
Why a separate job? The perfshop-jmeter container runs tail -f /dev/null (idle). No log is emitted on its standard output, so docker logs perfshop-jmeter is empty and the Docker SD of job 1 captures nothing. During a JMeter run, the engine writes to /jmeter-logs/jmeter.log inside the container, which is the bind mount ./jmeter/logs:/jmeter-logs. Promtail reads this file directly from the host filesystem (the bind mount is also mounted into perfshop-promtail in read-only mode).
The pipeline applies the same level + multiline parsing as job 1, adapted to the JMeter format.
Job 3 — rf-runner-log (file tail)¶
- job_name: rf-runner-log
static_configs:
- targets: [localhost]
labels:
job: perfshop-test-runner
container: perfshop-test-runner
__path__: /rf-logs/*.log
Same mechanism for Robot Framework and pytest, which write to /rf-logs/ (bind mount ./test-runner/logs). The *.log glob collects all log files produced by the runs.
The pipeline adds PASS and FAIL to the extracted levels, in addition to ERROR|WARN|INFO|DEBUG, because Robot Framework uses these tags for test results.
Promtail container bind mounts¶
volumes:
- ./promtail/promtail-config.yml:/etc/promtail/config.yml:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./jmeter/logs:/jmeter-logs:ro
- ./test-runner/logs:/rf-logs:ro
All mounts are read-only on the Promtail side. The Docker socket is mounted to enable SD discovery; the two log directories are the same ones written to by the JMeter and Test Runner containers respectively.
LogQL examples¶
All the queries below are extracted from the Grafana dashboards actually shipped in grafana/dashboards/{eleves,formateurs}/dashboard-logs-*.json.
All backend logs¶
The | logfmt parser is used by the Instructor Logs dashboard. Spring Boot emits logs in key=value format when the logstash-logback-encoder logback encoder is active, which makes it possible to extract level, logger_name, message, etc. as queryable fields.
ERROR-level logs only¶
{container="perfshop-app"} != "[BusinessChaos]" != "[BackendChaos]" != "[SecurityChaos]" != "[ChaosInterceptor]" != "[FrontendChaos]" != "[ChaosScripting]" |= "ERROR"
This is the query used by the Student Logs dashboard (Backend errors only panel). The != exclude the chaos engine internal logs to avoid spoiling the student; the |= only keeps lines containing the word ERROR.
Logs from a specific chaos family¶
Used by the Instructor Logs dashboard. Each chaos family prefixes its logs with a tag in brackets, which makes filtering trivial.
Log volume by level (timeseries)¶
Combines count_over_time (the LogQL equivalent of rate for logs) with a logfmt parser that dynamically extracts the level label. The sum by (level) allows ERROR, WARN, and INFO to be overlaid in the same panel.
Nginx logs with HTTP 4xx or 5xx errors¶
count_over_time({container="perfshop-frontend"} |= " 4" [1m])
count_over_time({container="perfshop-frontend"} |= " 5" [1m])
Simple approach: nginx logs HTTP codes with a leading space (HTTP/1.1" 404), so |= " 4" matches the 4xx codes. No dedicated parser — that is sufficient for pedagogical needs.
MySQL logs with note exclusion¶
MySQL 8 logs a huge number of [note] lines at startup and during normal operations. The != excludes them to keep only [error] and [warning].
Volumes¶
| Volume | Mount | Content |
|---|---|---|
loki-data (named volume) |
/loki |
Chunks, index, compactor working dir, WAL |
./loki/loki-config.yml (bind mount) |
/etc/loki/local-config.yaml |
Loki configuration (read-only) |
Ports¶
| Service | Host port | Container port | Env variable |
|---|---|---|---|
perfshop-loki |
19100 | 3100 | LOKI_HTTP_PORT |
perfshop-promtail |
(none) | (internal only) | — |
Promtail does not expose any port to the host: it pushes to Loki and does not need to be reachable from outside. Its internal HTTP port 9080 is only for Promtail's internal metrics (not scraped by PerfShop).
To go further¶
- Overview — global observability flow, Loki vs OpenSearch comparison
- Grafana — Loki datasource and
tracesToLogsV2correlation - Shipped dashboards — Student and Instructor log panels
- OpenSearch and Vector — the other log sink (full-text)