OpenSearch and Vector¶

OpenSearch is PerfShop's second log sink, in parallel with Loki. Where Loki indexes only on labels and stores content as text (lightweight model, query by filtering), OpenSearch indexes all fields in full-text and enables rich aggregations and facets. Vector plays the role of collection and transformation agent between Docker logs and OpenSearch.

Source of truth

This page is taken from vector/vector.toml, opensearch/opensearch.yml, opensearch-seed/seed.py, and the perfshop-opensearch, perfshop-opensearch-dashboards, perfshop-vector, perfshop-opensearch-seed blocks of the compose files.

Why two log sinks?¶

This is a legitimate question — collecting the same logs twice does not seem natural. The answer comes down to three points:

Pedagogical demonstration: students must be able to concretely compare the "label-based index" model (Loki) and the "full-text index" model (OpenSearch / Elasticsearch). Having both in parallel makes it possible to illustrate live, on the same real logs, the strengths and limitations of each approach.
Different use cases: Loki is unbeatable for fast filtering by container and level during a lab. OpenSearch is unbeatable for exploratory full-text search ("find all exceptions where the word connection appears, regardless of the service").
Grafana / OpenSearch Dashboards coupling: Loki is natively integrated into Grafana; OpenSearch has its own UI (OpenSearch Dashboards, a Kibana fork). Two UIs, two paradigms — the student sees both worlds.

Pinned versions¶

PerfShop pins OpenSearch and OpenSearch Dashboards to 2.13.0, and Vector to 0.38.0-alpine. The three components are linked: Vector 0.38 uses a stable VRL syntax, and OpenSearch 2.13 supports the APIs used by the Python seed (_index_template, saved_objects, _import).

Architecture¶

flowchart LR
  SOCK["/var/run/docker.sock"]

  VEC["perfshop-vector<br/>(timberio/vector:0.38.0-alpine)"]

  OS["perfshop-opensearch<br/>(2.13.0)<br/>full-text indexing"]

  OSD["perfshop-opensearch-dashboards<br/>(2.13.0)<br/>Kibana-compatible UI"]

  SEED["perfshop-opensearch-seed<br/>(python:3.11-slim)<br/>one-shot"]

  SOCK -->|"docker_logs source"| VEC
  VEC -->|"VRL transform<br/>JSON parse +<br/>service_family routing"| VEC
  VEC -->|"elasticsearch sink<br/>bulk.index = perfshop-{family}"| OS

  OS --> OSD

  SEED -.|"index templates +<br/>index patterns +<br/>dashboard import"| OS
  SEED -.|"GET /api/status"| OSD

Vector — collection and transformation¶

Vector is the most technically interesting component of this stack. It works as a declarative TOML pipeline: sources → transforms → sinks.

Source — `docker_logs`¶

[sources.docker_logs]
type = "docker_logs"
docker_host = "unix:///var/run/docker.sock"
include_containers = [
  "perfshop-app",
  "perfshop-frontend",
  "perfshop-db",
  "perfshop-monitoring",
  "perfshop-chaos-admin",
  "perfshop-admin",
  "perfshop-jmeter",
  "perfshop-jmeter-ui",
  "perfshop-loki",
  "perfshop-promtail",
  "perfshop-tempo",
  "perfshop-pyroscope",
  "perfshop-prometheus",
  "perfshop-grafana",
  "perfshop-testmgmt",
  "perfshop-squash-db",
  "perfshop-selenium",
  "perfshop-test-runner",
  "perfshop-orchestrator",
  "perfshop-forgejo",
  "perfshop-scripts-ui",
  "perfshop-welcome",
  "perfshop-docs",
]

Vector reads logs via the Docker socket (mounted as a bind mount), exactly like Promtail. But unlike Promtail, which only covers 4 containers, Vector collects 23 containers: all the application, observability and QA services. The one-shot services (*-seed) are excluded because they only emit a few lines at startup.

Pedagogical games hub container

The pedagogical games hub is included in the Vector sources because it is technically an nginx container like any other. No information about its URL, its port, or its Docker service name appears in the user documentation — only the technical log collection is mentioned here.

Transform — VRL (Vector Remap Language)¶

[transforms.enrich]
type = "remap"
inputs = ["docker_logs"]
source = '''
.container = replace(string!(.container_name), "/", "")

if exists(.timestamp) {
  ."@timestamp" = .timestamp
} else if exists(.time) {
  ."@timestamp" = .time
} else {
  ."@timestamp" = now()
}

parsed, err = parse_json(.message)
if err == null {
  if exists(parsed.level)        { .level        = string!(parsed.level) }
  if exists(parsed.logger_name)  { .logger       = string!(parsed.logger_name) }
  if exists(parsed.message)      { .msg          = string!(parsed.message) }
  if exists(parsed.chaos_family) { .chaos_family = string!(parsed.chaos_family) }
  if exists(parsed.chaos_level)  { .chaos_level  = string!(parsed.chaos_level) }
  if exists(parsed.scenario_id)  { .scenario_id  = string!(parsed.scenario_id) }
} else {
  .msg   = string!(.message)
  .level = "INFO"
}

del(.label)
del(.labels)
del(.host)
del(.source_type)

c = .container

.service_family = if c == "perfshop-app" {
  "spring"
} else if c == "perfshop-frontend" || c == "perfshop-admin" || c == "perfshop-chaos-admin" || c == "perfshop-monitoring" || c == "perfshop-scripts-ui" || c == "perfshop-welcome" || c == "perfshop-docs" {
  "nginx"
} else if c == "perfshop-db" || c == "perfshop-squash-db" {
  "mysql"
} else if c == "perfshop-jmeter" || c == "perfshop-jmeter-ui" {
  "jmeter"
} else if c == "perfshop-testmgmt" || c == "perfshop-orchestrator" || c == "perfshop-selenium" || c == "perfshop-test-runner" {
  "qa"
} else if c == "perfshop-forgejo" {
  "forgejo"
} else {
  "observability"
}
'''

This is a real little VRL program that does five things on each event:

1. Container name cleanup¶

Docker adds a / prefix to container names (/perfshop-app). The replace strips this prefix to expose a clean container=perfshop-app field.

2. Timestamp mapping¶

OpenSearch Dashboards requires an @timestamp field for the time-based index pattern. Vector maps timestamp → @timestamp (or time → @timestamp, or now() as a last resort).

3. Conditional JSON parsing¶

Spring Boot with logstash-logback-encoder produces logs in JSON format:

{"@timestamp":"...","level":"ERROR","logger_name":"com.perfshop.controller.AuthController","message":"Login failed","chaos_family":"security","chaos_level":2,"scenario_id":"S6"}

Vector tries to parse the message field as JSON. If it succeeds, it extracts six specific fields (level, logger, msg, chaos_family, chaos_level, scenario_id) and promotes them as top-level fields indexed by OpenSearch. If parsing fails (raw-text nginx, MySQL, etc. logs), the raw message is placed in msg and level is forced to INFO.

This is where the added value of OpenSearch over Loki becomes visible: the chaos_family, chaos_level, and scenario_id fields are indexed as keyword, which enables aggregations such as "how many events with scenario_id=S6 in the last hour?" — impossible to do efficiently in LogQL.

4. Removal of noisy Docker fields¶

del(.label)
del(.labels)
del(.host)
del(.source_type)

Docker Compose labels contain dots (com.docker.compose.project) which are incompatible with OpenSearch mappings (dots are interpreted as nesting). Vector removes them before indexing.

5. Routing by service family¶

Each container is mapped to a family (spring, nginx, mysql, jmeter, qa, forgejo, observability). This family becomes the suffix of the target OpenSearch index — having a single container per family is not required.

Sink — `elasticsearch` (ES compatibility)¶

[sinks.opensearch]
type = "elasticsearch"
inputs = ["enrich"]
endpoints = ["http://perfshop-opensearch:9200"]
mode = "bulk"
suppress_type_name = true

bulk.index = "perfshop-{{ service_family }}"

compression = "gzip"
request.retry_attempts = 10
healthcheck.enabled = true

Vector does not (or no longer) have a separate native opensearch sink — it uses the standard elasticsearch sink, which is compatible with the OpenSearch REST API (OpenSearch is an Elasticsearch fork).

Parameter	Effect
`mode = "bulk"`	Bulk insert to reduce the number of HTTP requests
`suppress_type_name = true`	Removes the `_type` field (deprecated since ES 7+)
`bulk.index = "perfshop-{{ service_family }}"`	Templating: the target index is computed dynamically from the `service_family` field set by the transform — a Spring container goes to `perfshop-spring`, an nginx goes to `perfshop-nginx`, etc.
`compression = "gzip"`	gzip network compression
`request.retry_attempts = 10`	10 retries before giving up
`healthcheck.enabled = true`	Verifies at startup that the OpenSearch endpoint responds

OpenSearch — configuration¶

# opensearch.yml
network.host: 0.0.0.0
plugins.security.disabled: true
bootstrap.memory_lock: false

And on the environment variables side in compose:

environment:
  - cluster.name=perfshop-logs
  - node.name=perfshop-opensearch-node1
  - discovery.type=single-node
  - OPENSEARCH_JAVA_OPTS=${OPENSEARCH_JAVA_OPTS:--Xms512m -Xmx512m}
  - DISABLE_SECURITY_PLUGIN=true
  - DISABLE_PERFORMANCE_ANALYZER_AGENT_CLI=true
ulimits:
  memlock:
    soft: -1
    hard: -1
  nofile:
    soft: 65536
    hard: 65536

Parameter	Effect
`cluster.name=perfshop-logs`	Cluster name (a single node)
`discovery.type=single-node`	Disables the multi-node bootstrap (otherwise OpenSearch refuses to start in single-node without explicit config)
`OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m`	512 MB JVM heap (configurable via env)
`DISABLE_SECURITY_PLUGIN=true`	Security plugin disabled — no TLS, no auth, freely accessible from the internal Docker network. This is intentional in the pedagogical context; in real production it would have to be enabled.
`ulimits memlock=-1`	Memory lock disabled on the ulimit side
`bootstrap.memory_lock: false`	Memory lock disabled on the config side
`nofile=65536`	High open-file cap (OpenSearch consumes many of them for Lucene segments)

Healthcheck:

healthcheck:
  test: ["CMD-SHELL", "curl -sf http://localhost:9200/_cluster/health | grep -qE '\"status\":\"(green|yellow)\"'"]
  interval: 15s
  timeout: 10s
  retries: 12
  start_period: 60s

The healthcheck waits for the cluster status to be yellow or green (single-node: replication is not possible, so the maximum reachable status is yellow).

OpenSearch Dashboards¶

environment:
  - OPENSEARCH_HOSTS=["http://perfshop-opensearch:9200"]
  - DISABLE_SECURITY_DASHBOARDS_PLUGIN=true
volumes:
  - ./opensearch/dashboards.yml:/usr/share/opensearch-dashboards/config/opensearch_dashboards.yml

OpenSearch Dashboards is a Kibana fork — the UI is familiar to anyone who has used the ELK stack. Minimal configuration: points to the OpenSearch cluster and disables its own security plugin.

Default host port: 5601 (variable OPENSEARCH_HTTP_PORT).

The `opensearch-seed/seed.py` seed¶

The perfshop-opensearch-seed service (one-shot, restart: "no") performs three steps at first startup.

sequenceDiagram
  autonumber
  participant S as opensearch-seed
  participant OS as perfshop-opensearch
  participant OSD as perfshop-opensearch-dashboards

  S->>OS: GET /_cluster/health<br/>(loop 5s × 60)
  OS-->>S: {"status":"yellow"} ✓

  loop for each family (7)
    S->>OS: PUT /_index_template/perfshop-{family}-template<br/>{ "index_patterns":["perfshop-{family}*"],<br/>  "template":{"mappings":{...}} }
    OS-->>S: 200 OK
  end

  S->>OSD: GET /api/status<br/>(loop 5s × 90)
  OSD-->>S: 200 OK

  loop for each pattern (8)
    S->>OSD: POST /api/saved_objects/index-pattern/{pid}<br/>{"attributes":{"title":"perfshop-...*","timeFieldName":"@timestamp"}}
    OSD-->>S: 200 or 409 (already existing)
  end

  S->>OSD: POST /api/opensearch-dashboards/settings<br/>{"changes":{"defaultIndex":"perfshop-all"}}
  OSD-->>S: 200 OK

  S->>OSD: POST /api/saved_objects/_import?overwrite=true<br/>file: perfshop-all-logs.ndjson
  OSD-->>S: 200 OK + successCount

Step 1 — 7 index templates¶

For each family (spring, nginx, mysql, jmeter, qa, forgejo, observability), the seed creates a template that:

Matches the perfshop-{family}* indices
Sets number_of_shards: 1, number_of_replicas: 0, index.refresh_interval: 5s
Defines an explicit mapping on the fields: @timestamp (date), ts (date), container (keyword), service_family (keyword), level (keyword), logger (keyword), msg (text + sub-field raw keyword), message (text), stream (keyword), chaos_family (keyword), chaos_level (keyword), scenario_id (keyword), host (keyword)

The mapping ensures that aggregations on chaos_family, scenario_id, etc. are efficient (keyword fields indexed as doc_values).

Step 2 — 8 index patterns in Dashboards¶

The seed creates 8 index patterns in OpenSearch Dashboards:

Pattern	Target
`perfshop-all`	`perfshop-*`
`perfshop-spring`	`perfshop-spring*`
`perfshop-nginx`	`perfshop-nginx*`
`perfshop-mysql`	`perfshop-mysql*`
`perfshop-jmeter`	`perfshop-jmeter*`
`perfshop-qa`	`perfshop-qa*`
`perfshop-forgejo`	`perfshop-forgejo*`
`perfshop-observability`	`perfshop-observability*`

And sets perfshop-all as the default index pattern (Discover view).

Step 3 — Import of the `PerfShop — All Logs` dashboard¶

ndjson_path = "/app/dashboards/perfshop-all-logs.ndjson"

The seed imports a pre-built NDJSON dashboard (opensearch/dashboards/perfshop-all-logs.ndjson) via the POST /api/saved_objects/_import?overwrite=true API. If the file does not exist, the step is silently skipped.

Volumes¶

Volume	Mount	Content
`opensearch-data` (named volume)	`/usr/share/opensearch/data`	Data indexed by OpenSearch (Lucene segments, translog)
`./opensearch/opensearch.yml` (bind mount)	`/usr/share/opensearch/config/opensearch.yml`	OpenSearch config (read-only)
`./opensearch/dashboards.yml` (bind mount)	`/usr/share/opensearch-dashboards/config/opensearch_dashboards.yml`	OpenSearch Dashboards config
`./vector/vector.toml` (bind mount)	`/etc/vector/vector.toml`	Vector pipeline (read-only)
`/var/run/docker.sock` (bind mount)	`/var/run/docker.sock`	Docker socket for Vector's `docker_logs` source
`./opensearch-seed/seed.py` (bind mount)	`/app/seed.py`	Seed Python script (read-only)
`./opensearch/dashboards` (bind mount)	`/app/dashboards`	Pre-built NDJSON dashboards

Ports¶

Service	Host port	Container port	Env variable
`perfshop-opensearch`	9201	9200	`OPENSEARCH_API_PORT`
`perfshop-opensearch-dashboards`	5601	5601	`OPENSEARCH_HTTP_PORT`
`perfshop-vector`	(none)	(internal only)	—

To go further¶

Overview — Loki vs OpenSearch comparison
Loki — the other log sink (label-based index model)
Docker Compose — details of the perfshop-opensearch* and perfshop-vector services