Skip to content

Real-time student monitoring

perfshop-monitoring is a standalone Node.js component that exposes a real-time HTML dashboard. It plays two distinct but complementary roles:

  1. Docker collector — it queries the Docker socket to expose in Prometheus format the CPU / memory / network / I/O metrics of each PerfShop container
  2. HTML dashboard — it serves a web page that displays these metrics live along with those of the Spring Boot backend, in a deliberately simpler and more pedagogical version than Grafana

Sources

monitoring/src/server.js, monitoring/public/index.html, monitoring/package.json, monitoring/Dockerfile

Why a separate HTML dashboard?

Grafana is an excellent observability platform but its interface can overwhelm beginner students. perfshop-monitoring offers a simplified view:

  • A single screen, no navigation
  • Basic gauges and charts (no PromQL to write)
  • Focused on PerfShop container resources rather than the host
  • Automatic update every few seconds, without interaction

The full Grafana dashboard remains available for more advanced students and instructors — see Grafana.

Stack

Component Version Role
Node.js 20 Alpine Runtime
Express 4.18 HTTP server
node-fetch 3.3 HTTP calls to the backend
Docker socket /var/run/docker.sock Direct reading of container stats

The container mounts the Docker socket read-only to be able to query the /containers/json and /containers/{name}/stats API of the daemon.

Configuration via environment variables

Variable Default Role
PORT 3001 HTTP listen port
APP_METRICS_URL http://perfshop-app:8080/actuator/prometheus Backend Prometheus endpoint
DOCKER_SOCKET /var/run/docker.sock Docker socket path
HEAPDUMP_URL http://perfshop-app:9090/actuator/heapdump Heap dump endpoint (internal management port)
PERFSHOP_API_INTERNAL http://perfshop-app:8080 Internal backend for the /api/admin/login proxy
PUBLIC_API_URL http://localhost:8080 Injected into window.__CONFIG__
PUBLIC_MONITORING_URL http://localhost:3001 Same
PUBLIC_GRAFANA_URL http://localhost:3002 Same
PUBLIC_CHAOS_URL http://localhost:3003 Same
PERFSHOP_LANG fr Language injected into window.__CONFIG__.LANG

The five PUBLIC_* variables are grouped in a PUBLIC_CONFIG object injected into each served HTML page via an inline <script> (route / and /config.js). This allows client JavaScript to call the right endpoints while respecting the deployment configuration.

Monitored containers

The CONTAINERS_TO_WATCH array is hard-coded in server.js:

const CONTAINERS_TO_WATCH = [
  'perfshop-frontend', 'perfshop-app', 'perfshop-db', 'perfshop-monitoring'
];

These are the four critical containers at the heart of PerfShop. The resolveContainerNames() function intelligently handles Docker Compose naming: it accepts project prefixes (perfshop-perfshop-app) and separators (- or _) to work on both NAS and Docker Desktop. It runs at startup then every 60 seconds to stay up to date in case of container restarts.

Exposed endpoints

/metrics — Prometheus format

Returns as plain text Prometheus-format metrics that Prometheus scrapes and stores. Two families of metrics are exposed:

Docker metrics (per monitored container):

Metric Type Labels
docker_container_cpu_percent gauge container
docker_container_mem_usage_bytes gauge container
docker_container_mem_limit_bytes gauge container
docker_container_mem_percent gauge container
docker_container_net_rx_bytes counter container
docker_container_net_tx_bytes counter container
docker_container_io_read_bytes counter container
docker_container_io_write_bytes counter container
docker_container_pids gauge container

The memory measurement excludes disk cache (memory.stats.cache or inactive_file) to display actually used memory and not just the Linux page cache.

Browser client metrics (Frontend Chaos) — only if received in the previous 10 seconds:

Metric Role
perfshop_client_fps Frames per second measured in the browser
perfshop_client_heap_used_mb Used JS heap in MB
perfshop_client_long_tasks_per_sec Number of long tasks (>50 ms) per second
perfshop_client_fetch_req_per_sec Fetch requests launched per second
perfshop_client_dom_node_count Number of DOM nodes
perfshop_client_cpu_worker_active 1 if a CPU Web Worker is running
perfshop_client_last_received_timestamp Unix timestamp of the last push

These metrics are pushed by chaos-agent.js from browsers via POST /api/chaos/client-metrics every 2 seconds. perfshop-monitoring acts as a bridge between browsers and Prometheus (Prometheus does not scrape browsers directly).

/api/docker/all

Returns the cached Docker stats as JSON for the HTML dashboard. A 5-second TTL cache avoids overloading the Docker daemon.

/api/docker/stats?container=NAME

Returns the stats of a specific container as JSON.

/api/prometheus-raw

Transparent proxy to the Spring Boot backend (/actuator/prometheus). The client dashboard uses this endpoint with its own parsePrometheus() parser rather than going directly to Spring Boot, for two reasons: to work around cross-origin issues and to benefit from the 5-second timeout that avoids blocking the dashboard if the backend is under chaos.

/api/chaos/client-metrics

  • POST: receives the browser metrics pushed by chaos-agent.js. Each field is validated by type (typeof x === 'number') to prevent injections.
  • GET: returns the last received metrics with a stale flag if more than 10 seconds have elapsed without a push.

/api/heapdump

Proxy to the backend's /actuator/heapdump endpoint (internal management port 9090, not publicly exposed). Generates a timestamped file name and sends the .hprof binary to the client. 60-second timeout — a heap dump can take ~30 seconds on a loaded JVM. The dashboard has a widget (heapdump-widget.html) that triggers this download.

/api/admin/login

Proxy to the backend's /api/admin/login. This alias works around cross-origin restrictions and allows the dashboard to log in as admin without the browser having to directly call perfshop-app:8080. The backend returns 402 if a license is missing, and the proxy propagates this code faithfully.

HTML dashboard

The dashboard is served on /. index.html is read on each request and the <script>window.__CONFIG__ = {...}</script> script is injected just before </head> — this avoids having to rebuild the image when a public URL changes.

The dashboard displays:

  • Four container cards (frontend, backend, db, monitoring) with CPU / memory / network
  • Real-time JVM charts (heap, threads, Tomcat / Hikari pools)
  • A Business Chaos panel (counters of detected anomalies)
  • A Security Chaos panel (attack counters)
  • A heap dump download zone
  • An admin login sidebar

The parsePrometheus() parser (client-side) extracts metrics like jvm_memory_used_bytes{area="heap"}, process_cpu_usage, jvm_threads_live_threads, tomcat_threads_busy_threads, hikaricp_connections_active etc. to feed the charts.

JavaScript architecture — duplication eliminated

The dashboard is split into three clear layers:

File Role Lines
public/js/monitoring-charts.js Pure utilities: mkCfg, mkDS, addPt, parsePrometheus, histogramQuantile, diffBuckets, formatBytes, switchTab ~140
public/js/monitoring-core.js Shared student/admin logic: construction of common Chart.js instances, fetchers (fetchBackendMetrics, fetchAppCpu, fetchClientMetrics, fetchDockerStats, fetchGeneral), triggerHeapDump, bindGlobalPause, openGrafana. Exposed via window.PerfShopMonitoring. ~530
public/js/monitoring-app.js Student wrapper (~60 lines) — wires monitoring-core.js onto index.html elements and defines polling. Only contains the student-specific backend banner logic. ~60
public/admin/js/admin-app.js Admin wrapper (~660 lines) — wires monitoring-core.js plus admin-only code: beChart13 (chaos intensities), Backend/Frontend Chaos panels, Scripting, Business Chaos, Security Chaos, Functional Chaos, token detail modal, login/logout. ~660

Before this refactor, monitoring-app.js and admin-app.js shared ~278 lines verbatim (50% of the student file was strictly identical to admin functions). A Chart.js label bug or a silent catch(e){} had to be fixed twice. monitoring-core.js unifies this foundation: single source of truth, single fix to apply.

monitoring-core.js API contract:

const M = window.PerfShopMonitoring;

const isPaused = M.bindGlobalPause();      // #global-pause-btn → boolean getter
const charts   = M.createSharedCharts({ beChart13: /* admin-only optional */ });
const state    = M.createSharedState();    // EMA counters, previous buckets, etc.

M.fetchBackendMetrics(charts, state, isPaused);   // returns { text, tomcatBusy, dbPending }
M.fetchAppCpu(charts, isPaused, { verboseMissing: true });
M.fetchClientMetrics(charts, isPaused, { updateBanner: true });  // student only
M.fetchDockerStats(charts, state, isPaused);
M.fetchGeneral(charts, isPaused);
M.triggerHeapDump();
M.openGrafana(selectElement, GRAFANA_URL);

Option flags (verboseMissing, updateBanner) carry the small behavioral differences between student and admin without multiplying functions. Script load order is strict:

<script src="/config.js"></script>                              <!-- window.__CONFIG__ -->
<script src="https://.../chart.umd.min.js"></script>            <!-- Chart.js global -->
<script src="/js/i18n.js"></script>                             <!-- _t, loadI18n, I18N -->
<script src="/js/monitoring-charts.js"></script>                <!-- mkDS, mkCfg, parsePrometheus -->
<script src="/js/monitoring-core.js"></script>                  <!-- window.PerfShopMonitoring -->
<script src="/js/monitoring-app.js"></script>                   <!-- OR /admin/js/admin-app.js -->

Internationalization

The monitoring has its own public/i18n/ folder with fr.json and en.json. The language is propagated via window.__CONFIG__.LANG (value of PERFSHOP_LANG). Loading the dictionary follows the same pattern as chaos-admin: async loader, _t(), declarative applyI18n().

public/js/i18n.js exposes five declarative HTML attributes:

Attribute Effect
data-i18n="key" el.textContent = _t('key') (or document.title if the element is <title>)
data-i18n-html="key" el.innerHTML = _t('key') — trusted static content, no user input
data-i18n-placeholder="key" el.placeholder = _t('key')
data-i18n-title="key" el.title = _t('key') (tooltip)
data-i18n-label="key" el.label = _t('key') — for <optgroup>

Adding a new language is as simple as dropping a public/i18n/<code>.json file containing the same keys as fr.json, then setting PERFSHOP_LANG=<code> in the container environment. No code change required. The loader applies a detection cascade: window.__CONFIG__.LANGnavigator.languagefr. If the requested file is missing, a fallback to fr.json is applied automatically with a console warning.

See Ancillary tools.

Differences with Grafana

Criterion perfshop-monitoring Grafana
Target audience Beginner students, quick demonstration Advanced students, instructors, QA
Interaction Read-only, no config Customizable dashboards, PromQL
Coverage PerfShop containers + JVM + Chaos The entire observability stack
Data source Docker socket + /actuator/prometheus direct Prometheus (history) + Loki + Tempo
Persistence None (real-time only) Prometheus storage (15 days by default)

One does not replace the other: the HTML dashboard is an entry point to observability, Grafana is the analysis tool for incidents.