Back to Blog
Engineering

Monitoring Quiz API Performance with Prometheus and Grafana

Instrument your quiz API with Prometheus metrics, build Grafana dashboards, and set up alerts that catch problems before users notice.

Bobby Iliev2026-04-088 min read
Share:

You Cannot Fix What You Cannot See

Your quiz API might be running fine right now, but when 500 students hit it during an exam, will you know about the latency spike before they start complaining? Monitoring with Prometheus and Grafana gives you visibility into request latency, error rates, database performance, and quiz-specific metrics like completion rates and scoring distributions.

This guide walks you through instrumenting a Node.js quiz API, defining custom metrics, building dashboards, and creating alerts.

Prerequisites

  • Node.js quiz API (Express or Fastify)
  • Docker for running Prometheus and Grafana locally
  • Basic understanding of HTTP metrics

Setting Up prom-client

Install the Prometheus client library:

npm install prom-client

Create a metrics module at src/metrics.ts:

1import { 2 Registry, 3 Counter, 4 Histogram, 5 Gauge, 6 collectDefaultMetrics, 7} from "prom-client"; 8 9export const registry = new Registry(); 10 11// Collect Node.js runtime metrics (memory, CPU, event loop) 12collectDefaultMetrics({ register: registry }); 13 14// HTTP request metrics 15export const httpRequestDuration = new Histogram({ 16 name: "http_request_duration_seconds", 17 help: "Duration of HTTP requests in seconds", 18 labelNames: ["method", "route", "status_code"], 19 buckets: [0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5], 20 registers: [registry], 21}); 22 23export const httpRequestTotal = new Counter({ 24 name: "http_requests_total", 25 help: "Total number of HTTP requests", 26 labelNames: ["method", "route", "status_code"], 27 registers: [registry], 28}); 29 30// Quiz-specific metrics 31export const quizCompletionDuration = new Histogram({ 32 name: "quiz_completion_duration_seconds", 33 help: "Time taken to complete a quiz", 34 labelNames: ["quiz_id", "difficulty"], 35 buckets: [30, 60, 120, 300, 600, 900, 1800], 36 registers: [registry], 37}); 38 39export const quizScore = new Histogram({ 40 name: "quiz_score_percentage", 41 help: "Distribution of quiz scores as percentages", 42 labelNames: ["quiz_id", "difficulty"], 43 buckets: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100], 44 registers: [registry], 45}); 46 47export const quizSubmissions = new Counter({ 48 name: "quiz_submissions_total", 49 help: "Total quiz submissions", 50 labelNames: ["quiz_id", "difficulty", "passed"], 51 registers: [registry], 52}); 53 54export const activeQuizSessions = new Gauge({ 55 name: "active_quiz_sessions", 56 help: "Number of currently active quiz sessions", 57 registers: [registry], 58}); 59 60// Database metrics 61export const dbQueryDuration = new Histogram({ 62 name: "db_query_duration_seconds", 63 help: "Duration of database queries", 64 labelNames: ["operation", "table"], 65 buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1], 66 registers: [registry], 67}); 68 69export const dbConnectionPool = new Gauge({ 70 name: "db_connection_pool_size", 71 help: "Current database connection pool size", 72 labelNames: ["state"], 73 registers: [registry], 74});

Instrumenting Express

Add middleware to capture HTTP metrics:

1import express from "express"; 2import { registry, httpRequestDuration, httpRequestTotal } from "./metrics"; 3 4const app = express(); 5 6// Metrics endpoint for Prometheus to scrape 7app.get("/metrics", async (req, res) => { 8 res.setHeader("Content-Type", registry.contentType); 9 res.send(await registry.metrics()); 10}); 11 12// Request duration middleware 13app.use((req, res, next) => { 14 const end = httpRequestDuration.startTimer(); 15 16 res.on("finish", () => { 17 const route = req.route?.path || req.path; 18 const labels = { 19 method: req.method, 20 route: normalizeRoute(route), 21 status_code: res.statusCode.toString(), 22 }; 23 24 end(labels); 25 httpRequestTotal.inc(labels); 26 }); 27 28 next(); 29}); 30 31// Normalize route paths to avoid high cardinality 32function normalizeRoute(path: string): string { 33 return path 34 .replace(/\/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}/g, "/:id") 35 .replace(/\/\d+/g, "/:id") 36 .replace(/\/cuid_[a-z0-9]+/g, "/:id"); 37}

High cardinality is the most common Prometheus mistake. If you use raw paths with IDs as label values, you create a new time series for every unique quiz ID. The normalizeRoute function collapses these into generic patterns.

Instrumenting Quiz Logic

Add metrics to your quiz submission handler:

1import { 2 quizCompletionDuration, 3 quizScore, 4 quizSubmissions, 5 activeQuizSessions, 6} from "./metrics"; 7 8app.post("/api/v1/quizzes/:id/submit", async (req, res) => { 9 const { id: quizId } = req.params; 10 const { answers, startedAt } = req.body; 11 12 try { 13 const quiz = await getQuiz(quizId); 14 const result = calculateScore(quiz, answers); 15 16 // Record completion time 17 if (startedAt) { 18 const durationSeconds = (Date.now() - new Date(startedAt).getTime()) / 1000; 19 quizCompletionDuration.observe( 20 { quiz_id: quizId, difficulty: quiz.difficulty }, 21 durationSeconds 22 ); 23 } 24 25 // Record score distribution 26 const percentage = (result.score / result.total) * 100; 27 quizScore.observe( 28 { quiz_id: quizId, difficulty: quiz.difficulty }, 29 percentage 30 ); 31 32 // Count submissions 33 const passed = percentage >= 70; 34 quizSubmissions.inc({ 35 quiz_id: quizId, 36 difficulty: quiz.difficulty, 37 passed: passed.toString(), 38 }); 39 40 // Decrement active sessions 41 activeQuizSessions.dec(); 42 43 res.json(result); 44 } catch (err) { 45 res.status(500).json({ error: "Submission failed" }); 46 } 47}); 48 49// Track when quizzes start 50app.post("/api/v1/quizzes/:id/start", async (req, res) => { 51 activeQuizSessions.inc(); 52 // ... start logic 53});

Database Query Instrumentation

Wrap your database client to capture query metrics:

1import { Pool } from "pg"; 2import { dbQueryDuration, dbConnectionPool } from "./metrics"; 3 4const pool = new Pool({ connectionString: process.env.DATABASE_URL }); 5 6// Monitor connection pool 7setInterval(() => { 8 dbConnectionPool.set({ state: "total" }, pool.totalCount); 9 dbConnectionPool.set({ state: "idle" }, pool.idleCount); 10 dbConnectionPool.set({ state: "waiting" }, pool.waitingCount); 11}, 5000); 12 13// Instrumented query function 14export async function query( 15 text: string, 16 params?: unknown[] 17): Promise<any> { 18 const operation = text.trim().split(" ")[0].toUpperCase(); 19 const table = extractTableName(text); 20 21 const end = dbQueryDuration.startTimer({ operation, table }); 22 23 try { 24 const result = await pool.query(text, params); 25 end(); 26 return result; 27 } catch (err) { 28 end(); 29 throw err; 30 } 31} 32 33function extractTableName(sql: string): string { 34 const match = sql.match(/(?:FROM|INTO|UPDATE|JOIN)\s+(\w+)/i); 35 return match?.[1] ?? "unknown"; 36}

Prometheus Configuration

Create prometheus.yml:

1global: 2 scrape_interval: 15s 3 evaluation_interval: 15s 4 5rule_files: 6 - "alert_rules.yml" 7 8scrape_configs: 9 - job_name: "quiz-api" 10 metrics_path: "/metrics" 11 static_configs: 12 - targets: ["host.docker.internal:3000"] 13 labels: 14 environment: "production"

Alerting Rules

Create alert_rules.yml:

1groups: 2 - name: quiz-api-alerts 3 rules: 4 - alert: HighErrorRate 5 expr: | 6 sum(rate(http_requests_total{status_code=~"5.."}[5m])) 7 / 8 sum(rate(http_requests_total[5m])) 9 > 0.05 10 for: 2m 11 labels: 12 severity: critical 13 annotations: 14 summary: "High error rate detected" 15 description: "More than 5% of requests are returning 5xx errors" 16 17 - alert: HighLatency 18 expr: | 19 histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) 20 > 2 21 for: 5m 22 labels: 23 severity: warning 24 annotations: 25 summary: "High API latency" 26 description: "95th percentile latency is above 2 seconds" 27 28 - alert: DatabaseSlowQueries 29 expr: | 30 histogram_quantile(0.99, rate(db_query_duration_seconds_bucket[5m])) 31 > 1 32 for: 3m 33 labels: 34 severity: warning 35 annotations: 36 summary: "Slow database queries" 37 description: "99th percentile query duration is above 1 second" 38 39 - alert: LowQuizPassRate 40 expr: | 41 sum(rate(quiz_submissions_total{passed="true"}[1h])) 42 / 43 sum(rate(quiz_submissions_total[1h])) 44 < 0.2 45 for: 30m 46 labels: 47 severity: info 48 annotations: 49 summary: "Low quiz pass rate" 50 description: "Less than 20% of submissions are passing - questions may be too difficult"

Docker Compose Setup

Run Prometheus and Grafana locally:

1# docker-compose.monitoring.yml 2services: 3 prometheus: 4 image: prom/prometheus:v2.53.0 5 ports: 6 - "9090:9090" 7 volumes: 8 - ./prometheus.yml:/etc/prometheus/prometheus.yml 9 - ./alert_rules.yml:/etc/prometheus/alert_rules.yml 10 extra_hosts: 11 - "host.docker.internal:host-gateway" 12 13 grafana: 14 image: grafana/grafana:11.1.0 15 ports: 16 - "3001:3000" 17 environment: 18 GF_SECURITY_ADMIN_PASSWORD: admin 19 volumes: 20 - grafana-data:/var/lib/grafana 21 22volumes: 23 grafana-data:

Start the stack:

docker compose -f docker-compose.monitoring.yml up -d

Grafana Dashboard

After connecting Prometheus as a data source in Grafana, create panels with these PromQL queries:

Request rate:

sum(rate(http_requests_total[5m])) by (route)

95th percentile latency:

histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le, route))

Error rate:

sum(rate(http_requests_total{status_code=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))

Quiz score distribution:

histogram_quantile(0.5, sum(rate(quiz_score_percentage_bucket[1h])) by (le, quiz_id))

Active sessions:

active_quiz_sessions

Summary

Prometheus and Grafana give you complete visibility into your quiz API. The combination of standard HTTP metrics, database performance tracking, and quiz-specific metrics like score distributions and pass rates lets you understand both technical and product health.

Key points:

  • Normalize route paths to avoid high-cardinality label problems
  • Instrument both the HTTP layer and the business logic layer
  • Set alerts on error rates, latency, and slow database queries
  • Track quiz-specific metrics like pass rates to catch content problems
  • Use histograms with meaningful bucket boundaries for latency and scores
Test Your Knowledge

Think you understand Engineering? Put your skills to the test with hands-on quiz questions.

Engineering
Start Practicing

Enjoyed this article?

Share it with your team or try our quiz platform.

Stay Updated

Get the latest tutorials and API tips delivered to your inbox.

No spam, unsubscribe anytime.