Skip to content

Bamboo v0.4 Observability & Resilience Preparation

This planning note captures the design targets for the v0.4 release: a Prometheus export surface, resilience middleware, and operational hooks that keep OpenSwoole workers healthy. It establishes contracts before code lands so the middleware pipeline (introduced in v0.3) and the runtime can absorb the changes with minimal refactoring.

/metrics HTTP endpoint

  • The route will respond on GET /metrics and return text/plain; version=0.0.4 (the Prometheus text exposition media type). The handler will be registered through the router like any other controller so it benefits from request logging, authentication, or rate limiting middleware when configured.
  • Bamboo\\Core\\ResponseEmitter will emit metric payloads verbatim without JSON encoding. The handler returns a PSR-7 response with the correct Content-Type header and buffered body so the existing emitter can stream it through OpenSwoole without changes.
  • The endpoint will expose process-level counters (requests, errors, in-flight connections), timers/histograms for request latency, and gauges for worker state. Instrumentation points will be added in the HTTP kernel and middleware pipeline to increment/update metrics before the response is emitted.

Prometheus text format contract

  • Responses conform to the Prometheus 0.0.4 text exposition format, including # HELP/# TYPE preamble lines, snake_case metric names with namespace prefixes (e.g., bamboo_http_requests_total), and UTF-8 encoded labels.
  • Histograms will include _count, _sum, and bucket lines. Latency buckets will default to [0.05, 0.1, 0.25, 0.5, 1, 2.5, 5] seconds, configurable through etc/metrics.php.
  • To avoid per-worker duplication, samples will be aggregated across workers using a shared storage backend rather than per-request formatting.
  • Error conditions must yield HTTP 503 responses with an explanatory body so Prometheus scrape failures are obvious. The collector should degrade gracefully if storage backends are unavailable (returning 503 or emitting an empty body).

Metrics collection plumbing

  • Add the Composer dependency promphp/prometheus_client_php:^2.7 to obtain the CollectorRegistry, counters, gauges, and histograms. The project already requires APCu; the Prometheus client can use its APC adapter for local development and Redis storage in clustered deployments.
  • A new service provider (or optional bamboo/metrics-prometheus module) will register the collector registry within the container, expose a middleware to observe request timing, and provide helper services for custom instrumentation.
  • OpenSwoole workers need a shared collector to avoid per-worker caches. Use the Prometheus Prometheus\\Storage\\InMemory driver backed by Swoole\Table or Redis to aggregate counts. The bootstrap sequence will initialize the storage so forked workers inherit the connection/table descriptor.
  • The /metrics handler will pull samples from the registry and write the body using Prometheus\\Renderer\\TextFormat. Because the ResponseEmitter streams raw text, no special casing is required beyond setting the Content-Type header.

Timeout middleware strategy

  • Implement a TimeoutMiddleware that wraps the downstream handler in OpenSwoole\Coroutine::withTimeout() (or uses Swoole\Timer::after() fallback when coroutines are unavailable). When the timeout elapses, it aborts the request, increments a bamboo_http_timeouts_total counter, and returns a 504 response.
  • Configuration for default/global timeouts will live in etc/middleware.php under a timeouts group so routes can opt in or override with alias syntax defined in the v0.3 middleware document.
  • The middleware will emit timing data around the downstream call so latencies are recorded even when the timeout trips. Metrics instrumentation should use the shared collector registered in the container.

Circuit breaker middleware strategy

  • Introduce a CircuitBreakerMiddleware that monitors upstream failures using a rolling window stored in Swoole\Table or APCu. When thresholds are exceeded (configurable via etc/middleware.php), the middleware short-circuits requests and returns 503 Service Unavailable with a retry-after hint.
  • Middleware ordering matters: circuit breakers must run before expensive work so they will be placed early in the global stack (before authentication or route handlers) but after logging so failures are still recorded.
  • State transitions (closed → open → half-open) will publish gauges/counters to Prometheus so operators can correlate breaker activity with upstream outages.

Graceful shutdown & health checks

  • OpenSwoole exposes on('Shutdown') and on('WorkerExit') hooks. Register listeners that flush pending metrics, mark workers as unhealthy in the readiness registry, and close Redis connections. The HTTP server bootstrap (bootstrap/server.php) will wire these callbacks during application boot.
  • Implement lightweight /healthz (liveness) and /readyz (readiness) endpoints. The liveness check returns 200 as long as the worker loop is running; readiness reports 200 only when critical dependencies (Redis, database, Prometheus storage) are reachable.
  • A shared HealthRegistry service will track worker readiness state. It uses a Swoole\Table or cache entry updated on start/stop events. The Prometheus exporter will expose bamboo_worker_ready gauges sourced from this registry.
  • Graceful shutdown will tie into the CLI command http.serve so signals sent to the managed OpenSwoole server first mark readiness as false, stop accepting new connections, wait for in-flight requests to complete (observed via the shared collector), and finally exit.

Integration summary

  • Dependencies – Composer require promphp/prometheus_client_php and enable Redis or Swoole Table storage. Optional extension: ext-apcu for local storage.
  • ResponseEmitter – No functional changes; handlers must return PSR responses with the correct headers so the emitter can stream metrics text and health payloads unchanged.
  • Middleware pipeline – Timeout and circuit breaker middleware will be registered via etc/middleware.php using the v0.3 alias/group conventions, ensuring deterministic ordering with other global middleware. Instrumentation hooks in each middleware feed the shared collector.
  • Operational hooks – Bootstrap wiring listens for OpenSwoole worker events to maintain readiness state and flush metrics before exit, while new health endpoints give orchestrators clear liveness signals.