Bamboo v0.4 Observability & Resilience Preparation¶
This planning note captures the design targets for the v0.4 release: a Prometheus export surface, resilience middleware, and operational hooks that keep OpenSwoole workers healthy. It establishes contracts before code lands so the middleware pipeline (introduced in v0.3) and the runtime can absorb the changes with minimal refactoring.
/metrics HTTP endpoint¶
- The route will respond on
GET /metricsand returntext/plain; version=0.0.4(the Prometheus text exposition media type). The handler will be registered through the router like any other controller so it benefits from request logging, authentication, or rate limiting middleware when configured. Bamboo\\Core\\ResponseEmitterwill emit metric payloads verbatim without JSON encoding. The handler returns a PSR-7 response with the correct Content-Type header and buffered body so the existing emitter can stream it through OpenSwoole without changes.- The endpoint will expose process-level counters (requests, errors, in-flight connections), timers/histograms for request latency, and gauges for worker state. Instrumentation points will be added in the HTTP kernel and middleware pipeline to increment/update metrics before the response is emitted.
Prometheus text format contract¶
- Responses conform to the Prometheus 0.0.4 text exposition format, including
# HELP/# TYPEpreamble lines, snake_case metric names with namespace prefixes (e.g.,bamboo_http_requests_total), and UTF-8 encoded labels. - Histograms will include
_count,_sum, and bucket lines. Latency buckets will default to[0.05, 0.1, 0.25, 0.5, 1, 2.5, 5]seconds, configurable throughetc/metrics.php. - To avoid per-worker duplication, samples will be aggregated across workers using a shared storage backend rather than per-request formatting.
- Error conditions must yield HTTP
503responses with an explanatory body so Prometheus scrape failures are obvious. The collector should degrade gracefully if storage backends are unavailable (returning503or emitting an empty body).
Metrics collection plumbing¶
- Add the Composer dependency
promphp/prometheus_client_php:^2.7to obtain theCollectorRegistry, counters, gauges, and histograms. The project already requires APCu; the Prometheus client can use itsAPCadapter for local development andRedisstorage in clustered deployments. - A new service provider (or optional
bamboo/metrics-prometheusmodule) will register the collector registry within the container, expose a middleware to observe request timing, and provide helper services for custom instrumentation. - OpenSwoole workers need a shared collector to avoid per-worker caches. Use the Prometheus
Prometheus\\Storage\\InMemorydriver backed bySwoole\Tableor Redis to aggregate counts. The bootstrap sequence will initialize the storage so forked workers inherit the connection/table descriptor. - The
/metricshandler will pull samples from the registry and write the body usingPrometheus\\Renderer\\TextFormat. Because theResponseEmitterstreams raw text, no special casing is required beyond setting the Content-Type header.
Timeout middleware strategy¶
- Implement a
TimeoutMiddlewarethat wraps the downstream handler inOpenSwoole\Coroutine::withTimeout()(or usesSwoole\Timer::after()fallback when coroutines are unavailable). When the timeout elapses, it aborts the request, increments abamboo_http_timeouts_totalcounter, and returns a504response. - Configuration for default/global timeouts will live in
etc/middleware.phpunder atimeoutsgroup so routes can opt in or override with alias syntax defined in the v0.3 middleware document. - The middleware will emit timing data around the downstream call so latencies are recorded even when the timeout trips. Metrics instrumentation should use the shared collector registered in the container.
Circuit breaker middleware strategy¶
- Introduce a
CircuitBreakerMiddlewarethat monitors upstream failures using a rolling window stored inSwoole\Tableor APCu. When thresholds are exceeded (configurable viaetc/middleware.php), the middleware short-circuits requests and returns503 Service Unavailablewith a retry-after hint. - Middleware ordering matters: circuit breakers must run before expensive work so they will be placed early in the global stack (before authentication or route handlers) but after logging so failures are still recorded.
- State transitions (closed → open → half-open) will publish gauges/counters to Prometheus so operators can correlate breaker activity with upstream outages.
Graceful shutdown & health checks¶
- OpenSwoole exposes
on('Shutdown')andon('WorkerExit')hooks. Register listeners that flush pending metrics, mark workers as unhealthy in the readiness registry, and close Redis connections. The HTTP server bootstrap (bootstrap/server.php) will wire these callbacks during application boot. - Implement lightweight
/healthz(liveness) and/readyz(readiness) endpoints. The liveness check returns200as long as the worker loop is running; readiness reports200only when critical dependencies (Redis, database, Prometheus storage) are reachable. - A shared
HealthRegistryservice will track worker readiness state. It uses aSwoole\Tableor cache entry updated on start/stop events. The Prometheus exporter will exposebamboo_worker_readygauges sourced from this registry. - Graceful shutdown will tie into the CLI command
http.serveso signals sent to the managed OpenSwoole server first mark readiness as false, stop accepting new connections, wait for in-flight requests to complete (observed via the shared collector), and finally exit.
Integration summary¶
- Dependencies – Composer require
promphp/prometheus_client_phpand enable Redis or Swoole Table storage. Optional extension:ext-apcufor local storage. - ResponseEmitter – No functional changes; handlers must return PSR responses with the correct headers so the emitter can stream metrics text and health payloads unchanged.
- Middleware pipeline – Timeout and circuit breaker middleware will be registered via
etc/middleware.phpusing the v0.3 alias/group conventions, ensuring deterministic ordering with other global middleware. Instrumentation hooks in each middleware feed the shared collector. - Operational hooks – Bootstrap wiring listens for OpenSwoole worker events to maintain readiness state and flush metrics before exit, while new health endpoints give orchestrators clear liveness signals.