Skip to content

Observability Stack

TL;DR — Logz.io is the single platform for logs, errors, and alerts. Correlation IDs (injected via an internal shared NestJS package) let you trace a request across microservices. Alerts are delivered to Slack.

Overview

Observability at Securitize is concentrated in one tool (Logz.io) combined with a shared library pattern that propagates correlation IDs across service calls. The goal is that every production error can be traced back from symptom → originating request.

Components

Logz.io

  • What it covers: monitoring, error/log management, alert configuration.
  • URL: see databases-and-services.md.
  • Alert delivery: certain error types (e.g., failing events) are configured to notify Slack channels.

Correlation IDs

  • Propagated via an internal NestJS shared package (part of nestjs-shared — see shared-libraries.md).
  • Every inbound request gets or generates a correlation ID.
  • All outbound service calls forward the correlation ID in headers.
  • Logz.io queries can filter by correlation ID to reconstruct full request traces across microservices.

Slack alerts

  • Certain high-signal errors (e.g., failing events, pipeline failures) post to Slack.
  • Channel routing is configured per alert in Logz.io.

Usage patterns

Debugging a user-reported issue

  1. Ask the user (or find in logs) a correlation ID or approximate timestamp.
  2. In Logz.io, filter by correlation ID or time window.
  3. Follow the trace across services — Logz.io shows the chain from entry point to failure.
  4. Cross-reference with the service deploy (ECR commit hash — see jenkins-k8s-jobs.md) to confirm which code is running.

Current state notes

  • Correlation ID emission varies across services; some legacy Express services do not use the shared package.
  • Alert configuration lives inside Logz.io (not backed by IaC).
  • Formal SLO/SLI documentation is not yet in place.

See also

Tags

observability #logging #monitoring #logz-io #correlation-id #alerts #slack