Streaming

Kafka Clickstream Telemetry

Next.js → Kafka → ClickHouse, with real-time dashboards + latency metrics.

KafkaClickHouseNext.jsTypeScriptClickPipes

Overview

A production-shaped clickstream pipeline for my portfolio. Every meaningful interaction emits an event, lands in Kafka, and becomes queryable in ClickHouse within seconds.

Problem

  • I wanted a portfolio that proves data engineering skills, not just screenshots.
  • Most analytics tools are black boxes — I wanted a transparent pipeline with real streaming concepts.

Solution

  • Browser emits events (page_view, button_click, project_open) to a `/api/track` endpoint.
  • API publishes JSON messages to a Kafka topic partitioned by `session_id`.
  • ClickHouse ingests from Kafka (ClickPipes recommended) into `events_raw` and materializes rollups.
  • Telemetry page queries ClickHouse to display freshness, latency P50/P95, and top clicks.

Architecture

  1. Next.js UI captures events
  2. /api/track publishes to Kafka topic portfolio.events.v1
  3. ClickHouse ingests Kafka → events_raw
  4. Materialized rollups → events_daily
  5. Telemetry dashboard queries ClickHouse

Metrics

  • End-to-end latency: P50/P95 (click → queryable).
  • Freshness: last_event_at, events/day, sessions/day.
  • Reliability: API 2xx rate, invalid event drop rate (extendable).

Highlights

  • Partition key: session_id (keeps session ordering).
  • Idempotency: unique event_id (safe retries).
  • Low ops: managed Kafka + ClickPipes removes custom consumer service.