Streaming
Kafka Clickstream Telemetry
Next.js → Kafka → ClickHouse, with real-time dashboards + latency metrics.
KafkaClickHouseNext.jsTypeScriptClickPipes
Overview
A production-shaped clickstream pipeline for my portfolio. Every meaningful interaction emits an event, lands in Kafka, and becomes queryable in ClickHouse within seconds.
Problem
- I wanted a portfolio that proves data engineering skills, not just screenshots.
- Most analytics tools are black boxes — I wanted a transparent pipeline with real streaming concepts.
Solution
- Browser emits events (page_view, button_click, project_open) to a `/api/track` endpoint.
- API publishes JSON messages to a Kafka topic partitioned by `session_id`.
- ClickHouse ingests from Kafka (ClickPipes recommended) into `events_raw` and materializes rollups.
- Telemetry page queries ClickHouse to display freshness, latency P50/P95, and top clicks.
Architecture
- Next.js UI captures events
- /api/track publishes to Kafka topic portfolio.events.v1
- ClickHouse ingests Kafka → events_raw
- Materialized rollups → events_daily
- Telemetry dashboard queries ClickHouse
Metrics
- End-to-end latency: P50/P95 (click → queryable).
- Freshness: last_event_at, events/day, sessions/day.
- Reliability: API 2xx rate, invalid event drop rate (extendable).
Highlights
- Partition key: session_id (keeps session ordering).
- Idempotency: unique event_id (safe retries).
- Low ops: managed Kafka + ClickPipes removes custom consumer service.
