analyticsdatabasedeveloper

Analytics for Favicons: Using ClickHouse for High-Volume Icon Telemetry

UUnknown

2026-03-01

10 min read

Build a ClickHouse pipeline to analyze favicon loads, cache hit rates, device differences and A/B tests at scale — actionable setup & SQL examples.

Stop wondering whether your favicons are working — measure them. Fast.

Favicons are tiny, but their telemetry is high-volume and surprisingly informative. If you run an icon service, CDN, or a site-builder, you face these recurring problems: noisy event streams, inconsistent cache hit rates across devices, and fragile A/B experiments that don’t scale. This guide shows how to build a ClickHouse-powered analytics pipeline that ingests billions of favicon events per day and produces reliable metrics for cache hit rate, device variation, performance, and A/B tests — without blowing up your cost or query latency.

Why ClickHouse in 2026?

ClickHouse has continued to lead OLAP adoption for real-time, high-cardinality telemetry. Large investments and rapid feature development through late 2024–2025 — including ClickHouse’s large growth round reported in late 2025 — made it a first-class choice for event analytics at scale. For favicon telemetry you need:

High ingest throughput (Kafka/HTTP streams)
Low-latency OLAP queries for dashboards and experiments
Cost-effective storage with TTL and compression
Clustered, replicated storage for availability

Key claim: ClickHouse gives you millisecond query response for aggregated telemetry and can sustain billions of small events a day when paired with Kafka/Redpanda and proper schema design.

Architecture overview: From browser request to dashboard

At a high level, the pipeline is:

Client or edge emits favicon telemetry (loads, cache headers, status)
Events are batched to Kafka / Redpanda or pushed to an HTTP ingest gateway
ClickHouse reads from Kafka or HTTP and writes into MergeTree tables
Materialized views pre-aggregate per-minute and per-variant metrics
Grafana (or Superset) queries ClickHouse for dashboards, alerts, and AB analysis

Recommended tech choices

Event bus: Kafka or Redpanda for durability and backpressure
Ingest buffer: ClickHouse Kafka engine + Materialized Views, or ClickHouse HTTP API with a buffering proxy
Tables: ReplicatedReplacingMergeTree / AggregatingMergeTree for dedup and rollups
Query/visualization: Grafana ClickHouse plugin or ClickHouse dashboards

Designing the event schema

Keep events compact and consistent. Use typed JSON or binary (Protobuf/Avro) when possible. Below is a compact JSON schema tailored for favicon telemetry:

 {
  "event_id": "uuid",             // client-generated or edge-assigned
  "ts": "2026-01-17T12:34:56.789Z",
  "domain": "example.com",
  "page_url": "/docs/favicon-intro",
  "favicon_id": "hash-or-id",
  "cache_status": "HIT|MISS|STALE",
  "response_ms": 23,               // time to serve icon
  "status_code": 200,
  "user_agent_key": "ios-17-safari", // pre-parsed on edge
  "device_type": "mobile|desktop|tablet",
  "experiment_id": "exp_2026_new_shape", // optional
  "variant": "A|B",
  "sample_rate": 1,                // client-side sampling fraction
  "is_prefetch": false
 }

Design tips:

Pre-parse user agent and map to a few device buckets at the edge to reduce cardinality.
Include sample_rate for correctness when clients sample events.
Use hashed identifiers for PII compliance (never send raw user emails).

ClickHouse table design

For high-volume favicon events, a two-layer storage model works well:

Raw events into a high-ingest MergeTree (for reprocessing and debugging)
Pre-aggregated tables (per-minute/per-domain/per-variant) for dashboards and experiments

Raw events table

CREATE TABLE favicons.events_raw (
    event_id UUID,
    ts DateTime64(3, 'UTC'),
    domain String,
    page_url String,
    favicon_id String,
    cache_status LowCardinality(String),
    response_ms UInt32,
    status_code UInt16,
    device_type LowCardinality(String),
    experiment_id LowCardinality(String),
    variant LowCardinality(String),
    sample_rate Float32
) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/favicons.events_raw', '{replica}')
PARTITION BY toYYYYMM(ts)
ORDER BY (domain, favicon_id, ts)
TTL ts + INTERVAL 90 DAY
SETTINGS index_granularity = 8192;

Notes:

Partition by month to bound merges and improve TTL deletes.
Use LowCardinality for high-selectivity string columns (device_type, variant).
ReplacingMergeTree lets you deduplicate events by event_id if you insert duplicates.

Aggregated, per-minute table (for dashboards)

CREATE MATERIALIZED VIEW favicons.events_minute
TO favicons.events_minute_mv
AS
SELECT
  toStartOfMinute(ts) AS minute,
  domain,
  favicon_id,
  device_type,
  experiment_id,
  variant,
  count() AS impressions,
  sumIf(1, cache_status = 'HIT') AS hits,
  avg(response_ms) AS avg_response_ms
FROM favicons.events_raw
GROUP BY minute, domain, favicon_id, device_type, experiment_id, variant;

CREATE TABLE favicons.events_minute_mv (
  minute DateTime,
  domain String,
  favicon_id String,
  device_type LowCardinality(String),
  experiment_id LowCardinality(String),
  variant LowCardinality(String),
  impressions UInt64,
  hits UInt64,
  avg_response_ms Float32
) ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(minute)
ORDER BY (domain, favicon_id, minute);

AggregatingMergeTree stores partially aggregated states and reduces storage for large rollups.

Ingestion patterns

1) Kafka + ClickHouse (recommended for scale)

Create a Kafka engine topic and a materialized view that inserts into events_raw:

CREATE TABLE favicons.kafka_events (
  event_id String,
  ts String,
  domain String,
  page_url String,
  favicon_id String,
  cache_status String,
  response_ms UInt32,
  status_code UInt16,
  device_type String,
  experiment_id String,
  variant String,
  sample_rate Float32
) ENGINE = Kafka(
  'kafka:9092',
  'favicon-events',
  'favicon-group',
  'JSONEachRow'
);

CREATE MATERIALIZED VIEW favicons.kafka_to_raw
TO favicons.events_raw
AS
SELECT
  parseUUID(event_id) as event_id,
  parseDateTime64BestEffort(ts) as ts,
  domain,
  page_url,
  favicon_id,
  cache_status,
  response_ms,
  status_code,
  device_type,
  experiment_id,
  variant,
  sample_rate
FROM favicons.kafka_events;

Buffering is automatic with Kafka; add a ClickHouse Buffer table if you need milliseconds of write batching before Kafka.

2) HTTP bulk ingestion (simpler, smaller scale)

Use a gateway (Nginx + buffering) that forwards compressed JSON batches to ClickHouse HTTP insert endpoint. Keep batch payloads around 50–500KB for efficiency.

Key queries: Cache hit rate, device breakdown, A/B tests

Cache hit rate (real-time, per-minute)

SELECT
  minute,
  domain,
  favicon_id,
  SUM(hits) / SUM(impressions) AS cache_hit_rate
FROM favicons.events_minute_mv
WHERE minute >= now() - INTERVAL 24 HOUR
GROUP BY minute, domain, favicon_id
ORDER BY minute ASC
LIMIT 1000;

Device variation: hit rate and latency by device

SELECT
  device_type,
  SUM(hits) / SUM(impressions) AS cache_hit_rate,
  avgWeighted(avg_response_ms, impressions) as weighted_avg_response_ms
FROM favicons.events_minute_mv
WHERE domain = 'example.com' AND minute >= now() - INTERVAL 12 HOUR
GROUP BY device_type
ORDER BY cache_hit_rate DESC;

A/B experiment results: difference in cache hits and response times

Compute aggregated metrics per variant and use a simple Z-test for difference in proportions (cache hits). This example computes pooled proportion and z-score; adapt for your exact statistical needs.

WITH
  (SELECT sum(hits) as hits_A, sum(impressions) as n_A FROM favicons.events_minute_mv
    WHERE experiment_id='exp_2026_new_shape' AND variant='A' AND minute >= now() - INTERVAL 7 DAY) AS A,
  (SELECT sum(hits) as hits_B, sum(impressions) as n_B FROM favicons.events_minute_mv
    WHERE experiment_id='exp_2026_new_shape' AND variant='B' AND minute >= now() - INTERVAL 7 DAY) AS B
SELECT
  (A.hits_A / A.n_A) AS p1,
  (B.hits_B / B.n_B) AS p2,
  ( (A.hits_A + B.hits_B) / (A.n_A + B.n_B) ) AS p_pooled,
  (p1 - p2) / sqrt(p_pooled * (1 - p_pooled) * (1.0/A.n_A + 1.0/B.n_B)) AS z_score
;

Use the z_score to compute p-values client-side or with ClickHouse math functions. For large n, z-test approximations are valid. For small-sample tests, use bootstrap sampling or Bayesian methods.

Operational best practices

Sampling and cardinality control

Apply client-side sampling for optional debug fields; always include sample_rate so you can scale counts back up.
Pre-map user agents and domain names to low-cardinality tags to avoid cardinality explosion.

Retention and cost

Store raw events for a bounded period (30–90 days) and keep aggregated rollups (per-minute or per-hour) for longer retention.
Use TTLs in MergeTree definitions to automatically expire raw data.

Backfills and schema changes

Use the raw events table for reprocessing when event schema evolves. Materialized views simplify streaming ETL, but keep a reliable backfill path (batch re-insert into events_raw and run INSERT ... SELECT into aggregates).

Monitoring and alerts

Track ingestion lag and Kafka consumer group offsets.
Create alerts for sudden drops in cache hit rate or increases in avg_response_ms across domains or devices.

In 2026, privacy-first telemetry patterns are expected. Avoid storing PII; prefer hashed identifiers and edge-side enrichment. Respect user consent: gate sampling and event emission behind consent checks. Use differential retention for regions with strict retention laws.

Scaling tips and failure modes

Scaling write throughput

Horizontally scale ClickHouse shards and use the Distributed engine for query routing.
Use Kafka partitions keyed by domain or favicon_id to keep ordering where needed.
Batch small events before inserting; tiny single-event inserts are expensive.

Handling spikes

Buffer with Kafka retention and provision extra ClickHouse capacity (or autoscale ClickHouse Cloud) for predictable traffic spikes caused by product launches or large sites.

Drive down query latency

Pre-aggregate per-minute and materialize commonly used groupings.
Use appropriate ORDER BY keys to support common query patterns (e.g., domain, minute).
Enable index_granularity tuning when range scans over time are common.

Example case study (anonymized)

Company: an icon CDN serving 400M users monthly. Problem: inconsistent favicon cache behavior across mobile browsers and poor experiment fidelity. Solution implemented:

Instrumented edge to emit compact event JSON with sample_rate and device_type.
Piped events to Redpanda and into ClickHouse using the Kafka engine.
Built per-minute AggregatingMergeTree rollups and surfaced them in Grafana.

Results (after 8 weeks):

Detected a 12% lower cache hit rate on Android WebViews and rolled out a cache-control change that recovered 9%.
Ran a controlled favicon shape experiment (A/B), and within 72 hours got statistically significant cache hit improvement for variant B — saved ~15% bandwidth across 10 top domains.
Average dashboard query p99 latency: 180ms for top-level aggregations.

Advanced strategies and 2026 trends

Look ahead — these are patterns we recommend adopting in 2026:

Edge-native pre-aggregation: compute per-minute key counts at the edge and send summarized events to reduce ingestion pressure.
Client-driven sampling with adaptive rates: increase sampling for low-traffic domains to preserve experiment power and reduce cost for high-volume domains.
Hybrid cloud-edge analytics: use ClickHouse Cloud for burst capacity and on-prem clusters for long-term raw retention.
Integrate model-driven anomaly detection (lightweight ML) that runs over ClickHouse aggregates for early detection of cache regressions.

These patterns reflect industry moves toward decentralized telemetry and operational OLAP seen across 2024–2026 as ClickHouse and streaming platforms matured.

Checklist: Quick deployment plan

Define compact event schema and sampling rules.
Stand up Redpanda/Kafka and create topic per environment.
Deploy ClickHouse cluster (3+ replicas) or use ClickHouse Cloud.
Create Kafka engine table + materialized view into events_raw.
Build per-minute AggregatingMergeTree rollups and dashboard queries.
Run backfill to validate counts vs historical logs.
Instrument A/B assignment and expose experiment ids in events.
Monitor ingestion lag, p99 query times, and cost by domain.

Actionable takeaways

Always include sample_rate in events so you can scale sampling without biasing counts.
Pre-aggregate per-minute to keep dashboards fast and cheap.
Use Kafka + ClickHouse Kafka engine for durable, high-throughput ingestion.
Partition and TTL raw events; keep aggregates longer for trend analysis.
Design experiments to track both cache hit rate and user-perceived latency — both matter.

Final notes

Favicons may be small, but their telemetry is a high-signal source for performance, cache strategy, and visual experiments. With ClickHouse you can reliably ingest, analyze, and act on that signal at scale. The patterns in this guide will help you build a resilient, fast, and cost-aware analytics pipeline for favicon telemetry in 2026 and beyond.

Ready to implement? Start with the checklist above and spin up a small proof-of-concept: one Kafka topic, one ClickHouse node, and the materialized view pipeline. If you want a template of the full SQL + Grafana dashboard used by production teams, contact our engineering docs team or download the repo linked from this article.

Note: This article references industry developments through 2025 and early 2026. For help building a production pipeline or evaluating architectures for your scale, reach out — we’ve built pipelines for icon CDNs and platform builders handling hundreds of millions of favicon events daily.

Call to action

Get a ready-to-deploy ClickHouse favicon analytics kit: schema, ingestion scripts (Kafka + HTTP), Grafana dashboards, and AB analysis queries. Request the kit or a free architecture review — optimize your favicon pipeline to save bandwidth, cut latency, and run confident experiments.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Up Next

Integrating Gemini-Based LLMs to Generate Icon Variants on Demand (Siri Is a Gemini Inspiration)

performance•10 min read

Edge Caching Versus Local Storage: What SK Hynix’s Flash Innovations Mean for Icon Delivery

performance•9 min read

Fallback Favicons and Offline UX: Preparing for Outages Like the X/Cloudflare Incident

branding•9 min read

Designing Avatar Systems for Transmedia IP: What The Orangery Deal Teaches Small Studios

AI•11 min read

Auto-Generate Favicons for Vertical-First Apps Using AI: Lessons from Holywater’s Scale-Up

From Our Network

Trending stories across our publication group

Podcast Launch Checklist: DNS, Custom Domain, and Hosting Tips for New Shows

someones.xyz

hosting•11 min read

Podcast Launch Checklist: DNS, Custom Domain, and Hosting Tips for New Shows

Turn Grandma’s Lipstick Stories into a Visual Memoir

memorys.cloud

oral-history•11 min read

Turn Grandma’s Lipstick Stories into a Visual Memoir

Operationalizing Rapid Identity Provider Changes: Scripting Recovery Email Updates at Enterprise Scale

loging.xyz

automation•9 min read

Operationalizing Rapid Identity Provider Changes: Scripting Recovery Email Updates at Enterprise Scale

Secure Fast Pair Implementations: How to Protect Bluetooth Accessories from Eavesdropping

certifiers.website

iot-security•10 min read

Secure Fast Pair Implementations: How to Protect Bluetooth Accessories from Eavesdropping

API Patterns to Thwart Automated Account Takeovers After Platform Resets

recipient.cloud

apis•9 min read

API Patterns to Thwart Automated Account Takeovers After Platform Resets

WhisperPair and Companion Devices: Securing Bluetooth as an Identity Factor

verify.top

device-security•10 min read

WhisperPair and Companion Devices: Securing Bluetooth as an Identity Factor

2026-03-01T08:41:28.530Z