Analytics for Favicons: Using ClickHouse for High-Volume Icon Telemetry
Build a ClickHouse pipeline to analyze favicon loads, cache hit rates, device differences and A/B tests at scale — actionable setup & SQL examples.
Stop wondering whether your favicons are working — measure them. Fast.
Favicons are tiny, but their telemetry is high-volume and surprisingly informative. If you run an icon service, CDN, or a site-builder, you face these recurring problems: noisy event streams, inconsistent cache hit rates across devices, and fragile A/B experiments that don’t scale. This guide shows how to build a ClickHouse-powered analytics pipeline that ingests billions of favicon events per day and produces reliable metrics for cache hit rate, device variation, performance, and A/B tests — without blowing up your cost or query latency.
Why ClickHouse in 2026?
ClickHouse has continued to lead OLAP adoption for real-time, high-cardinality telemetry. Large investments and rapid feature development through late 2024–2025 — including ClickHouse’s large growth round reported in late 2025 — made it a first-class choice for event analytics at scale. For favicon telemetry you need:
- High ingest throughput (Kafka/HTTP streams)
- Low-latency OLAP queries for dashboards and experiments
- Cost-effective storage with TTL and compression
- Clustered, replicated storage for availability
Key claim: ClickHouse gives you millisecond query response for aggregated telemetry and can sustain billions of small events a day when paired with Kafka/Redpanda and proper schema design.
Architecture overview: From browser request to dashboard
At a high level, the pipeline is:
- Client or edge emits favicon telemetry (loads, cache headers, status)
- Events are batched to Kafka / Redpanda or pushed to an HTTP ingest gateway
- ClickHouse reads from Kafka or HTTP and writes into MergeTree tables
- Materialized views pre-aggregate per-minute and per-variant metrics
- Grafana (or Superset) queries ClickHouse for dashboards, alerts, and AB analysis
Recommended tech choices
- Event bus: Kafka or Redpanda for durability and backpressure
- Ingest buffer: ClickHouse Kafka engine + Materialized Views, or ClickHouse HTTP API with a buffering proxy
- Tables: ReplicatedReplacingMergeTree / AggregatingMergeTree for dedup and rollups
- Query/visualization: Grafana ClickHouse plugin or ClickHouse dashboards
Designing the event schema
Keep events compact and consistent. Use typed JSON or binary (Protobuf/Avro) when possible. Below is a compact JSON schema tailored for favicon telemetry:
{
"event_id": "uuid", // client-generated or edge-assigned
"ts": "2026-01-17T12:34:56.789Z",
"domain": "example.com",
"page_url": "/docs/favicon-intro",
"favicon_id": "hash-or-id",
"cache_status": "HIT|MISS|STALE",
"response_ms": 23, // time to serve icon
"status_code": 200,
"user_agent_key": "ios-17-safari", // pre-parsed on edge
"device_type": "mobile|desktop|tablet",
"experiment_id": "exp_2026_new_shape", // optional
"variant": "A|B",
"sample_rate": 1, // client-side sampling fraction
"is_prefetch": false
}
Design tips:
- Pre-parse user agent and map to a few device buckets at the edge to reduce cardinality.
- Include sample_rate for correctness when clients sample events.
- Use hashed identifiers for PII compliance (never send raw user emails).
ClickHouse table design
For high-volume favicon events, a two-layer storage model works well:
- Raw events into a high-ingest MergeTree (for reprocessing and debugging)
- Pre-aggregated tables (per-minute/per-domain/per-variant) for dashboards and experiments
Raw events table
CREATE TABLE favicons.events_raw (
event_id UUID,
ts DateTime64(3, 'UTC'),
domain String,
page_url String,
favicon_id String,
cache_status LowCardinality(String),
response_ms UInt32,
status_code UInt16,
device_type LowCardinality(String),
experiment_id LowCardinality(String),
variant LowCardinality(String),
sample_rate Float32
) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/favicons.events_raw', '{replica}')
PARTITION BY toYYYYMM(ts)
ORDER BY (domain, favicon_id, ts)
TTL ts + INTERVAL 90 DAY
SETTINGS index_granularity = 8192;
Notes:
- Partition by month to bound merges and improve TTL deletes.
- Use LowCardinality for high-selectivity string columns (device_type, variant).
- ReplacingMergeTree lets you deduplicate events by event_id if you insert duplicates.
Aggregated, per-minute table (for dashboards)
CREATE MATERIALIZED VIEW favicons.events_minute
TO favicons.events_minute_mv
AS
SELECT
toStartOfMinute(ts) AS minute,
domain,
favicon_id,
device_type,
experiment_id,
variant,
count() AS impressions,
sumIf(1, cache_status = 'HIT') AS hits,
avg(response_ms) AS avg_response_ms
FROM favicons.events_raw
GROUP BY minute, domain, favicon_id, device_type, experiment_id, variant;
CREATE TABLE favicons.events_minute_mv (
minute DateTime,
domain String,
favicon_id String,
device_type LowCardinality(String),
experiment_id LowCardinality(String),
variant LowCardinality(String),
impressions UInt64,
hits UInt64,
avg_response_ms Float32
) ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(minute)
ORDER BY (domain, favicon_id, minute);
AggregatingMergeTree stores partially aggregated states and reduces storage for large rollups.
Ingestion patterns
1) Kafka + ClickHouse (recommended for scale)
Create a Kafka engine topic and a materialized view that inserts into events_raw:
CREATE TABLE favicons.kafka_events (
event_id String,
ts String,
domain String,
page_url String,
favicon_id String,
cache_status String,
response_ms UInt32,
status_code UInt16,
device_type String,
experiment_id String,
variant String,
sample_rate Float32
) ENGINE = Kafka(
'kafka:9092',
'favicon-events',
'favicon-group',
'JSONEachRow'
);
CREATE MATERIALIZED VIEW favicons.kafka_to_raw
TO favicons.events_raw
AS
SELECT
parseUUID(event_id) as event_id,
parseDateTime64BestEffort(ts) as ts,
domain,
page_url,
favicon_id,
cache_status,
response_ms,
status_code,
device_type,
experiment_id,
variant,
sample_rate
FROM favicons.kafka_events;
Buffering is automatic with Kafka; add a ClickHouse Buffer table if you need milliseconds of write batching before Kafka.
2) HTTP bulk ingestion (simpler, smaller scale)
Use a gateway (Nginx + buffering) that forwards compressed JSON batches to ClickHouse HTTP insert endpoint. Keep batch payloads around 50–500KB for efficiency.
Key queries: Cache hit rate, device breakdown, A/B tests
Cache hit rate (real-time, per-minute)
SELECT
minute,
domain,
favicon_id,
SUM(hits) / SUM(impressions) AS cache_hit_rate
FROM favicons.events_minute_mv
WHERE minute >= now() - INTERVAL 24 HOUR
GROUP BY minute, domain, favicon_id
ORDER BY minute ASC
LIMIT 1000;
Device variation: hit rate and latency by device
SELECT
device_type,
SUM(hits) / SUM(impressions) AS cache_hit_rate,
avgWeighted(avg_response_ms, impressions) as weighted_avg_response_ms
FROM favicons.events_minute_mv
WHERE domain = 'example.com' AND minute >= now() - INTERVAL 12 HOUR
GROUP BY device_type
ORDER BY cache_hit_rate DESC;
A/B experiment results: difference in cache hits and response times
Compute aggregated metrics per variant and use a simple Z-test for difference in proportions (cache hits). This example computes pooled proportion and z-score; adapt for your exact statistical needs.
WITH
(SELECT sum(hits) as hits_A, sum(impressions) as n_A FROM favicons.events_minute_mv
WHERE experiment_id='exp_2026_new_shape' AND variant='A' AND minute >= now() - INTERVAL 7 DAY) AS A,
(SELECT sum(hits) as hits_B, sum(impressions) as n_B FROM favicons.events_minute_mv
WHERE experiment_id='exp_2026_new_shape' AND variant='B' AND minute >= now() - INTERVAL 7 DAY) AS B
SELECT
(A.hits_A / A.n_A) AS p1,
(B.hits_B / B.n_B) AS p2,
( (A.hits_A + B.hits_B) / (A.n_A + B.n_B) ) AS p_pooled,
(p1 - p2) / sqrt(p_pooled * (1 - p_pooled) * (1.0/A.n_A + 1.0/B.n_B)) AS z_score
;
Use the z_score to compute p-values client-side or with ClickHouse math functions. For large n, z-test approximations are valid. For small-sample tests, use bootstrap sampling or Bayesian methods.
Operational best practices
Sampling and cardinality control
- Apply client-side sampling for optional debug fields; always include sample_rate so you can scale counts back up.
- Pre-map user agents and domain names to low-cardinality tags to avoid cardinality explosion.
Retention and cost
- Store raw events for a bounded period (30–90 days) and keep aggregated rollups (per-minute or per-hour) for longer retention.
- Use TTLs in MergeTree definitions to automatically expire raw data.
Backfills and schema changes
Use the raw events table for reprocessing when event schema evolves. Materialized views simplify streaming ETL, but keep a reliable backfill path (batch re-insert into events_raw and run INSERT ... SELECT into aggregates).
Monitoring and alerts
- Track ingestion lag and Kafka consumer group offsets.
- Create alerts for sudden drops in cache hit rate or increases in avg_response_ms across domains or devices.
Privacy, consent, and compliance
In 2026, privacy-first telemetry patterns are expected. Avoid storing PII; prefer hashed identifiers and edge-side enrichment. Respect user consent: gate sampling and event emission behind consent checks. Use differential retention for regions with strict retention laws.
Scaling tips and failure modes
Scaling write throughput
- Horizontally scale ClickHouse shards and use the Distributed engine for query routing.
- Use Kafka partitions keyed by domain or favicon_id to keep ordering where needed.
- Batch small events before inserting; tiny single-event inserts are expensive.
Handling spikes
Buffer with Kafka retention and provision extra ClickHouse capacity (or autoscale ClickHouse Cloud) for predictable traffic spikes caused by product launches or large sites.
Drive down query latency
- Pre-aggregate per-minute and materialize commonly used groupings.
- Use appropriate ORDER BY keys to support common query patterns (e.g., domain, minute).
- Enable index_granularity tuning when range scans over time are common.
Example case study (anonymized)
Company: an icon CDN serving 400M users monthly. Problem: inconsistent favicon cache behavior across mobile browsers and poor experiment fidelity. Solution implemented:
- Instrumented edge to emit compact event JSON with sample_rate and device_type.
- Piped events to Redpanda and into ClickHouse using the Kafka engine.
- Built per-minute AggregatingMergeTree rollups and surfaced them in Grafana.
Results (after 8 weeks):
- Detected a 12% lower cache hit rate on Android WebViews and rolled out a cache-control change that recovered 9%.
- Ran a controlled favicon shape experiment (A/B), and within 72 hours got statistically significant cache hit improvement for variant B — saved ~15% bandwidth across 10 top domains.
- Average dashboard query p99 latency: 180ms for top-level aggregations.
Advanced strategies and 2026 trends
Look ahead — these are patterns we recommend adopting in 2026:
- Edge-native pre-aggregation: compute per-minute key counts at the edge and send summarized events to reduce ingestion pressure.
- Client-driven sampling with adaptive rates: increase sampling for low-traffic domains to preserve experiment power and reduce cost for high-volume domains.
- Hybrid cloud-edge analytics: use ClickHouse Cloud for burst capacity and on-prem clusters for long-term raw retention.
- Integrate model-driven anomaly detection (lightweight ML) that runs over ClickHouse aggregates for early detection of cache regressions.
These patterns reflect industry moves toward decentralized telemetry and operational OLAP seen across 2024–2026 as ClickHouse and streaming platforms matured.
Checklist: Quick deployment plan
- Define compact event schema and sampling rules.
- Stand up Redpanda/Kafka and create topic per environment.
- Deploy ClickHouse cluster (3+ replicas) or use ClickHouse Cloud.
- Create Kafka engine table + materialized view into events_raw.
- Build per-minute AggregatingMergeTree rollups and dashboard queries.
- Run backfill to validate counts vs historical logs.
- Instrument A/B assignment and expose experiment ids in events.
- Monitor ingestion lag, p99 query times, and cost by domain.
Actionable takeaways
- Always include sample_rate in events so you can scale sampling without biasing counts.
- Pre-aggregate per-minute to keep dashboards fast and cheap.
- Use Kafka + ClickHouse Kafka engine for durable, high-throughput ingestion.
- Partition and TTL raw events; keep aggregates longer for trend analysis.
- Design experiments to track both cache hit rate and user-perceived latency — both matter.
Further reading and resources (2026)
ClickHouse’s rapid growth and enterprise adoption in 2024–2025 (including major funding rounds and platform investments) have made it the de-facto OLAP engine for telemetry at scale. If you’re evaluating providers, compare managed ClickHouse offerings for autoscaling and backup options to estimate TCO for multi-billion-event workloads.
Final notes
Favicons may be small, but their telemetry is a high-signal source for performance, cache strategy, and visual experiments. With ClickHouse you can reliably ingest, analyze, and act on that signal at scale. The patterns in this guide will help you build a resilient, fast, and cost-aware analytics pipeline for favicon telemetry in 2026 and beyond.
Ready to implement? Start with the checklist above and spin up a small proof-of-concept: one Kafka topic, one ClickHouse node, and the materialized view pipeline. If you want a template of the full SQL + Grafana dashboard used by production teams, contact our engineering docs team or download the repo linked from this article.
Note: This article references industry developments through 2025 and early 2026. For help building a production pipeline or evaluating architectures for your scale, reach out — we’ve built pipelines for icon CDNs and platform builders handling hundreds of millions of favicon events daily.
Call to action
Get a ready-to-deploy ClickHouse favicon analytics kit: schema, ingestion scripts (Kafka + HTTP), Grafana dashboards, and AB analysis queries. Request the kit or a free architecture review — optimize your favicon pipeline to save bandwidth, cut latency, and run confident experiments.
Related Reading
- Migration Playbook: Moving High-Traffic Domains to New Hosts Without Losing AI Visibility
- From Talent Agency Finance to Studio CFO: What Students Can Learn About Career Paths in Media Finance
- From Stove to Global Shelves: What Handbag Makers Can Learn from a DIY Brand’s Scaling Journey
- Curating a Salon Retail Shelf That Reflects 2026’s Biggest Beauty Launches
- How to Style a Home Office with Ceramics and an Affordable Big Monitor
Related Topics
Unknown
Contributor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you
Integrating Gemini-Based LLMs to Generate Icon Variants on Demand (Siri Is a Gemini Inspiration)
Edge Caching Versus Local Storage: What SK Hynix’s Flash Innovations Mean for Icon Delivery
Fallback Favicons and Offline UX: Preparing for Outages Like the X/Cloudflare Incident
Designing Avatar Systems for Transmedia IP: What The Orangery Deal Teaches Small Studios
Auto-Generate Favicons for Vertical-First Apps Using AI: Lessons from Holywater’s Scale-Up
From Our Network
Trending stories across our publication group