Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/redis/redis/llms.txt

Use this file to discover all available pages before exploring further.

Redis HyperLogLog is a probabilistic data structure for counting unique elements with a standard error of 0.81%. It uses a fixed 12KB of memory regardless of cardinality, making it ideal for counting unique items at scale.

Use Cases

  • Unique visitors: Count unique IP addresses or user IDs
  • Unique page views: Track distinct viewers per page
  • Unique search queries: Count different search terms
  • Ad impressions: Measure unique users who saw ads
  • Network traffic: Count unique source IPs
  • Sensor data: Track unique device IDs

Key Commands

Basic Operations

# Add elements
redis> PFADD unique:visitors "user:1" "user:2" "user:3"
(integer) 1

# Add duplicate (doesn't change count)
redis> PFADD unique:visitors "user:1"
(integer) 0

# Get cardinality estimate
redis> PFCOUNT unique:visitors
(integer) 3

# Add more elements
redis> PFADD unique:visitors "user:4" "user:5"
(integer) 1
redis> PFCOUNT unique:visitors
(integer) 5

Merging HyperLogLogs

# Create multiple HyperLogLogs
redis> PFADD visitors:page1 "user:1" "user:2" "user:3"
(integer) 1
redis> PFADD visitors:page2 "user:2" "user:3" "user:4"
(integer) 1

# Merge into new HyperLogLog
redis> PFMERGE visitors:all visitors:page1 visitors:page2
OK

# Get merged count (deduplicates)
redis> PFCOUNT visitors:all
(integer) 4

# Count multiple HyperLogLogs
redis> PFCOUNT visitors:page1 visitors:page2
(integer) 4

Time Complexity

CommandTime ComplexityDescription
PFADDO(1)Add elements
PFCOUNTO(1)Get cardinality (single)
PFCOUNT multipleO(N)N=number of keys
PFMERGEO(N)N=number of keys

Memory Usage

HyperLogLog uses exactly 12,288 bytes (12KB) per key:
redis> PFADD hll "element1"
(integer) 1
redis> MEMORY USAGE hll
(integer) 12344  # ~12KB

# Add millions of elements - still 12KB
redis> PFADD hll "element2" "element3" ... "element1000000"
redis> MEMORY USAGE hll
(integer) 12344  # Still ~12KB!

Memory Comparison

# Track 1 million unique IDs

# Set: ~15-20 MB
redis> SADD unique_set user:1 user:2 ... user:1000000
redis> MEMORY USAGE unique_set
(integer) ~15000000  # ~15MB

# Bitmap: 125 KB (for sequential IDs 0-1M)
redis> SETBIT unique_bitmap 1000000 1
redis> MEMORY USAGE unique_bitmap
(integer) 125032  # ~125KB

# HyperLogLog: 12 KB (any IDs)
redis> PFADD unique_hll user:1 user:2 ... user:1000000
redis> MEMORY USAGE unique_hll
(integer) 12344  # ~12KB

Accuracy

Standard error: ±0.81%
# Add 10,000 unique elements
redis> PFADD test element:1 element:2 ... element:10000

# Estimate (typically 9,919 to 10,081)
redis> PFCOUNT test
(integer) 10023  # Within 0.81% of 10,000

# Add 1 million elements
redis> PFADD large element:1 ... element:1000000
redis> PFCOUNT large
(integer) 1001234  # Typically within ~8,100 of actual
The error rate is consistent regardless of cardinality. Whether counting 100 or 1 billion items, the error stays at ±0.81%.

Patterns and Examples

Daily Unique Visitors

# Track unique visitors per day
redis> PFADD visitors:2024-03-03 "ip:192.168.1.1"
(integer) 1
redis> PFADD visitors:2024-03-03 "ip:192.168.1.2"
(integer) 1
redis> PFADD visitors:2024-03-03 "ip:192.168.1.1"
(integer) 0  # Duplicate

# Get daily uniques
redis> PFCOUNT visitors:2024-03-03
(integer) 2

# Set expiration
redis> EXPIRE visitors:2024-03-03 2592000  # 30 days
(integer) 1

# Weekly uniques (merge days)
redis> PFMERGE visitors:week:10 
  visitors:2024-03-03 
  visitors:2024-03-04 
  visitors:2024-03-05 
  visitors:2024-03-06 
  visitors:2024-03-07 
  visitors:2024-03-08 
  visitors:2024-03-09
OK
redis> PFCOUNT visitors:week:10
(integer) 8923

Page View Analytics

# Track unique viewers per page
redis> PFADD page:home:unique "user:123"
(integer) 1
redis> PFADD page:about:unique "user:123"
(integer) 1
redis> PFADD page:contact:unique "user:456"
(integer) 1

# Get unique viewers per page
redis> PFCOUNT page:home:unique
(integer) 1234

# Total unique visitors across all pages
redis> PFCOUNT page:home:unique page:about:unique page:contact:unique
(integer) 2891

Search Query Analytics

# Track unique search queries per day
redis> PFADD searches:2024-03-03 "redis tutorial"
(integer) 1
redis> PFADD searches:2024-03-03 "redis commands"
(integer) 1
redis> PFADD searches:2024-03-03 "redis tutorial"  # Duplicate
(integer) 0

# Count unique queries
redis> PFCOUNT searches:2024-03-03
(integer) 2

# Monthly unique queries
redis> PFMERGE searches:2024-03 
  searches:2024-03-01 
  searches:2024-03-02 
  ...
  searches:2024-03-31
OK
redis> PFCOUNT searches:2024-03
(integer) 45678

A/B Testing

# Track unique users per experiment variant
redis> PFADD experiment:checkout:variantA "user:123"
(integer) 1
redis> PFADD experiment:checkout:variantB "user:456"
(integer) 1

# Get experiment reach
redis> PFCOUNT experiment:checkout:variantA
(integer) 5234
redis> PFCOUNT experiment:checkout:variantB
(integer) 5189

# Total experiment participants
redis> PFCOUNT experiment:checkout:variantA experiment:checkout:variantB
(integer) 10423

Ad Campaign Tracking

# Track unique users who saw ads
redis> PFADD campaign:spring2024:impressions "user:123"
(integer) 1
redis> PFADD campaign:spring2024:clicks "user:123"
(integer) 1

# Get metrics
redis> PFCOUNT campaign:spring2024:impressions
(integer) 123456
redis> PFCOUNT campaign:spring2024:clicks
(integer) 4567

# CTR calculation: 4567/123456 = 3.7%

Algorithm Details

HyperLogLog uses:
  • 16,384 registers (buckets)
  • 6-bit precision per register
  • MurmurHash64A for hashing
  • Sparse/Dense encoding for memory optimization

Sparse vs Dense Encoding

# Small cardinality: sparse encoding (smaller)
redis> PFADD small "element1" "element2"
(integer) 1
redis> MEMORY USAGE small
(integer) 856  # Less than 12KB

# Large cardinality: automatic conversion to dense
redis> PFADD small element:1 element:2 ... element:10000
redis> MEMORY USAGE small
(integer) 12344  # Now 12KB
Redis automatically converts from sparse (variable size) to dense (12KB) encoding when beneficial for memory usage.

SIMD Optimizations

HyperLogLog uses hardware acceleration:
  • AVX2 instructions on modern Intel/AMD CPUs
  • NEON instructions on ARM processors
  • Significantly faster PFCOUNT operations

Best Practices

  1. Use for counting only - HyperLogLog can’t retrieve individual elements
  2. Merge related HLLs - combine daily data into weekly/monthly
  3. Set expiration on time-based HLLs to save memory
  4. Accept approximation - perfect accuracy requires sets (more memory)
  5. Use consistent hashing - don’t mix element formats
HyperLogLog can only estimate cardinality. You cannot retrieve individual elements, test membership, or get exact counts. Use sets if you need these features.
For the best trade-off between memory and accuracy, HyperLogLog is hard to beat. At 12KB per key, you can track billions of unique items with less than 1% error.

When to Use HyperLogLog

RequirementHyperLogLogSetBitmap
Count unique items✅ Best
Exact count needed❌ (~0.81% error)
Retrieve elements
Test membership
Memory (1M items)12 KB15-20 MB125 KB*
Memory (1B items)12 KB15-20 GB125 MB*
Best forMassive scaleExact countsSequential IDs
*Bitmap memory depends on highest ID, not count

Next Steps

Probabilistic Structures

Bloom filters, Count-Min Sketch, and more

Bitmaps

For exact counting with dense IDs