Scaling Image Throughput with an NGINX-Based Cache

With over 600 million images to index for downstream services like search and personalization, we faced a fundamental bottleneck in how our system accessed and served images. This post dives into the thoughts that went in when building a high-throughput, sharded image caching layer on top of NGINX to unlock 5x the read performance—while keeping things simple and resilient.


🚧 Problem

We needed to solve several pressing issues:

  • Throughput limitations in our existing image storage infrastructure
  • Latency-sensitive workflows, like embedding extraction and indexing
  • Massive scale: 600M+ images, frequently accessed by downstream services

Traditional storage approaches introduced unacceptable bottlenecks when embedding or retrieving images at scale.


📋 Technical Requirements

To meet the demands, our caching layer needed:

  • Configurable and pluggable cache logic
  • High throughput for concurrent image reads
  • Scalability via sharding across nodes
  • Fault tolerance and even data distribution

⚙️ Implementation: The Image Cache

We built a distributed caching layer on top of NGINX, known for its:

  • High-speed I/O handling
  • Built-in caching module
  • Simplicity in defining key-value semantics (via URL routes)

Key Design Points:

  • Local file system storage → Simplicity and performance, but I/O intensive
  • Multi-node architecture with consistent hashing → Ensured even data distribution and fault tolerance
  • NGINX as both reverse proxy and cache manager → No extra runtime components; lightweight

Note:

We’ve published a detailed step-by-step guide on the setup:

TECH.ZEALOT: Creating an NGINX Image Cache


✅ Outcome

  • 5x improvement in read throughput
  • Even data distribution via consistent hashing
  • Automatic failure handling—nodes could be added/removed without rebalancing all data

🔍 Challenges & Improvements

While the system works well in production, there are still some limitations:

  • Heavy disk I/O due to local file system access
  • Replication is not built-in and needs orchestration
  • Cache introspection is limited—it’s hard to query which images are present
  • Hot key pressure can overload nodes → Mitigated via more nodes and randomized hashing seeds

🧠 Other Considerations

  • Memory-layer caching (e.g., Redis, Memcached) for ultra-hot images
  • Hybrid edge-caching using CDNs or object storage with smart invalidation
  • Cache analytics dashboard for visibility and tuning

Leave a comment