Saturday, March 7, 2026

System Design : Tiny URL

https://www.youtube.com/watch?v=xFeWVugaouk&list=PLjTveVh7FakLGZ36GbWAk_DMf_0xBZpGv&index=2

https://docs.google.com/presentation/d/1kAAxyxuY8KX4Oo9aDo1ric7-AxnhPdPI0ceFasTM4j8/edit?slide=id.g33d3b1923dc_0_61#slide=id.g33d3b1923dc_0_61

Requirements : Functional

  • createShortURL(String longURL) : returns string shortURL
    • expiration time
    • purge time 
  • fetchShortURL(String shortURL) : redirects user to corresponding longURL

Requirements : Non Functional

  • secure
  • data privacy / compliance 
  • call volume : 
    • launch 1million users , DAU , WAU , MAU
    • URL creation 600 million users per month , 228 per second
    • URL Retrieval 10 billion per month : 3805 per second
    • spike 5x
  • low latency 
  • High availability 
  • Uptime : 99% 

Data Volume

How many characters should we use for our URL suffix?
  • 600 million creations/mo = 7.2 billion creations/y = 720 billion creations/century
  • If each URL can contain a-z and 0-9, we have 36 choices per character
  • If using 8 characters, we have 368 = 3 trillion possibilities
Back of Envelope Calculation 










Number Of Database Nodes
3805 database reads per second
Simple reads, can probably be handled on a single node.

228 writes per second
Can most likely be handled on a single node, depending on write complexity

74 bytes * 720 billion = 48TB of data
Can be handled on a beefy node, but likely worth horizontally scaling

 Block diagram




















Create Database Index for reads , indexed on shortURL value








Replication










if a leader fails, postgres is able to promote one of the replica as leader

We may loose some data if change of leader happens








Since we do not know which short URL will be hot , use LRU policy will max hit rate



















One option is Citus

Citus is an open-source extension that transforms PostgreSQL into a distributed database, allowing it to scale horizontally across multiple commodity servers

Key Features and Concepts
  • As an Extension, not a Fork: Citus is implemented as an extension to standard PostgreSQL. This means it can stay current with the latest PostgreSQL features, tooling, and ecosystem, including support for recent versions like Postgres 18.
  • Horizontal Scaling: It enables PostgreSQL to use more CPU cores, memory, and storage than a single machine by distributing data and queries across a cluster of "nodes" in a shared-nothing architecture.
  • Sharding and Distribution: Citus shards (splits) large tables into smaller pieces distributed across "worker" nodes, while "reference" tables (smaller, frequently joined tables) are replicated across all nodes for maximum performance.
  • Distributed Query Engine: It includes a query planner and executor that parallelizes SQL queries across the worker nodes, resulting in dramatically improved response times for data-intensive applications.
  • Compatibility: Users can interact with the distributed database using standard SQL and existing PostgreSQL tools.
  • Deployment Options: Citus is available as a 100% open-source download or as a managed cloud service through Azure Cosmos DB for PostgreSQL.
Primary Use Cases
Citus is designed for performance-critical applications that have outgrown a single Postgres instance. Common use cases include:
  • Multi-tenant SaaS applications: Sharding data by tenant to isolate workloads and improve performance.
  • Real-time analytics and dashboards: Providing low-latency queries on large datasets (terabytes or more).
  • Time-series and IoT data platforms: Efficiently managing and analyzing large volumes of time-stamped data
 




















LSM-based databases
designed for high write throughput, use a log-structured merge-tree to manage data by first writing to memory (MemTable) then flushing sorted, immutable files (SSTables) to disk. Common examples include RocksDB, Apache Cassandra, LevelDB, InfluxDB, and ScyllaDB, which are frequently used in NoSQL and time-series applications

Avoids getting locks when reading immutable table = performance benefits

B-trees
Many popular database management systems use B-trees (or the related B+ trees) as their primary indexing method due to their efficiency in handling data stored on disk.

When searching
  • LSM may search in memory and disk hence read maybe slower
  • Btree : always search by going to single location (single search) on disk.
Replication

If use a single leader based replication we have risk of loosing data if failover happens








Example : Cassandra & Sila

Do we need strong consistency ?

no , if leader goes down it can have inconsistent reads

Multi Leader replication : enables low latency writers in geo diverse regions , can be used , comes with a cost of write conflict when 2 users come & try to claim same short URL,  

We can solve it by multiple shard leaders











No comments: