# How to Design a URL Shortening Service Like Bit.ly: A Complete System Design Guide

> Design a URL shortening service like Bit.ly. Learn about MD5 hashing, Base62 encoding, database choices, caching, and API design for a complete system.

- Repository: [Donne Martin/system-design-primer](https://github.com/donnemartin/system-design-primer)
- Tags: guide
- Published: 2026-02-24

---

**Design a URL shortening service by using MD5 hashing combined with Base62 encoding to generate unique 7-character keys, storing mappings in a SQL/NoSQL database with Redis caching, and exposing REST APIs for creation and resolution with analytics and expiration handling.**

The `donnemartin/system-design-primer` repository provides a battle-tested blueprint for building scalable URL shorteners and paste services. This guide walks through the exact architecture used to handle millions of short links, referencing the implementation details found in [`solutions/system_design/pastebin/README.md`](https://github.com/donnemartin/system-design-primer/blob/main/solutions/system_design/pastebin/README.md) and the accompanying Python code.

## Use Cases and Constraints

Before writing code, define the functional and non-functional requirements. According to the primer's **Step 1** section in [`solutions/system_design/pastebin/README.md`](https://github.com/donnemartin/system-design-primer/blob/main/solutions/system_design/pastebin/README.md), a production-grade URL shortener must support:

- **Create**: Generate unique short links for arbitrary URLs
- **Read**: Resolve short links to original URLs with minimal latency
- **Analytics**: Track hit counts and usage patterns
- **Expiration**: Automatic cleanup of stale entries after a TTL

Target constraints typically include **high availability**, **read-heavy traffic** (approximately 100 M reads vs. 10 M writes per month), and the capacity to store 360 M links over three years without collisions.

## High-Level Architecture Components

The system architecture follows a layered approach detailed in **Step 2** of the primer. Each component handles a specific responsibility:

- **Client**: Sends HTTP requests to create or retrieve shortlinks
- **Web Server (Reverse Proxy)**: Terminates TLS and load-balances traffic to API servers
- **Write API**: Generates unique short keys, stores mappings in the database, and persists large payloads to object storage when necessary
- **Read API**: Looks up short keys, fetches original URLs, and handles redirects
- **SQL/NoSQL Store**: Persistent hash table mapping `shortlink → original_url` with secondary indexes on `created_at`
- **Object Store** (e.g., S3): Holds large payloads for paste contents (optional for pure URL shortening)
- **Cache** (Redis/Memcached): Serves hot lookups to maintain sub-millisecond latency
- **Analytics Processor**: Runs MapReduce jobs over web-server logs to aggregate hit counts
- **Expiration Service**: Scans the database for expired entries and deletes or marks them

## Core Implementation Details

### Short-Link Generation Strategy

The primer specifies a two-step generation process to ensure uniqueness and sufficient key space. As implemented in the source analysis, the service uses **MD5 hashing** of `client_ip + timestamp + url` to produce a uniformly distributed 128-bit value, then applies **Base62 encoding** to create URL-safe strings.

Base62 uses the alphabet `a-zA-Z0-9` (62 characters). The first 7 characters provide 62⁷ ≈ 3.5 × 10¹² possible keys, exceeding the projected 360 M links requirement.

```python
import hashlib
import string
import time

BASE62 = string.digits + string.ascii_letters

def base62_encode(num, alphabet=BASE62):
    """Encode an integer to a base‑62 string."""
    if num == 0:
        return alphabet[0]
    arr = []
    base = len(alphabet)
    while num:
        num, rem = divmod(num, base)
        arr.append(alphabet[rem])
    arr.reverse()
    return ''.join(arr)

def generate_shortlink(url, client_ip, timestamp):
    # Step 1: MD5 hash of client_ip + timestamp + url

    md5 = hashlib.md5(f"{client_ip}{timestamp}{url}".encode()).hexdigest()
    # Step 2: Convert hex to int and encode to Base62, take first 7 chars

    short = base62_encode(int(md5, 16))[:7]
    return short

```

### Database Schema and Persistence

The storage layer requires fast lookups by shortlink and efficient expiration scanning. The primer recommends:

1. **Relational Database**: Use `shortlink` as the primary key with a secondary index on `created_at` to support expiration queries
2. **Object Store**: Offload large paste contents to S3 when payloads exceed typical database row limits (relevant for Pastebin-style services)

Write scaling starts with a single master handling ~4 writes/sec, with sharding or federation added as traffic grows.

### API Design

The REST API exposes two primary endpoints. According to [`solutions/system_design/pastebin/README.md`](https://github.com/donnemartin/system-design-primer/blob/main/solutions/system_design/pastebin/README.md):

**Create Endpoint**

```bash
curl -X POST -H "Content-Type: application/json" \
     -d '{"original_url":"https://example.com/very/long/path","expiration":1440}' \
     https://short.ly/api/v1/shorten

```

Returns: `{ "shortlink": "abc1234" }`

**Resolve Endpoint**

```bash
curl -L https://short.ly/abc1234

```

The Read API handles `GET /{shortlink}` by returning an **HTTP 301 redirect** to the stored URL after performing a cache lookup.

### Caching and Read Scaling

To handle 100M+ monthly reads, place **Redis** or **Memcached** in front of the database:

- Cache misses fall back to the master-slave replica set
- Hot keys remain in memory for sub-millisecond resolution
- Write-through or write-around strategies prevent cache staleness

### Analytics and Expiration

**Analytics** processing runs MapReduce jobs over web-server logs to produce monthly hit counts, as shown in the primer's reference implementations.

**Expiration cleanup** requires a periodic background job that scans the `created_at + expiration` column and removes stale rows from both the database and cache.

## Scaling Considerations

The `system-design-primer` suggests iterative scaling: benchmark → profile → address bottlenecks. Each component can expand independently:

1. **Load Balancer**: Add layer-7 load balancers to distribute traffic across API servers
2. **CDN**: Cache redirect responses at edge locations for popular links
3. **Read Replicas**: Scale database reads with additional replicas
4. **Sharding**: Partition the key space when the single master becomes a bottleneck

The design in [`solutions/system_design/pastebin/README.md`](https://github.com/donnemartin/system-design-primer/blob/main/solutions/system_design/pastebin/README.md) demonstrates that horizontal scaling requires no API changes—only infrastructure adjustments.

## Summary

- **Generate keys** using MD5 hashing and Base62 encoding to create 7-character unique identifiers with 62⁷ possible combinations
- **Store mappings** in a SQL database with `shortlink` as primary key and secondary indexes on `created_at` for expiration queries
- **Expose APIs** via POST for creation and GET with 301 redirects for resolution
- **Add Redis caching** to handle read-heavy traffic and maintain low latency
- **Process analytics** using MapReduce over logs and clean expired entries with background jobs
- **Scale iteratively** through load balancers, read replicas, and sharding without changing the public interface

## Frequently Asked Questions

### How do you prevent collisions when generating short links?

The primer uses a combination of **MD5 hashing** with client IP and timestamp entropy, followed by **Base62 encoding**. With 62⁷ ≈ 3.5 trillion possible 7-character keys against a requirement of ~360 M links, collision probability remains statistically negligible. For absolute safety, the Write API can check database existence and re-hash with a counter if a collision occurs.

### What database should I use for a URL shortening service?

Either **SQL** (PostgreSQL, MySQL) or **NoSQL** (DynamoDB, Cassandra) works depending on your team's expertise. The primer's solution in [`solutions/system_design/pastebin/README.md`](https://github.com/donnemartin/system-design-primer/blob/main/solutions/system_design/pastebin/README.md) suggests SQL with `shortlink` as the primary key because the data model is a simple key-value mapping. Use secondary indexes on `created_at` columns to support efficient expiration scanning.

### How do you handle analytics for billions of redirects?

Process web-server logs asynchronously using **MapReduce** jobs. The primer references this approach in [`solutions/system_design/pastebin/README.md`](https://github.com/donnemartin/system-design-primer/blob/main/solutions/system_design/pastebin/README.md) and provides related examples in [`solutions/system_design/web_crawler/web_crawler_snippets.py`](https://github.com/donnemartin/system-design-primer/blob/main/solutions/system_design/web_crawler/web_crawler_snippets.py). Aggregate hit counts periodically rather than updating the database on every read to avoid write amplification.

### What is the difference between designing Bit.ly and Pastebin?

While both use identical short-link generation mechanics, **Pastebin** stores large text payloads requiring **Object Store** integration (S3) for content bodies, whereas **Bit.ly** only stores URL mappings in the database. The Pastebin design in [`solutions/system_design/pastebin/pastebin.py`](https://github.com/donnemartin/system-design-primer/blob/main/solutions/system_design/pastebin/pastebin.py) includes additional logic for content expiration and size limits that pure URL shorteners can omit.