# Horizontal vs Vertical Scaling: Pros, Cons, and Architectural Trade-offs

> Understand horizontal vs vertical scaling pros cons architectural trade-offs. Learn which scaling method is best for your system design.

- Repository: [Donne Martin/system-design-primer](https://github.com/donnemartin/system-design-primer)
- Tags: deep-dive
- Published: 2026-02-24

---

**Vertical scaling upgrades existing server resources while horizontal scaling distributes load across multiple commodity machines, with the latter offering superior availability but requiring stateless application design and externalized session management.**

Horizontal vs vertical scaling defines how engineering teams handle growth in traffic and data volume. The donnemartin/system-design-primer repository provides detailed guidance on selecting between these approaches based on workload characteristics, cost constraints, and operational maturity. Understanding the specific trade-offs documented in the primer's core files enables architects to evolve from single-instance deployments to distributed systems without unnecessary complexity.


## Understanding Vertical Scaling (Scale-Up)

### Definition and Implementation

**Vertical scaling** involves adding more CPU, RAM, or storage to a single existing machine. In cloud infrastructure contexts, this translates to upgrading your EC2 instance class or RDS database tier to a larger instance type.


### Advantages of Vertical Scaling

- **Simplicity**: Upgrading a single instance requires minimal architectural changes and no distributed systems expertise.
- **Immediate capacity**: Existing applications benefit from increased resources without code modifications or data model redesigns.
- **Validation suitability**: Ideal for proving core functionality before investing in distributed architecture complexity.


### Limitations and Risks

According to [`README.md`](https://github.com/donnemartin/system-design-primer/blob/main/README.md) (lines 703-706), vertical scaling presents significant constraints that limit its applicability at scale:

- **Cost inefficiency**: High-end specialized hardware becomes exponentially more expensive than commodity alternatives.
- **Single point of failure**: One powerful machine represents a critical vulnerability; hardware failure compromises the entire service.
- **Talent acquisition barriers**: Organizations struggle to hire engineers experienced with specialized, high-end hardware configurations.


## Understanding Horizontal Scaling (Scale-Out)

### Definition and Architecture

**Horizontal scaling** adds more commodity machines behind a load balancer to distribute incoming requests. As documented in [`README.md`](https://github.com/donnemartin/system-design-primer/blob/main/README.md) (lines 703-706), this approach treats infrastructure as a pool of interchangeable resources rather than monolithic assets.


### Operational Benefits

- **Availability improvements**: Losing one node does not compromise the entire service, enabling fault-tolerant architectures.
- **Cost efficiency**: Commodity hardware provides better price-to-performance ratios than premium single-box solutions.
- **Elastic capacity**: Systems can add or remove nodes dynamically based on real-time traffic patterns.


### Implementation Challenges

The transition to horizontal scaling introduces operational complexity documented in [`README.md`](https://github.com/donnemartin/system-design-primer/blob/main/README.md) (lines 708-713):

- **Stateless requirement**: Application servers must not store local session data or user-specific state between requests.
- **Externalized sessions**: User state must migrate to distributed caches (Redis) or centralized data stores accessible by all nodes.
- **Downstream pressure**: Databases and internal services face increased concurrent connection counts from multiple application nodes.
- **Operational overhead**: Managing multiple instances requires orchestration, monitoring, and automated deployment pipelines.


## Load Balancer Critical Considerations

### The Hidden Bottleneck

Load balancers themselves can become architectural vulnerabilities. As noted in [`README.md`](https://github.com/donnemartin/system-design-primer/blob/main/README.md) (lines 714-718), if provisioned incorrectly, the load balancer transforms from a traffic distribution mechanism into a **single point of failure** or a throughput bottleneck that limits system growth.


### Mitigation Strategies

Engineers must implement health checks, configure redundant load balancer instances, and monitor connection counts to prevent the scaling infrastructure from becoming the limiting factor under high load.


## Practical Implementation Examples

### NGINX Layer 7 Load Balancing

For teams implementing horizontal scaling with NGINX, the configuration defines an upstream backend pool distributing requests across multiple web servers:

```nginx
http {
    upstream backend {
        server 10.0.1.10;   # Web server 1

        server 10.0.1.11;   # Web server 2

        # add more servers as you scale out

    }

    server {
        listen 80;
        location / {
            proxy_pass http://backend;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }
    }
}

```


### AWS Elastic Load Balancer Configuration

Cloud-native horizontal scaling uses managed load balancers to distribute traffic across availability zones:

```json
{
  "LoadBalancerName": "my-app-elb",
  "Listeners": [
    {
      "Protocol": "HTTP",
      "LoadBalancerPort": 80,
      "InstanceProtocol": "HTTP",
      "InstancePort": 80
    }
  ],
  "AvailabilityZones": ["us-east-1a", "us-east-1b"]
}

```


### Vertical Scaling via AWS CLI

When upgrading existing database resources vertically on AWS RDS:

```bash
aws rds modify-db-instance \
    --db-instance-identifier mydb \
    --db-instance-class db.m5.large \
    --apply-immediately

```


## Migration Strategy: From Vertical to Horizontal

### The Evolution Path

The [`solutions/system_design/scaling_aws/README.md`](https://github.com/donnemartin/system-design-primer/blob/main/solutions/system_design/scaling_aws/README.md) (lines 84-99) recommends a phased approach for system evolution:

1. **Start vertical**: Validate core functionality with a single powerful instance to minimize initial complexity.
2. **Prepare infrastructure**: Implement **stateless services**, **distributed caches**, and **replicated databases** before traffic demands force architectural changes.
3. **Transition early**: Move to horizontal scaling before resource exhaustion creates emergency scenarios.


## Summary

- **Vertical scaling** offers simplicity but creates single points of failure and hard scaling ceilings due to hardware limitations.
- **Horizontal scaling** provides availability and cost benefits but requires stateless architecture and externalized session management.
- **Load balancers** require redundancy planning and health checking to avoid becoming bottlenecks or critical failure points.
- **Migration strategy** should begin with vertical validation, then transition to horizontal scaling while implementing distributed data stores and caches.


## Frequently Asked Questions

### Is horizontal scaling always better than vertical scaling?

No. Horizontal scaling introduces significant operational complexity regarding state management, downstream connections, and distributed debugging. Vertical scaling remains optimal for small workloads, proof-of-concept validation, and systems requiring strong consistency without distributed transaction overhead.


### What makes vertical scaling difficult for hiring?

According to the donnemartin/system-design-primer documentation, specialized high-end hardware requires specific expertise that is scarce in the job market. Commodity hardware used in horizontal scaling aligns with widely available Linux and cloud engineering skills, making recruitment significantly easier.


### How do load balancers impact horizontal scaling architectures?

Load balancers distribute traffic across horizontally scaled nodes, but if not properly provisioned with redundancy and health checks, they become a single point of failure (as documented in [`README.md`](https://github.com/donnemartin/system-design-primer/blob/main/README.md) lines 714-718) or a throughput bottleneck that limits system growth under high concurrent load.


### When should I migrate from vertical to horizontal scaling?

Migrate when vertical upgrade costs exceed horizontal infrastructure expenses, or when availability requirements demand eliminating single points of failure. The primer suggests transitioning after validating core functionality vertically, but implementing **stateless services**, **distributed caches**, and **replicated databases** before traffic growth forces emergency architectural changes.