AkiVaMu Just tiny things come to mind...

System design

Steps to design a system

Taking example to design Pastebin.com

Step 1: Requirement & scope

  • Features, Use cases
    • What does the system do? Inputs and outputs?
    • Who is going to use it?
    • How are they going to use it?
  • Constraints
    • How many users?
    • Data size?
    • Latency per request?
    • Throughput (requests per duration)?
    • Read/Write ratio
  • Calculate/estimate: size, performance…

Step 2: Create a high level design

Design high level architecture

  • Modules, components
  • Interfaces, communications
  • Which parts to be scaled

Step 3: Design core components

Base on each use case/feature:

  • Flow diagram
  • Sequence diagram
  • Input output

Step 4: Scale the design

  • Identify bottleneck
  • Scale strategy:
    • Vertical vs horizontal
    • Load balancer
    • Caching
    • Distributed data store

Principles

  • Performance vs scalability
    • Performance: the system is slow for a single user
    • Scalability: the system is fast for a single user, but slow under heavy load
  • Latency vs throughput
    • Latency: processing time per request
    • Throughput: number of requests per time
    • Best: maximal throughput with acceptable latency
  • Availability vs consistency
    • Consistency: always get same result, or same error
    • Availability: always get result, but not guarantee valid or consistence result
      • Fail-over: main node (serving request) and backup nodes (in case main node fails)
      • Replication: multiple, cloned nodes (all serving request)
    • Partition Tolerance: trade off

Optimize independent unit-of-work

Mean reducing the processing time of one single, no-dependencies task

  • Optimize algorithm
  • Increase processing ability (CPU, mem upgrade)
  • Caching

Optimize task with external dependencies (whole system)

For a system with input and output:

Standards, documentation

TODO

References

Scalability for dummies

  • Clone (horizontal scale)
    • Load balancer
    • Consistency among clones
      • Stateless request
      • Same code base
      • External, centralized data store/cache
  • Database
  • Cache
    • What to cache
    • Where to store cache entry: Persistence vs Memcache
    • When to invalidate cache entry
  • Asynchronism
    • Message queue
    • Worker pool