Giter VIP home page Giter VIP logo

system-design-notebook's Introduction

A curated collection of resources and exercises to help you learn about system design

Topics

Topics Explained

Requirements

Usually a system design process starts with understanding the system's purpose and one way to understand system's purpose or goal, is to clearly define a list of requirements.
These requirements allow us not only to understand how the system will be used and how it works, but also set clear boundaries which will make sure our design is focused on the right aspects of the systems. We usually distinguish between functional and non-functional requirements.

Functional Requirements

Functional requirements are used to specify an expected function or a behaviour of the system. Simply put, something the system should be able to do.
For example, for a video streaming service a requirement might be to upload a video or comment on a video. For instant messaging application, a functional requirement will be, to be able send and receive messages.

Non-Functional Requirements

Non-functional requirements focus on how the system performs, especially in general and not focusing on specific functions.
While such requirements might affect user's experience they shouldn't affect specific functionality or features the system supports.

For example, if a system is a type of a service, a non-functional requirement might be "zero downtime" or "No loss of data".

Basic Architecture

Client

A client refers to a software or hardware accessing a resource or a service that is served by a server. While in some cases the server and the client might be on the same system/host, in most cases they will be on separate systems.

Examples for clients:

  • A Web browser that is used by a user to access a certain web page
  • A mobile phone that is used by the user to read emails

Server

A server, similarly to a client, can be a software or hardware, but as opposed to a client, its role is to serve the client. It can be by providing a certain resource to the client or let it use a service that is running on the server. Few examples:

  • A system that stores files and allow the user to access or download them
  • A system that runs a service which allows users to listen to music

Scalability

Wikipedia: "Scalability is the property of a system to handle a growing amount of work by adding resources to the system"

In simpler words, scalability is about answering the question whether a system or an architecture are able to scale in a way that meets the new workloads and demand.
More practically, answer questions like:

  • if a system runs a database, does it able to handle more queries?
  • if a system runs a service that stream videos to million users. Will it able to stream them the same way if the amount of users would triple itself?

Also, scaling can be performed on different components. For example, in most cloud environments scaling is supported in case of:

  • Compute hosts
  • Virtual network functions
  • VMs/Instances
  • Containers

There are different ways to scale.

Vertical Scaling

Adding additional resources to the existing system/component/unit. If we have a server, a vertical scaling might be done in one or more of the following ways:

  • Adding more RAM to the server
  • Adding more storage/disks
  • Adding CPUs

Horizontal Scaling

Adding more systems/units/components but at the same time, make them work together so it would seems to the client as if there is one system it interacts with.
Few examples:

  • Instead of one web server, having two web servers with one load balancer balancing the traffic between them
  • Instead of one database server, having two databases

Scalability Factor

When you double the resources of your system (or design) you might expect your system to be able to handle double the workload as well, right? But this is not necessarily what will happen. Scalability factor is the term used to describe the workload your system is able to handle as a result of scaling your resources.

Linear Scalability

Linear Scalability happens when the workloads your system is able to handle scale accordingly to the scale in resources. The scalability factor remains constant as you scale.
For example, you triple the resources of your system -> the system is able to handle triple the workloads. In reality, it's actually not the case most of the time.

Sub-Linear Scalability

A more realistic outcome of scaling systems would be that some resources or component may not scale as expected (or as other resources and components). So doubling the resources will actually lead to an improvement of only x1.5 in workloads handling. In this case the scalability factor will be lower than 1.0

Supra-Linear Scalability

This is the optimal outcome. You triple the workloads handling by "only" doubling your resources for example. In other words, the ratio in performance change is bigger than the ratio in scaling changes (e.g. adding more CPUs). A scalability factor in this case, is bigger than 1.0

Negative Scalability

It may sound crazy, but in some cases, scaling your system might actually lead to worse results and that's exactly what negative scalability is all about. Scalability factor is below 0.

Networking

Public IP

Wikipedia: "A public IP address, in common parlance, is a globally routable unicast IP address, meaning that the address is not an address reserved for use in private networks"

From system design perspective, when you have a resources or a component, you would like everyone to be able to access to, whether for direct communication (like a web server) or as a gateway for other components (like a load balancer), you should use a public IP

Private IP

Whenever you don't want users to be able to globally interact with a certain component or resource, you should use a private IP address. Few examples:

  • Web servers that only the load balancer should communicate with them directly
  • Internal servers that users outside the organization should access

Private IPs, as opposed to public IPs, don't have to be unique and each separate network, can use the same addresses.

Latency

The time it takes to perform a certain task/action

Throughput

The number of tasks/actions per unit of time

DNS

Wikipedia: "Most prominently, it translates more readily memorized domain names to the numerical IP addresses needed for locating and identifying computer services and devices with the underlying network protocols."

In other words, the most common use can of a DNS would be a address translation. It can be from a hostname to IP address and vice versa - from an IP address to a hostname. In addition, a DNS can be used for load balancing, using the round robin technique.

CDN

Cloudflare: "A content delivery network (CDN) refers to a geographically distributed group of servers which work together to provide fast delivery of Internet content."

In other words, a content delivery network allows you to quickly transfer content by having servers with the content around the world or certain area. The client then, access these servers instead of the main server where the data originates from.

Exercises

Note: The names of the exercises are quotes from movies (sometimes little bit modified). If you can guess from which movie, please submit it to movies.md file in this way: [QUOTE] [MOVIE] [YOUR NAME]
Another note: Exercises may repeat themselves in different variations to practice and emphasize different concepts.

"Elementary, my dear Watson"

You have a website running on a single server. It's mostly running fine because only two users access it on weekly basis :'(
It suddenly becomes super popular and many users try to access it, but they are experiencing issues due to high load of the server. Two questions: * What term/pattern in system design is referring to the issue you are experiencing? * How can you deal with it (even if partially) WITHOUT adding more servers or changing the architecture?


  • Scalability. Your web server doesn't scale based on demand (= the additional users accessing your website) hence they are experiencing issues.
  • Apply vertical scaling which means, adding more resources to your server - more CPU, more RAM. This way, your architecture doesn't change, but your website is able to serve more users.
Will 'vertical scaling' solve your scale issues permanently? Is it the optimal solution?

It might solve your issue for limited time, but you can't solely rely on it. Vertical scaling has limitations. You can't keep adding RAM, storage and CPU endlessly. Eventually you'll hit some physical limit where for example, you simply don't have anymore space in your server box and you bought the best components you could.

Assuming you now can extend the architecture, what would you change?

"Perfectly balanced, as all things should be"

You have the following simple architecture of a server handling requests from a client. What are the drawbacks of this design and how to improve it?


  • Limitations:

    • Load - at some point it's possible the server will not be able to handle more requests and it will fail or cause delays
    • Single point of failure - if the server goes down, nothing will be able to handle the requests
  • How to improve:

  • Further limitations:

    • Load was handled as well as the server being a single point of failure, but now the load balancer is a single point of failure.
Is there a way to improve the above design without adding an actual load balancer instance?

Yes, one could use DNS load balancing.
Bonus question: which algorithm a DNS load balancer will use?

What are the drawbacks of round robin algorithm in load balancing?
  • A simple round robin algorithm knows nothing about the load and the spec of each server it forwards the requests to. It is possible, that multiple heavy workloads requests will get to the same server while other servers will got only lightweight requests which will result in one server doing most of the work, maybe even crashing at some point because it unable to handle all the heavy workloads requests by its own.
  • Each request from the client creates a whole new session. This might be a problem for certain scenarios where you would like to perform multiple operations where the server has to know about the result of operation so basically, being sort of aware of the history it has with the client. In round robin, first request might hit server X, while second request might hit server Y and ask to continue processing the data that was processed on server X already.

"For all my actions both public and private"

The following is an architecture of a load balancer serving and three web servers. Assuming, we would like to have a secured architecture, that makes sense, where would you set a public IP and where would you set a private IP?


It makes sense to hide the web servers behind the load balancers instead of giving users direct access to them, so each one of them will have a private IP assigned to it. The load balancer should have a public IP since, we except anyone who would like to access a certain web page/resource, to go through the load balancer hence, it should be accessible to users.

What load balancing techniques are there?
  • Round Robin
  • Weighted Round Robin
  • Least Connection
  • Weighted Least Connection
  • Resource Based
  • Fixed Weighting
  • Weighted Response Time
  • Source IP Hash
  • URL Hash

"Keep calm, all I want is your cash"

The following is a simple architecture of a client making requests to web server which in turn, retrieves the data from a datastore. What are the drawbacks of this design and how to improve it?


  • Limitations:

    • Time - retrieving the data from the datastore every time a request is made from the client, might take a while
    • Single point of failure - if the datastore is down (or even slow) it wouldn't be possible to handle the requests
    • Load - the datastore getting all the requests can result in high load on the datastore which might result in a downtime
  • How to improve:

Are you able to explain what is Cache and in what cases you would use it?

Why to use cache?

  • Save time - Accessing a remote datastore, and in general making network calls, takes time
  • Reduce load - Instead of the datastore handling all the requests, we can take some of its load and reduce by accessing the cache
  • Avoid repetitive tasks - Instead of querying the data and processing it every time, do it once and store the result in the cache
Why not storing everything in the cache?

For multiple reasons:

  1. The hardware on which we store the cache is in some cases much more expensive
  2. More data in the cache, the bigger it gets and longer the put/get actions will take

"In a galaxy far, far away..."

The following is a system design of a remote database and three applications servers


  • Limitations:

    • Latency. Every query made to the remote database will hit latency, even if small.
    • In case the remote database crashes, the app will stop working
  • How to improve:

    * Replicate each database to the local app server. This has several advantages. First, we are not bound to latency anymore. Secondly, a fai
  • Further limitations:

    • We are bound now to bandwidth
    • If the remote database isn't accessible for a long period of time, we'll have an outdated database and each app has the potential to work against a different DB

"A bit on the slow side"

The following is an improvement of the previous system design


  • Limitations:

    • Queries to database might be slow, even on the server itself where the app is running
    • Once the remote database isn't available, the local databases will not by in sync
  • How to improve:

Questions

This is a more "simplified" version of exercises section. No drawings, no evolving exercises, no strange exercises names, just straightforward questions, each in its own category.

Your website usually serves on average a dozen of users and has good CPU and RAM utilization. It suddenly becomes very popular and many users try to access your web server but they are experiencing issues, and CPU, RAM utilization seems to be on 100%. How would you describe the issue?

Scalability issue. The web server doesn't scales :'(
In order to avoid such issues, the web server has to scale based on the usage. More users -> More CPU, RAM utilization -> Add more resources (= scale up). An When there are less users accessing the website, scale down.

Scalability

Explain Scalability
Explain "Vertical Scaling" and give an example where it can help to solve an issue

Vertical scaling is the act of adding

For example, you have a website which serve a class of 20 students. Suddenly, you are teaching multiple classes and your website has to service 40 students. In order to be able to do that, you might have to apply "vertical scaling" and add resources like RAM and CPU to the server running your website.

Why we can't usually rely solely on "vertical scaling" to solve scaling issues?

Because you can't keep upgrading forever a certain server. At some point, you'll hit limitations of buying the best components you could and not having additional space for more components. Maybe the best RAM you could buy is 10TB, but you actually need 19TB RAM to serve all the users.

What is "Horizontal Scaling"?
Once we perform "Horizonal Scaling", by for example adding multiple web servers instead of having one server, how do we handle client acess to these servers?

Using a load balancer

Load Balancer

Tell me everything you know about Load Balancers
What load balancing techniques are there?
  • Round Robin
  • Weighted Round Robin
  • Least Connection
  • Weighted Least Connection
  • Resource Based
  • Fixed Weighting
  • Weighted Response Time
  • Source IP Hash
  • URL Hash
Do you neccesrialy need a dedicated load balancer instance to perform load balancing? (using the round robin technique for example)
No, you can use a DNS server.
What is an application load balancer?
At what layers a load balancer can operate?
What is DNS load balancing?
What are the drawbacks of round robin algorithm in load balancing?
  • A simple round robin algorithm knows nothing about the load and the spec of each server it forwards the requests to. It is possible, that multiple heavy workloads requests will get to the same server while other servers will got only lightweight requests which will result in one server doing most of the work, maybe even crashing at some point because it unable to handle all the heavy workloads requests by its own.
  • Each request from the client creates a whole new session. This might be a problem for certain scenarios where you would like to perform multiple operations where the server has to know about the result of operation so basically, being sort of aware of the history it has with the client. In round robin, first request might hit server X, while second request might hit server Y and ask to continue processing the data that was processed on server X already.
What are sticky sessions? What are their pros and cons?
Explain each of the following load balancing techniques
  • Round Robin
  • Weighted Round Robin
  • Least Connection
  • Weighted Least Connection
  • Resource Based
  • Fixed Weighting
  • Weighted Response Time
  • Source IP Hash
  • URL Hash

Which load balancing technique would you use for e-commerce website? Why?

One that supports sticky sessions so users returning to the website, would have their data loaded, in case the server don't use shared storage.

Cache

Tell me everything you know about Cache
True or False? While caching can reduce load time, it's increasing the load on the server

False. If your server doesn't have to execute the request since the result is already in the cache, then it's actually the opposite - there is less load on the server in addition to reduced load times.

Networking

What is a public IP? In which scenarios, one should use a public IP?
What is a private IP? In which scenarios, one should use a private IP?
What is latency?
What is latency of L1 cache reference vs. L2 cache reference?

L1 cache reference latency is 0.5 nanosecond L2 cache reference latency is 7 nanosecond

So basically the latency of L2 cache reference is 14x L1 cache reference.

DNS

What is a DNS?
Can you use DNS for load balancing?

Storage

What is RAID?

DNS

What is CDN?

Misc

How operating systems able to run tasks simultaneously? for example, open a web browser while starting a game

The CPU have multiple cores. Each task is executed by a different core.
Also, it might only appear to run simultaneously. If every process is getting CPU allocation every nanosecond, the user might think that both processes are running simultaneously.

What is a VPS?

From wikipedia: "A virtual private server (VPS) is a virtual machine sold as a service by an Internet hosting service."

True or False? VPS is basically a shared server where each user is allocated with a portion of the server OS

False. You get a private VM that no one else can or should use.

Resources

There many great resources out there to learn about system design in different ways - videos, blogs, etc. I've gathered some of them here for you

By Topic

System Design Introduction

Articles

Scalability

Videos

Repositories

  • awesome-scalability - "An updated and organized reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems"

By Resources Type

Videos

System Design

  • Gaurav Sen - Excellent series of videos on system design topics
  • System Design Interview - How to get through system design interviews. Covering both architecture and code

Scalability

Repositories

Scalability

  • awesome-scalability - "An updated and organized reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems"

System Design

  • system-design-primer - "Learn how to design large-scale systems. Prep for the system design interview."
Books
Articles

Introduction

System Design Process

How to perform system design???

  1. Define which quality attributes are important for your system - scalability, efficiency, reliability, etc.

System Design Interview Tips

If you are here because you have an system design interview, here a couple of suggestions

What to ask yourself when you see a design and asked to give an opinion on it

Note: You might want to ask yourself these questions also after you've done performing a system design

  • Does it scale if I add more users?
  • Is there a single point of failure in the design?

What you might want to ask when you need to perform a system design

  • What are the requirements?

    • How the system is used?
    • How much users are expected to access the system?
    • How often the users access the system?
    • Where the users access the system from?
  • Are there any constraints?

Credits

The icon in the banner made by Freepik from www.flaticon.com

Contributions

If you would like to contribute to the project, please read the contribution guidelines

License

License: CC BY-NC-ND 3.0

system-design-notebook's People

Contributors

bregman-arie avatar

Stargazers

Mindula Dilthushan Manamperi avatar SNOmad1 GAMMA avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.