What Happens When I Type “holbertonschool.com” into My Browser and Press Enter?

basic web-infrastructure design

Today we’ll be responding to a common software engineering interview question: “what happens when I type holbertonschool.com (can be any address) and press enter?” This question is usually asked to gauge whether you understand web-infrastructure — how websites are hosted and how they serve content to users.

This question can be summarized in minutes or can be explained thoroughly for hours. That’s why I recommend that, when you receive this question, you respond by asking the interviewer clarifying questions like: “in what level of detail should I explain the work-flow” and “is there any one area (networks, security, servers, etc) that you’d like me to focus on more than others? Because this question is so open ended, it’s important that you discover your interviewer’s intent in asking. Are they testing your communication skills by asking you to summarize a complex process in a few minutes? Are they testing how knowledgeable you are about web-infrastructure by asking you to dive deep into technical descriptions? Or are they hiring you for a more specific role in web-infrastructure (ex: security engineer), and want to see how much you know about HTTPS (Hyper Text Transfer Protocol Secure), SSL (Secure Sockets Layer), and configuring firewalls?

This article will be a deep dive. It will explain each step in the process from start to finish and analyze where each step occurs in the Open Systems Interconnected (OSI) model. Lets start with the Domain Name System (DNS) process.

OSI model

The user types in holbertonschool.com and hits enter. Currently we are on Layer 7 of the OSI model — the application layer (Chrome, Firefox, Skype or any other application the user interacts with make up the application layer). The OSI has 7 layers and, from the perspective of a user, runs top to bottom (application layer to physical layer). After hitting enter, the user’s browser checks its cache for a IP address that corresponds to that domain. The browser cannot locate the IP address so it asks the user’s operating system to check its cache. The user’s operating system cannot locate the IP address so it asks the DNS resolver to resolve that particular domain (find the corresponding IP address). The resolver is typically hosted on a server provided by your Internet Service Provider (ISP). First, the resolver checks its cache for the corresponding IP address. It’s unable to locate the corresponding domain in its cache but it does know the location of the Root Server. The Root Server cannot resolve our domain, but it does know where the corresponding Top Level Domain (TLD) server is. TLD means either “.com”, “.org”, “.net”, etc (there are roughly 1,000 different TLD’s at this point). In this example, we’re a “.com” so we get directed to the TLD server for “.com”. The resolver saves the address of the corresponding TLD server so it doesn’t have to ask the Root Server again next time. The TLD server cannot resolve our domain, but it does know the location of the corresponding authoritative Nameserver for that domain. But what is a “authoritative nameserver”? And how did the TLD know which nameserver corresponds to our domain? When a domain is purchased, that name is registered with a domain registrar. The domain registrar tells the corresponding TLD servers the locations of the nameservers that can provide answers to DNS queries. The authoritative nameservers are the last stop in the DNS process. Either, the nameserver will provide us with a valid IP address to holbertonschool.com, or it will return a 404 error — notifying us that it was unable to resolve the domain we provided (couldn’t find an IP address). The resolver will then cache the IP address provided by the nameserver so it can resolve this domain much more quickly the next time around.

Let’s assume we were returned a valid IP address. The user now makes a request to this IP address using HTTPS TCP/IP protocols. Hold on, what are all these protocols? Let’s start with HTTPS. HTTPS is a client-server protocol sitting on the application layer of the OSI model. It is the process (set of rules) that runs whenever a user requests a web-page from a browser. HTTPS requests are sent over via Transmission Control Protocol (TCP) / Internet Protocol (IP) which is sometimes encrypted with Transport Layer Security (TLS) or Secure Sockets Layer (SSL). It’s important to encrypt all HTTP connections with either SSL or TLS because otherwise, if hackers intercepted packets (information) you were sending over the network, all the information inside the packets would be unencrypted. So, if you were sending your credit card information or social security number, they would be easily readable. HTTPS requests can ask for content from a website: videos, sound, images, etc, or, it can post content to servers (ex: text submitted in an HTML form on a website). Results sent from a web-server back to the user are known as the “HTTPS response.” The TCP/IP protocol is how information is sent across the internet (sending packets to IP addresses). IP protocol is a “Network Layer” protocol (operates on Layer 3 of the OSI model) charged with routing packets through different routers and across the internet based on IP address. The TCP is a “Transport Layer” protocol (operates on Layer 4 of the OSI model) is charged with dividing the requested “resource” (file, image, etc) into “chunks” of an efficient size for routing, ensuring that all packets reach their intended destination (by re-transmitting packets if no acknowledgement is received from packet receiver), and reassembling the packets to the original file once they reach their destination.

To see this in action, let’s imagine that the user just clicked a link on a Wikipedia page. An HTTPS GET request is sent to the IP address (which is either cached somewhere or returned by the DNS resolver) and now the user is waiting for the HTTPS response…

This is the perfect place to talk about web-infrastructure. We’ll follow-up with our HTTPS response after we examine how Wikipedia handles our HTTPS request.

Before our HTTPS request can enter Wikipedia’s network, it will have to go through their firewall. A firewall is a network security system that controls incoming and outgoing network traffic based on a set of pre-determined security rules. Typically, a firewall represents a barrier between a trusted network (in this case Wikipedia) and a untrusted network (in this case the internet). A firewall is an example of a “gateway” device. There are two different (and common) types of firewalls: network and host based. Host based firewalls run on the host’s machine and control network traffic in and out of that machine. Network firewalls filter traffic between two or more networks and run on network hardware. In this case, let’s assume Wikipedia has a network based firewall running on their load balancer. Let’s also assume the firewall is configured with a basic packet filtering defense. It will check the IP address of our sender and either verify or deny our request. If the firewall denies us, it can either deny us silently, by simply discarding the packet and refusing us entry to the network, or, it can generate a message notifying us that we have been denied by the network’s firewall. Packet filtering based on the sender’s IP address is a very simple firewall configuration, and there are tons of different approaches to configuring firewalls.

For the purposes of this article of course, we’ll be granted access to the network by Wikipedia’s firewall. Now, our HTTPS request will get directed to Wikipedia’s load balancer.

A load balancer is a piece of software that distributes network traffic across a cluster of web-servers. It’s usually the first place an HTTPS request will be directed once it enters a network. Load balancers allow for scalability by reducing the number of requests each web-server must handle. They also remove a single point of failure as, if one web-server goes down, the load balancer can direct all requests to other web-servers. There are two common ways to load balance network traffic to web-server clusters: Layer 4 and Layer 7. Layer 4 load balancing or, “Transport Layer” load balancing, balances load based on the destination IP address and ports recorded in the packet header. Layer 7, is “Application Level” based load balancing and, forwards user requests to different back-end servers based on the contents of the user’s request (what the user is looking for on the website). Once our request is directed to a particular cluster of back-end servers, the load balancer applies a load balancing algorithm to determine which web-server our HTTPS request should be forwarded to. Common algorithms include:

Round Robin- Forwards requests to the first web-server on the list. Then, moves that web-server to the bottom of the list and, forwards the next request to the new web-server on the top of the list.

Least Connection Method- Selects the web-server currently handling the fewest number of active connections.

Least Packets Method- Forwards requests to the web-server that’s received the fewest packets over a specified period of time.

Least Response Time Method- Selects web-server with fewest active connections and lowest average response time.

The load balancer will forward our request to one of Wikipedia’s web-servers based on whichever of the aforementioned load balancing algorithms it’s using.

simple Round Robin load balancing algorithm

One final thing load balancers can do is SSL termination. SSL termination means decrypting the HTTPS request (removing the SSL encryption) before it’s forwarded along to other servers inside your network. The logic here is that, since the HTTPS request made it past your firewall, and is now inside your private network, you can decrease system load (the processing power your server’s will otherwise spend decrypting the HTTPS requests) by decrypting the requests on a proxy server (in this case your load balancer). However, depending on the nature of your operation this may not be advisable as, terminating SSL will allow anyone with access to your servers to see (and read) all the information moving around your network. Consider the implications before you do this.

Finally, our HTTPS request is forwarded to a web-server. At this point, the user’s computer has successfully established a session with Wikipedia’s network. This mean’s we’ve passed Layer 5 of the OSI model (the Session Layer) and can now begin interacting with Wikipedia’s servers. The web-server’s primary role is to return static content to the client. This “static content” is often HTML, CSS, and Javascript files. So when we click on a Wikipedia page and pictures, paragraphs, and links populate our screen, that is the web-server doing its job. Another key role of the web-server is to interact with the application server (app-server). The role of the app-server is to process dynamic content. This “dynamic content” is often described as the “business logic” of the application. For instance, if the user is on Nike’s website designing a custom pair of shoes, and they click the “get price estimate button,” the app server will run whichever algorithm it uses to generate a price estimate (let’s say it’s written in Python) and serve the content back to the web-server. The web-server will then transmit the data (in packets of course) back to the user using the TCP/IP protocol. So the app-server is charged with processing dynamic content — what the app does, not how it looks. Since the app-server sits between the web-server and the database server, it’s also in charge of interacting with the database server. This means that if the user searches for something in a search form and hits enter (ex: “Nike running shoes”), the web-server will forward the request to the app-server, which will forward the request to the database server.

The database server’s primary purpose is entering and retrieving information from a database. When configuring database servers one must consider database topology. In this context, database topology refers to the permissions your database server’s have to interact with the database. At least one database server in your system needs to be the primary database server. The primary server has both read and write permissions to the database. However, since you want more than just one database server — in order to avoid a single point of failure (SPOF), you will want to have a second database server. Here’s where you need to start paying attention. If you were to add a second primary database server, you would risk having multiple servers try to update the same line at the same time. This is known as a database conflict. In order to prevent database conflicts, it would be wise to make your second database server a secondary database server. This means that your second server can only read from the database. So, if a user simply types something in the search bar (ex: “black running shoes”), your secondary database server will be able to query the database and return the data. However, if a user clicks the “save my password” button, the app-server must direct this request to the master database server (the only one that can write to the database). This way, we have redundancy in the form of multiple database servers, and we’re avoiding database conflicts. If the application you’re building gets very large, you may need to have multiple master database servers because you’re getting so many write requests at the same time (think Facebook). You can have a primary-primary database configuration, however, this set-up requires that you have some sort of conflict prevention / resolution software in the event that the same line is updated at the same time in your database.

Let’s return to our Wikipedia example. We’d left off where our HTTPS request had just reached our web-server. Let’s say that we’d clicked on a Wikipedia page about “ice hockey”. Since we’re just asking to be served a static web-page, the web-server should be able to fulfill our request. The web-server will know where to find these HTML, CSS, and Javascript files (because it was configured), and will return them (as packets) to us via the TCP/IP protocol. The sending of packets to a specific IP address is known as packet routing (occurs on the Network Layer (Layer 3) of the OSI model). Packet routing is made possible because each packet contains the sender’s IP address, the intended receivers IP address, something that tells the network how many packets the requested resource has been broken into, and a unique number corresponding to this packet. In the TCP protocol, the packets in the HTTPS response, travel over the internet via the fastest available route. This means all packets could travel via the same route, or no packets could travel the same route. In the case of requesting a Wikipedia page, each packet will contain a portion of the information in the page (one paragraph for instance). TCP has a fair amount of error checking during the packet transmission process. For instance, after sending a packet, if the packet sender doesn’t receive an acknowledgement from the packet receiver after a given period of time, he will assume the packets were lost and re-transmit them.

anatomy of a IPv4 packet

Hopefully you feel more comfortable about what happens after you type a url like “holbertonschool.com,” and press enter. There’re a lot of moving pieces and different scenarios (after all this is system design), so be sure to clarify what your interviewer’s expectations are before you start answering this question.

Don't forget to share

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *