What Happens When You Type A URL In Your Browser And Press Enter

Salmen Zouari
22 min readApr 8, 2020

If you are in any technical profession, I am sure someone at some point has asked you this question. Whether you are an engineer, developer, marketer, or even in sales, it is always good to have a basic understanding of what is going on behind our browsers and how information is transferred to our computers via the internet.

Here’s the situation: You’ve opened up your laptop, double-clicked on your web browser, typed holbertonschool.com in the address bar and hit the enter key. For us it’s just an enter button let’s see what’s behind.

First, let’s discuss the web browser.

The Web Browser

A web browser is a program that most people use to view websites on the Internet. Most modern web browsers like Google Chrome and Mozilla Firefox have many features built into them that hide some of the underlying processes involved in connecting to a webpage in order do improve user experience. We will discuss some of these features in later sections.

When you type holbertonschool.com in the address bar, you are essentially providing the browser with the URL for the website you want to visit. URL stands for Uniform Resource Locator; it is essentially an address to the file (resources) that the browser needs to display the web page.

A complete URL looks something like this:

http://www.holbertonschool.com/index.html
  • http:// tells the browser that we want to access a page using the Hyper Text Transfer Protocol (HTTP). This is a protocol that browsers use to interact with web pages. Other protocol have other purposes, for example ftp:// (File Transfer Protocol) is a protocol used to transfer files across the Internet. https:// is another protocol we will discuss later, but in short, it's the secured version of http://.
  • www is a subdomain of holbertonschool.com; this part refers to a specific location (server) inside the domain where resources are located.
  • holbertonschool.com is the domain name; it represents the server where all the data for "holbertonschool.com" resides.
  • /index.html is the path to the file that will be displayed by the browser.

One of the user-friendly features of web browsers is that they don’t require you to type the complete URL of every web page you want to see. These days it is assumed that when you type holbertonschool.com you mean http://www.holbertonscool.com, but underneath the hood, there is some communication with the browser and the website's server to determine which files are required for the home page. It might be /index.html, /index.php, or some other file that is produced dynamically.

The communication between computers on a network is built upon many different protocols. A protocol is a set of rules that both parties must follow in order to function. For example, both parties must speak the same language (syntax), include particular information about themselves, and generally act as expected, according to the protocol.

One of the most relevant protocols the Internet uses is TCP/IP.

TCP/IP

The TCP/IP reference model is a layered model developed by the Defense Project Research Agency(ARPA or DARPA) of the United States as a part of their research project in 1960. Initially, it was developed to be used by defense only. But later on, it got widely accepted. The main purpose of this model is to connect two remote machines for the exchange of information. These machines can be operating in different networks or have different architecture.

In the early days, the TCP/IP reference model has four layers, as described below.

These layers are much similar to the layers of the OSI modl. The Application layer in the TCP/IP model has approximately the same functionality as the upper three layers(Application, Presentation, and Session layer) of the OSI model. Also, the Internet layer acts as the Network layer, and the Network Access layer acts as the lower two layers(Physical and Data-Link layer) of the OSI model. TCP/IP network model is named after two main protocols(TCP and IP) and is widely used in current internet architecture. But nowadays, we generally use a five-layer TCP/IP model, as shown below.

In the above diagram, the Physical and Data-Link layer acts as the Network Access layer of the previously used TCP/IP model. This TCP/IP model is currently in use. So, in this blog, we’ll learn about the five-layer TCP/IP reference model. We’ll also see the key features of this model and the functionalities of its five layers.

The key features of the TCP/IP model are as follows:

  1. Supports flexible architecture: We can connect two devices with totally different architecture using the TCP/IP model.
  2. End-node verification: The end-nodes(source and destination) can be verified, and connection can be made for the safe and successful transmission of data.
  3. Dynamic Routing: The TCP/IP model facilitates the dynamic routing of the data packets through the shortest and safest path. Due to dynamic routing, the path taken by the data packet can not be predicted, and thus it improves data security.

There are also some demerits of using the TCP/IP model, these are as follows:

  1. Replacing a protocol is not easy.
  2. The roles and functionalities of each layer are not documented and specified properly, as it is described in the OSI model.

Following are the five layers of the TCP/IP model:

  1. Physical Layer
  2. Data-Link Layer
  3. Internet Layer
  4. Transport Layer
  5. Application Layer

Now, we will learn about the functionalities of these layers one-by-one in detail.

1. Physical Layer

The Physical Layer is the lowest layer of the TCP/IP model. It deals with data in the form of bits. This layer mainly handles the host to host communication in the network. It defines the transmission medium and mode of communication between two devices. The medium can be wired or wireless, and the mode can be simplex, half-duplex, or full-duplex.

It also specifies the line configuration(point-to-point or multiport), data rate(number of bits sent each second), and topology in the network. There are no specific protocols that are used in this layer. The functionality of the physical layer varies from network-to-network.

2. Data-Link Layer

The Data-Link Layer is the second layer of the TCP/IP layer. It deals with data in the form of data frames. It mainly performs the data framing in which, it adds some header information to the data packets for the successful delivery of data packets to correct destinations. For this, it performs physical addressing of the data packets by adding the source and the destination address to it.

The data-link layer facilitates the delivery of frames within the same network. It also facilitates the flow and error control of the data frames. The flow of the data can be controlled through the data rate. Also, the errors in the data transmission and faulty data frames can be detected and retransmitted using the checksum bits in the header information.

3. Internet Layer

The Internet layer of the TCP/IP model is approximately the same as the Network layer of the OSI model. It deals with data in the form of datagrams or data packets. This layer mainly performs the logical addressing of the data packets by adding the IP(Internet Protocol) address to it. The IP addressing can be done either by using the Internet Protocol Version 4(IPv4) or Internet Protocol Version 6(IPv6).

The Internet layer also performs routing of data packets using the IP addresses. The data packets can be sent from one network to another using the routers in this layer. This layer also performs the sequencing of the data packets at the receiver’s end. In other words, it defines the various protocols for logical transmission of data within the same or different network. The protocols that are used in the Internet layer are IP(Internet Protocol), ICMP(Internet Control Message Protocol), IGMP(Internet Group Management Protocol), ARP(Address Resolution Protocol), RARP(Reverse Address Resolution Protocol), etc.

4. Transport Layer

The Transport layer is the fourth layer of the TCP/IP model. It deals with data in the form of data segments. It mainly performs segmentation of the data received from the upper layers. It is responsible for transporting data and setting up communication between the application layer and the lower layers. This layer facilitates the end-to-end communication and error-free delivery of the data. It also facilitates flow control by specifying data rates. The transport layer is used for process-to-process communication with the help of the port number of the source and the destination.

The Transport layer facilitates the congestion control using the following protocols:

  1. TCP: TCP stands for Transmission Control Protocol. It is a connection-oriented protocol. It performs sequencing and segmentation of data. It also performs flow and error control in data transmission. There is an acknowledgement feature in TCP for the received data. It is a slow but reliable protocol. It is suitable for important and non-real time data items.
  2. UDP: UDP stands for User Datagram Protocol. It is a connection-less protocol. It does not perform flow and error control in data transmission. There is no acknowledgement feature in UDP for the received data. It is a fast but unreliable protocol. It is suitable for real-time data items.

5. Application Layer

The Application layer in the TCP/IP model is equivalent to the upper three layers(Application, Physical, and Session Layer) of the OSI model. It deals with the communication of the whole data message. The Application layer provides an interface between the network services and the application programs. It mainly provides services to the end-users to work over the network. For Example, file transfer, web browsing, etc. This layer uses all the higher-level protocols like HTTP, HTTPS, FTP, NFS, DHCP, FMTP, SNMP, SMTP, Telnet, etc.

The application layer helps in setting up and managing the network connections. It also checks for the user’s program authentication and authorization for the data. It also performs some complex operations like data translation, encryption and decryption, and data compression. The application layer synchronizes the data at the sender’s and the receiver’s end. In other words, it is the topmost layer and defines the interface for application programs with transport layer services.

This is all about the TCP/IP model and now we will move to the DNS

DNS

The Domain Name System (DNS) is the phonebook of the Internet. Humans access information online through domain names, like holbertonschool.com Web browsers interact through Internet Protocol (IP) addresses. DNS translates domain names to IP addresses so browsers can load Internet resources.

Each device connected to the Internet has a unique IP address which other machines use to find the device. DNS servers eliminate the need for humans to memorize IP addresses such as 54.88.73.204 (in IPv4), or more complex newer alphanumeric IP addresses such as 2400:cb00:2048:1::c692:d7a2 (inIPv6).

Domain name records

Even though decentralized, the information about domain names still needs to be recorded and stored. Domain name records are kept by authoritative DNS servers that are commonly hosted by the domain registrar. In addition to hosting the records, an authoritative DNS server is allowed to create, edit, and delete records for the domains delegated to it. Due to this, most registrars offer their users a way to manage the domain name records without deeper knowledge of the inner workings of DNS servers.

However, it’s important to understand the basics of DNS records. Each domain can have many different types of records that serve different purposes. In essence, DNS records create a set of instructions that allow Internet users to find to their destination web site.

Below is a short-list of the most commonly used DNS record types:

  • A record holds the IPv4 address of a domain and is the most important of these records. One domain or sub-domain can have a single IP while one IP can have multiple domains pointing to it.
  • AAAA record is essentially the same as A record but for IPv6 addresses.
  • PTR record finds a domain name in a reverse-lookup when the IP is already known. IP addresses usually have one PTR record each, but multiple PTR records can point to the same domain.
  • CNAME record, or canonical name, forward a domain or sub-domain to another domain without providing an IP address. These can be used as aliases to domains.
  • MX record is the mail exchange record that directs mail to an email server. It indicates how email should be routed to its destination.
  • TXT record lets a domain administrator store text notes in the record. These are commonly used to gauge the trustworthiness and verify ownership of a domain.
  • NS record indicates the authoritative name servers. A domain often has multiple name servers, primary and secondary, to prevent outages in case of failures.

You can check any existing records using domain name lookup tools such as nslookup or dig.

Domain Name Resolution Steps

Let’s go back to the point where you entered holbertonschool.com in your browser and pressed enter. In the context of DNS, the first that will happen is the browser will check it's cache to see if it already knows the IP address of holbertonschool.com. Cache is a memory bank that is easy for the browser to access. In other words, if you've visited holbertonschool.com recently, the browser will have the destination IP address in it's cache and the resolution process will end.

If you’re using Google Chrome you can view your DNS cache by entering

chrome://net-internals/#dns

You will probably see some familiar URLs and their corresponding IP addresses listed next to them.

Let’s imagine holbertonschool.com is not in your browser's DNS cache.

Next, the browser will check if any information about the IP address of holbertonschool.com exists in the operating system. There is a file called /etc/hosts on Linux and MacOS machines (C:\Windows\System32\Drivers\etc\hosts on Windows) that stores local DNS information. These files were used more frequently when there wasn't distributed DNS servers like there are today, but they are still checked by the browser before they head out to the network.

Let’s assume that the IP address for holbertonschool.com was not found in the hosts file.

Next, the browser will make a query to a remote DNS server. A DNS server is a server dedicated to resolving domain names to IP addresses. They contain a database of domain names and their corresponding IP addresses, along with addresses of other DNS servers.

Most Internet Service Providers (ISPs) have servers dedicated to resolving domain names, so it will start with that one since it’s probably the closest. It will ask that server, “do you know the IP address of holbertonschool.com?"

If it does, great. The resolution process ends. If not, it will query the next DNS server, known as the Root DNS Server. Root DNS servers know information about Top Level Domain (TLD) servers. .com is an example of a TLD. Others are .org, .edu, .co.uk, .cn, and many more.

We ask the Root DNS Server, “Do you know the IP address of holbertonschool.com?" It replies, "No, but I know about the .com TLD Server. I will forward your query to it.".

Since holbertonschool.com ends in .com the Root DNS Server that the browser connected with will send a query to the .com TLD Server.

The .com TLD Server holds information about the Name Server of the "second-level" domain, in our case, holbertonschool.

The query gets forwarded to the holbertonschool.com's Name Server. Whoever registered the domain holbertonschool.com probably knows their domain's Name Server. If the site is hosted by GoDaddy for example, the Name Server might be ns1.godaddy.com or ns2.godaddy.com.

Finally, these are the servers that will (hopefully) know the IP address of holbertonschool.com. If not, an error will be sent back to the browser and you will see something like this:

Note: ERR_NAME_NOT_RESOLVED.

DNS Caching

The process of resolving a domain name involves a lot of steps. If the resolving process had to be done completely every time you wanted to view a webpage, your experience on the Internet would be slower. As mentioned before, browsers have a caching mechanism so it doesn’t have to initialize an entire recursive resolution process again and again. Only once in a while for unfamiliar websites.

Not only do browsers have a cache, but DNS servers have cache, too. Servers keep track of domains requested and then store them in a memory bank for some time in order to make the recursive resolution process shorter.

When someone registers a domain name, they have the option to configure the DNS records for that domain. One of the configuration settings is called Time to Live (TTL).

A domain’s TTL is the amount of time, in seconds, that DNS can exists in the cache.

Let’s say holbertonschool.com, with a TTL of 10 days, decided to launch a new version of their site tomorrow. It would be wise for them to update their DNS records and set their TTL to 0 so that user's who have holbertonschool.com in their cache don't see the "old" site on launch day. Later, they can reset their TTL back to 10 days, if they want. Having a longer TTL improve user experience because re-visitors don't have to go through the whole resolution stage each time they visit the site.

Quick Recap

So far we have talked about the browser, URLs, TCP/IP protocols, domain names, and DNS. At this point we have resolved holbertonschool.com to the IP address 54.88.73.204 and we are still waiting for the web page to show up.

Let’s see what happens when we press enter:

Note the changes what appear in the browser…

  1. The full URL went from holbertonschool.com to https://www.holbertonschool.com
  2. There is a green padlock icon next to the URL.
  3. The title “Holberton School of Software Engineering in San Francisco” appears in the title of the browser’s window
  4. The Holberton School web page is now being displayed.

Let’s discuss changes 1 and 2; the green padlock and https.

HTTPS, SSL, and TLS

I know, I know… More acronyms.

Remember HTTP, the Hyper Text Transfer Protocol? Well, HTTPS is the secured Hyper Text Transfer Protocol. This protocol works similarly to HTTP (client sends a request for a resource, server responds to request), except that the communication is encrypted so it cannot be read by anyone observing network traffic.

When thinking about security, it’s important to remember that requests and responses usually include many other routers in their journey from client to server. The nature of underlying IP protocols guides the path from the client to the server, selecting a route that it thinks is most efficient. It’s very likely that somewhere in the route there is an observe you don’t want seeing your messages.

The HTTPS is a requirement for any decent website that asks users to enter their personal information, such as credit card number, address, phone number, etc. Without HTTPS, that information can be seen by network traffic observers.

How does it work?

The HTTPS protocol involves another protocol called TLS (Transport Layer Security. TLS is a "new and improved" version of SSL (Secure Sockets Layer). Mozilla Developer Network claims that SSL is now considered obsolete, but it is still used in the HTTPS protocol. The main point is that both SSL and TLS are protocols used for security purposes.

The main two things the Transport Layer Security protocol provides is authentication and encryption.

Authentication: ensuring that the client truly is who it claims to be and the server truly is who it claims to me. Authentication is needed because some deceptive attackers dupe clients by posing as the server they are trying to reach.

Encryption: using cryptographic algorithms to jumble up the messages during the tranmission between client and server so they cannot be understood by anyone except the two parties.

SSL and TLS use an asymmetric encryption algorithm, which sets two different keys for each party; one for encryption and one for decryption. You encrypt a message with one key, and then decrypt it with another. These keys, known as public and private keys, are mathematically linked so only messages sent to public key A can only be decrypted by private key A.

The TLS Handshake

The TLS handshake is a protocol between a client (web browser) and a server (holbertonschool.com) to establish trust and then negotiate what key should be used to encrypt and decrypt traffic between them. Here is what happens during the TLS handshake:

  1. Client sends Client Hello message. This packet includes the version of TLS, length of the packet, the ciphers (types of encryption) the client can handle, and a random string of bytes that is used to create a master key.
  2. Server replies with Server Hello message. This packet includes the cipher the server selected to use, a random string of bytes used to create a master key, and information about the server's certificate of trust.
  3. Server sends the Certificate to the client. This include information about the insurer (for example, VeriSign), company name, terms of use, dates of validity, and the server's public key.
  4. Server sends Certificate Status to the client. Shows whether or not the certificate has been verified successfully. Server also sends Hello Done, marking the end of the introduction between the two.
  5. Client generates a master (or “secret”) key using the random string of bytes, encoding it with the server’s public key.
  6. Client sends it’s public key, along with the master key to the server. This is part of the asymmetric encryption key exchange.
  7. Server receives this key and generates a symmetric key to be used during the session. Symmetric keys are a lot faster and more efficient than asymmetric keys. It’s less secure, but by this point, client and server have been linked with asymmetric keys so the secure connection has already been established. The symmetric key is also generated on the client-side.
  8. Client sends change cipher spec to server to announce a change from asymmetric to symmetric encryption, along with Client Finished.
  9. Server sends change cipher spec and Server Finished message (that is now encrypted with the symmetric key)
  10. Server sends client a new session ticket and the transmissions will not be encrypted across the network.

Here is a summarized version of how it works in the context of vising holbertonschool.com:

  • Your browser has a public key.
  • You make a request to holbertonschool.com.
  • When the request hits holberton’s server, it says, “This is a secure website. Before I send you any resources I need to establish authentication and encryption.”
  • TLS handshake occurs
  • Secure session has been established. Green padlock shows up in the address bar, and https appears at the beginning of the URL.

HTTP

The Hyper Text Transfer Protocol is an application-layer protocol used for transmitting graphics, audio, video, text, and links; collectively known as “hypermedia”. Web pages can be built to display hypermedia using a language called Hypertext Markup Language (HTML).

This protocol is built upon the request/response paradigm. When you enter holbertonschool.com in your browser and press enter, you are making an HTTP request to a server. In turn, the server will send an HTTP response back to you.

HTTP requests and responses have two parts: the header and the body. Headers have information about the request formatted in key, value pairs. Body contains the a segment of, or the entire resource requested.

Here is what an HTTP request header might look like:

And an HTTP response header sent back from the server:

The body segment of an HTTP request often contains the resource that the browser displays. If it’s a web page, it will certain contain some HTML.

HTML

Here is what some HTML code looks like:

HTML uses tags and plain text to create the structure of the webpage and define how it will look. Along with HTML, web pages may also be using Cascading Style Sheets (CSS) and JavaScript to add more style and functionality to the webpage.

The name of the file in the image above is called index.html. index.html is a special name that turns out to be the default html page that is displayed when you go to a website if none is specified. In other words, when you visit holbertonschool.com, there is a good chance that the URL of the page that displays is http://www.holbertonschool.com/index.html.

For the sake of the article, holbertonschool.com is a bad example, because they're homepage is not index.html. They must have configured their servers to make their home page display a resource named something else.

To get a glimpse at the HTML, CSS and JavaScript used to display holbertonschool.com you can right click in your browser and select view page source

The image above is just a small part of the entire source code, which is nearly 1500 lines.

The browser reads the HTML, CSS and JavaScript, sends HTTP requests for additional resources needed such as images, videos, or scripts and then renders the web page:

But what about personalized, dynamic content? For example, when I sign into your amazon.com account it displays my name and it even remembers what I put in my shopping cart.

When you sign into Netflix, it remembers your favorite shows and exactly what minute and second of an episode of Stranger Things you were watching.

This is what we call dynamic content. It’s content that is customized for the client.

It’s possible to serve dynamic content with JavaScript, but robust applications, such as Amazon and Netflix use other components to process requests and serve content dynamically: application servers and database servers.

Application Servers

Application servers are servers dedicated to generating static content for the web server. In other words, they take in some parameters such as "name="Lee", day="Monday", item="Laptop", process the parameters through a program, then return a static document to give to the web server, which in turn sends to the client.

In this example, the application code might look something like:

function say_hello(Name, Day, Item) { return "Hello <name>, it is <day>, would you like to buy a <item>? }

Depending on the client, this item would generate some static content like “Hello Lee, it is Monday, would you like to buy a laptop?”

This is kind of a silly example, but the idea is that the web server (which serves static content) is connected with the application server (which generates static content dynamically).

An application with millions of users like Netflix needs a place to store all the information about their customers in order to deliver a personalized experience. It needs to know their name, account status, billing address, their viewing history, and a lot more. Where do they keep all this information?

In a database.

Database Servers

Database servers are servers dedicated to storing and organizing data. When a user signs in to Netflix, a query will be sent to one of Netflix’s database servers asking “john@doe.com” is an actual user, if so, it will check the password they entered against the password in the database and then act accordingly by giving them access or denying them access to their account.

Imagine all the fields in Netflix’s database. Millions of users, each user with tens or hundreds of fields. That must take up a lot of space. Surely, they cannot fit that all on one computer. And what if it goes down? Will everyone lose access to Netflix?

This brings up some more concepts about web infrastructure: redundancy, availability, and security.

Redundancy

Let’s first address the questions, “What if Netflix’s database server goes down… Will everyone lose access to the site?”

Good web architecture has no single point of failure. Meaning, there is no single thing that will cause everything to break. Netflix has redundant database servers that act as backups in case the current one goes down. They also have redundant web servers, and application servers, and redundancy on basically every level of their architecture.

Availability

You might imagine that the traffic of visitors going to Netflix varies throughout the day. People don’t tend to watch Netflix at work as much as they do when they get home or on the weekend. An application, like Netflix, that performs the same no matter where you are or what time you are using it, is said to be highly available. High availability is something every web application strives for. They do this in many ways, one of which is using a server known as a load balancer.

A load balancer is well-named. It balances the load of traffic across the servers. If an application has a load balancer, it is implied they have multiple web servers behind it. Having multiple web servers can be a sign of redundancy (good), or they could be web servers that are assigned to serve different content. Either way, the load balancer’s job is to take all incoming requests and send them to a server.

Here is a simple diagram of a haproxy load balancer in front of two web servers:

So when we make a request to holbertonschool.com, the first server the request hits is most likely a load balancer. From there it will decide which server to forward the request to.

Load balancers determine the server to send the request to using algorithms such as:

  • Round Robin: if you have 3 servers (A, B and C), the first request goes to A, second to B, third to C, fourth to A, etc.
  • Least Connections: choose the server to the one who has the least amount of connections
  • Random: literally random.
  • Least latency: choose the server that is fastest.

There are also weighted versions of these algorithms where the server’s administrator can configure each server with a specific weight so they are more or less likely to be chosen to handle the request.

Using a load balancer helps maintain the health of the application by not putting the whole load of traffic on only one server.

Well-designed applications will also have redundant load balancers ready in case the main one goes down. Otherwise that would be a single point of failure.

Security

We’ve talked about HTTPS/SSL/TLS, which is great for encrypting traffic and providing authentication between client and server, but good web architecture goes a step further and installs firewalls.

A firewall can he a physical device or a piece of software installed on a server that monitors inbound and outbound traffic. Server administrators can configure a firewall to only accept incoming traffic from certain IP addresses, or certain ports, for example.

By the way, a port is like a door on the interface of a computer on a network. We already know that each computer on a network has an IP address. They also have the ability to utilize ports. As an analogy, you can think of the IP address as a phone number and the port as an extension. Some ports are designated for certain protocols:

  • Port 80: HTTP
  • Port 443: HTTPS
  • Port 53: DNS

So let’s say you don’t want anyone requesting DNS information about your web server. You can configure a firewall on that server to deny any inbound traffic coming from port 53.

Firewalls add extra security by defining what kind of traffic you want coming in and out of your server. Unwanted intruders might make a request to your server and slide into the file system through port 123. If your server’s purpose is to serve web content, it would be a good idea to allow traffic only on ports 80 and 443. That way any incoming requests for other ports will be denied.

Good web architecture will configure firewalls in front of every server to make it clear what time of traffic they want coming in and going out.

This is all about that press enter lol. Hope you learned something new today. That’s it for this blog.

Do share this blog with your friends to spread the knowledge.

Keep Learning :)

--

--