Understanding exactly what happens when you visit a website can go some way to understanding the Internet as a whole. To some - I may not be flippant in saying most actually - the Internet is a term describing something that we use every day, often without even knowing or thinking about it. It is a shapeless, geographically-obscure entity owned by someone somewhere which holds an awful lot of data.
The only thing wrong with that description is that it is owned by a lot of people and supported, in physical infrastructure and in the numerous softwares which power it, by a lot of other people. Those two sets of people overlap of course. As soon as some entity provides hosting and has some sort of infrastructure that supports the transmission of data, they fall into both groups. I diverge…
When you fill in an address in your browser, let’s for arguments sake say www.bbc.co.uk, and click go from your computer in South Africa, the following happens (as a simplistic view of things):
- Your browser checks in its DNS cache for the location of the website
- Failing that, it makes direct connection to the DNS server in your network config (probably your ISP’s DNS server)
- The location is usually there. Sometimes your DNS is overwritten to look somewhere else for DNS records. Google’s servers for instance.
- If that fails, this server looks upstream to another DNS server. That again may be in some other country across some very long cable lying under the sea (awesome site showing these submarine cables: http://submarinecablemap.com/). With a massive Internet outage of the undersea cable WACS and Seacom (http://mybroadband.co.za/news/broadband/152675-massive-south-african-int...), as happened on 21st January, this might mean that the answer simply doesn’t come back, and your browser tells you the site is unavailable.
- Again, lets assume the location came back to your browser properly. Most likely from a few hundred kilometers away, but maybe from the other side of the world.
- With the knowledge of the website’s location, your browser now asks the website host directly for the particular page it is requesting. That question, in this case, travels all the way up Africa, and again the data comes back down again.
- The webpage and all its components come down that undersea cable and over a myriad networks, and into your modem, which passes them to your computer and to your browser.
- Often components come from different places, and sometimes those places are slower at delivering information, or fail completely to do so, which results in odd looking, incomplete pages. If all the data is coming from one place, then it has to do so in a line, one piece after the next, however some brilliant web developers put page assets onto different servers to speed things up. This is known as a content delivery network (CDN).
- All this information is passed to your browser in raw code form, and your browser renders the page for you to view.
That’s the simple story. I don’t mention things like dynamic page building, where server-side programs create the code and aggregate the data into other code to be sent as the page. And bear in mind that this is a simplistic view of even this basic process.
A lot can go wrong, and a lot of things that could slow the process up. So next time you open a page and get irritated that it takes 5 seconds to load (I do), think about just how amazing it is that it can do it in that time!