Packet, routers, and reliability | Internet 101 | Computer Science | Khan Academy
Hi, my name is Lynn Root. I am a software engineer here at Spotify, and I'll be the first to admit that I often take for granted the reliability of the internet. The sheer amount of information zooming around the internet is astonishing. How is it possible for every piece of data to be delivered to you reliably?
Say you want to play a song from Spotify. It seems like your computer connects directly to Spotify servers, and Spotify sends you a song on a direct, dedicated line. But actually, that's not how the internet works. If the internet were made of direct, dedicated connections, it would be impossible to keep things working as millions of users join, especially since there is no guarantee that every wire and computer is working all the time.
Instead, data travels on the internet in a much less direct fashion. Many, many years ago, in the early 1970s, my partner Bob Kahn and I began working on the design of what we now call the internet. Bob and I had the responsibility and the opportunity to design the internet's protocols and its architecture, so we persisted in participating in the internet's growth and evolution for all of this time, up to and including the present.
The way information gets transferred from one computer to another is pretty interesting. It need not follow a fixed path. In fact, your path may change in the midst of a computer-to-computer conversation. Information on the internet goes from one computer to another in what we call a packet of information. A packet travels from one place to another on the internet a lot like how you might get from one place to another in a car.
Depending on traffic congestion or road conditions, you might choose or be forced to take a different route to get to the same place each time you travel. Just as you can transport all sorts of stuff inside a car, many kinds of digital information can be sent with IP packets. But there are some limits. What if, for example, you need to move a space shuttle from where it was built to where it will be launched?
The shuttle won't fit in one truck, so it needs to be broken down into pieces and transported using a fleet of trucks. They could all take different routes and might get to the destination at different times. But once all the pieces are there, you can reassemble the pieces into the complete shuttle, and it'll be ready for launch.
On the internet, the details work similarly. If you have a very large image that you want to send to a friend or upload to a website, that image might be made up of tens of millions of bits or ones and zeros—too many to send along in one packet. Since it’s data on a computer, the computer sending the image can quickly break it into hundreds or even thousands of smaller parts called packets.
Unlike cars or trucks, these packets don't have drivers, and they don't choose their route. Each packet has the internet address of where it came from and where it's going. Special computers on the internet, called routers, act like traffic managers to keep the packets moving through the networks smoothly. If one route is congested, individual packets may travel different routes through the internet, and they may arrive at the destination at slightly different times or even out of order.
So let's talk about how this works. As part of the internet protocol, every router keeps track of multiple paths for sending packets, and it chooses the cheapest available path for each piece of data based on the destination IP address for the packet. "Cheapest" in this case doesn’t mean cost, but time and non-technical factors, such as politics and relationships between companies.
Often, the best route for data to travel isn't necessarily the most direct. Having options for paths makes the network fault-tolerant, which means the network can keep sending packets even if something goes horribly wrong. This is the basis for a key principle of the internet: reliability.
Now, what if you want to request some data and not everything is delivered? Say you want to listen to a song. How can you be 100% sure all the data will be delivered so the song plays perfectly?
Introducing your new best friend: TCP, Transmission Control Protocol. TCP manages the sending and receiving of all your data as packets. Think of it like a guaranteed mail service. When you request a song on your device, Spotify sends a song broken up into many packets.
When your packets arrive, TCP does a full inventory and sends back acknowledgments of each packet received. If all packets are there, TCP signs for your delivery, and you're done. If TCP finds some packets are missing, it won't sign; otherwise, your song wouldn’t sound as good, or portions of the song could be missing.
For each missing or incomplete packet, Spotify will resend them. Once TCP verifies the delivery of many packets for that one song request, your song will start to play. What's great about the TCP and router systems is they're scalable. They can work with 8 devices or 8 billion devices.
In fact, because of these principles of fault tolerance and redundancy, the more routers we add, the more reliable the internet becomes. What's also great is we can grow and scale the internet without interrupting service for anybody using it.
The internet is made of hundreds of thousands of networks and billions of computers and devices connected physically. These different systems that make up the internet connect to each other, communicate with each other, and work together because of agreed-upon standards for how data is sent around on the internet.
Computing devices or routers along the internet help all the packets make their way to the destination where they're reassembled, if necessary, in order. This happens billions of times a day, whether you and others are sending an email, visiting a web page, doing a video chat, using a mobile app, or when sensors or devices on the internet talk to each other.