FTP stands for “File Transfer Protocol.” Generally, it is a way to transfer files online. And it is just doing its jobs like HTTP, IMAP, or POP. You probably know that there are a lot of file transfer protocols on the current internet; FTP is one of this group.
FTP consists two parts: Server and Client. An FTP server offers access to a directory with sub-directories. Users connect to these servers with an FTP client, a piece of software that lets you download files from the server, as well as upload files to it.
To understand it easily. FTP Server is like a Video rental store. It was filled with hundreds of videotapes or cd on the shelf ( You can imagine these as the files in a server/computer ), the store is waiting for people to rent (In the real case, it’s copy it rather than rent it) them back home or bring their rented tapes back to the store. Customers who come here are considered to be the FTP Clients; they do rent (download) and return (upload). That is how FTP works. To do this, the clients should have a FTP software, and the FTP server should offer its own address for clients to get access. Besides, Clients need a username and a password to log in.
FTP operates on TCP, you probably know the 3 steps handshake for TCP. When client sends a request to the server, the first thing comes along the 3 steps handshake to build the connection, and connect to corresponding port, in the FTP case, the client connects to port 21.
TCP belongs to the Transport layer. The function of the “transport layer” is to establish “port-to-port” communication. By the way, the “network layer” function is to create a “host to host” connection. As long as the host and port are determined, we can communicate between the programs.
There are two modes of communication between the FTP server and the FTP client: active and passive. To not being confused, here I will explain it step by step.
The first thing to keep in mind is that, when mentioned the “active” and “passive”, they are all relative to the server side, not command. That is to say, Active Mode means the server is the active part, Passive Mode means the server is the passive part. So don’t confuse yourself with the client side.
Secondly, the ports in this case is a little bit confusing. To understand it easily, first you need to remember the two fixed ports: 20 and 21. And they are all relative to the server, you can think the same way as the mode unstanding in the previous paragraph. Then, 21 is for sending command, 20 is for sending data.
Now let’s take a deeper look at the two modes.
Active mode is also called PORT mode,it is the connection method originally defined by the FTP protocol. In the process of establishing a data connection, the server actively initiates the connection, so it is called an active mode.
Passive mode is also named PASV. In the whole process, because the server always passively receives the data connection from the client, it is called passive mode.
If you are still confused about the term - port, here I have a brief Explanation.
There are many programs on the same host that need to use the network. For example, you can chat online with friends while browsing the web. When a packet is sent from the Internet, how do you know whether it represents the content of a web page or the content of an online chat?
In other words, we also need a parameter to indicate which program (process) this packet is for. This parameter is called “port”, which is actually the number of each program that uses the network card. Each packet is sent to a specific port on the host, so different programs can get the data they need.
Now back to FTP, it uses 2 ports, 20 and 21. Again, for the server side, 21 is for command, things like user authentication, open and close the connection. 20 is for data transmission, but it is not fixed, it depends on the FTP mode: port mode and passive mode. Because of the seperate ports, data connections and control connections are not confusing.
A brief history about FTP
FTP was developed in the early 1970s by Abhay Bhushan while he was a student at MIT. FTP was initially created to allow the secure transfer of files between servers and host computers over the ARPANET Network Control Program (a precursor to the modern internet).
Someone might be curious about why FTP uses 2 ports? It seems the connection could have been easily specified on a single port. Given all the problems with firewalls and NATS with FTP, it seems that a single port would have been much better.
I found that is a very interesting question. Actually, FTP was specified before NAT(Network Address Translation), Firewalls and Full-duplex Ethernet were the norm.
To explain it easily, Legacy Ethernet is half-duplex, meaning information can move in only one direction at a time. So there will be collisions in the this network. So FTP decided to use 2 ports to solve this issue. The following are the main reasons for two ports design:
Continue sending and receiving control instruction on the control connection while you are transfering data.
Have more than one data connection active at the same time.
The server decides when it’s ready to send you the data.
Let’s suppose Jack is the manager of Jacky and Lucy.
Jack wants two files from Jacky. So Jack connects to Jacky’s port 21 and asks for the files. Jacky opens the connection to Jack on port 20 (could be other port) when the Jacky is ready and send the files there to Jack. Meanwhile, Lucy needs an urgent approvement file from Jack. So Lucy still connects to Jack on 21 and asks for the file. Jack connects to port 20 when Jack is ready, because, right now, Jack is busy with Jacky.
Both ports serve a different purpose, and again for sake of simplicity, they chose to use two different ports instead of implementing a negotiation protocol.
A brief history about HTTP
In 1989, Dr. Tim Berners-Lee, who was working at CERN, wrote a report on building a hypertext system over the network. It is built on top of the existing TCP and IP protocols and consists of four parts: HTML, HTTP, WorldWideWeb(Browser), Server.
These four parts were completed at the end of 1990, On August 16, 1991, Tim Berners-Lee’s article on the public hypertext newsgroup was seen as the beginning of a public domain project on the World Wide Web.
HTTP was very simple in the early stages of the application, and was later called HTTP/0.9, sometimes called a one-line protocol.
So, FTP comes earlier then HTTP, The first FTP client application was a command-line program developed before operating systems had graphical user interfaces. Most network interfaces were using command-line interface at that time. And the way to use FTP in command line is easy to operate.
Open a connect to the server, For emample :
Put your username and password.
Verified and connected.
Then, you can Putting(upload) and getting(download) files using
get command. Other operations are almost the same as the way you operate a linux system, like
FTP is handful tool at this time.
HTTP is Non-specific and sends files to a lot of users. FTP is a protocol developed to “transfer” files between specific hosts. HTTP is more client-server oriented, or broadcast-oriented if you think of the server as a broadcaster, while FTP is more multi-point-to-point.
FTP uses two TCP connections. One is for command - 21, the other one is for data - 20. HTTP use one port, default is 80.
FTP requires a username and password to access (except anonymous login). The purpose of this is that some files are only accessible to special people, and the HTTP is accessible to anyone.
HTTP is actually more specific, in that HTTP headers mention the content type, while FTP doesn’t. HTTP is more document-oriented, with multiple files making up one HTML document, while FTP is more file-oriented, with less regard for the purpose of the files.
FTP has a stateful control connection which maintains a current working directory, and each transfer requires a secondary connection through which the data are transferred. HTTP is stateless and multiplexes control and data over a single connection from client to server on well-known port numbers, which is simple for firewalls to manage.
Related to all of this is that FTP is designed for the command line interface, while HTTP is designed for the GUI.
There is a heated debate on this topic; it’s tough for me to draw a conclusion. However, from my view, as a beginning learner to Network, I hear HTTP more than FTP. The reason is that the functions that the FTP protocol needs to implement, including file upload and download, authentication, and resume transmission, can be completed by other protocols such as HTTP and SFTP.
There are still downsides for FTP, this file transfer protocol is not so strong on security. So it’s not recommend to send sensitive data on FTP, that’s why we have SFTP. In another word, FTP is to SFTP as HTTP is to HTTPS
For the vast majority of ordinary users, there are very few opportunities to download via FTP. But FTP is still possible on the intranet, like the school network, library network system. Also, FTP is a very good choice in some commercial closed and downloadable resources. In practice, FTP is like telnet, it’s no longer recommended for use, and SFTP should always be the first choice when you need FTP’s functionality.
SFTP is short for “Secure File Transfer Protocol”.
Literally, SFTP is a more secure way to transfer files online. SFTP encrypts the data before sending. SFTP is more for the Client rather than Server.
SSH, also known as Secure Shell or Secure Socket Shell, is a network protocol that provides administrators with a secure way to access a remote computer. SSH establishes a cryptographically secured connection between two parties(client and server), authenticating each side to the other, and passing commands and output back and forth.
SSH consists the client software and the server software. Both of them are talking on the port 22. The client softwares include scp, slogin, sftp.
SFTP has almost the same syntax and functionality as FTP. But SFTP uses encrypted transmission of authentication information and transmitted data, so using SFTP is much more secure than FTP.
Since SFTP transmission method uses encryption/decryption technology, so the transmission efficiency is much lower than that of ordinary FTP.
Here is an example for SFTP client, I use the Cyberduck to upload files to my remote server on digital ocean.
I can also use
ssh to log into my server. The same way as I mentioned before.
Today, SFTP is still widely used in the field of encrypted file transmission.
Although we have fewer and fewer opportunities to use FTP, understanding FTP is still very helpful for understanding the network.