Foreword Introduction 1. Overview of the recommendation 1.1. The client-server model 1.2. The communication channels 1.3. Active and passive modes 1.4. The commands 1.5. The answers 2. The API 2.1. Installation of ROOF 2.2. Initialization 2.3. The context 2.4. Read/Write on the network 2.5. Command sending 2.6. Reception of the responses 2.7. Connection to the server 2.8. Diagram number 1 2.9. Diagram number 2 2.10. Diagram number 3 3. Example based on the API Conclusion Resources About the author
A french version of this article has been published in GNU Linux Magazine France number 103.
Lots of software need to download or upload files remotely. One of the oldest but still widely used protocols is File Transfer Protocol (FTP). After an overview of the recommendation, this paper focuses on the design of a FTP API written in C language to ease the integration of a FTP client module in software.
The FTP protocol specified in the RFC959
recommendation has been designed to:
On the client side, the establishment of a FTP connection consists to
open a control channel. Then a data channel may be opened. The control
channel is used for the transfer of the commands, the responses to the
commands and the spontaneous messages. The data channel is established
only if the commands trigger data transfers.
The communication on the control channel is bidirectionnal and complies
with the TELNET
protocol. In practice, TELNET is used very basically. The dialog is
alternated since the client sends a command to which the server answers
with one or more messages. Each command has a predefined list of
responses. The server may eventually send spontaneous messages to
provide various information such as "The system is going down in 15
minutes" or "The connection delay has expired".
The communication on the data channel is unidirectionnal. The direction
depends on the command type being executed.
A channel is actually a TCP connection as depicted in figure 2.
By default, the client requests the opening of the control channel on the TCP port number 21 on the server machine. This supposes that the server is listening on this port. The default values of the ports for both channels are listed in the file /etc/services:
$ cat /etc/services | grep ftp ftp-data 20/tcp ftp 21/tcp tftp 69/udp sftp 115/tcp ftps-data 989/tcp # FTP over SSL (data) [...] $ |
The establishment of the data channel depends on the functionning mode:
active or passive.
In active mode,
the server establishes the data channel on the client's
TCP port which is by default, the port used by the client for the
control connection. The client has the ability to specify another port
through the PORT command. In practice, the active mode is not used
because it is rarely supported by the clients. Moreover, if the machine
on which the client runs is protected by a firewall, it may be
impossible for the server to establish a connection to the client.
In passive mode,
the client establishes the data channel using the PASV
command to get a TCP port from the server on which the
connection will
be done. This mode simplifies the design of the client.
The commands are lines of ASCII data with the following format:
command parameter1 parameter2... CRLF |
The command is a word of at most 4 upper or lower case characters long.
Parameters are optional.
The frequently used commands are described in table
1
(the parameters inside brackets are optional and the commands marked
with a star are part of the minimum set of commands that a server
should provide to be considered compliant with the recommendation).
Command name | Parameters | Description |
---|---|---|
USER (*) | user_name | It is the first command sent to the server to identify the user. |
PASS | password | Send a password to the server if the name specified by the USER command needs one. By the way, we can note that the password is sent without any encryption. This is frequently considered as a security weakness of the FTP protocol. |
CWD | directory | Change working directory (i.e. current directory). |
CDUP | This is the shortcut of the CWD command to go to the upper directory. On Unix, this is the same as "cd .." but on other systems this may be something else. Hence this command to ignore the system specificities. | |
QUIT (*) | Disconnect current user and close control channel. | |
REIN | Disconnect current user but the control channel stays opened to accept a subsequent USER command in order to connect another user. | |
PORT (*) | TCP_port | The client specifies to the server the TCP port number on which it waits for a data connection: the server is in active mode. |
PASV | This is the opposite of PORT command. The client requests to the server a TCP port on which it will establish a data connection: the server is in passive mode. | |
TYPE (*) | type | Specify the type of the information on the data channel. Among the multiple choices, the mostly used and often the only ones supported by the servers are ASCII (type = A) and BINARY (type = I). |
STRU (*) | structure | This was used in the past by servers which organized their data into pages and records for efficiency reasons. This command is also used for the data recovery upon errors. The default structure is FILE (structure = F). |
MODE (*) | mode | Specifies the data transfer mode. This command is also used for the data recovery upon errors. The default mode is STREAM (mode = S). |
RETR (*) | file | Request the transfer of file from the server to the client. |
STOR (*) | file | Request the transfer of file from the client to the server. |
RNFR | file | Request the renaming of a file on the server. This command specifies the source file name and is followed by the RNTO command to specify the destination file name. |
RNTO | file | Cf. RNFR |
ABOR | Stop the running command. If a data channel is opened, it is closed by the server. | |
DELE | file | Request the destruction of a file on the server. |
RMD | directory | Request the destruction of a directory on the server. |
MKD | directory | Request the creation of a directory on the server. |
PWD | Request to the server the name of the current directory. | |
LIST | [directory] or [file] | Request to the server information about a given file name or all the files of a given directory name (name, access rights, creation date, size...). By default, the information concern the files of the current directory. On Unix, it is normally the result of the "ls -l" command but on other systems, this can be something else. |
NLST | [directory] or [file] | This command is the same as LIST but provides only the name of the files. Compared to LIST, this command has the advantage to be portable as it returns the same result no matter which kind of operating system that the server is running on. |
SYST | Request to the server to identify its operating system. | |
NOOP (*) | Do not trigger any action except to request the server to answer OK. This can be used to maintain a minimum traffic with the server which may implement an inactivity connection timeout. |
An answer is coded as follow:
Value | Description |
---|---|
1yz |
Positive Preliminary reply. |
2yz |
Positive Completion reply. |
3yz |
Positive Intermediate reply. |
4yz |
Transient Negative Completion reply. |
5yz |
Permanent Negative Completion reply. |
The text which follows the digits in the responses
is
optional and most of the time it is for information purposes except for
some commands like PASV
which need a formatted answer.
Most of the time the answers are single line. Here is, for example, an
answer from the server when it expects a password:
331 Password required for foo. |
In this case, the code number 3 specifies that the server accepted the
previous command and it is waiting for additional information: the
password.
The multiline answers are mostly used for connection banners (content
of the file /etc/ftpwelcome).
For example, here is a banner sent by a
Linux server after the user identification commands. The code followed
by a minus sign is displayed on each lines but according to the
recommendation it is not mandatory. Only the first and last line are
supposed to begin with the 3-digit code:
230- Linux toto-host 2.6.22-14-generic #1 SMP Tue Dec 18 08:02:57 UTC 2007 i686 230- 230- The programs included with the Ubuntu system are free software; 230- the exact distribution terms for each program are described in the 230- individual files in /usr/share/doc/*/copyright. 230- 230- Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by 230- applicable law. 230 User foo logged in. |
In the preceding example, the code number 2 specifies that the command has been accepted and a new command can be launched.
After the overview of the FTP recommendation, it is now possible to describe the API which permits to design FTP clients. In the layered approach of the FTP model, the API is located under the client as shown in figure 3.
The name of this API is ROOF (Remote Operation On Files). This is shared library available in open source on sourceforge.
Download the .tgz file from sourceforge and uncompress it with the following command:
$ tar xvfz roofxxx.tgz |
The resulting source tree is depicted in figure 4.
The include
directory
contains the file
roof.h in
which are defined the public services and data structures.
This file must be included by any user of the library.
The lib
directory contains the implementation of the libroof.so library
which will be dynamically linked with the user program. The files
are roof.c
for the API and roof_p.h
for the internal definitions.
The client
directory contains an example of FTP client based on the API: roof.
The fs
directory contains the implementation of a file system based on FUSE and ROOF
as an alternative to file systems like NFS.
The man
directory contains the online manuals of the library and the
client. The manual is spreaded in the section 1 (roof command), 3
(API)
and 7 (general description).
The build process uses a script called roof_install.sh
based on cmake
which must be installed on the system. Here we show how to build and
install directly with cmake
command in the default directory /usr/local
(the
installation requires the super user rights):
$ cd roofxxx $ cmake . -- Check for working C compiler: /usr/bin/gcc [...] -- Build files have been written to: [...] $ make [...] Linking C executable roof [...] $ sudo make install [...] Linking C executable CMakeFiles/CMakeRelink.dir/roof Install the project... [...] |
To check that the installation succeeded, read the online manual of roof (this may need to update the environment variable MANPATH with /usr/local/man):
$ man 3 roof ROOF(3) Linux Programmer’s Manual ROOF(3) NAME roof - API for Remote Operations On Files [...] |
It is possible to test the roof executable (this may need to update the LD_LIBRARY_PATH environment variable with /usr/local/lib or to call "ldconfig -n /usr/local/lib" if the dynamic linking with libroof.so fails):
$ roof roof 1.5 [...] Type 'help' or '?' for the list of available commands. ftp> quit $ |
In the following chapters of this article, we describe the main parts of the library to understand in delail the step from the RFC959 recommendation to the implementation. The code snippets are summaries to make this article as short as possible. The reader is advised to download the sources from sourceforge to get the complete implementation.
To be easy to use and robust, an API is supposed to be reentrant to run multiple instances concurrently. This precaution is very important if the software which uses it, is multithreaded (multiple thread running in parallel may use the library concurrently). A mutex is created and initialized in the entry point of the library. To be identified by the dynamic linker at loading time, the entry point is declared as follow (cf. this page for more details about dynamic libraries entry and exit points):
void __attribute__ ((constructor)) roof_initialize(void); void roof_initialize(void) { int rc; // Creation of the mutex (unlocked) rc = pthread_mutex_init(&roof_mtx, NULL); [...] } // roof_initialize |
Two macros are defined to lock and unlock the mutex:
#define ROOF_LOCK() (pthread_mutex_lock(&roof_mtx)) #define ROOF_UNLOCK() (pthread_mutex_unlock(&roof_mtx)) |
The context is a data structure which identifies an instance. There is one context of type roof_ctx_t per user:
typedef struct { void *ctx; // User context } roof_ctx_t; |
The field ctx points on the user's private data. The library does not do anything with this field. This only makes possible to associate arbitrary data to the user context. In the internals of the library, the user context is stored in the ctx field of the roof_context_t structure defined in roof_p.h as follow:
typedef struct { unsigned int debug_level; // Debug level char *iobuf; // I/O buffer unsigned int l_iobuf; // I/O buffer size void *ctx; // User data int busy; // 1, if context is busy int internal_iobuf; // 1, if I/O buffer allocated internally int ctrl; // Socket of the control connection unsigned int timeout_ms; // Timeout with server (ms) char type; // Type for TYPE command char code; // Code for TYPE command } roof_context_t; |
If this structure was in the file roof.h,
the
users
of
the
library
would
be
tempted to use the fields of the structure in their code. This
would make their programs incompatible with future versions of the
library
(if the fields are renamed or disappear) and it may cause the library
to fail (if the fields are modified by the user without
calling the
services of the API).
In this article, roof_ctx_t
will be called "external context" (the one seen by the user) and roof_context_t will
be called "internal context" (the one seen by the library). To get the
latter from the first, the library uses the ROOF_CTX() macro which
retrieves the address of the internal context from the address of the
external context and the offset of the ctx field in the roof_context_t structure:
#define ROOF_CTX(p) ((roof_context_t *) ((char *)p - offsetof(roof_context_t, ctx))) |
Conversely, the external context is retrieved from the internal context through the ROOF_EXT_CTX() macro which merely returns the address of the ctx field in roof_context_t:
#define ROOF_EXT_CTX(p) ((roof_ctx_t *)&(p->ctx)) |
The library defines a maximum number of contexts with the ROOF_NB_MAX_CTX constant
which is the dimension of the table of contexts (roof_context[]) in roof.c.
The first thing that the user must do, is to allocate a context by
calling roof_new():
roof_ctx_t *roof_new( unsigned int timeout_ms, // Timeout (ms) to interact with the server // 0 = Infinite wait char *iobuf, // I/O buffer (default if NULL) unsigned int l_iobuf, // Lenght of the I/O buffer (default if 0) void *ctx // User context ) { unsigned int i; ROOF_LOCK(); // Look for a free context for (i = 0; i < ROOF_NB_MAX_CTX; i ++) { if (0 == roof_context[i].busy) { roof_context[i].busy = 1; break; } } // End for ROOF_UNLOCK(); if (i >= ROOF_NB_MAX_CTX) { errno = ENOSPC; return NULL; } roof_context[i].debug_level = 0; // If the buffer has been allocated by the user if (iobuf && l_iobuf) { [...] roof_context[i].iobuf = iobuf; roof_context[i].l_iobuf = l_iobuf; roof_context[i].internal_iobuf = 0; } else // Buffer allocated internally { roof_context[i].iobuf = (char *)malloc(ROOF_IO_BUF_SIZE); [...] roof_context[i].l_iobuf = ROOF_IO_BUF_SIZE; roof_context[i].internal_iobuf = 1; } roof_context[i].ctrl = -1; roof_context[i].timeout_ms = timeout_ms; roof_context[i].ctx = ctx; return (roof_ctx_t *)&(roof_context[i].ctx); } // roof_new |
A free context (busy field = 0) is looked for in the table of contexts. The search is done in a critical section protected by the mutex (ROOF_LOCK()/ROOF_UNLOCK()) to be reentrant. The found context is initialized with the parameters passed by the user. The first parameter timeout_ms is the maximum time in milliseconds to wait for a answer from the server. It is advised to set it to some seconds to avoid to have a program hanging when the server does not answer. The following parameters (iobuf and l_iobuf) are respectively the address and the length of the I/O buffer to interact with the server. If the caller passes NULL for the address or 0 for the length, the buffer is allocated internally with ROOF_IO_BUF_SIZE bytes as default length (this is defined in roof_p.h). The service returns NULL on error or the address of the external context if OK (this is the address of the field ctx in the roof_context_t structure in the roof_context[] table). From the user point of view, this context is an identifier that will be passed to all the subsequent calls to the library in order to identify its instance. When the user does no longer need the context, he calls roof_delete() to free the resources:
void roof_delete(roof_ctx_t *pContext) // external context { roof_context_t *pCtx = ROOF_CTX(pContext); // Deallocation of the I/O buffer if allocated internally if (pCtx->internal_iobuf) { free(pCtx->iobuf); } // Close the control socket if opened if (pCtx->ctrl >= 0) { shutdown(pCtx->ctrl, SHUT_RDWR); close(pCtx->ctrl); } ROOF_LOCK(); // Free the context pCtx->busy = 0; ROOF_UNLOCK(); } // roof_delete |
This function is the counterpart of roof_new(). The I/O buffer is freed if it has been allocated internally (field internal_iobuf != 0), the control socket is closed properly with a call to shutdown() then close() if it is opened and finally, the field busy is reset to mark the context as being free.
The communication on the network uses the TCP/IP protocol through the
socket library of Linux. This makes possible to receive and send data
over the network via a file descriptor using the read() and write()
system calls as if we were writing or reading a file (this is the
application of the famous concept of Unix: "everything is file").
In the library, two functions roof_read()
and roof_write()
encapsulate the read()
and write()
system calls to mainly use the context concept where are located useful
information like the debug level and to handle the EINTR
error code. As signals are asynchronous, they can be received at any
moment interrupting the running code to trigger a handler that
the
user may have defined. Under Linux, most of the system calls return in
error (e.g. the value -1) and set the global variable errno to EINTR if they are
interrupted by a signal. So, roof_read()
and roof_write()
check this error code to relaunch the system calls.
Here is the roof_write()
function:
static int roof_write( roof_context_t *pCtx, // Internal context int fd, // Output file descriptor const void *buf, // Writing buffer size_t len // Number of bytes to write ) { int rc; unsigned int l; l = len; do { rc = write(fd, ((const char *)buf) + (len - l), l); if (rc < 0) { if (EINTR == errno) { // Reiterate the read() rc = 0; } } if (rc > 0) { assert(l >= (unsigned)rc); l -= rc; } } while (l && (rc >= 0)); if (-1 == rc) { int saved_errno = errno; ROOF_ERR(pCtx, "Error '%s' (%d) on write()\n", strerror(errno), errno); errno = saved_errno; } else { rc = len; } return rc; } // roof_write |
Here is the roof_read() function:
static int roof_read( roof_context_t *pCtx, // Internal context int fd, // Input descriptor char *buf, // Read buffer unsigned int len // Maximum number of bytes to read ) { int rc; int saved_errno; do { rc = read(fd, buf, len); if (-1 == rc) { if (EINTR == errno) { continue; } saved_errno = errno; ROOF_ERR(pCtx, "Error '%s' (%d) on read(%d)\n", strerror(errno), errno, fd); errno = saved_errno; return -1; } } while (rc < 0); return rc; } // roof_read |
In the preceding examples, the errno
variable is saved before the calls to the macros displaying the error
messages and it is restored after because we need to preserve the error
value at the return of roof_read()
and roof_write().
The saving is mandatory because the error macro calls functions from
the C library like printf()
which may alter the value of errno.
This
is
advised
by
the
online
manual of errno
(man 3 errno).
Those functions merely could use the address of the I/O buffers stored
in the internal context instead of receiving the address as parameter.
But in some cases, those functions are used with a buffer which is not
from the context or with an address at a given offset in the I/O buffer.
One of the main functions of the client is to send commands to the server. As seen in the § 1.4, the commands are composed with a keyword eventually followed by a parameter and are terminated by the end of line characters (CR and LF). The commands are sent by the client on the control channel. It is the work of the roof_send_cmd() function:
static int _roof_send_cmd( roof_context_t *pCtx, // ROOF internal context const char *format, // Command line to send ... ) __attribute__ ((format (printf, 2, 3))); static int roof_send_cmd( roof_context_t *pCtx, // Internal context const char *format, // Command to send ... ) { int rc = 0; va_list args_list; int sz; va_start(args_list, format); sz = vsnprintf(pCtx->iobuf, pCtx->l_iobuf, format, args_list); va_end(args_list); [...] rc = roof_write(pCtx, pCtx->ctrl, pCtx->iobuf, sz); [...] return 0; } // roof_send_cmd |
The function is passed a command described with a format and a variable number of arguments like printf(). vsnprintf() is used to decode the parameters. This interface makes the function generic enough to be able to send any FTP command (with or without parameters). For example, to send a parameter less command like PASV:
rc = roof_send_cmd(pCtx, "PASV\r\n"); |
And to send a command with parameters like TYPE:
rc = roof_send_cmd(pCtx, "TYPE %c %c\r\n", type, code); |
For robustness purposes, we use the GCC's format attribute to make the compiler check that the arguments passed to the function are coherent with the format parameter (cf. GCC's attribute description for more information).
The other main function of the client is to receive the responses from the server. As specified in § 1.5, the answers are formatted and may contain one or multiple lines. This is the work of the roof_get_reply() function:
int roof_get_reply( roof_ctx_t *pContext, // External context const char **reply // Response ) { roof_context_t *pCtx = ROOF_CTX(pContext); fd_set fdset; int rc; struct timeval to; unsigned int i; int first = 1; unsigned int offset = 0; unsigned int lreply; char code[4]; *reply = NULL; lreply = pCtx->l_iobuf; one_more_time: FD_ZERO(&fdset); FD_SET(pCtx->ctrl, &fdset); if (pCtx->timeout_ms) { to.tv_sec = pCtx->timeout_ms / 1000; to.tv_usec = (pCtx->timeout_ms % 1000) * 1000; rc = select(pCtx->ctrl + 1, &fdset, NULL, NULL, &to); } else { rc = select(pCtx->ctrl + 1, &fdset, NULL, NULL, NULL); } switch(rc) { case -1: // Error or signal { if (EINTR == errno) { goto one_more_time; } ROOF_ERR(pCtx, "Error '%s' (%d) on select()\n", strerror(errno), errno); return -1; } break; case 0: // Timeout { ROOF_ERR(pCtx, "Timeout on read\n"); errno = ETIMEDOUT; return -1; } break; case 1 : // Data from the connection { char *p; rc = roof_read_line(pCtx, pCtx->ctrl, pCtx->iobuf + offset, lreply); [...] // If the connection is closed if (0 == rc) { // Overwrite the buffer with a dummy code strcpy(pCtx->iobuf, "600"); lreply = pCtx->l_iobuf - 3; // Add additional information if enough room strncat(pCtx->iobuf, " End of connection", lreply); pCtx->iobuf[pCtx->l_iobuf - 1] = '\0'; *reply = pCtx->iobuf; return 0; } [...] // Look for the end of line p = pCtx->iobuf + offset; i = 0; while (p < (pCtx->iobuf + offset + rc)) { if (('\r' == *p) && ('\n' == *(p + 1))) { // This is the first line if (first) { // We must have 3 digits at the beginning of the line [...] for (i = 0; i < 3; i ++) { [...] code[i] = pCtx->iobuf[i]; } // End for // Is it a multipline answer ? if ('-' == pCtx->iobuf[i]) { first = 0; // Remaining space in the input buffer lreply -= rc; // New beginning of buffer when the response // is multiline offset += rc; [...] // Read the following line goto one_more_time; } // This is a single line response // Terminate the line by overwriting the last |
Although quite big, the function is straightforward. It is based on
the select()
system call to wait for data from the server on the control socket (ctrl
field in the internal context). By the way we can notice the use of the
timeout if it has been specified by the user in roof_new() call (cf.
§ 2.3). This avoids
to wait for a
response which would not come if the server is out of service. The
function is able to read responses of one or more lines by testing the
presence of the minus sign behind the 3-digit code. To read the data
lines from the server, the internal function roof_read_line() is
called to get the input characters up to the apparition of the
CR
and LF characters.
This is not useful to describe it in detail.
roof_get_reply()
is part of
the API because not only it is used internally but also it is used
externally by the user to get the spontaneous messages from the server.
That is why it is passed an external context instead of an internal
context as parameter.
We will see in the following functions that the responses are handled
by checking the first digit of the 3-digit code because it is
sufficient to know which action to trigger (cf. § 4.2 of RFC959).
The connection to the server consists before all to establish the
control channel as specified at the beginning of the § 5.4 of the RFC959:
the channel is opened and the server sends a response with a code equal
to 220 to specify that it is ready. If the server is not ready to
receive commands, it sends a response with a code equal to 120 and the
client is supposed to wait until it receives a code equal to 220.
The establishment of the control channel is done by roof_open_ctrl()
which is passed the address of the server (name or IP address) and the
TCP port number. If the server is listening on the standard port
(general situation), the caller can pass the constant ROOF_DEF_PORT
instead of the hardcoded value 21. The return code of this service is
the socket descriptor of the connection or -1 if an error occured. The ctrl field of the
internal context stores the value of this socket. We can note the use
of the SO_LINGER
socket option to make the TCP protocol wait for the remote side to get
all the pending data at disconnection time.
int roof_open_ctrl( roof_ctx_t *pContext, // External context const char *host, // Server's address (name or IP address) unsigned int port // Server's port number ) { roof_context_t *pCtx = ROOF_CTX(pContext); int rc; int sd = -1; struct sockaddr_in addr; struct hostent *pHost; char *code; struct linger opt_linger; int err_sav; [...] // Get a TCP socket descriptor sd = socket(PF_INET, SOCK_STREAM, 0); [...] // Set SO_LINGER option to make the caller of close() on the socket // wait until the remote part has got all its data opt_linger.l_onoff = 1; // Activate the LINGER opt_linger.l_linger = 2; // Persistence time in 100ms units rc = setsockopt(sd, SOL_SOCKET, SO_LINGER, &opt_linger, sizeof(opt_linger)); [...] // Translate the hostname into address pHost = gethostbyname(host); [...] // Populate the address memset(&addr, 0, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_port = htons(port); addr.sin_addr.s_addr = (in_addr_t)(*(unsigned long *)(pHost->h_addr_list[0])); // Connection to the remote host rc = connect(sd, (struct sockaddr *)&addr, sizeof(addr)); [...] // We populate the context right now bacause roof_get_reply() need the ctrl field pCtx->ctrl = sd; // Loop until we receive a 2yz response or timeout do { // Wait for a response from the server rc = roof_get_reply(pContext, &code); [...] if ((code[0] != '1') && (code[0] != '2')) { ROOF_ERR(pCtx, "Negative reply code '%s'\n", pCtx->iobuf); errno = EIO; rc = -1; goto error; } } while (code[0] != '2'); rc = sd; goto end; error: err_sav = errno; if (sd >= 0) { close(sd); sd = -1; } pCtx->ctrl = -1; errno = err_sav; end: return rc; } // roof_open_ctrl |
The second and last step of the connection is the identification procedure. The RFC959 proposes a diagram reproduced in figure 5 (with some adaptations). The diagram shows that depending on the server, the identification may consist only of a USER command or USER followed by PASS or USER followed by PASS and ACCT. In practice, the procedure consists most of the time of USER followed by PASS.
The identification diagram is run by the roof_service() function. We describe this service to show that the commands are sent by the roof_send_cmd() internal routine introduced above, the responses are received by roof_get_reply() introduced above and the diagrams are implemented with a suite of switch/case to test the values of the responses and goto to change the states.
int roof_login( roof_ctx_t *pContext, // External context const char *login, // Login name const char *passwd, // Password const char *account // Account information ) { roof_context_t *pCtx = ROOF_CTX(pContext); int rc; const char *code; assert(NULL != pCtx); assert(pCtx->busy); if (!login || !(login[0])) { ROOF_ERR(pCtx, "NULL login parameter\n"); errno = EINVAL; return -1; } rc = roof_send_cmd(pCtx, "USER %s\r\n", login); [...] rc = roof_get_reply(ROOF_EXT_CTX(pCtx), &code); [...] switch(code[0]) { case '1' : // Preliminary positive response case '4' : // Transient negative completion case '5' : // Permanent negative completion { ROOF_ERR(pCtx, "Error '%s'\n", pCtx->iobuf); errno = EIO; return -1; } break; case '3' : // Intermediate positive response { goto send_passwd; } break; case '2' : // Positive completion { goto end; } break; default : // Normally impossible { ROOF_ERR(pCtx, "Unexpected reply code '%s'\n", pCtx->iobuf); errno = EIO; return -1; } break; } // End switch send_passwd: if (!passwd || !(passwd[0])) { ROOF_ERR(pCtx, "Password parameter is required by server\n"); errno = EINVAL; return -1; } rc = roof_send_cmd(pCtx, "PASS %s\r\n", passwd); [...] rc = roof_get_reply(ROOF_EXT_CTX(pCtx), &code); [...] switch(code[0]) { case '1' : // Positive Preliminary reply case '4' : // Transient negative completion case '5' : // Permanent negative completion { ROOF_ERR(pCtx, "Error '%s'\n", pCtx->iobuf); errno = EIO; return -1; } break; case '3' : // Intermediate positive response { goto send_account; } break; case '2' : // Positive termination { goto end; } break; default : // Normally impossible { ROOF_ERR(pCtx, "Unexpected reply code '%s'\n", pCtx->iobuf); errno = EIO; return -1; } break; } // End switch send_account: if (!account || !(account[0])) { ROOF_ERR(pCtx, "Account parameter is required by server\n"); errno = EINVAL; return -1; } rc = roof_send_cmd(pCtx, "ACCT %s\r\n", account); [...] rc = roof_get_reply(ROOF_EXT_CTX(pCtx), &code); [...] switch(code[0]) { case '1' : // Positive preliminary reply case '3' : // Intermediate positive response case '4' : // Transient negative completion case '5' : // Permanent negative completion { ROOF_ERR(pCtx, "Error '%s'\n", pCtx->iobuf); errno = EIO; return -1; } break; case '2' : // Positive completion { goto end; } break; default : // Normally impossible { ROOF_ERR(pCtx, "Unexpected reply code '%s'\n", pCtx->iobuf); errno = EIO; return -1; } break; } // End switch end: return 0; } // roof_login |
The diagram in figure 6 is the first one presented in the § 6 of the RFC959 recommendation. It concerns the ABOR, ALLO, DELE, CWD, CDUP, SMNT, HELP, MODE, NOOP, PASV, QUIT, SITE, PORT, SYST, STAT, RMD, MKD, PWD, STRU and TYPE commands.
The diagram in figure 7 is the second one presented in the § 6 of the RFC959 recommendation. It concerns the data transfer APPE, LIST, NLST, REIN, RETR, STOR and STOU commands.
This diagram introduces the roof_open_data() function which opens the data channel in order to transfer the content of the files and directories:
static int roof_open_data(roof_context_t *pCtx) // Internal context { int rc; const char *code; char *p; int port_lsb, port_msb, port; struct sockaddr_in addr; int data; int err_sav; rc = roof_send_cmd(pCtx, "PASV\r\n"); [...] rc = roof_get_reply(ROOF_EXT_CTX(pCtx), &code); [...] // Make sure the response is OK if (code[0] != '2') { ROOF_ERR(pCtx, "Error '%s'\n", pCtx->iobuf); errno = EIO; return -1; } // Parsing of the response from the server to get the port number as // well as the server's address p = pCtx->iobuf; while (*p && (*p != ')')) { p ++; } if (*p != ')') { ROOF_ERR(pCtx, "Expected a terminating ')' in '%s'\n", pCtx->iobuf); errno = EIO; return -1; } *p = '\0'; while ((p != pCtx->iobuf) && (*p != ',')) { p --; } if ((*p != ',') && (!isdigit(*(p+1)))) { ROOF_ERR(pCtx, "Expected a ',' followed by a digit in '%s'\n", pCtx->iobuf); errno = EIO; return -1; } *p = '\0'; port_lsb = atoi(p+1); while ((p != pCtx->iobuf) && (*p != ',')) { p --; } if ((*p != ',') && (!isdigit(*(p+1)))) { ROOF_ERR(pCtx, "Expected a ',' followed by a digit in '%s'\n", pCtx->iobuf); errno = EIO; return -1; } *p = '\0'; port_msb = atoi(p+1); port = (port_msb << 8) | port_lsb; // Convert the address into dotted notation p--; while ((p != pCtx->iobuf) && (*p != '(')) { if (',' == *p) { *p = '.'; } else { if (!isdigit(*p)) { ROOF_ERR(pCtx, "Expected a digit in the server's address'%s'\n", pCtx->iobuf); errno = EIO; return -1; } } p --; } if (*p != '(') { ROOF_ERR(pCtx, "Expected a '(' in '%s'\n", pCtx->iobuf); errno = EIO; return -1; } // Populate the server's address memset(&addr, 0, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_port = htons(port); addr.sin_addr.s_addr = inet_addr(p+1); // Creation of the socket for the data channel data = socket(PF_INET, SOCK_STREAM, 0); [...] // Connection to the server rc = connect(data, (struct sockaddr *)&addr, sizeof(addr)); [...] return data; } // roof_open_data |
The PASV command is sent to the server to make it switch into passive
mode. The server answers by a response code "2" like "227 Entering
Passive Mode (127,0,0,1,128,87)" to give the port number on which it
will wait for the data connection. The numbers inside parenthesis
and separated by commas are specified in RFC959.
They represent from left to right, the IP address (the first four
fields: 127.0.0.1), the most significant byte and the less significant
byte of the 16-bit port number (128 * 256 + 87 = 32855). Once those
information are received by the client, a socket is created to
establish the data channel to the server. The function returns the
socket descriptor of the data channel or -1 if an error occured.
We will not describe the functions which implement the commands based
on the diagram number 2 because they follow the same principle as roof_login() seen
above.
For STOR command we have added some interesting optimizations to make the file uploading efficient using the sendfile() Linux system call and the TCP_CORK socket option. Loosely speaking, the latter makes the network layer fill the packets as much as possible before sending them to the server in order to reduce the number of packets necessary to send the file (cf. https://baus.net/on-tcp_cork/ or https://www.techrepublic.com/article/tcp-ip-options-for-high-performance-data-transmission/). The first delegates to the kernel the reading of the local file content and its submission into the network socket. This reduces the kernel to user space context switches as well as it avoids the intermediate copies from kernel to user space and conversely (cf. https://developer.ibm.com/articles/j-zerocopy/).
For downloading commands like RETR, the server is supposed to apply the same optimization principles on its side.
The diagram of the figure 8 is the third described in the § 6 of the RFC959 recommendation. It concerns the renaming RNFR and RNTO commands. This diagram is implemented in the function roof_mv().
For a complete example, the reader can look at the source code of the roof or ftpfs commands in
the client
and fs
sub-directories of
the source tree.
Here we present a tiny program called test_ftp
which sets
up a FTP
connection to a server, displays the remote system's type, displays the
name and the content of the working directory:
#include |
The program can be built as follow:
$ gcc test_ftp.c -o test_ftp -lroof |
Here is an example of execution of the program with a local connection (localhost), with the login name foo and the password bar:
$ ./test_ftp localhost foo bar Remote system's type: 215 UNIX Type: L8 (Linux) Working directory: 257 "/home/foo" is current directory. Content of working directory: total 2256 -rw-r--r-- 1 foo foo 60 Jan 16 08:55 file1 -rw------- 1 foo foo 203 Jan 16 08:55 file2 drwx------ 2 foo foo 4096 Mar 11 2007 directory $ |
After an overview of the FTP recommendation, it has been possible to
develop an API called ROOF. It is very simple to set up and can
be used in any application needing file transfer features with a
standard protocol.
The source package comes with three application examples:
The author is an engineer in computer sciences located in France. He can be contacted here.