Last modification: 03-Apr-2019
Author: R. Koucha
isys: a way to make system() more efficient
Foreword
This article has been extracted from this larger study on the solutions to optimize the C library's system() service.
1. Introduction
In embedded environments, the cost of the hardware is an important consideration. As a consequence, the memory is often very limited. The memory as well as the CPU time are critical resources which must be used with care and as efficiently as possible not only for response time and robustness purposes but also for hardware cost reduction purposes. Several applications need to call shell commands to trigger various tasks that would be tough to accomplish with languages like C. Hence, to make it, the C library provides the system() service which is passed as parameter the command line to run:
The "command" parameter may be a simple executable name or a more complex shell command line using output redirections and pipes.
system() hides a call to "/bin/sh -c" to run the command line passed as parameter.
From Linux system point of view, in the simplest case, system() triggers at least two pairs of fork()/exec() system calls: one for "sh -c" and another for the command line itself as depicted in Figure 1.
Moreover, fork() triggers a duplication of some resources (memory, file descriptors...) of the calling process (the father) to make the forked process (the child) inherit them. If the calling process is big from a memory occupation point of view or the overall memory occupation is high, the system()call may fail because of a lack of free memory. Even tough Linux benefited multiple enhancements like the Copy On Write (i.e. COW) to make the fork() more efficient and less cumbersome, this may lead to a memory over consumption which triggers Linux defense mechanisms like Out Of Memory (OOM) killer.
This paper aims at addressing the problem of system() overuse with an alternate solution called isys to enhance existing applications in a confident way that is to say with a minimal impact on the existing source code and its behaviour.
2. Isys
As some applications need to call system() frequently, it means that "sh -c" is run very often. Moreover, the execution and termination of multiple shells by several concurrent applications sucks CPU time and memory resources. It is possible to plan a solution where a shell is executed once and stays ready to use in any application needing to run commands.
The idea consists to start one (or more ?) background shell(s) at application startup. We don't use the "-c" option which runs one command line and then makes the shell exit. The shell must live in background during the application lifetime even after command execution. Each time the application needs to run a command, it submits it to the background shell. This saves the CPU time and memory needed to start and stop the shell. Figure 2 depicts the principle.
Without "-c" option, the shell is interactive. In other words, it needs to be in front of a terminal. Linux provides the pseudo-terminal (i.e. PTY) concept to manage this kind of needs. The PTY is setup between the application process (master side) and the background shell process (slave side). The latter believes that it is interacting with an operator through a real terminal whereas the operator is actually the application process: cf. Figure 3.
As the shell is in interactive mode, it displays a prompt to wait for a command. It gets the command, executes it and displays a new prompt at the end of the command to wait for another one. At first sight, the application process would need to do some tricky work to parse the displays from the shell in order to discriminate the command display from the displayed prompt at the end of the command. Moreover, the application must also get the result of the command (i.e. the exit status). To make it simple, it is possible to use PDIP (i.e. Programmed Dialogs with Interactive Programs). This is an open source. The package is fully documented with online manuals, html pages and examples. It is an expect-like tool but much more simple to use than its ancestor. It provides the ability to pilot interactive programs. It comes in two flavors: a command named pdip which is used to control interactive programs from a shell script and an C language API offered by a shared library called libpdip.so to control interactive programs from a C/C++ language program. The latter is interesting to implement the current solution.
In the source tree of PDIP package, the isys sub-directory contains a variant of system() using the above principle (cf. isys.c embedded in a shared library called libisys.so). § 4 presents some details about this library. With libisys.so, the application process calls an API named isystem() which behaves the same as system() but actually it hides the PTY and the running background shell described above (cf. Figure 4). The name of this service, isys, stands for "Interactive SYStem()" because it lies on shells running in interactive mode.
The solution described in this chapter saves the fork()/exec() of "sh -c" by keeping at least one running background shell per application process. Depending on the application's behaviour, it may be useful to keep at least a running shell. But it may be cumbersome from a memory point of view if the application calls to isystem() are rare. It is possible to enhance this implementation to reduce the number of running background shells by sharing them with all the running applications as proposed by rsys solution.
3. Performances
In this article a little test program is used to compare the performances of system() and isystem():
system() |
$ tests/system_it 2000 tests/scrip.sh Running command 'tests/scrip.sh' 2000 times... Elapsed time: 5 s - 612918826 ns |
isystem() |
$ tests/isystem_it 2000 tests/scrip.sh Running command 'tests/scrip.sh' 2000 times... Elapsed time: 4 s - 209876090 ns |
We can see that isystem() is faster than system(). As a consequence, it is a good alternative to system().
4. Download, build and installation
4.1. Build from the sources
Unpack the source code package:
$ tar xvfz pdip-xxx.tgz
Go into the top level directory of the sources and trigger the build of the DEB packages:
$ cd pdip-xxx
$ ./pdip_install -P DEB
4.2. Installation from the packages
ISYS depends on PDIP. So, PDIP must be installed prior to install ISYS otherwise you get the following error:
$ sudo dpkg -i isys_xxx_amd64.deb
Selecting previously unselected package isys.
(Reading database ... 218983 files and directories currently installed.)
Preparing to unpack isys_xxx_amd64.deb ...
Unpacking isys (xxx) ...
dpkg: dependency problems prevent configuration of isys:
isys depends on pdip (>= 2.0.4); however:
Package pdip is not installed.
dpkg: error processing package isys (--install):
dependency problems - leaving unconfigured
Errors were encountered while processing:
isys
Install first the PDIP package:
$ sudo dpkg -i pdip_xxx_amd64.deb
Selecting previously unselected package pdip.
(Reading database ... 218988 files and directories currently installed.)
Preparing to unpack pdip_xxx_amd64.deb ...
Unpacking pdip (xxx) ...
Setting up pdip (xxx) ...
Processing triggers for man-db (2.7.5-1)...
Then install the ISYS package:
$ sudo dpkg -i isys_xxx_amd64.deb
(Reading database ... 219040 files and directories currently installed.)
Preparing to unpack isys_xxx_amd64.deb ...
Unpacking isys (xxx) over (xxx) ...
Setting up isys (xxx)
Installation from the packages is the preferred way as it is easy to get rid of the software with all the cleanups by calling:
$ sudo dpkg -r isys
(Reading database ... 219043 files and directories currently installed.)
Removing isys (xxx)
To display the list of files installed by the package:
$ dpkg -L isys
/.
/usr
/usr/local
/usr/local/include
/usr/local/include/isys.h
/usr/local/lib
/usr/local/lib/libisys.so
/usr/local/share
/usr/local/share/man
/usr/local/share/man/man3
/usr/local/share/man/man3/isys.3.gz
/usr/local/share/man/man3/isystem.3.gz
4.3. Installation from cmake
It is also possible to trigger the installation from cmake:
$ tar xvfz pdip-xxx.tgz
$ cd pdip-xxx
$ cmake .
-- The C compiler identification is GNU 6.2.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Building PDIP version xxx
The user id is 1000
-- Configuring done
-- Generating done
-- Build files have been written to: ...
$ sudo make install
Scanning dependencies of target man
Building pdip_en.1.gz
Building pdip_fr.1.gz
Building pdip_configure.
-- Installing: /usr/local/lib/librsys.so
-- Installing: /usr/local/sbin/rsystemd
-- Set runtime path of "/usr/local/sbin/rsystemd" to ""
4.4. Manual
When the ISYS package is installed, on line manuals are available in section 3 (API).
$ man 3 isystem
NAMEisys - Interactive system() service
SYNOPSIS
#include "isys.h"
int isystem(const char *fmt, ...);
int isys_lib_initialize(void);
The ISYS API provides a system(3)-like service based on a remanent background shell to save memory and CPU time in applications where system(3) is heavily
used.
isystem() executes the shell command line formatted with fmt. The behaviour of the format is compliant with printf(3). Internally, the command is run by a remanent shell created by the libisys.so library in a child of the current process.
isys_lib_initialize() is to be called in child processes using the ISYS API. By default, ISYS API is deactivated upon fork(2).
The ISYS_TIMEOUT environment variable specifies the maximum time in seconds to wait for data from the shell (by default, it is 10 seconds).
isystem() returns the status of the executed command line (i.e. the last executed command). The returned value is a "wait status" that can be examined using the macros described in waitpid(2) (i.e. WIFEXITED(), WEXITSTATUS(), and so on).
isys_lib_initialize() returns 0 when there are no error or -1 upon error (errno is set).
The service does not support concurrent calls to isystem() by multiple threads. If this behaviour is needed, the application is responsible to manage the mutual exclusion on its side.
The following program receives a shell command as argument and executes it via a call to isystem().
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <libgen.h>
#include <stdlib.h>
#include <string.h>
#include <isys.h>
int main(int ac, char *av[])
{
int status;
int i;
char *cmdline;
size_t len;
size_t offset;
if (ac < 2)
{
fprintf(stderr, "Usage: %s cmd params...\n", basename(av[0]));
return 1;
}
// Build the command line
cmdline = (char *)0;
len = 1; // Terminating NUL
offset = 0;
for (i = 1; i < ac; i ++)
{
len += strlen(av[i]) + 1; // word + space
cmdline = (char *)realloc(cmdline, len);
assert(cmdline);
offset += sprintf(cmdline + offset, "%s ", av[i]);
} // End for
printf("Running '%s'...\n", cmdline);
status = isystem(cmdline);
if (status != 0)
{
printf("Error from program (0x%x)!\n", status);
free(cmdline);
return 1;
} // End if
free(cmdline);
return 0;
} // main
$ gcc tisys.c -o tisys -lisys -lpdip -lpthread
Then, run something like the following:
$ ./tisys echo example
Running 'echo example '...
example
Rachid Koucha
system(3).
4.5. Build facilities
To help people to auto-detect the location of ISYS stuff (libraries, include files), the ISYS package installs a configuration file named isys.pc to make it available for pkg-config tool.
Moreover, for cmake based packages, a FindIsys.cmake file is provided at the top level of isys sub-tree to facilitate auto-configuration.
5. About the author
The author is an engineer in computer sciences located in France. He
can be contacted here.