Last update: 14-Aug-2022
Author: R. Koucha
Why shell built-ins are also available as programs?
Introduction

For the sake of performances, some Linux shell commands are not run as independant executables but as built-in functions. For instance, the widely used echo command is a built-in:

$ type echo
echo is a shell builtin
But echo is also available as a program:
$ which echo
/usr/bin/echo
$ file /usr/bin/echo
/usr/bin/echo: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=714b557112010bbcd04b0e5e6efc1b106166733c, for GNU/Linux 3.2.0, stripped
Loosely speaking, the program version of the built-ins is used when the shell interpreter is not available or not needed. Let's explain it in more details...

The command as a built-in

When you run a shell script, the interpreter recognizes the built-ins and will not fork/exec but merely call the function corresponding to the built-in. Even if you call them from an C/C++ executable through system(), the latter launches a shell first and then makes the spawn shell run the built-in. Here is an example program, which runs "echo message" thanks to system() library service:

#include <stdlib.h>

int main(void)
{
  system("echo message");

  return 0;
}

Compile it and run it:

$ gcc msg.c -o msg
$ ./msg
message

Running the latter under strace with the -f option shows the involved processes. The main program is executed:

$ strace -f ./msg
execve("./msg", ["./msg"], 0x7ffef5c99838 /* 58 vars */) = 0

Then, system() triggers a fork() which is actually a clone() system call. The child process#5185 is launched:

clone(child_stack=0x7f7e6d6cbff0, flags=CLONE_VM|CLONE_VFORK|SIGCHLD
strace: Process 5185 attached
 

The child process executes /bin/sh -c "echo message". The latter shell calls the echo built-in to display the message on the screen (write() system call):

[pid  5185] execve("/bin/sh", ["sh", "-c", "echo message"], 0x7ffdd0fafe28 /* 58 vars */ 
[...]
[pid  5185] write(1, "message\n", 8message
)    = 8
[...]
+++ exited with 0 +++
The command as a program

The program version of the built-ins is useful when you need them from a C/C++ executable without an intermediate shell for the sake of the performances or because the shell is not available on the system (e.g. tiny embedded environment which do not provide user interactions). For instance, when you call them through execv() function.
Here is an example program which does the same thing as the preceding example but with execv() instead of system():

#include <unistd.h>

int main(void)
{
  char *av[3];

  av[0] = "/bin/echo";
  av[1] = "message";
  av[2] = NULL;
  execv(av[0], av);

  return 0;
}

Compile and run it to see that we get the same result:

$ gcc msg1.c -o msg1
$ ./msg1
message

Let's run it under strace to get the details. The output is shorter because no sub-process is involved to execute an intermediate shell. The actual /bin/echo program is executed instead:

$ strace -f ./msg1
execve("./msg1", ["./msg1"], 0x7fffd5b22ec8 /* 58 vars */) = 0
[...]
execve("/bin/echo", ["/bin/echo", "message"], 0x7fff6562fbf8 /* 58 vars */) = 0
[...]
write(1, "message \1\n", 10message 
)            = 10
[...]
exit_group(0)                           = ?
+++ exited with 0 +++

Of course, if the program is supposed to do additional things, a simple call to execv() is not sufficient as it overwrites itself by the /bin/echo program. A more elaborated program would fork and execute the latter program but without the need to run an intermediate shell:

#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(void)
{
  if (fork() == 0) {

    char *av[3];

    av[0] = "/bin/echo";
    av[1] = "message";
    av[2] = NULL;
    execv(av[0], av);
  }

  wait(NULL);

  // Make additional things before ending

  return 0;
}

Compile and run it under strace to see that the intermediate child process executes the /bin/echo program without the need of an intermediate shell:

$ gcc msg2.c -o msg2
$ ./msg2
message
$ strace -f ./msg2
execve("./msg2", ["./msg2"], 0x7ffc11a5e228 /* 58 vars */) = 0
[...]
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 5703 attached
, child_tidptr=0x7f8e0b6e0810) = 5703
[pid  5703] execve("/bin/echo", ["/bin/echo", "message"], 0x7ffe656a9d08 /* 58 vars */ 
[...]
[pid  5703] write(1, "message\n", 8message
)    = 8
[...]
[pid  5703] +++ exited with 0 +++
<... wait4 resumed>NULL, 0, NULL)       = 5703
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5703, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
exit_group(0)                           = ?
+++ exited with 0 +++
About the author

The author is an engineer in computer sciences located in France. He can be contacted here.