Author: R. Koucha
Last update: 09-Apr-2021
Huge pages allocation on the Raspberry Pi 4B
Introduction
1 Default configuration
2 The file system
3 Example
About the author
Introduction

Some architectures like the ARM embedded on the Raspberry Pi 4B support multiple huge page sizes. The default huge page size is 2 MB. This article explains how to use the other supported sizes.

1 Default configuration

This article explains how to configure the Linux kernel and provides an example program to allocate huge pages with the default size (2 MB).

Once the previous configuration is done, the default huge page size is visble in /proc/meminfo:

$ cat /proc/meminfo
[...]
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB

The multiple huge page sizes supported by the architecture are visble in /sys/kernel/mm/hugepages directory:

$ ls -l /sys/kernel/mm/hugepages
total 0
drwxr-xr-x 2 root root 0 Nov 23 14:58 hugepages-1048576kB
drwxr-xr-x 2 root root 0 Nov 23 14:58 hugepages-2048kB
drwxr-xr-x 2 root root 0 Nov 23 14:58 hugepages-32768kB
drwxr-xr-x 2 root root 0 Nov 23 14:58 hugepages-64kB
2 The file system

The huge pages are accessible through a file system of type hugetlbfs. This is a kind of RAMFS. Each file system use a given huge page size passed in the mount options. Several file systems with the same huge page size can be mounted. The principle consists to create files into them and the user processes map those files into their memory spaces.

At kernel startup, Linux mounts an internal hugetlbfs file system for the default huge page size. For any other huge page size, the user must mount a corresponding file system. In the default mount point, the files are created internally by the kernel as shown in the example of this article: when mmap() is passed MAP_HUGETLB in the flags, it creates a file named anon_hugepage in the default hugetlbfs file system.

In the Linux source code tree, the hugetlbfs file system is managed in fs/hugetlbfs/inode.c. The list of options that can be passed to the mount command are:

static const struct fs_parameter_spec hugetlb_param_specs[] = {
	fsparam_u32   ("gid",		Opt_gid),
	fsparam_string("min_size",	Opt_min_size),
	fsparam_u32   ("mode",		Opt_mode),
	fsparam_string("nr_inodes",	Opt_nr_inodes),
	fsparam_string("pagesize",	Opt_pagesize),
	fsparam_string("size",		Opt_size),
	fsparam_u32   ("uid",		Opt_uid),
	{}
};

static const struct fs_parameter_description hugetlb_fs_parameters = {
	.name		= "hugetlbfs",
	.specs		= hugetlb_param_specs,
};

For example, to mount a file system of type hugetlbfs on /mnt/huge, the mount command is:

  mount -t hugetlbfs \
	-o uid=,gid=,mode=,pagesize=,size=,\
	min_size=,nr_inodes= none /mnt/huge
The kernel documentation Documentation/admin-guide/mm/hugetlbpage.rst presents the above options:

3 Example

The following program creates the /tmp/hpfs directory on which it mounts a hugetlbfs file system with a size of 4 huge pages of 64KB. A file named /memfile_01 is created and extended to the size of 2 huge pages. The file is mapped into memory thanks to mmap() system call. It is not passed MAP_HUGETLB flag as the file descriptor of the opened file is provided. Then, the program calls pause() to suspend its execution in order to make some observations.

#include <sys/types.h>
#include <errno.h>
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/mount.h>
#include <sys/stat.h>
#include <fcntl.h>


#define ERR(fmt, ...) do {                            \
    fprintf(stderr,                                   \
            "ERROR@%s#%d: "fmt,                       \
             __FUNCTION__, __LINE__, ## __VA_ARGS__); \
                         } while(0)


#define HP_SIZE   (64 * 1024)
#define HPFS_DIR  "/tmp/hpfs"
#define HPFS_SIZE (4 * HP_SIZE)


int main(void)
{
void *addr;
char  cmd[256];
int   status;
int   rc;
char  mount_opts[256];
int   fd;

  rc = mkdir(HPFS_DIR, 0777);
  if (0 != rc && EEXIST != errno) {
    ERR("mkdir(): %m (%d)\n", errno);
    return 1;
  }

  snprintf(mount_opts, sizeof(mount_opts), "pagesize=%d,size=%d,min_size=%d", HP_SIZE, 2*HP_SIZE, HP_SIZE);

  rc = mount("none", HPFS_DIR, "hugetlbfs", 0, mount_opts);
  if (0 != rc) {
    ERR("mount(): %m (%d)\n", errno);
    return 1;
  }

  fd = open(HPFS_DIR"/memfile_01", O_RDWR|O_CREAT, 0777);
  if (fd < 0) {
    ERR("open(%s): %m (%d)\n", "memfile_01", errno);
    return 1;
  }

  rc = ftruncate(fd, 2 * HP_SIZE);
  if (0 != rc) {
    ERR("ftruncate(): %m (%d)\n", errno);
    return 1;
  }

  addr = mmap(NULL, 2 * HP_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
  if (MAP_FAILED == addr) {
    ERR("mmap(): %m (%d)\n", errno);
    return 1;
  }

  // The file can be closed
  rc = close(fd);
  if (0 != rc) {
    ERR("close(%d): %m (%d)\n", fd, errno);
    return 1;
  }

  pause();

  return 0;

} // main

On the card running Raspberry Pi OS, we build and run it:

$ gcc mount_tlbfs.c -o mount_tlbfs
$ ./mount_tlbfs 
ERROR@main#43: mount(): Operation not permitted (1)  # Super user rights required for the mount operation
$ sudo ./mount_tlbfs 
ERROR@main#43: mount(): Cannot allocate memory (12)  # The huge page pool must be configured
$ cat /sys/kernel/mm/hugepages/hugepages-64kB/nr_hugepages 
0
$ sudo sh -c "echo 8 > /sys/kernel/mm/hugepages/hugepages-64kB/nr_hugepages"
$ cat /sys/kernel/mm/hugepages/hugepages-64kB/nr_hugepages 
8
$ sudo ./mount_tlbfs 

In another terminal, we can verify that the file system is mounted:

$ cat /proc/mounts
[...]
none /tmp/hpfs hugetlbfs rw,relatime,pagesize=64K,size=131072,min_size=65536 0 0
$ ls -l /tmp/hpfs/
total 0
-rwxr-xr-x 1 root root 131072 Nov 24 14:36 memfile_01

The memory map of the process points out the allocated huge pages:

$ pidof ./mount_tlbfs
1259
$ cat /proc/1259/smaps
cat: /proc/1259/smaps: Permission denied
$ sudo cat /proc/1259/smaps
2000000000-2000020000 rw-p 00000000 00:2d 23163       /tmp/hpfs/memfile_01
Size:                128 kB
KernelPageSize:       64 kB
MMUPageSize:          64 kB
Rss:                   0 kB
Pss:                   0 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         0 kB
Referenced:            0 kB
Anonymous:             0 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
FilePmdMapped:         0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:            0
VmFlags: rd wr mr mw me de ht 
[...]

As soon as the program writes into the huge pages, the lazy allocation mechanism of Linux triggers the effective allocation of the physical huge pages.

About the author

The author is an engineer in computer sciences located in France. He can be contacted here.