欢迎您访问 最编程 本站为您分享编程语言代码,编程技术文章!
您现在的位置是: 首页

Linux手冊翻譯指南:理解mmap(2)及其unmap方法

最编程 2024-07-24 11:18:22
...

删除指定区域(addr,addr+len)的映射,映射删除后再访问将导致无效内存引用。进程终止时,将自动取消映射,此外关闭fd不会引起取消。

消除映射的起始地址必须是页大小倍数,但len可以不是。All pages containing a part of the indicated range are unmapped, and subsequent references to these pages will generate SIGSEGV. It is not an error if the indicated range does not contain any mapped pages.

注意unmmap不是取消由mmap创建的,而取消任意
\color{#A00000}{RETURN VALUE}
成功时, mmap() 返回一个指向映射区域的指针。 出错时,返回值 MAP_FAILED(即 (void *) -1),并设置 errno 以指示错误。

成功时,munmap() 返回 0。失败时,它返回 -1,并且设置 errno 以指示错误(可能是 EINVAL)。

\color{#A00000}{ERRORS}

  • EACCES
    文件描述符是指非常规文件。 或者已请求文件映射,但 fd 未打开以供读取。 或者请求了 MAP_SHARED 并设置了 PROT_WRITE,但 fd 未在读/写 (O_RDWR) 模式下打开。 或者设置了 PROT_WRITE,但文件是append-only。

  • EAGAIN
    The file has been locked, or too much memory has been locked (see setrlimit(2)).
    文件已被锁定,或太多内存已被锁定(请参阅 setrlimit(2))。

  • EBADF
    fd 不是有效的文件描述符(并且未设置 MAP_ANONYMOUS)。

  • EEXIST
    MAP_FIXED_NOREPLACE 在标志中指定,并且 addr 和长度覆盖的范围与现有映射冲突。

  • EINVAL
    addr、length 或 offset不标准(例如,它们太大,或未在页面边界上对齐)。

  • EINVAL
    length == 0

  • EINVAL
    标志不包含 MAP_PRIVATE、MAP_SHARED 或 MAP_SHARED_VALIDATE。

  • ENFILE
    已达到系统范围内打开文件总数的限制。

  • ENODEV
    指定文件的底层文件系统不支持内存映射。

  • ENOMEM
    没有可用的内存。

  • ENOMEM
    已超出进程的最大映射数。 在 munmap() 取消映射现有映射中间的区域时,也会发生此错误,因为这会导致被取消映射区域两侧产生两个较小的映射。

  • ENOMEM
    超出进程的 RLIMIT_DATA 限制(在 getrlimit(2) 中描述)。

  • EOVERFLOW
    On 32-bit architecture together with the large file extension (i.e., using 64-bit off_t): the number of pages used for length plus number of pages used for offset would overflow unsigned long (32 bits).

  • EPERM
    prot 参数要求 PROT_EXEC 但映射区域属于挂载 no-exec 的文件系统上的文件

  • EPERM
    操作被 file seal 阻止; 请参见 fcntl(2)。

  • EPERM
    指定了 MAP_HUGETLB 标志,但调用者没有特权(没有 CAP_IPC_LOCK capability)并且不是 sysctl_hugetlb_shm_group 组的成员; 参见 /proc/sys/vm/sysctl_hugetlb_shm_group 中的描述

  • ETXTBSY
    设置了MAP_DENYWRITE,但 fd 指定的对象已打开以进行写入。

使用映射区域可能会产生以下信号:

  • SIGSEGV
    尝试写入只读映射。
  • SIGBUS
    试图访问超出映射文件末尾的缓冲区页面。 有关如何处理与不是页面大小倍数的映射文件末尾相对应的页面中的字节的说明,请参阅 NOTES。

\color{#A00000}{VERSIONS}
\color{#A00000}{ATTRIBUTES}
\color{#A00000}{CONFORMING TO}

Interface Attribute Value
mmap(), munmap() Thread safety MT-Safe

\color{#A00000}{NOTES}
由 mmap() 映射的内存 ,会在fork(2)产生的子进程中保留,且具有相同的属性。

文件以页面大小的倍数进行映射。 对于不是页面大小倍数的文件,映射结束时部分页面中的剩余字节在映射时为零,并且对该区域的修改不会写出到文件中。映射之后更改文件的大小,对映射区域的影响是不可知的,这个应该是取决于是否已经加载了pagecache。

在某些硬件架构(例如 i386)上,PROT_WRITE 意味着 PROT_READ。 PROT_READ 是否隐含 PROT_EXEC 取决于体系结构。如果可移植程序打算在新映射中执行代码,则应始终设置 PROT_EXEC。

创建映射的可移植方式是将 addr 指定为 0 (NULL),并从标志中省略 MAP_FIXED。在这种情况下,系统选择映射的地址;选择地址以免与任何现有映射冲突,并且不会为 0。如果指定了 MAP_FIXED 标志,并且 addr 为 0 (NULL),则映射地址将为 0 (NULL)。

Certain flags constants are defined only if suitable feature test macros are defined (possibly by default): _DEFAULT_SOURCE with glibc 2.19 or later; or _BSD_SOURCE or _SVID_SOURCE in glibc 2.19 and earlier. (Employing _GNU_SOURCE also suffices, and requiring that macro specifically would have been more logical, since these flags are all Linux-specific.) The relevant flags are: MAP_32BIT, MAP_ANONYMOUS (and the synonym MAP_ANON), MAP_DENYWRITE, MAP_EXECUTABLE, MAP_FILE, MAP_GROWSDOWN, MAP_HUGETLB, MAP_LOCKED, MAP_NONBLOCK, MAP_NORESERVE, MAP_POPULATE, and MAP_STACK.

An application can determine which pages of a mapping are currently resident in the buffer/page cache using mincore(2).

Using MAP_FIXED safely

MAP_FIXED 的唯一安全用途是先前使用另一个映射保留了由 addr 和 length 指定的地址范围; 否则,使用 MAP_FIXED 是危险的,因为它强制删除预先存在的映射,使多线程进程很容易破坏自己的地址空间。

For example, suppose that thread A looks through /proc/<pid>/maps in order to locate an unused address range that it can map using MAP_FIXED, while thread B simultaneously acquires part or all of that same address range. When thread A subsequently employs mmap(MAP_FIXED), it will effectively clobber the mapping that thread B created. In this scenario, thread B need not create a mapping directly; simply making a library call that, internally, uses dlopen(3) to load some other shared library, will suffice. The dlopen(3) call will map the library into the process's address space. Furthermore, almost any library call may be implemented in a way that adds memory mappings to the address space, either with this technique, or by simply allocating memory. Examples include brk(2), malloc(3), pthread_create(3), and the PAM libraries ⟨http://www.linux-pam.org⟩.

从 Linux 4.17 开始,多线程程序可以使用 MAP_FIXED_NOREPLACE 标志来避免在尝试在尚未被预先存在的映射保留的固定地址上创建映射时的上述危险。

Timestamps changes for file-backed mappings

For file-backed mappings, the st_atime field for the mapped file may be updated at any time between the mmap() and the corresponding unmapping; the first reference to a mapped page will update the field if it has not been already.

The st_ctime and st_mtime field for a file mapped with PROT_WRITE and MAP_SHARED will be updated after a write to the mapped region, and before a subsequent msync(2) with the MS_SYNC or MS_ASYNC flag, if one occurs.

Huge page (Huge TLB) mappings

For mappings that employ huge pages, the requirements for the arguments of mmap() and munmap() differ somewhat from the requirements for mappings that use the native system page size.

For mmap(), offset must be a multiple of the underlying huge page size. The system automatically aligns length to be a multiple of the underlying huge page size.

For munmap(), addr, and length must both be a multiple of the underlying huge page size.

C library/kernel differences

This page describes the interface provided by the glibc mmap() wrapper function. Originally, this function invoked a system call of the same name. Since kernel 2.4, that system call has been superseded by mmap2(2), and nowadays the glibc mmap() wrapper function invokes mmap2(2) with a suitably adjusted value for offset.

\color{#A00000}{BUGS}
On Linux, there are no guarantees like those suggested above under MAP_NORESERVE. By default, any process can be killed at any moment when the system runs out of memory.

In kernels before 2.6.7, the MAP_POPULATE flag has effect only if prot is specified as PROT_NONE.

SUSv3 specifies that mmap() should fail if length is 0. However, in kernels before 2.6.12, mmap() succeeded in this case: no mapping was created and the call returned addr. Since kernel 2.6.12, mmap() fails with the error EINVAL for this case.

POSIX specifies that the system shall always zero fill any partial page at the end of the object and that system will never write any modification of the object beyond its end. On Linux, when you write data to such partial page after the end of the object, the data stays in the page cache even after the file is closed and unmapped and even though the data is never written to the file itself, subsequent mappings may see the modified content. In some cases, this could be fixed by calling msync(2) before the unmap takes place; however, this doesn't work on tmpfs(5) (for example, when using the POSIX shared memory interface documented in shm_overview(7)).

\color{#A00000}{EXAMPLES}
以下程序将指定的文件的一部分打印到标准输出,其第一个命令行参数指定文件路径。 要打印的字节范围是通过第二个和第三个命令行参数中的偏移量和长度值指定的。 该程序创建文件所需页面的内存映射,然后使用 write(2) 输出所需的字节。

源代码
       #include <sys/mman.h>
       #include <sys/stat.h>
       #include <fcntl.h>
       #include <stdio.h>
       #include <stdlib.h>
       #include <unistd.h>

       #define handle_error(msg) \
           do { perror(msg); exit(EXIT_FAILURE); } while (0)

       int
       main(int argc, char *argv[])
       {
           char *addr;
           int fd;
           struct stat sb;
           off_t offset, pa_offset;
           size_t length;
           ssize_t s;

           if (argc < 3 || argc > 4) {
               fprintf(stderr, "%s file offset [length]\n", argv[0]);
               exit(EXIT_FAILURE);
           }

           fd = open(argv[1], O_RDONLY);
           if (fd == -1)
               handle_error("open");

           if (fstat(fd, &sb) == -1)           /* To obtain file size */
               handle_error("fstat");

           offset = atoi(argv[2]);
           pa_offset = offset & ~(sysconf(_SC_PAGE_SIZE) - 1);
               /* offset for mmap() must be page aligned */

           if (offset >= sb.st_size) {
               fprintf(stderr, "offset is past end of file\n");
               exit(EXIT_FAILURE);
           }

           if (argc == 4) {
               length = atoi(argv[3]);
               if (offset + length > sb.st_size)
                   length = sb.st_size - offset;
                       /* Can't display bytes past end of file */

           } else {    /* No length arg ==> display to end of file */
               length = sb.st_size - offset;
           }

           addr = mmap(NULL, length + offset - pa_offset, PROT_READ,
                       MAP_PRIVATE, fd, pa_offset);
           if (addr == MAP_FAILED)
               handle_error("mmap");

           s = write(STDOUT_FILENO, addr + offset - pa_offset, length);
           if (s != length) {
               if (s == -1)
                   handle_error("write");

               fprintf(stderr, "partial write");
               exit(EXIT_FAILURE);
           }

           munmap(addr, length + offset - pa_offset);
           close(fd);

           exit(EXIT_SUCCESS);
       }