Supporting CHERI capabilities in GCC and glibc

This article brought to you by LWN subscribers

Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

By Jonathan Corbet
September 26, 2022

Cauldron

The CHERI architecture is the product of a research program to extend common CPU architectures in a way that prevents many types of memory-related bugs (and vulnerabilities). At the 2022 GNU Tools Cauldron, Alex Coplan and Szabolcs Nagy described the work that has been done to bring GCC and the GNU C Library (glibc) to this architecture. CHERI is a fundamentally different approach to how memory is accessed, and supporting it properly is anything but a trivial task.

CHERI

Coplan started by describing CHERI as a research project that has been running for a decade or so, based at Cambridge University. It introduces the concept of "capabilities", a term which has a specific meaning in this context. A capability, Coplan said, is a token of authority to access a range of memory that cannot be forged. Capabilities can be passed around, and they can be derived from other capabilities, but any such derivation can only narrow the access permissions that capability provides; they can never be expanded.

Lots of details can be found in this overview of the CHERI architecture. In short, a capability can be thought of as a special type of pointer that occupies 129 bits. The bottom 64 bits are a conventional virtual address, while the upper 64 bits (often referred to as the "provenance") describe the associated access permissions. They include a bitmask of allowed operations (read, write, execute) and the range of memory to which the capability applies. Together, those 128 bits can be used like a pointer to perform the allowed accesses on the allowed range of memory.

The 129th bit is stored elsewhere and managed by the CPU; the capability is only valid if that bit is set. There are CPU instructions to derive a new capability from an old one that will keep the bit set; any direct changes to the capability, instead, will cause the validity bit to be cleared. Other disallowed operations, such as trying to use a capability to write outside of the allowed range of memory, will also invalidate the capability. If the hardware works as designed, a capability provides the right to access a range of memory in certain ways — and nothing more.

When this system boots, the firmware provides the kernel with a capability allowing full access to all of memory. Every other capability used during the life of the system will be ultimately derived from this "root capability". In a well designed system, capabilities may go through several levels of derivation until they refer to narrowly constrained regions of memory.

Morello GCC

CHERI is meant to be implemented as an add-on to an existing architecture; the Morello project is adding CHERI capabilities to the ARMv8-A architecture. Prototype boards implementing this combination exist now. A mature LLVM-based toolchain already exists, but ARM wanted a GCC option as well, so the Morello GCC project is working to provide that option; this project is also porting binutils, GDB, and glibc.

There are two capability models, called "pure-cap" and "hybrid", that have been implemented; the former puts capabilities on all pointers, while the latter only uses them in parts of the system. The hybrid model allows mixing capability-aware code with non-aware code; a kernel port to the Morello architecture has been done using the hybrid mode.

Challenges are not in short supply when attempting a port of this nature. One of the first steps was to remap the intptr_t type (describing an integer that can hold a pointer value). The problem being solved there is, of course, that a normal long cannot hold capabilities; any attempt to use a pointer taken from a long value will trap at run time. Code that uses such types in that way will, thus, break on a CHERI system. Pointer comparison is a bit tricky; two capabilities pointing to the same location may have different access rights and thus different bit patterns, but a comparison should call them equal.

Despite these traps, Coplan said, most code will just work when compiled in the pure-cap mode. Low-level software, instead, can require a lot of changes, and code that plays around with pointers can be problematic. There is a sorting function in GCC, for example, that plays games with pointers, leading to errors from the compiler.

Teaching GCC about capabilities in general is a big challenge, requiring many changes throughout the code. They break two fundamental assumptions made within GCC: that pointers and integers are interchangeable, and that addresses and offsets are essentially the same thing. The compiler must ensure that all pointers have the correct provenance, that pointer comparisons are done correctly, and so on.

Work on GCC started by adding a -mfake-capability option that causes the compiler to always use capabilities in its internal representations, but to generate standard code from the back end as before. This allows the separation of capability concepts from the specifics of the Morello architecture. The developers worked to make everything work with this flag first, and only then set to the task of generating code that would use capabilities on real hardware.

One remaining problem is that the address-range bounds in the capability provenance are stored using a compressed, floating-point representation; otherwise that information would not fit in the allotted space. But, as a result, not all combinations of address, base, and limit can be represented. This problem mostly affects memory allocators, which must be able to operate over large ranges of memory, Coplan said.

Porting glibc

Nagy then took over to talk about the work that has been done to port glibc to the Morello architecture. This work is currently being done in a separate branch, with no plans to submit it upstream until a more clean patch set can be created. A lot of code changes to the library have been required; CHERI C is a different language, he said. Any code that manipulates pointers may need changes, for example.

More subtly, special tricks have to be done with basic functions like memcpy(). Any capabilities stored in the memory to be copied will not be valid at the destination unless special care is taken. Similar issues exist with pointers stored in shared memory or sent over sockets. syscall() must return an intptr_t type — an ABI change — to be able to successfully return pointers; similar changes are needed for other system calls as well.

Then there is the issue of capability derivation. When the kernel launches a process, that process is provided with a capability for the entirety of user space. It is possible to make things work with this capability, of course, and that is the first step, but it does not really take advantage of the capability mechanism. So the next step is to start narrowing capabilities in specific situations where access to all of memory is not needed.

A trickier problem is the dynamic linker, which does a lot of manipulation of 64-bit addresses. It is working with ELF binaries, though, so most of its work takes the form of "base plus offset" calculations. If the base value is a properly formed capability, things will work correctly. As the initial problems are overcome, the next phase is to start separating capabilities that allow writing from those that allow execution; that will require a lot more care to be taken in pointer derivation, he said.

Then there is the question of malloc(). Ideally, this function would return a pointer that can access the allocated object (and nothing else). In the first phase, this sort of narrowing wasn't done; the focus was on getting something working. The second phase then narrows the bounds of the returned pointer to just the object in question. That creates a problem for free(), though, which must be able to access metadata stored outside of the object itself; this is handled by maintaining a special capability for use by free(). There are various other challenges to making all of this work; he would really design the malloc() interface differently for CHERI, he said.

Currently the test suites run, he said, but only if stack bounds are not used in GCC. There are a number of missing features, including profiling, support for an executable stack, support for LD_AUDIT, and support for pointers stored in shared memory or sent via a file descriptor. Those latter cases may never be able to work, he concluded.

[Thanks to LWN subscribers for supporting my travel to this event].

Index entries for this article
Security	GCC
Conference	GNU Tools Cauldron/2022

(Log in to post comments)

Supporting CHERI capabilities in GCC and glibc

Posted Sep 26, 2022 21:24 UTC (Mon) by khim (subscriber, #9252) [Link]

Everything new is something well-forgotten. The whole thing is, basically, duplicate of E2k approach to security (minus VLIW).

I wonder how different our world would have been if these designed were actually included back in the upstream glibc decade (or more?) ago when E2k first got it's Linux port.

Supporting CHERI capabilities in GCC and glibc

Posted Sep 27, 2022 0:34 UTC (Tue) by wahern (subscriber, #37304) [Link]

The IBM i series mainframes have supported tagged 128-bit pointers since at least the 1980s. The system uses a single, flat address space, so it relies on tagged pointers rather than separate virtual address spaces for hardware-enforced program isolation. (Some earlier systems from the 1970s may have used a similar architecture, but not 128 bits wide. The IBM world is so opaque....)

AFAIU, these sorts of environments weren't uncommon; there were perhaps just too many of them. The rise of C and C++, as well as the i386, Unix, and Windows seems to pushed the industry toward conformity. Perhaps too much conformity. The C standard, at least, was careful to leave open a window for tagged pointers. I think CHERI may have come just in the nick of time as the pressures to close that window were mounting and already making headway. Newer languages like Rust simply adopted and formalized the modern conceit (already baked into GCC and especially LLVM) that addresses and integers were interchangeable.

Supporting CHERI capabilities in GCC and glibc

Posted Sep 27, 2022 8:46 UTC (Tue) by matthias (subscriber, #94967) [Link]

I do not think that there is much of a difference wrt. to the support of tagged pointers in C(++) and rust. With safe rust, we have a language that does not have this problem. You can convert a reference to an integer and you can convert an integer to a raw pointer. But there is no way at all in safe rust to dereference a raw pointer. Thus in safe rust, you will never need the capabilities, once you converted a reference to an integer. The problems are in low level code that does funny things with pointers. This code has to be marked as unsafe. And this kind of code will not work in either rust or C(++) on CHERI.

There might be a problem with code that assumes that usize and references always has the same size. This depends on which size usize will have on CHERI. As usize is usually used to hold offsets into arrays/slices etc., 64 bit should be sufficient. But right now, usize is defined to be the size of a reference, which needs to be 128 bit on CHERI.

Supporting CHERI capabilities in GCC and glibc

Posted Sep 27, 2022 14:51 UTC (Tue) by jhoblitt (subscriber, #77733) [Link]

CHERI is unnecessary if you are able to assume that:

1) malicious intent doesn't exist.
2) memory corruption is Impossible.
3) compilers are infallible.

Supporting CHERI capabilities in GCC and glibc

Posted Sep 27, 2022 10:20 UTC (Tue) by khim (subscriber, #9252) [Link]

> AFAIU, these sorts of environments weren't uncommon; there were perhaps just too many of them.

Sure, but how many of these had GCC and GLibC port? My point was that not only E2k supported these tagged pointers, it also had native port of Linux, GCC and GLibC.

I don't think they ever distributed sources for these changes, though, but they definitely supported not just tagged pointers, but the whole thing.

Supporting CHERI capabilities in GCC and glibc

Posted Sep 26, 2022 22:20 UTC (Mon) by jrtc27 (subscriber, #107748) [Link]

Note that much of this is following work we have done in CheriBSD's (https://www.cheribsd.org) CheriABI (https://doi.org/10.1145/3297858.3304042), which has a full pure-capability ("purecap" at Cambridge, but often "pure-cap" still within Arm) kernel and userspace based on FreeBSD. Some OS-specific differences exist, but there's more in common than not given both are POSIX/Unix-like OSes.

We've also had a limited CHERI GDB for quite a few years now (https://github.com/CTSRD-CHERI/gdb), first for CHERI-MIPS then later for CHERI-RISC-V; ours was just rather barebones and Arm went and worked on the internals a bunch to fill in various known issues to make capabilities more of a first-class citizen (and our upcoming GDB 12-based branch is rebased onto that work).

Supporting CHERI capabilities in GCC and glibc

Posted Sep 27, 2022 11:48 UTC (Tue) by ms-tg (subscriber, #89231) [Link]

Question: Does the Rust “Strict Provenance Experiment” have a path to generating code for real-world CHERI architectures? Or is that further in the future as these ideas are explored?

https://github.com/rust-lang/rust/issues/95228

Supporting CHERI capabilities in GCC and glibc

Posted Sep 27, 2022 16:39 UTC (Tue) by brooks (guest, #161178) [Link]

There is work in progress on a Rust port to CHERI targeting Morello at the University of Kent. One interesting point is that Rust comes with some dynamic checking overheads (the slides below give a very rough number of 10%), some of these may be eliminated by using capabilities. Here's a set of slides from the recent CHERITech 22 event:

https://soft-dev.org/events/cheritech22/slides/Cooksey.pdf

Where is elsewhere

Posted Sep 28, 2022 8:18 UTC (Wed) by eru (subscriber, #2753) [Link]

Re "The 129th bit is stored elsewhere and managed by the CPU": How does this work? Is part of the memory dedicated to these 129th bits?

Where is elsewhere

Posted Sep 28, 2022 9:36 UTC (Wed) by matthias (subscriber, #94967) [Link]

In CHERI, this is not specified as CHERI is independent of the architecture. In Morello they seem to test two different implementations, as can be seen here[1]:
> How will Morello store CHERI's tag bits?
> Morello supports two different implementations of physical memory tagging, to allow their properties to be compared experimentally. In one configuration, ECC bits are used to hold memory tags. In the other, a tag controller and tag cache are used to hold memory tags (see our ICCD 2017 paper [2] on efficient memory tagging).

[1] https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/ch...
[2] https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/201...

Where is elsewhere

Posted Sep 28, 2022 9:37 UTC (Wed) by farnz (subscriber, #17727) [Link]

Yes. The CPU has 1 bit of memory for every 16 bytes of usable memory that's only used for tags, and is inaccessible to code running on a Morello CPU; this means that 1/128th of your RAM is permanently inaccessible just in case its associated 128 bit block is used to store a pointer.

Pointers must be stored 128-bit aligned in Morello, so there's no need for more complex tagging as long as the overhead is acceptable.