Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreigtstt42srtauks5dpmcom2sett5dirm22iw5eovb2ok2hfwjy4c4",
    "uri": "at://did:plc:ivbknywyskln22er3nkssdhl/app.bsky.feed.post/3mj4wv7uvn452"
  },
  "path": "/t/pre-rfc-btf-relocations/24161#post_1",
  "publishedAt": "2026-04-09T22:21:07.000Z",
  "site": "https://internals.rust-lang.org",
  "tags": [
    "BTF, the BPF Type Format",
    "Aya",
    "libbpf",
    "Compile Once, Run Everywhere (CO-RE) relocations",
    "bpf-linker",
    "@llvm.preserve.array.access.index",
    "@llvm.preserve.struct.access.index",
    "@llvm.preserve.union.access.index"
  ],
  "textContent": "## Summary\n\nAdd a `#[btf_preserve_access]` attribute for aggregate types. On builds targeting BPF targets (`bpfel-unknown-none`, `bpfeb-unknown-none`) that emit BTF (`-C debuginfo=2`), field and array-element accesses through values of attributed types emit BTF relocations, allowing eBPF loaders to relocate those accesses against the target Linux kernel's BTF data at load time.\n\n## Motivation\n\nBTF, the BPF Type Format, is the type metadata format used by the Linux kernel and by eBPF tooling. eBPF loaders such as Aya and libbpf use BTF for Compile Once, Run Everywhere (CO-RE) relocations: the compiled program records which field or array element it intended to access, and the loader rewrites the bytecode to match the layout of the kernel it is about to run on.\n\nclang and GCC are capable of emitting such relocations.\n\nRust can already target eBPF, but it does not currently have a way to emit these BTF access relocations. In practice that means Rust eBPF programs often have to pick one of three inconvenient options:\n\n  * Vendor exact kernel type definitions and rebuild for each supported kernel layout.\n  * Avoid typed field access and manually encode offsets, sacrificing readability and maintainability.\n  * Write a module in C solely for accessing kernel types and use build.rs to link it to the Rust project.\n\n\n\nThe goal of this RFC is to give Rust codegen enough information to emit the same relocation-friendly IR that clang and GCC emit today, without introducing a new source-language model for kernel types. Rust users should be able to define the subset of kernel types they care about, mark those definitions as relocation-aware, then write normal field and indexing code against them.\n\n## Guide-level explanation\n\nSuppose an eBPF program needs to read the `pid` and `tgid` fields from Linux's `task_struct`. The exact layout of `task_struct` varies across kernel versions, so using the full type definition or hard-coding byte offsets is not a robust option.\n\nAfter enabling the feature:\n\n\n    #![feature(btf_preserve_access)]\n\n\nthe program can declare only the part of the kernel type it intends to use:\n\n\n    #[btf_preserve_access]\n    #[expect(non_camel_case_types, reason = \"Linux kernel type\")]\n    struct task_struct {\n        pid: i32,\n        tgid: i32,\n    }\n\n\nThe important part is the Rust type's shape and names. For CO-RE to work, the field names and nesting must correspond to the kernel BTF that the loader will relocate against.\n\nOrdinary Rust field accesses then become relocation-aware:\n\n\n    fn process_task(task: &task_struct) {\n        let pid = task.pid;\n        let tgid = task.tgid;\n    }\n\n\nPattern matching over fields is also supported because it lowers to the same field projections:\n\n\n    fn process_task(task: &task_struct) {\n        let task_struct { pid, tgid } = task;\n    }\n\n\nLikewise, fixed-size arrays nested within an attributed type can use indexed accesses. For example, networking programs often work with [`net_device`][net-device], which contains `name[IFNAMSIZ]`:\n\n\n    #[btf_preserve_access]\n    #[expect(non_camel_case_types, reason = \"Linux kernel type\")]\n    struct net_device {\n        name: [u8; 16],\n    }\n\n    fn process_dev(dev: &net_device) {\n        let first_char = dev.name[0];\n    }\n\n\nWhen compiling for a BPF target with BTF emission enabled, rustc records those accesses in the generated IR using BTF-preserving intrinsics. The eBPF loader then patches the bytecode so that the program uses the offsets and indices appropriate for the target kernel.\n\nOutside BPF/BTF-producing builds, the attribute has no effect.\n\n## Reference-level explanation\n\n### Syntax\n\nThis RFC introduces an unstable built-in attribute:\n\n\n    #[btf_preserve_access]\n\n\nThe attribute is permitted on `struct` and `union` items. It is guarded by the `btf_preserve_access` feature gate.\n\n### Codegen\n\nLLVM provides dedicated intrinsics for emitting BTF relocations:\n\n  * `@llvm.preserve.array.access.index` for index projections from an array.\n  * `@llvm.preserve.struct.access.index` for field projections from a struct.\n  * `@llvm.preserve.union.access.index` for field projections from a union.\n\n\n\n`IRBuilder` provides methods that language frontends can use for creating the intrinsic calls:\n\n  * `CreatePreserveArrayAccessIndex`\n  * `CreatePreserveStructAccessIndex`\n  * `CreatePreserveUnionAccessIndex`\n\n\n\nThe implementation strategy is to expose corresponding hooks on the codegen backend abstraction:\n\n  * `btf_preserve_array_access_index`\n  * `btf_preserve_struct_access_index`\n  * `btf_preserve_union_access_index`\n\n\n\nThe LLVM backend lowers these directly to the corresponding `IRBuilder` methods.\n\nGCC's BPF backend also supports this relocation model through the `preserve_access_index` type attribute, which GCC documents as being equivalent to wrapping each access in `__builtin_preserve_access_index` built-in function.\n\nThe binary object keeps the relocations in the `BTF.ext` ELF section.\n\nBackends that do not provide an equivalent preserve-access mechanism would still lower the hooks to ordinary projections.\n\n### Scope of preserved accesses\n\nThe attribute affects projections whose base type is marked with `#[btf_preserve_access]`, as well as projections continuing from a value obtained through such an access. In practice this means:\n\n  * `task.pid` preserves access information because `task_struct` is annotated.\n  * `task.values[i]` preserves both the field projection and the array index when `values` is a fixed-size array.\n\n\n\n### MIR and compiler representation\n\nNo new Rust syntax beyond the attribute is needed, but rustc must carry the fact that a projection originates from an attributed aggregate far enough into codegen to choose the preserving intrinsics instead of ordinary GEP-like lowering.\n\nOne implementation strategy is:\n\n  * Parse `#[btf_preserve_access]` as a built-in type attribute.\n  * Record it in MIR and type metadata.\n  * When lowering `PlaceRef` field and index projections in codegen, inspect the base aggregate type and dispatch to the preserving backend hooks.\n\n\n\n### Target and backend interactions\n\nThe attribute is intended for BPF targets, where BTF relocation emission is meaningful. On other targets, rustc accepts the attribute under the feature gate but emits ordinary accesses.\n\nAlternative backends are not required to implement relocation emission for this RFC to be useful. The backend abstraction allows LLVM and GCC to provide the full feature immediately while other backends remain semantically correct. A backend that later gains BPF+BTF support can implement the same hooks without changing the Rust syntax.\n\nIn particular, a GCC backend implementation does not need a Rust-specific relocation design. It can reuse GCC's existing BPF CO-RE support while sharing the same Rust frontend attribute and MIR/codegen mechanism as the LLVM backend.\n\n## Drawbacks\n\nThis is a niche feature aimed at one target family and one ecosystem workflow. Yet it expands MIR and its entire code that deals with field and index projection with its concepts.\n\n## Rationale and alternatives\n\nThe main alternative is to emit BTF relocations in bpf-linker, which is a bitcode linker used exclusively for BPF targets.\n\nThat approach has the following disadvantages:\n\n  * By link time, the compiler has already lowered field access into offset-based operations. Reconstructing the original typed access path requires bpf-linker to traverse the IR.\n  * It prevents us from supporting ld type of linkers (e.g. binutils, lld) for BPF targets.\n\n\n\n## Prior art\n\nclang and GCC support this feature through:\n\n  * `__attribute__((preserve_access_index))` that can be applied to a type, e.g.\n\n\n\n\n    struct task_struct {\n    \tpid_t pid;\n    \tpid_t tgid;\n    } __attribute__((preserve_access_index));\n\n\n  * Built-in function `__builtin_preserve_access_index` that can be applied to a single field access, e.g.\n\n\n\n\n    pid_t pid = __builtin_preserve_access_index(task->pid);\n\n\nIn clang, that attribute causes accesses to lower to the same family of LLVM intrinsics mentioned previously.\n\nIn GCC, the entire BPF CO-RE mechanism relies on the `__builtin_preserve_access_index` built-in function. `__attribute__((preserve_access_index))` is equivalent to implicitly wrapping all accesses to the type with the built-in.\n\n## Unresolved questions\n\n  * Is `btf_preserve_access` the right long-term spelling for the feature gate and attribute, or should the name align more closely with `preserve_access_index` used in clang and GCC?\n  * Apart from providing an equivalent to `__attribute__((preserve_access_index))` (type-wide annotation), should we provide a way to annotate a single field access, like `__builtin_preserve_access_index` in clang and GCC does?\n    * The argument against is that the type attribute is much more widely used and even the kernel type headers [provided by libbpf][libbpf-headers] and generated by [bpftool][bpftool] apply the type attribute.\n\n\n\n## Future possibilities\n\nIf cranelift eventually introduces BPF support, the backend hooks introduced here provide a natural reference point for emitting equivalent relocations there as well.",
  "title": "[Pre-RFC] BTF relocations"
}