Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreifmlnq24jkbpzouzfkgdwdzfvbgv4mrobtnjwux6novug2a5oxm6a",
    "uri": "at://did:plc:ivbknywyskln22er3nkssdhl/app.bsky.feed.post/3mncroqpntqg2"
  },
  "path": "/t/pre-rfc-flow-directed-monomorphization-collection-in-mir/24366#post_1",
  "publishedAt": "2026-06-02T13:38:54.000Z",
  "site": "https://internals.rust-lang.org",
  "textContent": "## Summary\n\nThis RFC proposes an experimental MIR-level analysis for **flow-directed monomorphization** , based on the technique described in _The Simple Essence of Monomorphization_. The paper shows that monomorphization can be formulated as a type-flow analysis: collect constraints, solve them, and specialize the finite set of required instantiations. It further shows that higher-rank type polymorphism and existential types are not inherent obstacles to monomorphization; the real obstruction is cyclic type flow that requires infinitely many specializations.\n\nThe proposal has three goals:\n\n  1. make monomorphization collection explicit as a type-flow problem;\n  2. provide compiler infrastructure for future closed-world monomorphization of higher-rank, existential, and opaque-type flows;\n  3. detect infinite monomorphization structurally, before recursive expansion can cause excessive compile time, memory exhaustion, or compiler hangs.\n\n\n\nThis RFC does not propose any immediate surface-language change.\n\n## Motivation\n\nRust already monomorphizes generic functions and types. However, the current mechanism is primarily operational: rustc discovers reachable mono items and recursively expands the required instances.\n\nThis is sufficient for ordinary generics, but it leaves two issues.\n\nFirst, the compiler lacks a uniform representation for type instantiation flow through more advanced forms of abstraction, including:\n\n\n    generic functions\n    generic methods\n    opaque return types\n    async fn futures\n    return-position impl Trait\n    trait-associated opaque types\n    future existential-like encodings\n    future closed-world generic dyn methods\n\n\nSecond, recursive monomorphization is currently treated as an expansion problem. For example:\n\n\n    fn f<T>(x: T) {\n        f(Some(x));\n    }\n\n    fn main() {\n        f(0_i32);\n    }\n\n\nrequires:\n\n\n    f::<i32>\n    f::<Option<i32>>\n    f::<Option<Option<i32>>>\n    ...\n\n\nThe underlying cause is not recursion alone, but **growing type flow** :\n\n\n    i32       ⊑ T\n    Option<T> ⊑ T\n\n\nThis proposal makes that structure explicit. The compiler can reject the growing cycle before generating the infinite sequence.\n\n## Guide-level explanation\n\nThe compiler builds a type-flow graph during MIR monomorphization collection.\n\nA constraint has the form:\n\n\n    τ ⊑ α\n\n\nwhere `τ` is a lifetime-erased, codegen-relevant type expression and `α` is a generic parameter. The constraint means:\n\n\n    type τ flows into parameter α\n\n\nFor example:\n\n\n    fn id<T>(x: T) -> T {\n        x\n    }\n\n    fn main() {\n        id(1_i32);\n        id(String::from(\"x\"));\n    }\n\n\nproduces:\n\n\n    i32    ⊑ T\n    String ⊑ T\n\n\nThe solution is:\n\n\n    T ↦ { i32, String }\n\n\nso the required instances are:\n\n\n    id::<i32>\n    id::<String>\n\n\nForwarding is represented by parameter-to-parameter flow:\n\n\n    fn g<U>(x: U) {\n        h::<U>(x);\n    }\n\n    fn h<T>(x: T) {}\n\n\nproduces:\n\n\n    U ⊑ T\n\n\nRecursive growth is represented by constructor-labeled flow:\n\n\n    fn f<T>(x: T) {\n        f::<Option<T>>(Some(x));\n    }\n\n\nproduces:\n\n\n    Option<T> ⊑ T\n\n\nwhich forms the growing cycle:\n\n\n    T --Option--> T\n\n\nThe cycle is rejected because it denotes infinitely many required instances.\n\n## Reference-level explanation\n\n### Position in the compiler\n\nThe analysis runs during MIR monomorphization collection.\n\nIt operates after type checking and borrow checking, on normalized codegen-relevant types.\n\nRegions are erased before constructing monomorphization keys. Type parameters and const parameters remain specialization dimensions.\n\nFor example:\n\n\n    fn id_ref<'a>(x: &'a i32) -> &'a i32 {\n        x\n    }\n\n\ndoes not produce lifetime-indexed instances. The analysis sees:\n\n\n    &i32\n\n\nnot:\n\n\n    &'a i32\n    &'static i32\n\n\n### Constraint collection\n\nAt each generic instantiation site, rustc emits a constraint from the actual argument to the callee parameter.\n\n\n    fn f<T>(x: T) {}\n\n    fn main() {\n        f::<i32>(0);\n    }\n\n\nemits:\n\n\n    i32 ⊑ T\n\n\nGeneric forwarding emits parameter flow:\n\n\n    fn g<U>(x: U) {\n        f::<U>(x);\n    }\n\n\nemits:\n\n\n    U ⊑ T\n\n\nConstructed instantiation emits structured flow:\n\n\n    fn g<U>(x: U) {\n        f::<Option<U>>(Some(x));\n    }\n\n\nemits:\n\n\n    Option<U> ⊑ T\n\n\n### Solving\n\nThe solver computes a finite mapping:\n\n\n    S(α) = { ρ1, ρ2, ... }\n\n\nwhere each `ρ` is a lifetime-erased concrete type.\n\nA cycle is rejected when returning to the same parameter adds a non-empty type constructor stack.\n\nRejected:\n\n\n    T --Option--> T\n\n\nbecause it denotes:\n\n\n    T = i32\n    T = Option<i32>\n    T = Option<Option<i32>>\n    ...\n\n\nAllowed:\n\n\n    T ⊑ U\n    U ⊑ T\n\n\nbecause it does not itself grow the type.\n\n### Opaque types and `impl Trait`\n\nThe analysis must explicitly represent Rust opaque types.\n\nArgument-position `impl Trait` is treated as an anonymous generic parameter:\n\n\n    fn f(x: impl Clone) {}\n\n\nis modeled like:\n\n\n    fn f<T: Clone>(x: T) {}\n\n\nReturn-position `impl Trait` is treated as an opaque but concrete type:\n\n\n    fn once<T>(x: T) -> impl Iterator<Item = T> {\n        std::iter::once(x)\n    }\n\n\nConceptually, the flow graph records:\n\n\n    T       ⊑ once::T\n    Once<T> ⊑ once::ReturnOpaque<T>\n\n\nAfter calls at `i32` and `String`, the opaque instances are:\n\n\n    once::ReturnOpaque<i32>\n    once::ReturnOpaque<String>\n\n\n`async fn` is handled similarly, since it returns an opaque future type.\n\nThis RFC does not change the semantics of `impl Trait`. It only requires the flow graph to account for opaque types already present in Rust’s type system.\n\n### Higher-rank and existential flow\n\nThe analysis is intended to support future closed-world monomorphization of type flow through higher-rank and existential structures.\n\nFor higher-rank type-polymorphic methods, method-level type parameters are flow targets:\n\n\n    i32  ⊑ A\n    bool ⊑ A\n\n\nA closed-world lowering may then generate:\n\n\n    method_i32\n    method_bool\n\n\nFor existential packaging, construction flows into the hidden type parameter, and unpacking propagates it onward:\n\n\n    i32    ⊑ A\n    String ⊑ A\n    A      ⊑ B\n\n\nRust does not currently expose general existential packages in this form. Existing `impl Trait` is opaque rather than runtime-unpacked existential packaging. Nevertheless, both require tracking hidden concrete types through generic instantiation.\n\n## Proposed implementation\n\n### Stage 1: graph construction\n\nAdd an internal flow graph builder behind:\n\n\n    -Z flow-directed-mono\n\n\nThis stage mirrors existing monomorphization collection and does not affect code generation.\n\n### Stage 2: growing-cycle detection\n\nDetect growing cycles before recursive mono-item expansion.\n\nThis is the primary robustness benefit: infinite monomorphization becomes a structural compiler error, not an operational failure mode.\n\n### Stage 3: diagnostics\n\nUse the graph to report the type-growth path.\n\nExample diagnostic:\n\n\n    error: generic instantiation grows without bound\n     --> src/main.rs:2:5\n      |\n    2 |     f(Some(x));\n      |     ^^^^^^^^^^ recursively instantiates `f` with `Option<T>`\n\n    note: type-flow cycle detected:\n          T --Option--> T\n\n    note: starting from `i32`, this requires:\n          f::<i32>\n          f::<Option<i32>>\n          f::<Option<Option<i32>>>\n          ...\n\n\n### Stage 4: validation\n\nCompare the flow-derived mono item set against rustc’s existing collector under a debug flag.\n\nThe new analysis should initially be used for validation and diagnostics, not as the codegen driver.\n\n### Stage 5: optional integration\n\nAfter validation, rustc may use the solver for selected parts of monomorphization collection.\n\n## Drawbacks\n\nThis adds compiler complexity.\n\nThe analysis must interact with MIR, trait selection, opaque types, associated types, codegen units, incremental compilation, drop glue, closures, generators, async lowering, vtables, and shims.\n\nThe analysis may over-approximate. Over-approximation is acceptable for diagnostics, but not for direct code generation unless validated.\n\n## Rationale and alternatives\n\n### Keep the existing collector unchanged\n\nThis avoids implementation cost but leaves infinite monomorphization as a recursive expansion failure.\n\n### Add ad hoc recursion diagnostics\n\nThis may improve individual errors, but it does not provide a general model for generic, higher-rank, existential, and opaque type flow.\n\n### Use backend-level analysis\n\nLLVM-level analysis is too late. Rust generics have already been monomorphized before LLVM IR generation.\n\n## Prior art\n\nThe primary prior art is _The Simple Essence of Monomorphization_ , which presents monomorphization as flow-directed constraint solving, supports higher-rank and existential polymorphism, and characterizes non-monomorphizable programs as growing cycles in type flow.\n\nRust’s existing MIR monomorphization machinery remains the implementation baseline.\n\n## Unresolved questions\n\n  * What is the graph node identity?\n\n    * `DefId + GenericParamIndex`\n    * `InstanceDef + GenericArgIndex`\n    * `MonoItem + GenericArgIndex`\n  * Should the graph be built per item, per crate, per codegen unit, or per compilation session?\n\n  * How should opaque types be represented?\n\n    * by opaque `DefId`;\n    * by defining use;\n    * by hidden type;\n    * by opaque identity plus captured arguments?\n  * When should hidden types for `impl Trait` be revealed to the solver?\n\n  * How should RPITIT be represented?\n\n    * as associated type projection flow;\n    * as opaque type flow;\n    * or both before and after normalization?\n  * How should const generics appear in the graph?\n\n  * How should compiler-generated items appear?\n\n    * drop glue;\n    * closures;\n    * async futures;\n    * generators;\n    * vtable shims.\n  * Should growing cycles be rejected before the existing recursion limit is reached?\n\n  * Can upstream crates expose compact type-flow metadata?\n\n\n\n\n## Future possibilities\n\nThis infrastructure may support future closed-world experiments with generic methods behind dynamic dispatch.\n\nFor example:\n\n\n    trait Choose {\n        fn choose<T>(&self, a: T, b: T) -> T;\n    }\n\n\ncould, in a closed-world setting, be lowered into:\n\n\n    choose_i32\n    choose_bool\n    choose_String\n\n\nwhen flow analysis proves that these are the only required instantiations.\n\nThis RFC does not propose that language feature. It proposes the MIR-level analysis needed to make such features technically tractable.",
  "title": "[Pre-RFC] Flow-Directed Monomorphization Collection in MIR"
}