External Publication

"Folding" the Wafer : Huawei's Tau Law and the Far Side of a Closed EUV Door

Jason with his AI analysts May 26, 2026

Folding the Wafer: Huawei's Tau Law and the Far Side of a Closed EUV Door

In May 2026, Huawei CTO He Tingbo took the IEEE ISCAS keynote stage and presented a paper titled A Time Scaling Theory for Multi-Layer Electronic Systems. A ChinaXiv preprint dropped at the same time. The core claim, in plain terms: on the same SMIC N+3 node — no node shrink — a technique they call LogicFolding lets Kirin 2026 achieve +55% effective transistor density, +13% P-core frequency, and +41% energy efficiency. The 2031 target stretches further still — to "1.4nm equivalent" and 400+ MTr/mm².

Chinese-language commentary split into two reflexes. One said another Chinese engineering breakthrough. The other said another vendor claim from the Mate 60 era — wait for the TechInsights teardown before believing it. After reading the paper end-to-end and triangulating across more than 30 independent sources, I don't think either reflex quite fits. This is harder than a marketing stunt — the math is internally consistent, and it maps cleanly onto imec's CMOS 2.0 academic blueprint — and softer than an EUV substitute, since it does nothing for HBM and nothing for the NVLink-domain scale-up bottleneck. It sits in between. And the in-between space is where it gets interesting.

What this post tries to do is pry the physics open. Then put the answer back into the thread I've been writing for the last few months — how much, exactly, does China's compute ceiling change in a world without EUV.

1. What LogicFolding Actually Is — A Concept That's Often Misread

The two misreads I've seen most often in Chinese-language coverage: one calls Tau Law "China's answer to CFET". That's not quite right. CFET stacks PMOS and NMOS vertically inside a single CMOS unit, at the transistor level — a monolithic FEOL process, slotted on both the imec and TSMC roadmaps for post-2031. LogicFolding doesn't live there.

The other reads it as "China's version of TSMC SoIC / AMD 3D V-Cache". Also off. SoIC and V-Cache stack across functional blocks — a whole SRAM die sitting on top of a whole logic die. LogicFolding's partition granularity is far finer, sitting at the standard-cell level.

To see why that matters, you have to start with what a standard cell is. It's the smallest reusable Lego brick in a chip: a CMOS inverter is roughly 2 transistors, a NAND gate is 4, a flip-flop is 32. A few hundred cells make an adder (about the complexity of one Lego set). A few thousand make a multiplier. A few billion make a full CPU.

A standard cell rendered in 3D — the smallest reusable building block of a chip

source: Branch Education

What LogicFolding does is take the standard cells inside a single logic block and distribute them vertically across two stacked wafers. The paper itself puts this very cleanly:

"From the circuit designer's perspective, the two tiers behave as a single continuous fabric, with cells distributed across the wafer boundary as if it were an additional metal layer."

To the designer, in other words, the physical boundary between the two wafers behaves like one more metal layer. A wire that used to run along M8 on the top of a die can now hop to the M8 on the other side and keep going.

The physical recipe goes like this: each wafer independently completes its own FEOL and BEOL (both on the same SMIC N+3 node). Then, the way you'd close a book in half, Wafer A flips face-down onto Wafer B (face-up), so the two top-metal stacks meet in the middle. Between them sit a thin SiCN dielectric layer and Cu pads, fusion-bonded directly. Bond pad pitch 1.5μm. Overlay accuracy < 0.5μm.

Prose only takes you so far. A cross-section does the rest:

The physical anatomy of LogicFolding — why it really is a "folding"

LogicFolding Physical Cross-section: Two Wafers F2F Hybrid Bonded Wafer A (face down) Silicon Substrate (thinned) FEOL — Transistors (face down) BEOL Metal Stack M1 → M8 Top Metal Mtop (pitch ~720nm) SiCN dielectric + Cu pads (pitch 1.5μm) ← Bond Interface Wafer B (face up) SiCN dielectric + Cu pads (pitch 1.5μm) Top Metal Mtop (pitch ~720nm) BEOL Metal Stack M8 → M1 FEOL — Transistors (face up) Silicon Substrate (thinned) Backside RDL Power + I/O bumps → PCB / Substrate cell A cell B Critical-path wire (the blue curve) • Starts from cell A on Wafer A (blue dot), travels down through BEOL • Jumps to Wafer B via Cu-Cu hybrid bond pad, then down to cell B • Goes entirely via hybrid bond pad — NOT via TSV The real job of TSVs (brown columns) • Only carries power delivery + backside I/O • Does NOT carry inter-layer signal routing (that's the hybrid bond pad's job) • TSV pitch <6μm is sparser than HB pitch 1.5μm — different tasks Key process parameters (Kirin 2026 claim from paper Sidebar A): • Hybrid bond pitch: 1.5μm | Overlay accuracy: <0.5μm | TSV CD: <1.5μm | TSV pitch: <6μm • Top metal pitch: ~720nm → Gear ratio (HB / Mtop) ≈ 2, target → 1 • Yield: ~100% with smart redundancy | TSV failure rate: <100ppm | Repair rate: 99.9% Both wafers use the same SMIC N+3 node (homogeneous process), face-to-face stacking via direct Cu-Cu fusion bond (no solder/bump)

source: synthesized from Huawei paper Sidebar A + SemiAnalysis hybrid bonding process flow

Open this more visual friendly version in a new tab/点击跳转查看原文,左上角切换中文

The literal meaning of "folding" is right there in the picture. Two complete wafers — each carrying its own FEOL transistors and the full BEOL metal stack — meet face-down to face-up, the way a book closes shut. The red line in the middle is the bond interface — the spine of the book. The blue curve is one critical-path wire: starting at cell A on Wafer A, going down through BEOL, hopping across a Cu-Cu hybrid bond pad, entering the BEOL of Wafer B, and landing on cell B. That is what the paper's "wafer boundary as if it were an additional metal layer" actually looks like in silicon.

A common confusion worth clearing up while we're here: the brown vertical columns in the figure are TSVs (through-silicon vias). They are not what carries cell-to-cell signal routing — that's the hybrid bond pad's job. TSVs are there to deliver power and pull I/O from the top of the stack out to the PCB underneath. TSV pitch (< 6μm) is far sparser than HB pitch (1.5μm) because the two structures are doing completely different jobs. Some of the early Chinese-language commentary got this role backwards — assuming TSVs were the "main 3D interconnect". They're not.

Once you put the whole 3D-integration landscape into a single table, it becomes clear where LogicFolding actually sits:

3D Integration Spectrum (Coarse to Fine) Where does Huawei LogicFolding sit? L1 — Package-level 3D (PoP) Whole-package chip stacking | pitch ≥100μm | bump/solder | mobile DRAM-on-SoC (in production) L2 — Chip-level 3D / Chiplet stacking Die-to-die stacking | pitch 10-50μm μbump or sub-10μm HB | HBM, CoWoS (in production) L3 — Block-level 3D (functional block to block) Functional blocks across layers | pitch 1-10μm HB | AMD V-Cache (SRAM-on-logic, 9μm), TSMC SoIC (6μm) L4 — Cell-level 3D (standard cell granularity) ★ Standard cells within a single block distributed across wafers | pitch sub-1μm (strict academic definition) → Huawei LogicFolding (claimed 1.5μm, gear ratio ≈ 2) | Imec CMOS 2.0 academic blueprint L5 — Transistor-level 3D (inside a single standard cell) NFET/PFET vertically stacked within one cell | CFET monolithic | IMEC/TSMC roadmap 2031+ L6 — Sequential 3D / Monolithic 3D Upper-layer transistors grown continuously (no bonding) | sub-100nm via | research (Imec, CEA-Leti) Huawei actual position Coarse Fine Huawei LogicFolding actually sits at the L3 → L4 transition (1.5μm pitch + cell-granularity partitioning), not L5 CFET, nor traditional L3 SoIC

source: synthesized from imec / Huawei paper / TSMC / AMD

So Huawei's real differentiation isn't 3D IC as a thing — TSMC, Intel, and AMD all do that. The differentiation is pushing W2W active-logic-on-active-logic to a 1.5μm pitch and claiming to put it inside a production SoC. Those two conditions, together, are what makes this a qualified industry-first.

2. The Physics of 1.5μm — Gear Ratio Is the Real Enabling Condition

Why 1.5μm? Why not the 9μm pitches the industry has been shipping for years?

You can't really answer that without first looking at the BEOL metal stack. Every chip carries 10–19 layers of metal interconnect above its transistors. From bottom to top, pitch gets coarser. The lowest layers — M0 and M1 — sit directly above the transistors, with a pitch of 30–50nm, handling short-distance routing inside a cell. Thin, but high-resistance. The middle layers handle block-internal routing. The top layer — Mtop — runs global routing, power, and clock, with a pitch (in Huawei's paper) of about 720nm. Thick and low-resistance.

F2F hybrid bonding has to happen above Mtop. Each wafer first finishes its FEOL plus the entire BEOL stack, and the only surface available for mating when you flip one of them is the very top of BEOL. M0 and M1 are buried right above the transistors — physically impossible to expose for bonding. So the 1.5μm bond pad lives in the same dimensional neighborhood as the 720nm Mtop, not the 30nm M0.

Which is what introduces the concept the paper treats as central — but which marketing has not been eager to dwell on: gear ratio.

Gear ratio = HB pitch / top metal pitch. The paper's own wording: "needs to be controlled below 3, ideally close to 1". The physical meaning: routing wires per mm² inside a wafer (set by top metal pitch) vs. jump-points per mm² for crossing to the other wafer (set by hybrid bond pitch).

If jump-points are scarce relative to wires (gear ratio >> 1), most wires that "want" to cross to the other wafer can't find a corresponding pad, and cross-wafer routing has to stay coarse-grained. That's exactly why traditional SoIC at 9μm pitch could only do block-level stacking like SRAM-on-logic. When jump-point density approaches wire density (gear ratio → 1), the wafer boundary genuinely becomes one more metal layer — any wire can hop across — and cell-level partitioning becomes physically feasible.

Gear ratio: the real watershed between block-level and cell-level 3D

Gear Ratio: The Real Enabling Condition for LogicFolding Gear ratio = HB pitch / Top metal pitch | Paper: "needs to be controlled below 3, ideally close to 1" Gear ratio ≈ 10 (Traditional SoIC: 9μm HB / 0.9μm Mtop) Top metal tracks (dense) Hybrid bond pads (sparse) ❌ Block-level routing only Most wires can't find a pad opposite Gear ratio ≈ 2 ★ (Kirin 2026: 1.5μm HB / 720nm Mtop) Top metal tracks (medium) Hybrid bond pads (medium) ✓ Cell-level routing feasible Every 2 wires gets 1 vertical jump Gear ratio ≈ 1 (Ultimate target / Imec 250nm demo) Top metal tracks (dense) Hybrid bond pads (dense) ✓✓ Wafer boundary = a metal layer Every wire can jump across wafers Why is gear ratio the real enabling condition for LogicFolding? • Top metal tracks are where logic-cell output wires run (pitch determines per-mm² routing resource) • If bond-pad density is too low (gear ratio >> 1), most wires that "want to jump" can't find a matching pad → block-level only • Gear ratio → 1 means every top metal wire has a corresponding hybrid bond pad → cross-wafer routing as free as in-wafer → This is the real technical watershed between Huawei LogicFolding and traditional SoIC

source: synthesized from Huawei paper Sidebar A + imec demos

Kirin 2026's gear ratio is about 2 (1.5μm / 720nm) — enough for cell-level routing, but not at the limit. The paper points to gear ratio → 1 as the next step. imec, at VLSI 2025, demonstrated a 250nm W2W pitch (research-grade), confirming the asymptote actually exists.

This, I think, is the single most elegant piece of the paper. 1.5μm isn't an isolated process number. It's the engineering choice required to bring gear ratio low enough to unlock cell-level routing. Miss this point and you end up asking, "Why can't we just compare it to TSMC SoIC at 6μm?" — a question that puts the same ruler against two different partition granularities, and produces a misaligned answer.

3. The "Fat Bond Pad" RC Concern: A Direct Answer

By this point, any reader with engineering instincts is asking a sharp question: a 1.5μm Cu pad is more than twice the width of a 720nm Mtop wire, and dozens of times the width of a 30nm M0. Won't a pad that fat introduce enough parasitic capacitance and resistance to eat the LogicFolding gain alive?

The instinct is reasonable. The conclusion runs the other way, for two reasons.

First, parasitic RC accumulates along the length of a wire. The decisive variable is total wire length — not the size of any single point on it. The Cu pad is indeed fat in lateral dimension, but the longitudinal length it occupies is only about 0.5μm. And what it replaces is a planar long wire on the order of 100μm. A single pad's contribution to capacitance is roughly equivalent to 2.5μm of planar wire — negligible against the 100μm you save.

Second, the paper's headline figures are themselves direct evidence of RC improvement. Clock buffer count −50%, clock skew −25%, wire length −30%, P-core frequency +13%, energy efficiency +41%. Wire length −30% means average interconnect across the whole chip got 30% shorter; clock buffer −50% means signal integrity needs only half as many buffers to maintain — which only makes sense if RC has improved net. If the 1.5μm bond pad were really breaking the RC budget, none of these numbers could be positive.

That isn't to say there are no design challenges. Pad-to-pad coupling, pad-to-Mtop crosstalk, the area overhead from TSV keep-out zones, 3D-aware parasitic extraction — each of those is a new signoff dimension the EDA toolchain now has to handle. That is precisely what the paper's "preliminary internal tools have been developed, methodology details will be published in the coming months" actually means. The process is plausible; the supporting EDA toolchain is the unpublished variable. We'll come back to it.

4. Once You Take the +55% Apart, What's Left?

Where does the +55% density gain actually come from?

It is not from any shrink in gate pitch or cell height — both are unchanged, since we're on the same node. It comes from two active tiers stacking on top of each other.

In principle, if both wafers were fully folded, the theoretical gain would be +100%. Subtract keep-out zones and bond-related overhead, and you land around +80%. Reverse-engineering the actual +55% from there implies that about 70% of the die area was folded, while the remaining 30% stayed single-tier. That lines up cleanly with the paper's own wording:

"LogicFolding implementation shipping in Kirin 2026 is deliberately conservative. The hybrid-bonding pitch reached 1.5μm; TSV landing advanced only one step below the top metal; folding was applied selectively along key critical paths rather than across the entire design."

So Kirin 2026 doesn't fold the whole chip. It folds selectively along the critical paths. The vast majority of die area remains single-tier. A controlled, conservative first-generation implementation.

There is, however, a subtle baseline issue worth flagging.

The paper's baseline is 155 MTr/mm² — without saying which chip. Most likely it's the normalized density of SMIC N+3 under design rules compatible with LogicFolding. But the actual measured effective density of Kirin 9030 — derived from the TechInsights teardown of the Mate 80 Pro Max in December 2025 — is roughly 125 MTr/mm². A 24% gap, mainly from SRAM, IO, and analog area overhead.

Which means the +55% isn't a direct comparison against the shipping Kirin 9030. It's a like-for-like comparison under specific design rules. Actual product-level effective density of the Kirin 2026 SoC will likely land somewhere in the 190–200 MTr/mm² range — still meaningfully higher than 125, but the "equivalent to TSMC N3" framing needs to be discounted. The paper doesn't say this out loud — and I don't really fault them for it — but it's worth marking when reading vendor claims.

For a sense of industry context:

Global hybrid bonding pitch landscape — W2W vs D2W families are not directly comparable

Global Hybrid Bonding Pitch Comparison x-axis = pitch (μm, log scale) | Dot size = production maturity 0.25μm 0.5μm 1.0μm 2μm 5μm 10μm 25μm CIS image sensors 3D NAND Logic stacking Research demo Huawei Sony demo 0.4μm (ECTC 2024) Sony production 1.4μm YMTC Xtacking ~1μm TSMC SoIC 2025 6μm TSMC 2029 4.5μm TSMC W2W <3μm (roadmap) Intel Foveros Direct 9μm Intel target 3μm AMD V-Cache 9μm Imec W2W 250nm (VLSI 2025) Imec 400nm (IEDM 2023) Imec D2W 2μm (ECTC 2024) Huawei LogicFolding 1.5μm (claimed, Kirin 2026) Dot size: Production Planned / pilot Research demo Key judgment • Huawei 1.5μm in W2W active-logic-on-active-logic production SoC is industry-first (if Kirin 2026 ships and validates) • But Sony CIS has commercialized 1.4μm and Imec research is at 250nm — comparing 1.5μm to TSMC SoIC 6μm is W2W vs D2W, not apples-to-apples • Industry typically takes 2-3 years from lab demo to commercial product — if Huawei jumps straight to mass production, yield is the real risk

source: synthesized from imec / TSMC / Intel / Sony / Huawei

Sony has already commercialized 1.4μm W2W in CIS image sensors. imec research has gone as low as 250nm W2W. Lining up Huawei's 1.5μm directly against TSMC SoIC's 6μm is a W2W-vs-D2W misalignment. The genuinely qualified industry-first should be phrased like this: W2W active-logic-on-active-logic at 1.5μm pitch inside a production SoC — and that's still conditional on Kirin 2026 actually shipping, and the teardown holding up.

5. Where Does It Sit on the Map I've Been Drawing?

The most interesting part of this story shows up when you set it next to the other threads I've been writing over the past few months.

In March 2026, in Small by Design or by Default? What a Memory Formula Reveals About China's AI Ceiling , I worked through a simple formula: Chinese large models cluster in the 230B–1T parameter band not because of aesthetic preference but because of the 8×H200 memory ceiling. Generalized, the takeaway was that China's compute ceiling in the LLM era is constrained by three things — single-node compute density (bounded by process node), scale-up domain interconnect bandwidth (NVL72 and equivalents), and memory bandwidth (HBM and equivalents).

LogicFolding actually moves only the first of these — density. And even there, it's "density at the same node", not "access to a more advanced node". It does not change the NVL72-class scale-up ceiling, and it doesn't substitute for HBM bandwidth. It raises one of the three pillars of the ceiling by 30–50%. The other two are untouched.

This shouldn't be underrated — in a world without EUV, Huawei has used vertical architecture to push single-chip density a step toward the N3 class. That's real engineering progress. But it also shouldn't be overstated. The framework from Small by Design or by Default? mostly survives: Tau Law lifts the hardware ceiling for "small and precise" by one notch, without removing the ceiling itself. China's LLM situation could shift from "clustered at 230B–1T" to "clustered at 350B–1.5T"; the 1T+ territory stays locked in by NVLink-domain and HBM together.

Step one more level back, and there's a more interesting parallel.

In February 2026, in What the Back of a Wafer Tells Us About NVIDIA's Next Fifteen Years , I worked through TSMC A16's backside power delivery (BSPDN). A16 moves the power grid from the front of the wafer to the back, freeing up frontside routing space, letting logic dies grow bigger. That is one form of backside engineering.

Intel 18A's frontside/backside interconnect stack — a canonical physical form of BSPDN

source: SemiAnalysis

Huawei's LogicFolding is also backside engineering, but it's solving the inverse problem. TSMC uses the backside to deliver power, so the front can hold more logic. Huawei bonds two complete wafers face-to-face, so the logic itself can fold in half. Same family of process physics (wafer thinning + ultra-fine-pitch bonding + alignment), two different problem statements, two different solutions.

That parallel is much more interesting than the "Huawei breaks the EUV embargo" narrative. What it really shows is something happening at the global frontier: as horizontal transistor shrink keeps getting more expensive, the next real density increment has to come from the vertical axis. NVIDIA and TSMC's version: use the backside to make room on the front. Huawei's version: fold the front onto itself. Two chapters of the same story — node-level Moore is approaching its marginal-cost cliff, and the next decade belongs to whoever masters the two long-overlooked dimensions: the back of the wafer, and the space between wafers.

Huawei was pushed down this path earlier than it would have arrived otherwise, because EUV was cut off. That is a bit like the financial institutions forced to deleverage during the 2008 financial crisis — they didn't have a choice. But once the dust settled, the ones who went furthest, earliest, ended up structurally healthier.

6. The 2031 Roadmap — Commitment or Direction?

The paper sketches a scaling path out to 2031 — 400+ MTr/mm², 5.0 GHz P-core, "1.4nm equivalent". The path requires 3-layer folding + HB pitch compressed to sub-1μm + TSVs stepping deeper + EDA toolchain co-maturing. Every one of those is unproven.

The hardest piece is 3-layer folding. 2-layer F2F has two BEOL stacks pressed against each other. 3-layer wedges a complete additional wafer in the middle, which means longer electrical distance, more complex power delivery, and thermal density compounding. No production case anywhere in the industry has built 3-layer hybrid-bonded active logic — TSMC, Intel, Samsung, none of them — and none have a clear roadmap for it either. Putting it on a 2031 timeline reads more like directional aspiration than engineering commitment.

Heat is the second inherent challenge. Logic-on-active-logic thermal density is roughly an order of magnitude higher than SRAM-on-logic. Which is why Broadcom and Fujitsu's Monaka have stuck to logic-on-SRAM or logic-on-cache, declining N2-on-N2 logic-on-logic. Huawei choosing a mobile SoC as the first commercial LogicFolding implementation is in part because smartphones tolerate thermal throttling far better than HPC does — letting performance dip under peak load is acceptable. That gives mobile a grace period. But if this path is going to reach AI accelerators (Ascend 990 in 2030 is the paper's stated target), thermal density has to be confronted head-on.

EDA toolchain is the third piece — and the most hidden — systemic risk. 3D-aware parasitic extraction, 3D placement and routing, 3D thermal co-simulation — every one of those depends on EDA vendor support. The paper's "preliminary internal tools have been developed, methodology details will be published in the coming months" is essentially an admission that this layer is still immature. In May 2025, the US BIS briefly tested using EDA export as a lever against China (retracted in July). The market didn't make much of it at the time, but it's a potential systemic constraint on the LogicFolding roadmap. The more complex LogicFolding gets, the more dependent it is on EDA — and Synopsys and Cadence have been front-loading investment into 3D-IC design platforms for some time (a thread I picked up in Synopsys vs Cadence: The Battle for Semiconductor IP Dominance).

Put together, confidence in the 2031 400+ MTr/mm² target is low. It's a directional claim, not an engineering commitment. None of this takes away from the +55% that Kirin 2026 delivers — but the 2026 plausibility and the 2031 aspiration need to be read separately.

7. The Real Falsifiability Moment: Q3-Q4 2026

Everything I've discussed so far comes from the paper itself — vendor self-reported, independently unverified. Huawei has a track record on both sides of this question. There's the Mate 60 pattern of shipping quietly and letting the world teardown after the fact. There's also a history of vendor claims that get marked down once independently measured (Anshel Sag at Moor Insights has publicly questioned this: "None of Huawei's magical chip breakthroughs have actually been scalable"). Both base rates are real. Neither is automatically more credible than the other.

The real falsifiability moment is Q3-Q4 2026 — Kirin 2026 mass production plus an independent TechInsights teardown. That's when several things become visible: whether the 1.5μm bond pitch was actually achieved in process, what the yield curve looks like at production volume, how thermal behavior holds up under realistic smartphone workloads, and how far product-level effective density falls below the paper's claim. The first time this entire thesis gets physically tested from outside.

The second milestone is the EDA methodology paper Huawei has hinted at in "coming months". If it actually appears, we'll learn how mature the domestic EDA tool chain really is. If it doesn't, that's also a signal.

The third comes around 2027, when Broadcom and Fujitsu's Monaka (TSMC SoIC F2F at N2+N5) reach production — an industry benchmark for commercial logic-on-SRAM that backs out the engineering difficulty of the logic-on-logic path.

Until then, LogicFolding is plausible but unverified. The appropriate stance is to keep it in the consideration set without betting on its final magnitude.

Closing: The Story of the Third Dimension

If I had to compress the story to one paragraph, it would be this.

For 60 years, Moore's Law has been a horizontal story — shrinking transistors on a two-dimensional plane. That story isn't over (TSMC still has A14 and A10 ahead, and EUV source power still has 1000W and beyond to climb — I worked through that piece in From Photons to Pricing Power: ASML's 1,000-Watt Chain Reaction), but marginal cost is rising and marginal return is thinning.

The story of the next decade lives along the vertical axis. CFET stacks PMOS and NMOS inside the transistor — 2031+. Backside power delivery moves the power grid to the back of the wafer — commercialized in 2025–26. Hybrid bonding stacks entire wafers, now down to 1.5μm pitch. All of these used to live inside the small box labeled "advanced packaging." That box is now the main arena for the next wave of density growth.

In that sense, Huawei's Tau Law isn't an isolated event. It's one chapter of this vertical-axis story — a chapter that got pushed onto China earlier than it would have arrived otherwise, by export control. The problem it solves, the engineering limits it runs into, the unproven parts of the road ahead — all of these rhyme with what global players are facing in their own variants.

The investment implication is concrete: hybrid bonding equipment (BESI as HB die-attach leader at 42% share; EV Group as W2W bonder leader; Applied Materials, which acquired 9% of BESI in 2025), CMP and SiCN processes (TEL, plus domestic Piotech and Naura), and 3D-aware EDA tools — all three lines are pro-cyclical beneficiaries. None of this is a new thesis; LogicFolding just adds one more data point.

The broader question, though, is more interesting. Once the node-level race approaches physical limits, the next growth vector becomes "dimension," not "scale." Horizontal shrink keeps getting more expensive; vertical reorganization — backside, wafer-to-wafer, cell-level partitioning — opens up orders of magnitude of new design freedom. Huawei was forced to walk this path early because EUV was cut off, and may, ironically, end up holding an unintended early-mover position in advanced packaging.

None of this means China's compute ceiling has been broken. That ceiling rests on three pillars — density, interconnect, and memory. LogicFolding lifts the first. The other two are still there. But what it does show is this: when a door closes, you don't have to stand there. Turn around, and see if there's a window on the other wall. Sometimes there really is.

In Q3-Q4 2026, we'll find out just how big that window is.