Corti.com

@sascha.corti.com.ap.brid.gy

Welcome to corti.com, home of a collection of thoughts by Sascha Corti. 🌉 bridged from ⁂ https://corti.com/, follow @ap.brid.gy to interact

6 followers0 following27 stories

Longform Stories

Clustering Two NVIDIA DGX Sparks to Serve Qwen3-30B-Thinking with Ray + vLLM

TL;DR We took two NVIDIA DGX Spark units, wired them together over a 200 GbE link, joined them into a single Ray cluster running inside a vLLM container, and serve Qwen/Qwen3-30B-A3B-Thinking-2507-FP…

2d ago·10 min read·1969 words

Borrowing Memory, Not Speed: Clustering a Mac Studio and a DGX Spark with exo

Every local-inference setup eventually hits the same wall: a model you want to run is a few gigabytes too big for the one machine you'd run it on. You have a 128 GB Mac Studio. The model wants 160 GB.…

3d ago·12 min read·2308 words

Why Qwen3.6-35B Runs on a NVIDIA DGX Spark and gpt-oss-120B Fought Me Every Step

A field report from getting a local LLM inference endpoint working on an NVIDIA DGX Spark (GB10 / SM121, 128 GB unified memory) — including every wall I hit with gpt-oss-120B, why a smaller FP8 model …

5d ago·12 min read·2232 words

When the Helpdesk Becomes the Hacker: Technical Analysis of the Meta AI Account Takeover Incident And How to Prevent It

In June 2026, security researchers uncovered one of the most surprising account takeover incidents in recent memory. Attackers did not exploit a memory corruption bug, bypass cryptography, or compromi…

Jun 3·7 min read·1341 words

Microsoft’s New MAI Models: A Technical Analysis

At Build 2026, Microsoft significantly expanded its in-house MAI (Microsoft AI) model family. While much of the public attention focused on Microsoft's ongoing relationship with OpenAI, the more inter…

Jun 3·8 min read·1514 words

Two Sparks, One Cluster: Why Stacking NVIDIA DGX Spark Units Unlocks Local Frontier-Scale Inference

The NVIDIA DGX Spark put a Grace Blackwell superchip on the desk for the price of a high-end workstation. A single unit is already a capable local-inference box — 128 GB of unified memory, FP4 tensor …

Jun 1·12 min read·2299 words

Perplexity Bumblebee: Fast, Read-Only Supply-Chain Exposure Checks for Developer Machines

Modern software supply-chain incidents move fast. A malicious package version is published, copied into lockfiles, installed into developer environments, embedded into project workspaces, or exposed t…

May 31·15 min read·2923 words

Running GPT-OSS-120B on a Single NVIDIA DGX Spark - A Practical Guide

Note on the model name: OpenAI’s open-weight family ships as gpt-oss-20b and gpt-oss-120b. There is no 130B variant — this guide targets gpt-oss-120b, which is the one sized to fit the Spark’s unified…

May 31·8 min read·1590 words

Tiny11: Giving an Old, Unsupported PC a Secure Second Life with a Minimal Windows 11 Installation

When Windows 10 reached end of support, many perfectly usable PCs were pushed into an uncomfortable corner. The hardware still worked. The CPU was still fast enough for web browsing, email, light offi…

May 29·18 min read·3570 words

Install the “Caveman” Skill for GitHub Copilot CLI System-Wide

Large Language Models are incredibly powerful for software engineering, but they also have a habit of being verbose. Long explanations, conversational filler, and repeated context all consume tokens, …

May 28·7 min read·1262 words

What Achieving AGI Could Mean: Beyond Bigger Models and Longer Context Windows

Artificial General Intelligence, or AGI, is one of those terms that is both overused and underdefined. Depending on who you ask, it means human-level intelligence, economically useful autonomy, recurs…

May 27·12 min read·2354 words

From Passwords to Keys: Setting Up GitHub SSH Authentication on macOS (and Never Typing Credentials Again)

If you are still cloning GitHub repositories over HTTPS and repeatedly authenticating with browser logins or tokens, switching to SSH is one of those small infrastructure improvements that pays off ev…

May 21·5 min read·970 words

LLMs Corrupt Your Documents When You Delegate

The uncomfortable gap between “can edit” and “can be trusted” A lot of current AI enthusiasm is built around delegation. We no longer ask language models only to answer questions. We ask them to mod…

May 12·1 min read·82 words

CopyFail (CVE-2026-31431): Why a Tiny Linux Kernel Bug Became a Massive Infrastructure Threat

A newly disclosed Linux kernel vulnerability dubbed CopyFail (CVE-2026-31431) has quickly become one of the most serious Linux privilege escalation flaws in recent years. The bug allows an unprivilege…

May 11·1 min read·89 words

Graphify: Bringing Knowledge Graphs to AI-Assisted Engineering

AI coding assistants are becoming very good at generating code, explaining APIs, and navigating local repositories. But they still have a structural weakness: most of them reason over code through tex…

Apr 28·1 min read·83 words

Palantir’s 22-Point Manifesto, Decoded

What The Technological Republic says about software, state power, and the future of defense tech. Palantir’s recent X post is worth reading carefully, not because it is subtle, but because it is unus…

Apr 22·1 min read·80 words

EvilTokens: An AI-Driven Device Code Attack Compromising Microsoft Businesses

A new class of identity attacks is rapidly scaling across enterprises: AI-augmented device code phishing, operationalized through phishing-as-a-service (PhaaS) platforms like EvilTokens. Microsoft and…

Apr 9·1 min read·74 words

AI Agent Traps: When the Web Becomes the Attack Surface for Autonomous Agents

Autonomous AI agents are quickly moving beyond chat. They browse the web, read documents, call tools, retrieve knowledge, send messages, and increasingly act on behalf of users and organizations. That…

Apr 7·1 min read·90 words

Working Beyond the Desk: Using the M5 Apple Vision Pro as a High-Brightness External Display that works on the Balcony on a Sunny Day

I recently upgraded from the first-generation Apple Vision Pro to the new Apple Vision Pro M5 because even if this device and MR/VR in general gets a lot of bad press, it has fundamentally changed how…

Apr 2·1 min read·98 words