Corti.com

Welcome to corti.com, home of a collection of thoughts by Sascha Corti. 🌉 bridged from ⁂ https://corti.com/, follow @ap.brid.gy to interact

6 followers0 following27 stories

Longform Stories

Clustering Two NVIDIA DGX Sparks to Serve Qwen3-30B-Thinking with Ray + vLLM

TL;DR We took two NVIDIA DGX Spark units, wired them together over a 200 GbE link, joined them into a single Ray cluster running inside a vLLM container, and serve Qwen/Qwen3-30B-A3B-Thinking-2507-FP…

2d ago·10 min read·1969 words

Borrowing Memory, Not Speed: Clustering a Mac Studio and a DGX Spark with exo

Every local-inference setup eventually hits the same wall: a model you want to run is a few gigabytes too big for the one machine you'd run it on. You have a 128 GB Mac Studio. The model wants 160 GB.…

3d ago·12 min read·2308 words

Why Qwen3.6-35B Runs on a NVIDIA DGX Spark and gpt-oss-120B Fought Me Every Step

A field report from getting a local LLM inference endpoint working on an NVIDIA DGX Spark (GB10 / SM121, 128 GB unified memory) — including every wall I hit with gpt-oss-120B, why a smaller FP8 model …

5d ago·12 min read·2232 words

When the Helpdesk Becomes the Hacker: Technical Analysis of the Meta AI Account Takeover Incident And How to Prevent It

In June 2026, security researchers uncovered one of the most surprising account takeover incidents in recent memory. Attackers did not exploit a memory corruption bug, bypass cryptography, or compromi…

Jun 3·7 min read·1341 words

Microsoft’s New MAI Models: A Technical Analysis

At Build 2026, Microsoft significantly expanded its in-house MAI (Microsoft AI) model family. While much of the public attention focused on Microsoft's ongoing relationship with OpenAI, the more inter…

Jun 3·8 min read·1514 words

Two Sparks, One Cluster: Why Stacking NVIDIA DGX Spark Units Unlocks Local Frontier-Scale Inference

The NVIDIA DGX Spark put a Grace Blackwell superchip on the desk for the price of a high-end workstation. A single unit is already a capable local-inference box — 128 GB of unified memory, FP4 tensor …

Jun 1·12 min read·2299 words

Perplexity Bumblebee: Fast, Read-Only Supply-Chain Exposure Checks for Developer Machines

Modern software supply-chain incidents move fast. A malicious package version is published, copied into lockfiles, installed into developer environments, embedded into project workspaces, or exposed t…

May 31·15 min read·2923 words

Running GPT-OSS-120B on a Single NVIDIA DGX Spark - A Practical Guide

Note on the model name: OpenAI’s open-weight family ships as gpt-oss-20b and gpt-oss-120b. There is no 130B variant — this guide targets gpt-oss-120b, which is the one sized to fit the Spark’s unified…

May 31·8 min read·1590 words

Tiny11: Giving an Old, Unsupported PC a Secure Second Life with a Minimal Windows 11 Installation

When Windows 10 reached end of support, many perfectly usable PCs were pushed into an uncomfortable corner. The hardware still worked. The CPU was still fast enough for web browsing, email, light offi…

May 29·18 min read·3570 words

Install the “Caveman” Skill for GitHub Copilot CLI System-Wide

Large Language Models are incredibly powerful for software engineering, but they also have a habit of being verbose. Long explanations, conversational filler, and repeated context all consume tokens, …

May 28·7 min read·1262 words

What Achieving AGI Could Mean: Beyond Bigger Models and Longer Context Windows

Artificial General Intelligence, or AGI, is one of those terms that is both overused and underdefined. Depending on who you ask, it means human-level intelligence, economically useful autonomy, recurs…

May 27·12 min read·2354 words

From Passwords to Keys: Setting Up GitHub SSH Authentication on macOS (and Never Typing Credentials Again)

If you are still cloning GitHub repositories over HTTPS and repeatedly authenticating with browser logins or tokens, switching to SSH is one of those small infrastructure improvements that pays off ev…

May 21·5 min read·970 words

LLMs Corrupt Your Documents When You Delegate

The uncomfortable gap between “can edit” and “can be trusted” A lot of current AI enthusiasm is built around delegation. We no longer ask language models only to answer questions. We ask them to mod…

May 12·1 min read·82 words

CopyFail (CVE-2026-31431): Why a Tiny Linux Kernel Bug Became a Massive Infrastructure Threat

A newly disclosed Linux kernel vulnerability dubbed CopyFail (CVE-2026-31431) has quickly become one of the most serious Linux privilege escalation flaws in recent years. The bug allows an unprivilege…

May 11·1 min read·89 words

Graphify: Bringing Knowledge Graphs to AI-Assisted Engineering

AI coding assistants are becoming very good at generating code, explaining APIs, and navigating local repositories. But they still have a structural weakness: most of them reason over code through tex…

Apr 28·1 min read·83 words

Palantir’s 22-Point Manifesto, Decoded

What The Technological Republic says about software, state power, and the future of defense tech. Palantir’s recent X post is worth reading carefully, not because it is subtle, but because it is unus…

Apr 22·1 min read·80 words

EvilTokens: An AI-Driven Device Code Attack Compromising Microsoft Businesses

A new class of identity attacks is rapidly scaling across enterprises: AI-augmented device code phishing, operationalized through phishing-as-a-service (PhaaS) platforms like EvilTokens. Microsoft and…

Apr 9·1 min read·74 words

AI Agent Traps: When the Web Becomes the Attack Surface for Autonomous Agents

Autonomous AI agents are quickly moving beyond chat. They browse the web, read documents, call tools, retrieve knowledge, send messages, and increasingly act on behalf of users and organizations. That…

Apr 7·1 min read·90 words

Working Beyond the Desk: Using the M5 Apple Vision Pro as a High-Brightness External Display that works on the Balcony on a Sunny Day

I recently upgraded from the first-generation Apple Vision Pro to the new Apple Vision Pro M5 because even if this device and MR/VR in general gets a lot of bad press, it has fundamentally changed how…

Apr 2·1 min read·98 words

Apple Vision Pro in Switzerland: How to Use It Well in an Unsupported Country

Apple Vision Pro is portable by design, and Apple explicitly positions it as a device you can use at home, at work, and while traveling. But there is a practical difference between traveling with Visi…

Mar 27·1 min read·99 words

HVE Core for VS Code: Turning GitHub Copilot into a Structured Engineering System. A Practical Guide

AI-assisted engineering becomes much more valuable when it is constrained by process, standards, and reusable workflows. That is exactly where HVE Core for VS Code stands out. Rather than treating Gi…

Mar 27·1 min read·88 words

Why Running Redis in a Local Docker Container Is a Smart Move for Developers

Modern development is increasingly service-driven. Even small apps often depend on infrastructure components like databases, caches, queues, and session stores. Redis fits naturally into that world be…

Mar 25·1 min read·83 words

AI Is Not Converging. It Is Being Orchestrated.

For the last two years, the dominant question in AI has been deceptively simple: which model will win? That question made sense when the market was still trying to understand whether large language m…

Mar 25·1 min read·88 words

From scanners to reasoning: how LLMs and agent harnesses can improve code security

Better models matter, but better harnesses may matter more. The future of AI-assisted security is evidence, validation, and human-guided judgment. A year ago, a team at Microsoft explored an idea tha…

Mar 9·1 min read·93 words

AI Hijacking via Open-Source Agent Tooling: A Five-Layer Attack Anatomy

The threat landscape for AI-assisted development environments has quietly expanded beyond the attack surfaces that traditional security tooling is designed to cover. While conventional supply chain at…

Mar 6·1 min read·82 words

Building an AI-Powered Birthday Calendar with FastAPI and Vanilla JavaScript

A full-stack self-hosted app with email reminders, AI based gift suggestions, and zero framework overhead on the frontend. Why Build a Birthday Calendar? I kept forgetting birthdays. Not the big on…

Feb 27·1 min read·91 words

AI-Powered 3D Printing: From Text to STL with Meshy and OpenClaw

How I taught my AI assistant to generate 3D-printable models from simple text descriptions The Problem I've been 3D printing for years, but there's always been a gap in my workflow: organic shapes …

Feb 22·1 min read·91 words