Groot on SO101 failing
Since this is in the Physical AI / LeRobot territory, I’d still recommend taking it to the LeRobot Discord for the most reliable answer. But before that, here is what I can piece together from the public docs and related issues. Looks like this is several layers nastier than ordinary version drift:
Short diagnosis
I would not treat this first as an SO-101 robot-control problem.
It looks more like a dependency/runtime matrix problem involving several moving layers:
- NVIDIA Isaac GR00T version line: N1.5 / N1.6 / N1.7 /
main - LeRobot version line: v0.4.x / v0.5.x / current
main - PyTorch version
- TorchCodec version
- FlashAttention / FA2 version
- CUDA version and GPU architecture
- platform architecture: Linux x86_64, Windows, Linux aarch64, Jetson/Thor/Spark, etc.
- FFmpeg /
libtorchcodec - video backend:
torchcodec,decord,torchvision_av,pyav - dataset flavor: standard LeRobot dataset vs GR00T-flavored LeRobot v2 dataset
So the apparent contradiction you saw — “TorchCodec wants a newer PyTorch, but FlashAttention / GR00T wants torch 2.7-ish” — may be real in some environments, but not in all environments.
The first thing I would clarify is which platform and which docs line you are actually following.
The key conflict: GR00T N1.5 docs vs current LeRobot/TorchCodec assumptions
The LeRobot GR00T N1.5 docs currently say that GR00T N1.5 requires Flash Attention internally and that this is still not fully optional:
- LeRobot GR00T N1.5 Policy docs
Those docs give an install path around:
pip install "torch>=2.2.1,<2.8.0" "torchvision>=0.21.0,<0.23.0"
pip install "flash-attn>=2.5.9,<3.0.0" --no-build-isolation
So your concern about FlashAttention pulling you toward the torch 2.7-ish era is reasonable.
But TorchCodec itself is not always a torch>=2.11 dependency. The official TorchCodec compatibility table says, roughly:
| TorchCodec version | Compatible PyTorch line |
|---|---|
0.14, 0.13, 0.12 |
torch >= 2.11 |
0.11 |
torch 2.11 |
0.10 |
torch 2.10 |
0.9, 0.8 |
torch 2.9 |
0.7, 0.6 |
torch 2.8 |
0.5, 0.4, 0.3 |
torch 2.7 |
Source:
- TorchCodec README / compatibility table
So if you are on a Linux x86_64 machine and you want to stay on torch==2.7.x, the relevant question is not “does TorchCodec always require torch >= 2.11?” It is more likely:
Did the resolver install a too-new TorchCodec, such as
torchcodec>=0.11ortorchcodec>=0.12, instead of the torch-2.7-compatibletorchcodec 0.3–0.5line?
That said, the platform matters a lot.
Why the platform matters
Current LeRobot metadata appears to treat TorchCodec differently by platform.
In current LeRobot main, the dataset extra has platform-specific TorchCodec markers. At the time I checked, the rough shape was:
| Platform | LeRobot TorchCodec range | Practical implication |
|---|---|---|
| Linux x86_64 / AMD64 | torchcodec>=0.3.0,<0.12.0 |
You may be able to pin TorchCodec to a torch-2.7-compatible line, such as 0.4 or 0.5. |
| Windows | torchcodec>=0.7.0,<0.12.0 |
This is already more torch-2.8-ish or newer. Do not assume the Linux x86_64 fix applies. |
| Linux aarch64 / arm64 | torchcodec>=0.11.0,<0.12.0 |
This points toward the torch-2.11-era TorchCodec line. It can conflict with GR00T N1.5’s torch <2.8 / FlashAttention instructions. |
lerobot[groot] |
includes flash-attn>=2.5.9,<3.0.0 |
GR00T is not a normal “install every extra and go” case. |
Source:
- LeRobot pyproject.toml
One especially relevant detail: current LeRobot metadata appears to keep the groot extra out of the all extra with a comment along the lines that GR00T needs specific FlashAttention installation instructions. That is a signal that GR00T is not yet a fully ordinary dependency-extra path.
So I would be careful with commands like:
pip install "lerobot[all]"
pip install "lerobot[groot,dataset,training]"
pip install -U lerobot
unless you know exactly which LeRobot version/commit and which GR00T line you are targeting.
Possible docs-line drift
There are at least three “current-looking” instruction sets that can easily get mixed:
| Docs / repo line | Likely assumptions |
|---|---|
| General LeRobot installation docs | Newer Python / newer PyTorch assumptions, TorchCodec default video decoding, FFmpeg requirement. |
| LeRobot GR00T N1.5 policy docs | FlashAttention required, torch <2.8, CUDA-enabled device required. |
| Current NVIDIA Isaac-GR00T repository / N1.7 README | uv, FFmpeg, TorchCodec default video backend, platform-specific CUDA/Python matrix, aarch64 caveats, FlashAttention / TensorRT GPU deps. |
| HF/NVIDIA SO-101 GR00T N1.5 blog | Python 3.10, N1.5-era setup, flash-attn==2.7.1.post4, meta/modality.json, and --video-backend torchvision_av. |
Sources:
- LeRobot installation docs
- LeRobot GR00T N1.5 Policy docs
- NVIDIA Isaac-GR00T repository
- Fine-Tuning NVIDIA GR00T N1.5 on SO-101 Arm
I would not mix these blindly. They do not all imply the same Python / torch / TorchCodec / FlashAttention / video-backend stack.
Platform-specific interpretation
Case A: Linux x86_64, non-Blackwell GPU
If you are on a normal desktop/server Linux x86_64 machine, and not on an RTX 50 / Blackwell / SM120 GPU, then one plausible path is:
# Example only; verify against your exact GR00T/LeRobot line.
pip install "torch==2.7.1" "torchvision==0.22.*" "torchaudio==2.7.*" --index-url <your-matching-pytorch-cuda-wheel-index>
pip install "torchcodec>=0.3,<0.6"
pip install "flash-attn>=2.5.9,<3.0.0" --no-build-isolation
The key idea is:
- keep torch in the GR00T/FlashAttention-compatible range;
- keep TorchCodec in the torch-2.7-compatible range;
- do not allow the resolver to pull
torchcodec 0.11+or0.12+.
I am not claiming the exact commands above will work for every CUDA/Python/GPU combination. The point is the version shape : torch 2.7.x should be paired with TorchCodec 0.3–0.5, not the newer TorchCodec line.
Case B: Linux aarch64 / Jetson / Thor / DGX Spark
This is probably not the same problem.
The current GR00T README treats aarch64 platforms as special. It says that for Thor / Orin / Spark-style aarch64 platforms, TorchCodec is a required video backend and other video backends such as decord / pyav are not supported in the same way.
Source:
- NVIDIA Isaac-GR00T README
There is also a related report where uv run python from the repository root appears to rebuild or replace a platform-specific venv and lose torchcodec and likely flash-attn on aarch64 platforms:
- Isaac-GR00T issue #675: uv run python invalidates platform-specific venv / TorchCodec missing
If you are on Jetson / Thor / Spark / aarch64, I would not start with the generic x86_64 pip recipe. I would first verify the platform-specific GR00T deployment docs and make sure you are using the intended venv, Python version, and platform wheels.
Case C: RTX 50 / Blackwell / SM120
If your GPU is RTX 50-series / Blackwell / compute capability sm_120, the torch-2.7-era path may be a separate problem.
There are reports around newer RTX 50 / SM120 environments where older PyTorch or FlashAttention paths fail with architecture/kernel support errors. This does not mean every torch 2.7 build is automatically unusable, but it does mean you should verify the actual binary being imported:
python - <<'PY'
import torch
print("torch:", torch.__version__)
print("torch cuda:", torch.version.cuda)
print("cuda available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("device:", torch.cuda.get_device_name(0))
print("capability:", torch.cuda.get_device_capability(0))
PY
Related examples:
- LeRobot issue #2217: dependency resolver / torch 2.7.1 / RTX 5070 / SM120 concern
- FlashAttention issue #2168: RTX 5090 / SM120 / no kernel image-style failure
- FlashAttention issue #2361: SM120 source build trouble
For Blackwell/SM120, I would not assume that “torch 2.7.1 + FlashAttention 2.x” is automatically the right target.
Case D: Windows
I would treat Windows as a separate branch.
LeRobot’s metadata uses a different TorchCodec minimum on Windows than on Linux x86_64. FlashAttention on Windows is also generally a more fragile/unofficial path than Linux CUDA. I would not apply a Linux x86_64 GR00T/FA2 recipe directly to Windows without checking the exact wheels and support status.
TorchCodec errors are not always just “package not installed”
If the error says something like:
ImportError: torchcodec is not available
or
RuntimeError: Could not load libtorchcodec
then check more than pip show torchcodec.
TorchCodec depends on FFmpeg and binary/runtime compatibility. The error can be caused by:
- wrong TorchCodec version for your PyTorch version;
- missing or incompatible FFmpeg shared libraries;
LD_LIBRARY_PATH/ runtime loader issues;- CUDA-enabled vs CPU-only wheel differences;
- platform wheel availability;
- venv mismatch.
Useful checks:
python - <<'PY'
import sys, subprocess
print("python:", sys.version)
try:
import torch
print("torch:", torch.__version__)
print("torch cuda:", torch.version.cuda)
except Exception as e:
print("torch import failed:", repr(e))
try:
import torchcodec
print("torchcodec:", getattr(torchcodec, "__version__", "unknown"))
except Exception as e:
print("torchcodec import failed:", repr(e))
PY
ffmpeg -version
which ffmpeg
python -m pip show torch torchcodec torchvision torchaudio flash-attn
Related reports:
- Isaac-GR00T issue #421: torchcodec is not available
- Isaac-GR00T issue #502: Thor / GR00T N1.6 / Could not load libtorchcodec
- LeRobot issue #964: Could not load libtorchcodec
- LeRobot issue #3199: dataset viewer / Could not load libtorchcodec
FlashAttention errors are not always just “wrong torch version”
FlashAttention is a compiled CUDA extension path. Even if pip install flash-attn succeeds, it can still fail later due to ABI, CUDA, GPU architecture, or wheel/source-build mismatch.
Related examples:
- Isaac-GR00T issue #392: CUDA 11.8 / torch 2.7.1 / flash-attn install trouble
- Isaac-GR00T issue #233: GR00T N1.5 / flash-attn undefined symbol
- FlashAttention issue #1696: PyTorch 2.7.0 / CUDA 12.6 / prebuilt wheel ABI mismatch
- FlashAttention issue #1644: PyTorch 2.7.0 / 2.6.0 binary compatibility issue
So I would collect both install-time and runtime evidence:
python - <<'PY'
try:
import flash_attn
print("flash_attn module:", flash_attn)
print("flash_attn version:", getattr(flash_attn, "__version__", "unknown"))
except Exception as e:
print("flash_attn import failed:", repr(e))
PY
And if it imports, I would still not assume the actual model path is safe. The failure may appear only when GR00T enters the FlashAttention kernel.
Video backend and codec also matter
GR00T/LeRobot video data is not just a side dependency. If the video backend is wrong, the dataset may decode incorrectly, read wrong frames, leak memory, or fail during training/evaluation.
Relevant reports:
- Isaac-GR00T issue #342: wrong video_backend / codec mismatch can get stuck at the first frame
- Isaac-GR00T issue #119: torchvision_av backend memory leak; decord / TorchCodec comparison
- Isaac-GR00T issue #172: torchvision_av timestamp/frame bug
So please include:
video_backend = ?
actual video codec = h264 / h265 / av1 / other?
ffmpeg -version = ?
If you are following the SO-101 N1.5 blog, note that it used --video-backend torchvision_av in the fine-tuning command:
- SO-101 GR00T N1.5 blog
That does not automatically mean torchvision_av is the right backend for every dataset/platform.
Dataset flavor can be another separate issue
Even after the install succeeds, the dataset can still fail.
The SO-101 N1.5 blog uses a GR00T-compatible dataset preparation step and copies a meta/modality.json file. The current GR00T data preparation docs also describe GR00T’s dataset format as a LeRobot-compatible flavor with additional modality metadata.
Sources:
- SO-101 GR00T N1.5 blog
- Isaac-GR00T data preparation docs
There is also a related SO-101 dataset mismatch report:
- Isaac-GR00T issue #423: SO-101 dataset format mismatch / KeyError: observation.images.front
So I would not only ask “is this a LeRobot dataset?” I would ask:
Is this standard LeRobot v2, LeRobot v3, or GR00T-flavored LeRobot v2?
Does it contain `meta/modality.json`?
Which `embodiment-tag` are you using?
What are the exact observation/action/video keys?
LeRobot version line also matters
GR00T integration in LeRobot is relatively recent and has been moving.
The HF/NVIDIA GR00T-in-LeRobot article describes GR00T integration around LeRobot v0.4.0:
- NVIDIA Isaac GR00T in LeRobot
There are also related reports where older GR00T examples hit newer LeRobot API changes:
- Isaac-GR00T issue #266: eval_lerobot.py broken by LeRobot API refactor / lerobot.common missing
And a report that the SO-101 guide’s eval_lerobot.py path needed dependencies such as draccus and lerobot, with dependency conflicts when installing LeRobot on top of the GR00T environment:
- Isaac-GR00T issue #323: SO-101 guide missing draccus / lerobot; dependency conflict
So I would include the exact LeRobot version/commit in your report.
What I would ask you to post
Before trying another workaround, I would ask for the exact environment matrix.
Platform
- OS:
- CPU architecture: x86_64 / Windows / aarch64 / arm64?
- Machine: desktop/server GPU, Jetson Orin, Jetson Thor, DGX Spark, etc.?
- GPU model:
- GPU compute capability:
- CUDA driver:
- CUDA toolkit:
- Python version:
Python packages
- torch:
- torchvision:
- torchaudio:
- torchcodec:
- flash-attn:
- transformers:
- lerobot:
- Isaac-GR00T version/tag/commit:
- ffmpeg version:
Install path
- Are you following:
- LeRobot general installation docs?
- LeRobot GR00T N1.5 docs?
- HF/NVIDIA SO-101 N1.5 blog?
- current Isaac-GR00T README / N1.7?
- another guide?
- Did you use pip, uv, conda, or the platform-specific GR00T install scripts?
- Exact install commands:
GR00T / LeRobot usage
- Are you fine-tuning, evaluating, or deploying to real SO-101?
- Are you using GR00T’s own server/client scripts or LeRobot async inference?
- Is `policy.type=groot` involved?
- Are you using `lerobot[groot]`, `lerobot[dataset]`, `lerobot[training]`, or all extras?
Video / dataset
- video_backend:
- actual video codec:
- dataset format: LeRobot v2 / LeRobot v3 / GR00T LeRobot flavor?
- does the dataset have `meta/modality.json`?
- embodiment tag:
- full traceback:
Useful diagnostic snippet:
python - <<'PY'
import sys, subprocess, platform
print("python:", sys.version)
print("platform:", platform.platform())
print("machine:", platform.machine())
for name in ["torch", "torchvision", "torchaudio", "torchcodec", "flash_attn", "transformers", "lerobot"]:
try:
mod = __import__(name)
print(f"{name}:", getattr(mod, "__version__", "unknown"))
except Exception as e:
print(f"{name}: IMPORT FAILED:", repr(e))
try:
import torch
print("torch cuda:", torch.version.cuda)
print("cuda available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("gpu:", torch.cuda.get_device_name(0))
print("capability:", torch.cuda.get_device_capability(0))
except Exception as e:
print("torch cuda check failed:", repr(e))
PY
ffmpeg -version || true
python -m pip freeze | grep -E 'torch|torchvision|torchaudio|torchcodec|flash|lerobot|transformers|decord|av|ffmpeg'
My tentative recovery logic
I would branch the recovery attempt like this.
| Environment | First thing I would suspect | First recovery direction |
|---|---|---|
| Linux x86_64, non-Blackwell | Resolver pulled too-new TorchCodec; FlashAttention wheel/ABI mismatch | Try torch 2.7.x + TorchCodec 0.3–0.5 + compatible FA2, with exact CUDA/Python matching. |
| Linux x86_64, RTX 50 / SM120 | torch 2.7.x / FA2 kernel path may be too old | Verify actual torch CUDA build and GPU capability; do not assume the N1.5 torch <2.8 path works. |
| Windows | Different TorchCodec minimum and fragile FA2 path | Do not reuse Linux recipe directly. |
| aarch64 / Jetson / Thor / Spark | Platform-specific GR00T stack, TorchCodec required, venv/wheel mismatch | Use platform-specific GR00T install/deployment docs; avoid generic x86_64 pip recipe. |
torchcodec is not available |
Could be FFmpeg/shared-lib/PyTorch/TorchCodec mismatch, not just missing pip package | Test import torchcodec, ffmpeg -version, torch/TorchCodec compatibility. |
| dataset/video failure after install | Backend/codec/schema mismatch | Check video_backend, codec, GR00T meta/modality.json, embodiment tag. |
Where Discord fits
If this still fails after collecting the matrix above, I would bring that exact report to the LeRobot Discord rather than asking vaguely.
The entry point is:
- LeRobot on Hugging Face
That should make it much easier for maintainers or users with a known-good SO-101 / GR00T setup to identify whether this is:
- a simple TorchCodec pin issue,
- a GR00T N1.5 vs current LeRobot docs-line mismatch,
- an aarch64 platform-stack issue,
- an RTX 50 / Blackwell issue,
- an FFmpeg / libtorchcodec runtime issue,
- or a dataset/video-backend issue.
Discussion in the ATmosphere