External Publication
Visit Post

How can I tell the size of a model before downloading it?

Hugging Face Forums [Unofficial] June 2, 2026
Source
yes, that was the difference I was pointing out. The reason to focus on this too is that for VRAM you do need the model weights in GB plus some breathing room. This also affects solutions because one of the largest, fastest cards that can run AI models (whether you’re looking at consumer or enterprise markets) only has 24 GB of VRAM.Some of the older architectures run up to, I think, 48. There may be other ones. My research into this has not been exhaustive because I’ve been looking into this too. What this actually means is that with some of the larger models (like the 70b models), you need at least two graphics cards to run it. That’s because ofbecause of how they actually run the models, the models are not run as a 70B. They are run through an architecture that breaks the model up. I don’t completely understand that yet. What it means is they run it as an 8-bit model, I think that is the terminology, which allows a program you would want to use along with the large 70B model. It allows it to operate the 70B model in a VRAM pool between 2 or more cards instead of just on one card. The smaller agents, like the 8b agents, and I think there is a 16b agent, can run safely on one 24 GB card, but the larger ones won’t. So far, my research into what I’m trying to do has only led me to discover the 8B, the 30B, and the 70B. I haven’t been looking into it very long.

Discussion in the ATmosphere

Loading comments...