{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreigzhabgp7apqpwu44vzhjuzlhvfdkqg6dbtud6b5rzkdvv3psf4h4",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mnv6x362jol2"
  },
  "path": "/t/deepseek-qwen/176657#post_1",
  "publishedAt": "2026-06-09T21:37:14.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "\n\n\nGuys, I don’t even know where to begin. I’m used to working with RTX 3090s and the models that fit this type of setup. Recently, I had the opportunity to work with a server with an H200 and 2TB of RAM, and now I have no idea what to use in this setup. I was thinking of using Deepseek v4 Flash, but in conversations with ChatGPT, he’s been telling me I won’t get good results. Does anyone have any experience with VLLM in a setup like this? Or can anyone tell me what the best options are for a setup of this size? Initially, it will be for some internal applications working with text; we’ll use it mostly for testing to see if it performs well.\n\n\n",
  "title": "Deepseek? Qwen?"
}