{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreieytibldwroqnk3mpndxfuldcf44t5mqd4x3qnyckje4fi6iphhne",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mgbf2gvfvy22"
},
"path": "/t/inquiry-about-dataset-for-ai-driven-cloud-load-balancing-and-auto-scaling-of-instances/169639#post_3",
"publishedAt": "2026-03-04T20:01:33.000Z",
"site": "https://discuss.huggingface.co",
"textContent": "sohamk28:\n\n> I’m currently building a **Smart Load Balancer with Auto-Scaling Instances** and exploring ways to optimize cloud performance using AI-based techniques.\n>\n> I’m looking for a **dataset** that contains:\n>\n> * Server or VM utilization data (CPU, memory, network usage)\n> * Task or request distribution logs\n> * Auto-scaling or workload patterns over time\n> * Any real or simulated cloud performance metrics\n>\n\n>\n> I’d really appreciate it if anyone could suggest:\n>\n> * Publicly available cloud workload datasets\n> * Google, Alibaba, or Azure cluster traces\n> * Or any datasets that can help in modeling or testing AI-based load balancing algorithms\n>\n\n>\n> Thanks in advance for your help and suggestions\n>\n> — _Soham Kale_\n\nHi Soham, you can cover this in two ways: use public traces for realism, and synthetic traces for controlled stress testing.\n\nPublic datasets worth checking:\n\n * Google cluster traces (Borg) for job/task scheduling and resource usage patterns\n\n * Alibaba cluster trace for container workloads and utilization over time\n\n * Azure traces and other public workload datasets from academic benchmarking papers\n\n * Also look for “cluster trace”, “workload trace”, “autoscaling trace”, “request trace”, “datacenter telemetry”, “Kubernetes trace” on the Hub\n\n\n\n\nIf you cannot find a dataset with all signals in one place, a common approach is to fuse:\n\n * a request arrival trace (per service) plus\n\n * a resource utilization trace (per node or pod)\nthen derive autoscaling events from policy simulation.\n\n\n\n\nHow I can help you directly:\n\n * Provide a ready to use synthetic dataset generator that produces time series for CPU, memory, network, request rate, latency, error rate, plus autoscaling actions under different policies (HPA style, predictive, RL style)\n\n * Include bursty traffic, diurnal seasonality, noisy telemetry, failures, and multi service interference\n\n * Output formats that plug into training easily, like parquet plus a gym style environment spec for RL or a supervised dataset for predicting scale up and scale down actions\n\n * Add evaluation scripts for cost latency SLO violations and stability metrics, so you can compare heuristics vs learned policies\n\n\n",
"title": "Inquiry About Dataset for AI-Driven Cloud Load Balancing and Auto scaling of instances"
}