Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreieytibldwroqnk3mpndxfuldcf44t5mqd4x3qnyckje4fi6iphhne",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mgbf2gvfvy22"
  },
  "path": "/t/inquiry-about-dataset-for-ai-driven-cloud-load-balancing-and-auto-scaling-of-instances/169639#post_3",
  "publishedAt": "2026-03-04T20:01:33.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "sohamk28:\n\n> I’m currently building a **Smart Load Balancer with Auto-Scaling Instances** and exploring ways to optimize cloud performance using AI-based techniques.\n>\n> I’m looking for a **dataset** that contains:\n>\n>   * Server or VM utilization data (CPU, memory, network usage)\n>   * Task or request distribution logs\n>   * Auto-scaling or workload patterns over time\n>   * Any real or simulated cloud performance metrics\n>\n\n>\n> I’d really appreciate it if anyone could suggest:\n>\n>   * Publicly available cloud workload datasets\n>   * Google, Alibaba, or Azure cluster traces\n>   * Or any datasets that can help in modeling or testing AI-based load balancing algorithms\n>\n\n>\n> Thanks in advance for your help and suggestions\n>\n> — _Soham Kale_\n\nHi Soham, you can cover this in two ways: use public traces for realism, and synthetic traces for controlled stress testing.\n\nPublic datasets worth checking:\n\n  * Google cluster traces (Borg) for job/task scheduling and resource usage patterns\n\n  * Alibaba cluster trace for container workloads and utilization over time\n\n  * Azure traces and other public workload datasets from academic benchmarking papers\n\n  * Also look for “cluster trace”, “workload trace”, “autoscaling trace”, “request trace”, “datacenter telemetry”, “Kubernetes trace” on the Hub\n\n\n\n\nIf you cannot find a dataset with all signals in one place, a common approach is to fuse:\n\n  * a request arrival trace (per service) plus\n\n  * a resource utilization trace (per node or pod)\nthen derive autoscaling events from policy simulation.\n\n\n\n\nHow I can help you directly:\n\n  * Provide a ready to use synthetic dataset generator that produces time series for CPU, memory, network, request rate, latency, error rate, plus autoscaling actions under different policies (HPA style, predictive, RL style)\n\n  * Include bursty traffic, diurnal seasonality, noisy telemetry, failures, and multi service interference\n\n  * Output formats that plug into training easily, like parquet plus a gym style environment spec for RL or a supervised dataset for predicting scale up and scale down actions\n\n  * Add evaluation scripts for cost latency SLO violations and stability metrics, so you can compare heuristics vs learned policies\n\n\n",
  "title": "Inquiry About Dataset for AI-Driven Cloud Load Balancing and Auto scaling of instances"
}