Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreidwhymwg3hnsk6usxnaabdky77degpouozc2ccnbf6dedaqhphaxe",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mouf6z67cmu2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreid3yiv7yfaaxtwgkos5ufbtuwgrxypdw54xs2jefi7r6bkjym6jju"
    },
    "mimeType": "image/webp",
    "size": 286716
  },
  "path": "/shresthapandey/how-to-deploy-your-ml-model-to-aws-step-by-step-guide-af9",
  "publishedAt": "2026-06-22T07:21:40.000Z",
  "site": "https://dev.to",
  "tags": [
    "aws",
    "automation",
    "cloud",
    "productivity"
  ],
  "textContent": "I've trained more ML models than I've deployed. There's something comforting about the local loop—`model.fit()`, `model.evaluate()`, hitting 94% accuracy, then staring at the screen wondering, \"Okay, how do I make this actually useful?\"\n\nIf you're stuck there right now, this guide will help.\n\n> Note: I wrote this based on AWS documentation and standard SageMaker patterns. If you try it, drop a comment about what worked (or broke).\n\n##  **What You Need Before Starting**\n\n  * AWS account with SageMaker enabled\n  * A trained model saved as `model.pkl` (or `.joblib`)\n  * `requirements.txt` with your dependencies\n  * Python 3.8+ installed\n  * AWS CLI configured (`aws configure`)\n\n\n\n##  **Step 1: Save Your Model**\n\n\n    import joblib\n    joblib.dump(model, 'model.pkl')\n\n\nCreate a `requirements.txt` file:\n\n\n\n    sklearn==1.2.0\n    pandas==1.5.0\n    numpy==1.23.0`\n\n\nKeep both files in the same folder.\n\n##  **Step 2: Upload to S3**\n\n\n    import boto3\n\n    s3 = boto3.client('s3')\n\n    bucket_name = 'my-unique-ml-bucket-12345'  # Make this unique\n    s3.create_bucket(Bucket=bucket_name, CreateBucketConfiguration={\n        'LocationConstraint': 'us-east-1'\n    })\n\n    s3.upload_file('model.pkl', bucket_name, 'models/model.pkl')\n    s3.upload_file('requirements.txt', bucket_name, 'models/requirements.txt')\n\n    model_s3_path = f's3://{bucket_name}/models/model.pkl'\n\n\n##  **Step 3: Write Your Inference Script**\n\nSave this as `inference.py`:\n\n\n\n    import json\n    import joblib\n    import numpy as np\n    import os\n\n    model = None\n\n    def model_fn(model_dir):\n        return joblib.load(os.path.join(model_dir, 'model.pkl'))\n\n    def input_fn(input_data, content_type):\n        if content_type == 'application/json':\n            data = json.loads(input_data)\n            return np.array(data['features'])\n        raise ValueError(f\"Unsupported content type: {content_type}\")\n\n    def predict_fn(input_data, model):\n        return model.predict(input_data)\n\n    def output_fn(prediction, content_type):\n        return json.dumps({'predictions': prediction.tolist()})\n\n\nThese four functions are what SageMaker calls when someone hits your endpoint.\n\n##  **Step 4: Deploy Using Python SDK**\n\nRun this in a Python script:\n\n\n\n    from sagemaker.sklearn.model import SKLearnModel\n    from sagemaker import get_execution_role\n\n    sklearn_model = SKLearnModel(\n        model_data=model_s3_path,\n        role=get_execution_role(),\n        instance_type='ml.m5.large',\n        entry_point='inference.py',\n        py_version='py3'\n    )\n\n    sklearn_model.deploy(\n        initial_instance_count=1,\n        instance_type='ml.m5.large',\n        endpoint_name='my-model-endpoint'\n    )\n\n\nThis takes 5–10 minutes. You'll see `Creating` → `In Service`.\n\n##  **Step 5: Test Your Endpoint**\n\n\n    import boto3\n    import json\n\n    runtime = boto3.client('sagemaker-runtime')\n\n    response = runtime.invoke_endpoint(\n        EndpointName='my-model-endpoint',\n        ContentType='application/json',\n        Body=json.dumps({'features': [[5.1, 3.5, 1.4, 0.2]]})\n    )\n\n    result = json.loads(response['Body'].read().decode())\n    print(result)\n\n\nIf you see `{'predictions': [...]}`, it worked.\n\n##  **Step 6: Clean Up**\n\nEndpoints cost money even when idle:\n\n\n\n    aws sagemaker delete-endpoint --endpoint-name my-model-endpoint\n    aws sagemaker delete-endpoint-config --endpoint-config-name my-model-endpoint\n\n\n##  **Common Errors (And Fixes)**\n\n**Error** | **Fix**\n---|---\n`NoCredentialsError` | Run `aws configure` again\n`InvalidRoleException` | IAM role needs S3 + SageMaker permissions\n`ModelError` | Check `inference.py` for missing imports\nEndpoint stuck on `Creating` | Wait 5–10 more minutes\n\nYour IAM role needs:\n\n  * `s3:GetObject`, `s3:PutObject`\n  * `sagemaker:CreateModel`, `sagemaker:CreateEndpoint`\n\n\n\n##  **Cost Breakdown**\n\n**Resource** | **Cost**\n---|---\n`ml.m5.large` | ~$0.20/hour (~$6/month if 24/7)\nS3 storage | ~$0.02/GB/month\n\nDelete when not using. I've seen $50 surprises from idle endpoints.\n\n##  **Verify This Before You Trust It**\n\nIf you're following this, check:\n\n  1. **AWS SDK version** — Run `pip show boto3 sagemaker`\n  2. **IAM role permissions** — Biggest blocker is usually missing permissions\n  3. **Region mismatch** — S3 bucket region must match SageMaker region\n  4. **Inference.py imports** — Make sure `os`, `joblib`, `numpy` are installed\n\n\n\nIf something breaks, comment below with the error. I'll update this guide.\n\n##  **Final Thoughts**\n\nDeploying ML feels intimidating until you do it once. SageMaker handles most of the complexity. You just upload your model to S3, point SageMaker at it, and deploy.\n\nI've trained models that sat on my laptop for months because I didn't know how to deploy them. Now I tell people: \"Just run this script, it's not that hard.\"\n\n_If you're building something with this, drop a comment. I love seeing what people deploy._",
  "title": "How to Deploy Your ML Model to AWS (Step-by-Step Guide)"
}