Deployment of IAF with Autoscaling in VM using Docker
Prerequisites
Ensure you have access to the following:
GitHub Repositories
Infrastructure
- Access to Linux VMs
- Docker CLI installed on all VMs
Before Deploying IAF
Before deploying IAF in VMs, you need to have Kafka and LiteLLM set up.
Kafka Setup in VM
-
Pull the Docker image for Kafka:
docker pull bitnamilegacy/kafka:4.0.0-debian-12-r10 -
Run Kafka using
docker run.The command should run a single-node Kafka broker in KRaft (ZooKeeper-less) mode using the Bitnami Kafka image. It configures the broker to act as both controller and broker, sets up internal and external listeners, and defines advertised endpoints for client connectivity. Persistent storage is enabled to retain Kafka data across restarts, and replication settings are configured for a single broker.
-
Check the Kafka container status:
docker ps -a -
After the Kafka container is created, manually create the
__consumer_offsetstopic — it will not be created automatically. -
Access Kafka using:
<VM-IP>:9092
LiteLLM Server Setup
The LiteLLM setup for your VM follows the same process as the Linux VM installation guide.
LiteLLM Proxy Setup — Linux Installation
Building Docker Images
Backend
- Configure all variables in the backend
.envfile correctly. - Download the backend code from the GitHub Main branch.
- Copy the Dockerfile into the same folder.
- Update
main.pyto set CORS origins (if*is not already present):- Update origins
- For testing, update CORS (optional)
-
Build the Docker image:
docker build -f <docker filename with path> -t <image name>:<image tag>You will get an image like:
localhost/<image name>:<image tag>
Agent Worker
- Configure all variables in the backend
.envfile correctly. - Download the backend code from the GitHub Main branch.
- Copy the Dockerfile into the same folder.
- In the Dockerfile, use the command to run
run_agent_worker.py— do not runmain.pyfor the agent worker image. - Update
main.pyCORS origins if needed. -
Build the Docker image:
docker build -f <docker filename with path> -t <image name>:<image tag>You will get an image like:
localhost/<image name>:<image tag>
Tool Worker
- Configure all variables in the backend
.envfile correctly. - Download the backend code from the GitHub Main branch.
- Copy the Dockerfile into the same folder.
- In the Dockerfile, use the command to run
tool_worker/main.py— do not runmain.pyfrom the root directory for the tool worker image. - Update
main.pyCORS origins if needed. -
Build the Docker image:
docker build -f <docker filename with path> -t <image name>:<image tag>You will get an image like:
localhost/<image name>:<image tag>
Frontend
- Configure all variables in the frontend
.envfile correctly (Backend URL, MkDocs, Arize Phoenix, Grafana, etc.). - Download the frontend code from the GitHub Main branch.
- Copy the Dockerfile into the same folder.
-
Build the Docker image:
docker build -f <Dockerfile> -t <imagename>:<imagetag>You will get an image like:
localhost/<image name>:<image tag>
Deploying Containers in VMs
VM — Backend and Frontend
Build the backend and frontend images on this VM, then deploy:
# Run backend
docker run -d --name <container name> localhost/<image name>:<image tag> python main.py --host 0.0.0.0 --port <port number>
# Run frontend
docker run -d --name <container name> localhost/<image name>:<image tag> python main.py --host 0.0.0.0 --port <port number>
# Check containers
docker ps -a
Access the services at:
<VM IP>:<Backend port>
<VM IP>:<Frontend port>
VM — Agent Workers
Build the agent worker image on this VM, then deploy. Run at least 5 containers, each on a different port:
docker run -d --name <container name> localhost/<image name>:<image tag> python run_agent_worker.py --host 0.0.0.0 --port <port number>
# Repeat with different port numbers for additional containers
docker ps -a
VM — Tool Workers
Build the tool worker image on this VM, then deploy. Run at least 3 containers, each on a different port:
docker run -d --name <container name> localhost/<image name>:<image tag> python tool_worker/main.py --host 0.0.0.0 --port <port number>
# Repeat with different port numbers for additional containers
docker ps -a
All Containers Running
All containers are now up and running. You can start using the platform by triggering batches.
Notes
- Set the proxies in all VMs to establish connections between them.
- If you are unable to access URLs outside the VM, allow the required ports through the firewall.
-
It is not mandatory to host all services on one VM. You can distribute across multiple VMs:
- VM 1 — Backend, Frontend, Kafka, LiteLLM
- VM 2 — Agent Workers
- VM 3 — Tool Workers
When using this approach, ensure the same database and Kafka credentials are used across all VMs.
Sample Multi-VM Architecture
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│ VM 1 — Core Services │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ IAF Backend │ │ IAF Frontend│ │ Kafka Server │ │ LiteLLM │ │
│ │ (FastAPI) │ │ (UI) │ │ (Broker) │ │ (LLM Proxy) │ │
│ │ Port: 8000 │ │ Port: 3000 │ │ Port: 9092 │ │ Port: 4000 │ │
│ └──────────────┘ └──────────────┘ └──────┬───────┘ └──────────────┘ │
│ │ │
│ ┌──────────────────┐ │ Kafka Broker │
│ │ PostgreSQL DB │ │ (all workers connect here) │
│ │ Port: 5432 │ │ │
│ │ (shared across │ │ │
│ │ all 3 VMs) │ │ │
│ └──────────────────┘ │ │
└──────────────────────────────────────────────┼──────────────────────────────────────────┘
│
┌──────────────────────────┴──────────────────────────┐
│ │
▼ ▼
┌───────────────────────────────────────┐ ┌───────────────────────────────────────┐
│ VM 2 — Agent Workers │ │ VM 3 — Tool Workers │
│ │ │ │
│ ┌────────────┐ ┌────────────┐ │ │ ┌────────────┐ ┌────────────┐ │
│ │ Worker 1 │ │ Worker 2 │ │ │ │ Worker 1 │ │ Worker 2 │ │
│ │ Port: 8102│ │ Port: 8103│ │ │ │ Port: 8101│ │ Port: 8111│ │
│ └────────────┘ └────────────┘ │ │ └────────────┘ └────────────┘ │
│ │ │ │
│ ┌────────────┐ ┌────────────┐ │ │ ┌────────────┐ │
│ │ Worker 3 │ │ Worker 4 │ │ │ │ Worker 3 │ │
│ │ Port: 8104│ │ Port: 8105│ │ │ │ Port: 8121│ │
│ └────────────┘ └────────────┘ │ │ └────────────┘ │
│ │ │ │
│ ┌────────────┐ │ │ Consumer Group: │
│ │ Worker 5 │ │ │ "tool-executor-workers" │
│ │ Port: 8106│ │ │ │
│ └────────────┘ │ └───────────────────────────────────────┘
│ │
│ Consumer Group: │
│ "agent-executor-workers" │
│ │
└───────────────────────────────────────┘
All VMs connect to PostgreSQL on VM 1 (Port 5432)
┌────────────────────────────────────────────────┐
│ Shared PostgreSQL Database (hosted on VM 1) │
│ │
│ • Task Registry (status tracking) │
│ • Chat History │
│ • Agent/Tool configuration │
│ • Token usage logs │
│ │
│ VM 1 (Backend) ──▶ localhost:5432 │
│ VM 2 (Agent Workers) ──▶ VM1_IP:5432 │
│ VM 3 (Tool Workers) ──▶ VM1_IP:5432 │
└────────────────────────────────────────────────┘