Deployment of IAF with Autoscaling in AKS
Prerequisites
Ensure user have access to below:
- Infosys Github Repos
- Ensure Azure resources are created, and you have access to push and pull from these resources
- Azure Container Registry (ACR)
- Azure Kubernetes Service (AKS)
- Azure Postgres Service
- Install Docker, helm and kubectl in your Azure VM
Steps for Login to AKS and ACR from Azure VM
ACR
Login to ACR using the below command (You can find the login steps directly in your ACR resource from the azure portal)
az login
az acr login --name <acr name>
AKS
Login to the AKS using the below commands (you can get these commands directly from the AKS resource in your azure portal)
az login --tenant <your tenant id>
az account set --subscription "<subscription name>"
sudo az aks get-credentials --resource-group <resource group name> --name <aks name> --overwrite-existing
Note
Merged "<aks name>" as current context in /root/.kube/config — You will get a message like this, that means you were logged into AKS successfully.
Before deploying IAF in VMs we need to have KAFKA setup, KEDA setup in AKS and LiteLLM setup in our VM.
Steps for Setting up Kafka in AKS
-
Use the command below to pull the docker image for Kafka to your VM:
docker pull bitnamilegacy/kafka:4.0.0-debian-12-r10 -
Create the Kafka configuration file (
kafka-values.yaml) — Create a custom Helm values file to configure Kafka for AKS, including KRaft (ZooKeeperless) mode, replica counts, listener settings, persistent storage, service exposure, and the Kafka image source mirrored to ACR. This file customizes the Bitnami Kafka Helm chart for AKS and applies simple tuning required for a single-broker deployment. -
Use the command below to run the Kafka:
helm install kafka bitnami/kafka -n kafka -f kafka-values.yaml -
Use the command below to check the Kafka container status:
kubectl get pods -n <namespace> -
After creating the Kafka pod, you need to create a
__consumer_offsettopic for the Kafka cluster, it will not be created automatically. -
Now you can access the Kafka using the following URL:
or<External-IP>:9092<service>.<namespace>.svc.cluster.local:9092
Steps for KEDA Setup in AKS
-
Install Keda by deploying the manifests below directly:
kubectl apply -f https://github.com/kedacore/keda/releases/download/v2.12.0/keda-2.14.0.yaml -
Check Keda pods using below command:
kubectl get pods -n kedaUser will see something like:
keda-operator-xxxxxx keda-metrics-apiserver-xxxxxx -
Create ScaledObject for agent-worker and tool-worker deployment pods.
The ScaledObject configuration should include a reference to the target agent worker Deployment, the minimum and maximum replica limits, and polling/cooldown settings. It must define a Kafka-based trigger specifying the bootstrap server, topic, and consumer group. Lag and activation thresholds are configured to control when scaling starts and how aggressively it scales. Optional HPA behavior settings can be used to fine-tune scale-up and scale-down behavior.
Steps for Setting up LiteLLM Server in VM
You can find the setup process for litellm in the below URL, follow the same and complete the setup for the litellm server in your VM.
Steps for Setting up Model Server
You can find the setup process for model server in the below URL, follow the same and complete the setup for model server in your VM.
Steps for Setting up Knowledge Base Server
You can find the setup process for knowledge base server in the below URL, follow the same and complete the setup for knowledge base server in your VM.
Before deploying frontend and backend in AKS we need to deploy the following services:
- Arize Phoenix
- Elastic Search
- Opentelemetry
- Redis
- Grafana
Let us start with the deployment of these services first.
Warning
Make sure that you always use the latest version of the images for these services.
Steps for Deploying Arize Phoenix in AKS
Login to Azure VM:
- Create a yaml file for deploying arize phoenix as a container, you can use the arize phoenix image directly in the yaml file.
-
Now you need to use this command for creating deployment and service:
kubectl apply -f filename.yaml -
You can check the pods deployed using the command below:
kubectl get pods -n namespace -
You can check the services deployed using the command below:
kubectl get svc -n namespace -
Note down the load balancer IP for the container.
Steps for Deploying Redis in AKS
Login to Azure VM:
- Create a yaml file for deploying Redis as a container, you can use the redis image directly in the yaml file.
-
Now you need to use this command for creating deployment and service:
kubectl apply -f filename.yaml -
You can check the pods deployed using the command below:
kubectl get pods -n namespace -
You can check the services deployed using the command below:
kubectl get svc -n namespace -
Note down the load balancer IP for the container.
Steps for Deploying Grafana in AKS
Login to Azure VM:
- Create a yaml file for deploying Grafana as a container, you can use the grafana image directly in the yaml file.
-
Now you need to use this command for creating deployment and service:
kubectl apply -f filename.yaml -
You can check the pods deployed using the command below:
kubectl get pods -n namespace -
You can check the services deployed using the command below:
kubectl get svc -n namespace -
Note down the load balancer IP for the container.
Steps for Deploying Elastic Search in AKS
Login to Azure VM:
- Create a yaml file for deploying elastic search as a container, you can use the elastic search image directly in the yaml file.
-
Now you need to use this command for creating deployment and service:
kubectl apply -f filename.yaml -
You can check the pods deployed using the command below:
kubectl get pods -n namespace -
You can check the services deployed using the command below:
kubectl get svc -n namespace -
Note down the load balancer IP for the container.
Steps for Deploying OpenTelemetry in AKS
Login to Azure VM:
- Create a yaml file for deploying opentelemetry as a container, you can use the opentelemetry image directly in the yaml file.
-
Now you need to use this command for creating deployment and service:
kubectl apply -f filename.yaml -
You can check the pods deployed using the command below:
kubectl get pods -n namespace -
You can check the services deployed using the command below:
kubectl get svc -n namespace -
Note down the load balancer IP for the container.
Once you get these IPs, you need to mention them in the .env of backend and frontend accordingly.
Now we can proceed to create backend, agent worker, tool worker and frontend images and deploy them in VMs.
Backend
Login to VM:
- Before the image creation, ensure that you configure the values for the variables in the
.envof backend correctly. - After confirming you have configured all the values correctly proceed to create image for backend by following the below steps.
- Download backend code from GitHub Main branch.
- Copy Dockerfile into the same folder.
- Update the details in your
.envfile. -
Update
main.pyfile as below (If we already have*in origins, then no need to update):Update origins — For testing update CORS (optional):
# Configure CORS origins = [ "", # Add your frontend IP address "", # Add you frontend Ip with port number being "http://127.0.0.1", # Allow 127.0.0.1 "http://127.0.0.1:3000", #If your frontend runs on port 3000 "http://localhost", "http://localhost:3000" ] app.add_middleware( CORSMiddleware, # allow_origins=origins, allow_origins=["*"], allow_credentials=True, allow_methods=["*"], # Allows all methods allow_headers=["*"], # Allows all headers -
Now build a docker image for your backend folder using the command below:
docker build -f <docker filename with path> -t <image name>:<image tag> -
Now you will be getting your backend image something like this:
localhost/<image name>:<image tag> -
Login to jfrog using the command below:
docker login infyartifactory.jfrog.io -
Tag the image with acr as shown below:
docker tag localhost/<image name>:<image tag> <acr name>.azurecr.io/<image name>:<image tag> -
Push the image into the ACR using the command below (If you get any authentication error for ACR then again login to ACR using the steps mentioned earlier):
docker push <acr name>.azurecr.io/<image name>:<image tag> -
You can check the images in ACR using the command below:
az acr repository list --name <acrname> --output table -
Create a backend YAML file.
-
Deploy the file using the command below:
kubectl apply -f filename.yaml -
You can check the pods deployed using the command below:
kubectl get pods -n namespace -
You can check the services deployed using the command below:
kubectl get svc -n namespace -
Now you can access the backend service using Swagger UI from web browser. You can find the external ip by using the command mentioned in step 16.
<your external ip>:<port number>/docs
Agent Worker
Login to VM:
- Before the image creation, ensure that you configure the values for the variables in the
.envof backend correctly. - After confirming you have configured all the values correctly proceed to create image for backend by following the below steps.
- Download backend code from GitHub Main branch.
- Copy Dockerfile into the same folder.
- In Dockerfile you need to use the command to run the
run_agent_worker.pyfile, you should not run themain.pyfile for agent worker image creation. - Update the details in your
.envfile. -
Update
main.pyfile as below (If we already have*in origins, then no need to update):Update origins — For testing update CORS (optional):
# Configure CORS origins = [ "", # Add your frontend IP address "", # Add you frontend Ip with port number being "http://127.0.0.1", # Allow 127.0.0.1 "http://127.0.0.1:3000", #If your frontend runs on port 3000 "http://localhost", "http://localhost:3000" ] app.add_middleware( CORSMiddleware, # allow_origins=origins, allow_origins=["*"], allow_credentials=True, allow_methods=["*"], # Allows all methods allow_headers=["*"], # Allows all headers -
Now build a docker image for your backend folder using the command below:
docker build -f <docker filename with path> -t <image name>:<image tag> -
Now you will be getting your agent worker image something like this:
localhost/<image name>:<image tag> -
Tag the image with acr as shown below:
docker localhost/<image name>:<image tag> <acr name>.azurecr.io/<image name>:<image tag> -
Push the image into the ACR using the command below (If you get any authentication error for ACR then again login to ACR using the steps mentioned earlier):
docker push <acr name>.azurecr.io/<image name>:<image tag> -
You can check the images in ACR using the command below:
az acr repository list --name <acrname> --output table -
Create a agent worker YAML file.
-
Deploy the file using the command below:
kubectl apply -f filename.yaml -
You can check the pods deployed using the command below:
kubectl get pods -n namespace -
You can check the services deployed using the command below:
kubectl get svc -n namespace -
Now you can access the backend service using Swagger UI from web browser. You can find the external ip by using the command mentioned in step 16.
<your external ip>:<port number>/docs
Tool Worker
Login to Infosys VM for Image creation:
- Before the image creation, ensure that you configure the values for the variables in the
.envof backend correctly. - After confirming you have configured all the values correctly proceed to create image for backend by following the below steps.
- Download backend code from GitHub Main branch.
- Copy Dockerfile into the same folder.
- In Dockerfile you need to use the command to run the
tool_worker/main.pyfile, you should not run themain.pyfile from the root directory for tool worker image creation. - Update the details in your
.envfile. -
Update
main.pyfile as below (If we already have*in origins, then no need to update):Update origins — For testing update CORS (optional):
# Configure CORS origins = [ "", # Add your frontend IP address "", # Add you frontend Ip with port number being "http://127.0.0.1", # Allow 127.0.0.1 "http://127.0.0.1:3000", #If your frontend runs on port 3000 "http://localhost", "http://localhost:3000" ] app.add_middleware( CORSMiddleware, # allow_origins=origins, allow_origins=["*"], allow_credentials=True, allow_methods=["*"], # Allows all methods allow_headers=["*"], # Allows all headers -
Now build a docker image for your backend folder using the command below:
docker build -f <docker filename with path> -t <image name>:<image tag> -
Now you will be getting your tool worker image something like this:
localhost/<image name>:<image tag> -
Tag the image with acr as shown below:
docker tag localhost/<image name>:<image tag> <acr name>.azurecr.io/<image name>:<image tag> -
Push the image into the ACR using the command below (If you get any authentication error for ACR then again login to ACR using the steps mentioned earlier):
docker push <acr name>.azurecr.io/<image name>:<image tag> -
You can check the images in ACR using the command below:
az acr repository list --name <acrname> --output table -
Create tool worker YAML file.
-
Deploy the file using the command below:
kubectl apply -f filename.yaml -
You can check the pods deployed using the command below:
kubectl get pods -n namespace -
You can check the services deployed using the command below:
kubectl get svc -n namespace -
Now you can access the backend service using Swagger UI from web browser. You can find the external ip by using the command mentioned in step 16.
<your external ip>:<port number>/docs
Frontend
Login to VM:
- Before the image creation, ensure that you configure the values for the variables in the
.envof frontend correctly. - After confirming you have configured all the values correctly proceed to create image for frontend by following the below steps.
- Download frontend code from Github Main branch.
- Copy Dockerfile into the same folder.
- Update the
.envfile in the Frontend folder with your deployment URLs for Backend, MKDocs, Arize Phoenix, Grafana etc… -
Build the docker image for your frontend folder:
docker build -f <Dockerfile> -t <imagename>:<imagetag> -
Now you will be getting your frontend image something like this:
localhost/<image name>:<image tag> -
Tag the image with acr as shown below:
docker tag localhost/<image name>:<image tag> <acr name>.azurecr.io/<image name>:<image tag> -
Push the image into the ACR using the command below (If you get any authentication error for ACR then again login to ACR using the steps mentioned earlier):
docker push <acr name>.azurecr.io/<image name>:<image tag> -
You can check the images in ACR using the command below:
az acr repository list --name <acrname> --output table -
Create a frontend YAML file.
-
Deploy the file using the command below:
kubectl apply -f filename.yaml -
You can check the pods deployed using the command below:
kubectl get pods -n namespace -
You can check the services deployed using the command below:
kubectl get svc -n namespace -
Now you can access your frontend service from a web browser by using the external ip which you will get after deployment. You can find the external ip by using the command mentioned in step 14.
Now we have done setup of everything in AKS, so we can start consuming it.