Overview
This project involves deploying an Ollama service on Kubernetes, setting up Horizontal Pod Autoscaler (HPA), and conducting load testing. The goal is to ensure that the service scales effectively under various load conditions and meets performance requirements.
Implementation
-
Deployment Service Deployment: A Kubernetes deployment has been created for the application service with two replicas. The deployment is defined in deployment.yaml.
-
Service Configuration Service Definition: A LoadBalancer service is set up to expose the service to the external network. The service configuration is in service.yaml.
Horizontal Pod Autoscaler (HPA): An HPA is configured to scale the Ollama deployment based on CPU utilization. The HPA configuration is in hpa.yaml.
CI/CD Pipeline
In this task I have used GitHub actions as my ci/cd tool. The CI/CD pipeline using GitHub Actions automates the process of building, testing, and deploying the service. By adhering to best practices and utilizing automated workflows, you can ensure efficient and reliable deployments.
Kubectl Commands used
kubectl apply -f deployment.yaml -n devops-assignment
kubectl apply -f service.yaml -n devops-assignment
kubectl apply -f hpa.yaml -n devops-assignment
k6 Load test commands used
k6 run <test-script-file_name>
Conclusion
This project demonstrates the deployment and scaling of a service using Kubernetes and load testing to ensure performance and stability. By following the documented best practices and setup instructions, you can reproduce the setup and perform similar testing.