diff --git a/deployments/AWS_EKS/2. Deploy_Dynamo_Cloud.md b/deployments/AWS_EKS/2. Deploy_Dynamo_Cloud.md deleted file mode 100644 index d57aa9b..0000000 --- a/deployments/AWS_EKS/2. Deploy_Dynamo_Cloud.md +++ /dev/null @@ -1,85 +0,0 @@ -# Steps to deploy Dynamo Cloud Kubernetes Platform (Dynamo Deploy) - -## 1. Install Dynamo CLI in a Python Virtual environment - -``` -git clone https://github.com/ai-dynamo/dynamo.git -b v0.3.0 - -python3 -m venv venv -source venv/bin/activate - -pip install ai-dynamo[all] -``` - -## 2. Build Images (api-store & operator) - -Create 2 ECR repositories - -``` -aws configure -aws ecr create-repository --repository-name dynamo-api-store -aws ecr create-repository --repository-name dynamo-operator -``` - -Build and push images - -Change image build engine from `buildkit` to `kaniko` like below. We're seeing some issues currently with the default build engine `buildkit` and `kaniko` works OOTB. - -``` -# imageBuildEngine: kaniko -vim https://github.com/ai-dynamo/dynamo/blob/main/deploy/cloud/helm/dynamo-platform-values.yaml -``` - -Log into your docker registry - -``` -export DOCKER_SERVER= -export IMAGE_TAG=0.3.0 - -aws configure -aws ecr get-login-password | docker login --username AWS --password-stdin / -``` - -Push images - -``` -earthly --push +all-docker --DOCKER_SERVER=$DOCKER_SERVER --IMAGE_TAG=$IMAGE_TAG -``` - -## 3. Deploy the Helm Charts for Dynamo Cloud - -Export environment variables for creating a secret - -``` -export DOCKER_USERNAME=AWS -export DOCKER_PASSWORD="$(aws ecr get-login-password --region )" -``` - -Create namespace - -``` -export NAMESPACE=dynamo-cloud - -cd deploy/cloud/helm -kubectl create namespace $NAMESPACE -kubectl config set-context --current --namespace=$NAMESPACE -``` - -Install - -``` -./deploy.sh --crds -``` - -Your pods should all be running like below - -``` -NAME READY STATUS RESTARTS AGE -dynamo-cloud-dynamo-api-store-644cb8b7cf-87p5d 1/1 Running 0 3h35m -dynamo-cloud-dynamo-operator-controller-manager-548676c586plts8 2/2 Running 0 3h35m -dynamo-cloud-etcd-0 1/1 Running 0 3h35m -dynamo-cloud-minio-857cc956c6-l78v7 1/1 Running 0 3h35m -dynamo-cloud-nats-0 2/2 Running 0 3h35m -dynamo-cloud-nats-box-764fdb68f4-jfgnj 1/1 Running 0 3h35m -dynamo-cloud-postaresal-0 1/1 Running 0 3h35m -``` \ No newline at end of file diff --git a/deployments/AWS_EKS/3. Deploy_LLM_Example.md b/deployments/AWS_EKS/3. Deploy_LLM_Example.md deleted file mode 100644 index 8bbae5b..0000000 --- a/deployments/AWS_EKS/3. Deploy_LLM_Example.md +++ /dev/null @@ -1,131 +0,0 @@ -# Steps to deploy LLM example - -## 1. Build and push base image - -Build image - -``` -./container/build.sh -``` - -Create an ECR repository - -``` -aws configure -aws ecr create-repository --repository-name -``` - -Push image - -``` -docker tag dynamo:latest-vllm /:0.3.0 - -aws ecr get-login-password | docker login --username AWS --password-stdin / - -docker push /:0.3.0 -``` - -## 2. Deploy the Helm Chart for Inference Graph - -### a. Open port access to Dyname Cloud - -``` -kubectl port-forward svc/dynamo-store 8080:80 -n dynamo-cloud -``` - -Export necessary environment variables - -``` -export DYNAMO_CLOUD=http://localhost:8080 -export DYNAMO_IMAGE=/:0.3.0 -export DEPLOYMENT_NAME=llm-disagg-router -``` - -### b. Build service - -``` -cd examples/llm -DYNAMO_TAG=$(dynamo build graphs.disagg_router:Frontend | grep "Successfully built" | awk '{ print $NF }' | sed 's/\.$//') -``` - -You should output something similar to below - -``` -DYNAMO_TAG=$(dynamo build graphs.disagg_router:Frontend | grep "Successfully built" | awk '{ print $NF }' | sed 's/\.\.$//') -2025-05-06T01:05:55.346Z WARN __init__.vllm_version_matches_substr: Using ai_dynamo_vllm -2025-05-06T01:05:55.348Z INFO __init__.resolve_current_platform_cls_qualname: No platform detected, vLLM is running on UnspecifiedPlatform -2025-05-06T01:05:55.581Z INFO nixl: NIXL is available -``` - -### c. Deploy the Helm Chart - -``` -dynamo deployment create $DYNAMO_TAG -n $DEPLOYMENT_NAME -f ./configs/disagg_router.yaml --no-wait -``` - -You should output something similar to below - -``` -2025-06-03T00:15:08.652Z INFO utils.resolve_service config: Running dynamo serve with config: {'Common': {'model': 'deepseek-ai/DeepSeek-R1-Distill-Llama-8B', 'block-size': 64, 'max_model_len': 16384, 'router': 'kv', 'kv-transfer-config': {'kv_connector':'DynamoN...ector'}, 'Frontend': {'served_model_name':'deepseek-ai/DeepSeek-R1-Distill-Llama-8B', 'endpoint': 'dynamo.Processor.chat/completions', 'port': 8000}, 'Processor': {'common-configs': ['model', 'block-size', 'max-model-len', 'router']}, 'Router': {'min-workers': 1, 'common-configs': ['model', 'block-size', 'router']}, 'VLMWorker': {'max-num-batched-tokens': 16384, 'remote-prefill': True, 'conditional-disagg': True, 'max-local-prefill-length': 10, 'max-prefill-queue-size': 2, 'tensor-parallel-size': 1, 'enable-prefix-caching': True, 'ServiceArgs': {'workers': 1, 'resources': {'gpu': 1}}}, 'common-configs': ['model', 'block-size', 'max-model-len', 'kv-transfer-config']}, 'Planner': {'environment': 'local', 'no-operation': True}} -creating deployment... -Deployment 'llm-disagg-router' created. - ------------------------------------ Deployment ----------------------------------- - -Name: llm-disagg-router -Status: pending -Created: 2025-06-03T00:15:09.078452 -URLs: None -``` - -Your pods should all be running like below - -``` -NAME READY STATUS RESTARTS AGE -dynamo-cloud-dynamo-api-store-644cb8b7cf-87p5d 1/1 Running 0 3h35m -dynamo-cloud-dynamo-operator-controller-manager-548676c586plts8 2/2 Running 0 3h35m -dynamo-cloud-etcd-0 1/1 Running 0 3h35m -dynamo-cloud-minio-857cc956c6-l78v7 1/1 Running 0 3h35m -dynamo-cloud-nats-0 2/2 Running 0 3h35m -dynamo-cloud-nats-box-764fdb68f4-jfgnj 1/1 Running 0 3h35m -dynamo-cloud-postgresql-0 1/1 Running 0 3h35m -dynamo-image-builder-d0v3t3ab4mps73ar65j0-djpcx 0/1 Completed 0 3h31m -llm-disagg-router-frontend-785998d847-n2q8t 1/1 Running 0 3h19m -llm-disagg-router-planner-5dc64b9c68-tq69j 1/1 Running 0 3h19m -llm-disagg-router-prefillworker-84565696b4-2sz7q 1/1 Running 0 3h19m -llm-disagg-router-processor-865495c8b-gp929 1/1 Running 0 3h19m -llm-disagg-router-router-767dd97c95-df77k 1/1 Running 0 3h19m -llm-disagg-router-vllmworker-7bbf7f7f77-ks9ff 1/1 Running 0 3h19m -``` - -## 3. Send a request - -Open port access to frontend pod. You can find the frontend pod name from the output of `kubectl get pods` - -``` -kubectl port-forward pod/llm-disagg-router-frontend-785998d847-n2q8t 3000:3000 -``` - -Send a request - -``` -curl localhost:3000/v1/chat/completions \ - -H "Content-Type: application/json" \ - -d '{ - "model": "deepseek-ai/DeepSeek-R1-Distill-Llama-8B", - "messages": [ - { - "role": "user", - "content": "In the heart of Eldoria, an ancient land of boundless magic and mysterious creatures, lies the long-forgotten city of Aeloria. Once a beacon of knowledge and power, Aeloria was buried beneath the shifting sands of time, lost to the world for centuries. You are an intrepid explorer, known for your unparalleled curiosity and courage, who has stumbled upon an ancient map hinting at ests that Aeloria holds a secret so profound that it has the potential to reshape the very fabric of reality. Your journey will take you through treacherous deserts, enchanted forests, and across perilous mountain ranges. Your Task: Character Background: Develop a detailed background for your character. Describe their motivations for seeking out Aeloria, their skills and weaknesses, and any personal connections to the ancient city or its legends. Are they driven by a quest for knowledge, a search for lost familt clue is hidden." - } - ], - "stream":false, - "max_tokens": 30 - }' -``` - -You should output something similar to below - -``` -{"id":"bce6cbce5-9d8d-476a-b895-5d8906ee54e4","choices":[{"index":0,"message":{"content":"Okay, so I'm trying to help someone develop a character background for their role-playing game set in Eldoria. The city in question is Ael","refusal":null,"tool_calls":null},"role":"assistant","function_call":null,"audio":null}],"finish_reason":"length","logprobs":null}],"created":1746474610,"model":"deepseek-ai/DeepSeek-R1-Distill-Llama-8B","service_tier":null,"system_fingerprint":null,"object":"chat.completion","usage":null}ubuntu@ip-192-168-83-157:~% -``` \ No newline at end of file diff --git a/deployments/AWS_EKS/1. Create_EKS_EFS.md b/deployments/AWS_EKS_vLLM/1. Create_EKS_EFS.md similarity index 100% rename from deployments/AWS_EKS/1. Create_EKS_EFS.md rename to deployments/AWS_EKS_vLLM/1. Create_EKS_EFS.md diff --git a/deployments/AWS_EKS_vLLM/2. Deploy_Dynamo_Cloud.md b/deployments/AWS_EKS_vLLM/2. Deploy_Dynamo_Cloud.md new file mode 100644 index 0000000..fc794d7 --- /dev/null +++ b/deployments/AWS_EKS_vLLM/2. Deploy_Dynamo_Cloud.md @@ -0,0 +1,80 @@ +# Steps to install Dynamo Cloud from Source + +## 1. Build Dynamo Base Image + +Create 1 ECR repositoriy + +``` +aws configure +aws ecr create-repository --repository-name +``` + +Build Image + +``` +export NAMESPACE=dynamo-cloud +export DOCKER_SERVER= +export DOCKER_USERNAME=AWS +export DOCKER_PASSWORD="$(aws ecr get-login-password --region )" + +export IMAGE_TAG=0.3.2.1 +./container/build.sh +``` + +Push Image + +``` +docker tag dynamo:latest-vllm /:$IMAGE_TAG + +aws ecr get-login-password | docker login --username AWS --password-stdin / + +docker push /:$IMAGE_TAG +``` + +## 2. Install Dynamo Cloud + +Build and Push Operator Image + +``` +cd deploy/cloud/operator + +vim Earthfile # change ARG IMAGE_SUFFIX= +earthly --push +docker --DOCKER_SERVER=$DOCKER_SERVER --IMAGE_TAG=$IMAGE_TAG +``` + +Create secrets + +``` +kubectl create namespace ${NAMESPACE} +kubectl create secret docker-registry docker-imagepullsecret \ + --docker-server=${DOCKER_SERVER} \ + --docker-username=${DOCKER_USERNAME} \ + --docker-password=${DOCKER_PASSWORD} \ + --namespace=${NAMESPACE} + +export HF_TOKEN= +kubectl create secret generic hf-token-secret \ + --from-literal=HF_TOKEN=${HF_TOKEN} \ + -n ${NAMESPACE} +``` + +Install Dynamo Cloud + +``` +cd dynamo/cloud/helm + +helm repo add bitnami https://charts.bitnami.com/bitnami +vim deploy.sh # Use the correct image name for dynamo-operator +./deploy.sh --crds +``` + +Your pods should be running like below + +``` +ubuntu@ip-192-168-83-157:~/dynamo/components/backends/vllm/deploy$ kubectl get pods -A +NAMESPACE NAME READY STATUS RESTARTS AGE +dynamo-cloud dynamo-platform-dynamo-operator-controller-manager-86795c5f4j4k 2/2 Running 0 4h17m +dynamo-cloud dynamo-platform-etcd-0 1/1 Running 0 4h17m +dynamo-cloud dynamo-platform-nats-0 2/2 Running 0 4h17m +dynamo-cloud dynamo-platform-nats-box-5dbf45c748-bxqj7 1/1 Running 0 4h17m +``` \ No newline at end of file diff --git a/deployments/AWS_EKS_vLLM/3. Deploy_vLLM_Example.md b/deployments/AWS_EKS_vLLM/3. Deploy_vLLM_Example.md new file mode 100644 index 0000000..7d5009b --- /dev/null +++ b/deployments/AWS_EKS_vLLM/3. Deploy_vLLM_Example.md @@ -0,0 +1,50 @@ +# Steps to deploy vLLM example + +## 1. Deploy Dynamo Graph + +``` +cd dynamo/components/backends/vllm/deploy + +vim agg_router.yaml # under metadata add namespace: dynamo-cloud and change image to your built base image +kubectl apply -f agg_router.yaml +``` + +Your pods should be running like below + +``` +ubuntu@ip-192-168-83-157:~/dynamo/components/backends/vllm/deploy$ kubectl get pods -A +NAMESPACE NAME READY STATUS RESTARTS AGE +dynamo-cloud dynamo-platform-dynamo-operator-controller-manager-86795c5f4j4k 2/2 Running 0 4h17m +dynamo-cloud dynamo-platform-etcd-0 1/1 Running 0 4h17m +dynamo-cloud dynamo-platform-nats-0 2/2 Running 0 4h17m +dynamo-cloud dynamo-platform-nats-box-5dbf45c748-bxqj7 1/1 Running 0 4h17m +dynamo-cloud vllm-agg-router-frontend-79d599bb9c-fg97p 1/1 Running 0 4m9s +dynamo-cloud vllm-agg-router-vllmdecodeworker-787d575485-hrcjp 1/1 Running 0 4m9s +dynamo-cloud vllm-agg-router-vllmdecodeworker-787d575485-zkwdd 1/1 Running 0 4m9s +``` + +Test the Deployment + +``` +kubectl port-forward deployment/vllm-agg-router-frontend 8080:8000 -n dynamo-cloud + +curl localhost:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "model": "Qwen/Qwen3-0.6B", + "messages": [ + { + "role": "user", + "content": "In the heart of Eldoria, an ancient land of boundless magic and mysterious creatures, lies the long-forgotten city of Aeloria. Once a beacon of knowledge and power, Aeloria was buried beneath the shifting sands of time, lost to the world for centuries. You are an intrepid explorer, known for your unparalleled curiosity and courage, who has stumbled upon an ancient map hinting at ests that Aeloria holds a secret so profound that it has the potential to reshape the very fabric of reality. Your journey will take you through treacherous deserts, enchanted forests, and across perilous mountain ranges. Your Task: Character Background: Develop a detailed background for your character. Describe their motivations for seeking out Aeloria, their skills and weaknesses, and any personal connections to the ancient city or its legends. Are they driven by a quest for knowledge, a search for lost familt clue is hidden." + } + ], + "stream": false, + "max_tokens": 30 + }' +``` + +You should output something similar to below + +``` +{"id":"chatcmpl-bbe52b36-90ed-4479-9872-89e1aa412aa7","choices":[{"index":0,"message":{"content":"\nOkay, so the user wants me to develop a character background for an explorer named someone in Eldoria. The character is part of the","refusal":null,"tool_calls":null,"role":"assistant","function_call":null,"audio":null},"finish_reason":"stop","logprobs":null}],"created":1753417848,"model":"Qwen/Qwen3-0.6B","service_tier":null,"system_fingerprint":null,"object":"chat.completion","usage":{"prompt_tokens":196,"completion_tokens":29,"total_tokens":225,"prompt_tokens_details":null,"completion_tokens_details":null}} +``` \ No newline at end of file diff --git a/deployments/AWS_EKS/README.md b/deployments/AWS_EKS_vLLM/README.md similarity index 62% rename from deployments/AWS_EKS/README.md rename to deployments/AWS_EKS_vLLM/README.md index 2b8fc4e..5291d09 100644 --- a/deployments/AWS_EKS/README.md +++ b/deployments/AWS_EKS_vLLM/README.md @@ -1,6 +1,6 @@ -# Default LLM example on AWS EKS +# Dynamo vLLM example on AWS EKS -This folder contains steps below to create an AWS EKS cluster with EFS to deploy default Dynamo LLM example tested on commit `d849f7eccabdd850e2c7cb5e6103d6f8b39b0a77`. +This folder contains steps below to create an AWS EKS cluster with EFS to deploy Dynamo vLLM example tested on commit `30942780de2eb6a2358b96caa9f6978c799aede6`. 1. [Create AWS EKS cluster and EFS](1.%20Create_EKS_EFS.md) 2. [Deploy Dynamo Cloud](2.%20Deploy_Dynamo_Cloud.md)