自动扩缩容
概述
自动扩缩容是Kubernetes的核心能力之一,它可以根据工作负载的需求自动调整资源分配。Kubernetes提供了多种自动扩缩容机制:水平Pod自动扩缩器(HPA)、垂直Pod自动扩缩器(VPA)和集群自动扩缩器(Cluster Autoscaler)。这些机制协同工作,确保应用在负载变化时能够获得足够的资源,同时优化资源利用率,降低运营成本。
核心概念
1. 水平Pod自动扩缩器(HPA)
HPA根据观察到的CPU利用率或其他指标自动扩缩Deployment、ReplicaSet或StatefulSet中的Pod数量:
- 基于CPU/内存利用率扩缩
- 支持自定义指标和外部指标
- 声明式配置,自动调整副本数
- 适用于无状态应用
2. 垂直Pod自动扩缩器(VPA)
VPA根据容器的实际资源使用情况自动调整CPU和内存的requests和limits:
- 分析历史资源使用数据
- 自动设置资源请求和限制
- 支持自动更新和推荐模式
- 适用于资源使用模式稳定的应用
3. 集群自动扩缩器(Cluster Autoscaler)
Cluster Autoscaler根据Pod的调度需求和节点利用率自动调整集群节点数量:
- 监控未调度的Pod
- 自动添加或删除节点
- 与云提供商集成
- 优化集群资源利用率
4. 扩缩容策略
扩缩容策略定义了扩缩容的行为:
- 扩缩容速率限制
- 冷却时间
- 稳定窗口
- 行为策略
HPA配置
HPA架构
┌─────────────────────────────────────────────────────────────┐
│ Horizontal Pod Autoscaler │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Metrics │ -> │ Decision │ -> │ Scaling │ │
│ │ Server │ │ Engine │ │ Action │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ ↑ │ │ │
│ │ ↓ ↓ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Metrics │ │ Desired │ │ Replica │ │
│ │ Collection │ │ Replicas │ │ Count │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘基于CPU的HPA
HPA定义
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webapp-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
- type: Pods
value: 2
periodSeconds: 60
selectPolicy: Min
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max目标Deployment
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: webapp
template:
metadata:
labels:
app: webapp
spec:
containers:
- name: webapp
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
---
apiVersion: v1
kind: Service
metadata:
name: webapp
namespace: default
spec:
selector:
app: webapp
ports:
- port: 80
targetPort: 80基于内存的HPA
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webapp-memory-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80基于自定义指标的HPA
部署Metrics Server
bash
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
kubectl get pods -n kube-system -l k8s-app=metrics-server
kubectl top pods
kubectl top nodes部署Prometheus Adapter
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: adapter-config
namespace: monitoring
data:
config.yaml: |
rules:
- seriesQuery: 'http_requests_total{kubernetes_namespace!="",kubernetes_pod_name!=""}'
resources:
overrides:
kubernetes_namespace: {resource: namespace}
kubernetes_pod_name: {resource: pod}
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: custom-metrics-apiserver
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: custom-metrics-apiserver
template:
metadata:
labels:
app: custom-metrics-apiserver
spec:
serviceAccountName: custom-metrics-apiserver
containers:
- name: custom-metrics-apiserver
image: directxman12/k8s-prometheus-adapter:v0.11.0
args:
- --secure-port=6443
- --tls-cert-file=/var/run/serving-cert/serving.crt
- --tls-private-key-file=/var/run/serving-cert/serving.key
- --logtostderr=true
- --prometheus-url=http://prometheus.monitoring.svc:9090/
- --metrics-relist-interval=30s
- --v=10
- --config=/etc/adapter/config.yaml
ports:
- containerPort: 6443
volumeMounts:
- mountPath: /var/run/serving-cert
name: volume-serving-cert
readOnly: false
- mountPath: /etc/adapter/
name: config
readOnly: false
volumes:
- name: volume-serving-cert
secret:
secretName: adapter-serving-certs
- name: config
configMap:
name: adapter-config使用自定义指标的HPA
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webapp-custom-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
minReplicas: 2
maxReplicas: 20
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: 1000基于外部指标的HPA
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webapp-external-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
minReplicas: 2
maxReplicas: 15
metrics:
- type: External
external:
metric:
name: queue_messages_ready
selector:
matchLabels:
queue: "myqueue"
target:
type: AverageValue
averageValue: 30多指标HPA
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webapp-multi-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: 1000
- type: External
external:
metric:
name: queue_messages_ready
selector:
matchLabels:
queue: "myqueue"
target:
type: AverageValue
averageValue: 30
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: MaxVPA配置
VPA架构
┌─────────────────────────────────────────────────────────────┐
│ Vertical Pod Autoscaler │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Recommender │ -> │ Updater │ -> │ Admission │ │
│ │ │ │ │ │ Controller │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ ↑ │ │ │
│ │ ↓ ↓ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Metrics │ │ Pod │ │ Resource │ │
│ │ Server │ │ Eviction │ │ Update │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘安装VPA
bash
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh
kubectl get pods -n kube-system | grep vpaVPA定义
基础VPA
yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: webapp-vpa
namespace: default
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
updatePolicy:
updateMode: Auto
resourcePolicy:
containerPolicies:
- containerName: webapp
minAllowed:
cpu: 100m
memory: 256Mi
maxAllowed:
cpu: 2000m
memory: 4Gi
controlledResources: ["cpu", "memory"]
controlledValues: RequestsAndLimitsVPA更新模式
yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: webapp-vpa-recommend
namespace: default
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
updatePolicy:
updateMode: OffVPA推荐模式
yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: webapp-vpa-off
namespace: default
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
updatePolicy:
updateMode: Off
resourcePolicy:
containerPolicies:
- containerName: webapp
mode: Off查看VPA推荐
bash
kubectl get vpa webapp-vpa -o yaml
kubectl describe vpa webapp-vpa
kubectl get vpa webapp-vpa -o jsonpath='{.status.recommendation.containerRecommendations}'VPA与HPA协同
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webapp-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: webapp-vpa
namespace: default
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
updatePolicy:
updateMode: Auto
resourcePolicy:
containerPolicies:
- containerName: webapp
minAllowed:
cpu: 100m
memory: 256Mi
maxAllowed:
cpu: 1000m
memory: 2Gi
controlledResources: ["cpu", "memory"]集群自动扩缩
Cluster Autoscaler架构
┌─────────────────────────────────────────────────────────────┐
│ Cluster Autoscaler │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Unschedul │ -> │ Scale │ -> │ Node │ │
│ │ Pods │ │ Decision │ │ Provision │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ ↑ │ │ │
│ │ ↓ ↓ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Pod │ │ Cloud │ │ Node │ │
│ │ Scheduler │ │ Provider │ │ Addition │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘部署Cluster Autoscaler
AWS EKS
yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT_ID:role/cluster-autoscaler
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["events", "endpoints"]
verbs: ["create", "patch"]
- apiGroups: [""]
resources: ["pods/eviction"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/status"]
verbs: ["update"]
- apiGroups: [""]
resources: ["endpoints"]
resourceNames: ["cluster-autoscaler"]
verbs: ["get", "update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["watch", "list", "get", "update"]
- apiGroups: [""]
resources: ["pods", "services", "replicationcontrollers", "persistentvolumeclaims", "persistentvolumes"]
verbs: ["watch", "list", "get"]
- apiGroups: ["extensions"]
resources: ["replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["watch", "list"]
- apiGroups: ["apps"]
resources: ["statefulsets", "replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
app: cluster-autoscaler
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8085'
spec:
serviceAccountName: cluster-autoscaler
containers:
- image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.25.0
name: cluster-autoscaler
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
env:
- name: AWS_REGION
value: us-west-2
volumeMounts:
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-certificates.crt
readOnly: true
volumes:
- name: ssl-certs
hostPath:
path: "/etc/ssl/certs/ca-bundle.crt"GKE
bash
gcloud container clusters update my-cluster \
--enable-autoscaling \
--min-nodes 1 \
--max-nodes 10 \
--zone us-central1-aAKS
bash
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 10节点池配置
yaml
apiVersion: exp.infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSMachineDeployment
metadata:
name: my-cluster-md-0
namespace: default
spec:
clusterName: my-cluster
replicas: 3
template:
spec:
instanceType: m5.large
iamInstanceProfile: nodes.cluster-api-provider-aws.sigs.k8s.io
sshKeyName: my-key
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: my-cluster-md-0
namespace: default
spec:
clusterName: my-cluster
replicas: 3
selector:
matchLabels:
cluster.x-k8s.io/cluster-name: my-cluster
template:
metadata:
labels:
cluster.x-k8s.io/cluster-name: my-cluster
spec:
clusterName: my-cluster
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: my-cluster-md-0
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSMachineDeployment
name: my-cluster-md-0
version: v1.25.0Cluster Autoscaler参数
yaml
command:
- ./cluster-autoscaler
- --v=4
- --logtostderr=true
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --skip-nodes-with-system-pods=true
- --expander=least-waste
- --balance-similar-node-groups=true
- --max-node-provision-time=15m
- --max-unready-nodes=100
- --max-unready-percentage=45
- --ok-total-unready-count=3
- --scale-down-unneeded-time=10m
- --scale-down-unready-time=20m
- --scale-down-delay-after-add=10m
- --scale-down-delay-after-delete=10s
- --scale-down-delay-after-failure=3m
- --scale-down-non-empty-candidates-count=30
- --scale-down-candidates-pool-ratio=0.1
- --scale-down-candidates-pool-min-count=50
- --scan-interval=10s
- --max-empty-bulk-delete=10
- --min-replica-count=0
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster实践示例
示例1:Web应用自动扩缩容
Deployment配置
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: webapp
template:
metadata:
labels:
app: webapp
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
containers:
- name: webapp
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 80
initialDelaySeconds: 5
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: webapp
namespace: production
spec:
selector:
app: webapp
ports:
- port: 80
targetPort: 80
type: ClusterIPHPA配置
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webapp-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: 1000
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
- type: Pods
value: 2
periodSeconds: 60
selectPolicy: Min
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: MaxVPA配置
yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: webapp-vpa
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
updatePolicy:
updateMode: Auto
resourcePolicy:
containerPolicies:
- containerName: webapp
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 2000m
memory: 4Gi
controlledResources: ["cpu", "memory"]
controlledValues: RequestsAndLimits示例2:批处理任务自动扩缩容
Job配置
yaml
apiVersion: batch/v1
kind: Job
metadata:
name: batch-processor
namespace: batch
spec:
parallelism: 3
completions: 100
template:
metadata:
labels:
app: batch-processor
spec:
containers:
- name: processor
image: batch-processor:v1.0.0
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 4Gi
env:
- name: QUEUE_URL
value: "sqs://my-queue"
restartPolicy: OnFailure基于队列长度的HPA
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: batch-processor-hpa
namespace: batch
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: batch-processor
minReplicas: 1
maxReplicas: 50
metrics:
- type: External
external:
metric:
name: sqs_queue_messages_visible
selector:
matchLabels:
queue: "my-queue"
target:
type: AverageValue
averageValue: 10
behavior:
scaleDown:
stabilizationWindowSeconds: 600
policies:
- type: Percent
value: 50
periodSeconds: 120
scaleUp:
stabilizationWindowSeconds: 30
policies:
- type: Percent
value: 100
periodSeconds: 30
- type: Pods
value: 10
periodSeconds: 30
selectPolicy: Max示例3:微服务自动扩缩容
微服务Deployment
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-gateway
namespace: microservices
spec:
replicas: 3
selector:
matchLabels:
app: api-gateway
template:
metadata:
labels:
app: api-gateway
spec:
containers:
- name: api-gateway
image: api-gateway:v1.0.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service
namespace: microservices
spec:
replicas: 2
selector:
matchLabels:
app: user-service
template:
metadata:
labels:
app: user-service
spec:
containers:
- name: user-service
image: user-service:v1.0.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 300m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
namespace: microservices
spec:
replicas: 2
selector:
matchLabels:
app: order-service
template:
metadata:
labels:
app: order-service
spec:
containers:
- name: order-service
image: order-service:v1.0.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 400m
memory: 512Mi
limits:
cpu: 1500m
memory: 2Gi微服务HPA配置
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-gateway-hpa
namespace: microservices
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-gateway
minReplicas: 3
maxReplicas: 30
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: 500
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: user-service-hpa
namespace: microservices
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: user-service
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 75
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: order-service-hpa
namespace: microservices
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: order-service
minReplicas: 2
maxReplicas: 15
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: External
external:
metric:
name: kafka_consumer_lag
selector:
matchLabels:
topic: "orders"
group: "order-service"
target:
type: AverageValue
averageValue: 100kubectl操作命令
HPA管理
bash
kubectl get hpa
kubectl get hpa -n production
kubectl describe hpa webapp-hpa
kubectl get hpa webapp-hpa -o yaml
kubectl apply -f hpa.yaml
kubectl delete hpa webapp-hpa
kubectl autoscale deployment webapp --cpu-percent=70 --min=2 --max=10
kubectl get hpa webapp-hpa -o jsonpath='{.status.currentMetrics}'
kubectl get hpa webapp-hpa -o jsonpath='{.status.desiredReplicas}'
kubectl patch hpa webapp-hpa -p '{"spec":{"maxReplicas":20}}'
kubectl edit hpa webapp-hpaVPA管理
bash
kubectl get vpa
kubectl describe vpa webapp-vpa
kubectl get vpa webapp-vpa -o yaml
kubectl apply -f vpa.yaml
kubectl delete vpa webapp-vpa
kubectl get vpa webapp-vpa -o jsonpath='{.status.recommendation}'
kubectl get vpa webapp-vpa -o jsonpath='{.status.recommendation.containerRecommendations[0].target}'Cluster Autoscaler管理
bash
kubectl get pods -n kube-system -l app=cluster-autoscaler
kubectl logs -n kube-system -l app=cluster-autoscaler
kubectl logs -n kube-system -l app=cluster-autoscaler --tail=100
kubectl describe deployment cluster-autoscaler -n kube-system
kubectl get nodes
kubectl describe nodes | grep -A 5 "Allocated resources"
kubectl get nodes -o custom-columns=NAME:.metadata.name,CPU:.status.capacity.cpu,MEMORY:.status.capacity.memory监控和调试
bash
kubectl top pods
kubectl top pods -n production
kubectl top nodes
kubectl top pods -l app=webapp
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods"
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1"
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1"
kubectl get events --sort-by='.lastTimestamp'
kubectl describe pod <pod-name>故障排查指南
问题1:HPA无法获取指标
症状
bash
kubectl get hpa webapp-hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
webapp-hpa Deployment/webapp <unknown>/70% 2 10 2 5m排查步骤
bash
kubectl get pods -n kube-system -l k8s-app=metrics-server
kubectl logs -n kube-system -l k8s-app=metrics-server
kubectl top pods
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods"
kubectl describe hpa webapp-hpa解决方案
- 检查Metrics Server是否正常运行
- 验证Metrics Server是否正确配置
- 确认资源请求已设置
- 检查网络策略限制
问题2:HPA扩缩容不稳定
症状
bash
kubectl get hpa webapp-hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
webapp-hpa Deployment/webapp 85%/70% 2 10 5 10m
webapp-hpa Deployment/webapp 65%/70% 2 10 3 11m
webapp-hpa Deployment/webapp 90%/70% 2 10 6 12m排查步骤
bash
kubectl describe hpa webapp-hpa
kubectl get events --field-selector involvedObject.name=webapp-hpa
kubectl top pods -l app=webapp
kubectl get pods -l app=webapp -o wide解决方案
- 调整稳定窗口时间
- 配置扩缩容策略
- 优化指标采集间隔
- 检查应用负载模式
问题3:VPA推荐不合理
症状
bash
kubectl describe vpa webapp-vpa
...
Recommendation:
Container Recommendations:
Container Name: webapp
Lower Bound:
Cpu: 25m
Memory: 262144k
Target:
Cpu: 25m
Memory: 262144k
Uncapped Target:
Cpu: 25m
Memory: 262144k
Upper Bound:
Cpu: 25m
Memory: 262144k排查步骤
bash
kubectl describe vpa webapp-vpa
kubectl top pods -l app=webapp
kubectl get pods -l app=webapp -o jsonpath='{.items[0].spec.containers[0].resources}'
kubectl logs -n kube-system -l app=vpa-recommender解决方案
- 等待更多历史数据
- 调整VPA最小/最大资源限制
- 检查应用资源使用模式
- 验证VPA配置正确
问题4:Cluster Autoscaler不扩容
症状
bash
kubectl get pods -l app=webapp
NAME READY STATUS RESTARTS AGE
webapp-6d4b5d6f5-abcde 0/1 Pending 0 5m
webapp-6d4b5d6f5-fghij 0/1 Pending 0 5m排查步骤
bash
kubectl describe pod webapp-6d4b5d6f5-abcde
kubectl logs -n kube-system -l app=cluster-autoscaler
kubectl get nodes
kubectl get configmap -n kube-system cluster-autoscaler-status -o yaml
kubectl get events --field-selector reason=ClusterAutoscaler解决方案
- 检查节点池配置
- 验证云提供商配额
- 确认节点组标签正确
- 检查资源请求是否合理
问题5:Pod被频繁驱逐
症状
bash
kubectl get events --field-selector reason=Evicted
LAST SEEN TYPE REASON OBJECT MESSAGE
2m Warning Evicted pod/webapp-6d4b5d6f5-abcde Pod was evicted排查步骤
bash
kubectl describe node <node-name>
kubectl get pods --field-selector=status.phase=Failed
kubectl top nodes
kubectl describe pod <evicted-pod>
kubectl get resourcequota -n production解决方案
- 调整资源请求和限制
- 配置Pod优先级
- 优化节点资源分配
- 检查资源配额限制
最佳实践
1. HPA最佳实践
资源请求设置
yaml
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi扩缩容策略
yaml
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 152. VPA最佳实践
更新模式选择
yaml
updatePolicy:
updateMode: Auto资源限制
yaml
resourcePolicy:
containerPolicies:
- containerName: webapp
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 2000m
memory: 4Gi3. Cluster Autoscaler最佳实践
节点池配置
- 设置合理的最小/最大节点数
- 使用多种实例类型
- 配置节点标签和污点
- 优化实例规格选择
扩缩容参数
yaml
- --scale-down-unneeded-time=10m
- --scale-down-delay-after-add=10m
- --max-node-provision-time=15m
- --balance-similar-node-groups=true4. 监控最佳实践
指标采集
- 部署Metrics Server
- 配置Prometheus监控
- 设置告警规则
- 监控扩缩容事件
告警配置
yaml
groups:
- name: autoscaling.rules
rules:
- alert: HPAUnableToScale
expr: |
kube_hpa_status_condition{condition="ScalingLimited", status="true"} == 1
for: 5m
labels:
severity: warning
annotations:
summary: HPA {{ $labels.hpa }} is unable to scale
description: HPA {{ $labels.hpa }} has been unable to scale for 5 minutes
- alert: HPAMaxReplicasReached
expr: |
kube_hpa_status_current_replicas == kube_hpa_spec_max_replicas
for: 10m
labels:
severity: warning
annotations:
summary: HPA {{ $labels.hpa }} reached max replicas
description: HPA {{ $labels.hpa }} has reached max replicas {{ $labels.max_replicas }}5. 成本优化最佳实践
资源规划
- 合理设置资源请求
- 使用Spot/Preemptible实例
- 配置集群自动扩缩容
- 优化Pod调度策略
成本监控
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: cost-monitor
data:
config.yaml: |
resources:
cpu:
cost_per_core: 0.05
memory:
cost_per_gb: 0.01
thresholds:
warning: 0.8
critical: 0.9总结
本章详细介绍了Kubernetes自动扩缩容的核心概念和实践方法:
- 基础概念: 掌握了HPA、VPA、Cluster Autoscaler等核心概念
- HPA配置: 学会了基于CPU、内存、自定义指标的HPA配置
- VPA配置: 理解了垂直扩缩容的工作原理和配置方法
- 集群扩缩容: 掌握了Cluster Autoscaler的部署和配置
- 实践示例: 通过完整案例掌握了自动扩缩容的应用
- 故障排查: 掌握了常见问题的诊断和解决方法
自动扩缩容是Kubernetes的核心能力,为应用的弹性伸缩和资源优化提供了强大的支持。
下一步学习
- 服务网格 - 学习Istio服务网格架构
- Helm Charts - 回顾Helm包管理器使用
- Operator模式 - 深入学习Operator开发
- 自定义资源 - 掌握CRD定义和管理