自动扩缩容

概述

自动扩缩容是Kubernetes的核心能力之一，它可以根据工作负载的需求自动调整资源分配。Kubernetes提供了多种自动扩缩容机制：水平Pod自动扩缩器（HPA）、垂直Pod自动扩缩器（VPA）和集群自动扩缩器（Cluster Autoscaler）。这些机制协同工作，确保应用在负载变化时能够获得足够的资源，同时优化资源利用率，降低运营成本。

核心概念

1. 水平Pod自动扩缩器（HPA）

HPA根据观察到的CPU利用率或其他指标自动扩缩Deployment、ReplicaSet或StatefulSet中的Pod数量：

基于CPU/内存利用率扩缩
支持自定义指标和外部指标
声明式配置，自动调整副本数
适用于无状态应用

2. 垂直Pod自动扩缩器（VPA）

VPA根据容器的实际资源使用情况自动调整CPU和内存的requests和limits：

分析历史资源使用数据
自动设置资源请求和限制
支持自动更新和推荐模式
适用于资源使用模式稳定的应用

3. 集群自动扩缩器（Cluster Autoscaler）

Cluster Autoscaler根据Pod的调度需求和节点利用率自动调整集群节点数量：

监控未调度的Pod
自动添加或删除节点
与云提供商集成
优化集群资源利用率

4. 扩缩容策略

扩缩容策略定义了扩缩容的行为：

扩缩容速率限制
冷却时间
稳定窗口
行为策略

HPA配置

HPA架构

┌─────────────────────────────────────────────────────────────┐
│                    Horizontal Pod Autoscaler                 │
│                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │ Metrics      │ -> │   Decision   │ -> │   Scaling    │  │
│  │ Server       │    │   Engine     │    │   Action     │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│         ↑                   │                   │           │
│         │                   ↓                   ↓           │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │   Metrics    │    │   Desired    │    │   Replica    │  │
│  │   Collection │    │   Replicas   │    │   Count      │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
└─────────────────────────────────────────────────────────────┘

基于CPU的HPA

HPA定义

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
      - type: Pods
        value: 2
        periodSeconds: 60
      selectPolicy: Min
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max

目标Deployment

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: webapp
  template:
    metadata:
      labels:
        app: webapp
    spec:
      containers:
      - name: webapp
        image: nginx:1.25
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
---
apiVersion: v1
kind: Service
metadata:
  name: webapp
  namespace: default
spec:
  selector:
    app: webapp
  ports:
  - port: 80
    targetPort: 80

基于内存的HPA

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-memory-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

基于自定义指标的HPA

部署Metrics Server

bash

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

kubectl get pods -n kube-system -l k8s-app=metrics-server

kubectl top pods

kubectl top nodes

部署Prometheus Adapter

yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: adapter-config
  namespace: monitoring
data:
  config.yaml: |
    rules:
    - seriesQuery: 'http_requests_total{kubernetes_namespace!="",kubernetes_pod_name!=""}'
      resources:
        overrides:
          kubernetes_namespace: {resource: namespace}
          kubernetes_pod_name: {resource: pod}
      name:
        matches: "^(.*)_total"
        as: "${1}_per_second"
      metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: custom-metrics-apiserver
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: custom-metrics-apiserver
  template:
    metadata:
      labels:
        app: custom-metrics-apiserver
    spec:
      serviceAccountName: custom-metrics-apiserver
      containers:
      - name: custom-metrics-apiserver
        image: directxman12/k8s-prometheus-adapter:v0.11.0
        args:
        - --secure-port=6443
        - --tls-cert-file=/var/run/serving-cert/serving.crt
        - --tls-private-key-file=/var/run/serving-cert/serving.key
        - --logtostderr=true
        - --prometheus-url=http://prometheus.monitoring.svc:9090/
        - --metrics-relist-interval=30s
        - --v=10
        - --config=/etc/adapter/config.yaml
        ports:
        - containerPort: 6443
        volumeMounts:
        - mountPath: /var/run/serving-cert
          name: volume-serving-cert
          readOnly: false
        - mountPath: /etc/adapter/
          name: config
          readOnly: false
      volumes:
      - name: volume-serving-cert
        secret:
          secretName: adapter-serving-certs
      - name: config
        configMap:
          name: adapter-config

使用自定义指标的HPA

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-custom-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 1000

基于外部指标的HPA

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-external-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: External
    external:
      metric:
        name: queue_messages_ready
        selector:
          matchLabels:
            queue: "myqueue"
      target:
        type: AverageValue
        averageValue: 30

多指标HPA

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-multi-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 1000
  - type: External
    external:
      metric:
        name: queue_messages_ready
        selector:
          matchLabels:
            queue: "myqueue"
      target:
        type: AverageValue
        averageValue: 30
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max

VPA配置

VPA架构

┌─────────────────────────────────────────────────────────────┐
│                    Vertical Pod Autoscaler                    │
│                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │ Recommender  │ -> │   Updater    │ -> │   Admission  │  │
│  │              │    │              │    │   Controller │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│         ↑                   │                   │           │
│         │                   ↓                   ↓           │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │   Metrics    │    │   Pod        │    │   Resource   │  │
│  │   Server     │    │   Eviction   │    │   Update     │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
└─────────────────────────────────────────────────────────────┘

安装VPA

bash

git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh

kubectl get pods -n kube-system | grep vpa

VPA定义

基础VPA

yaml

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
    - containerName: webapp
      minAllowed:
        cpu: 100m
        memory: 256Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
      controlledValues: RequestsAndLimits

VPA更新模式

yaml

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa-recommend
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: Off

VPA推荐模式

yaml

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa-off
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: Off
  resourcePolicy:
    containerPolicies:
    - containerName: webapp
      mode: Off

查看VPA推荐

bash

kubectl get vpa webapp-vpa -o yaml

kubectl describe vpa webapp-vpa

kubectl get vpa webapp-vpa -o jsonpath='{.status.recommendation.containerRecommendations}'

VPA与HPA协同

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
    - containerName: webapp
      minAllowed:
        cpu: 100m
        memory: 256Mi
      maxAllowed:
        cpu: 1000m
        memory: 2Gi
      controlledResources: ["cpu", "memory"]

集群自动扩缩

Cluster Autoscaler架构

┌─────────────────────────────────────────────────────────────┐
│                    Cluster Autoscaler                         │
│                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │   Unschedul  │ -> │   Scale      │ -> │   Node       │  │
│  │   Pods       │    │   Decision   │    │   Provision  │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│         ↑                   │                   │           │
│         │                   ↓                   ↓           │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │   Pod        │    │   Cloud      │    │   Node       │  │
│  │   Scheduler  │    │   Provider   │    │   Addition   │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
└─────────────────────────────────────────────────────────────┘

部署Cluster Autoscaler

AWS EKS

yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT_ID:role/cluster-autoscaler
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
  resources: ["events", "endpoints"]
  verbs: ["create", "patch"]
- apiGroups: [""]
  resources: ["pods/eviction"]
  verbs: ["create"]
- apiGroups: [""]
  resources: ["pods/status"]
  verbs: ["update"]
- apiGroups: [""]
  resources: ["endpoints"]
  resourceNames: ["cluster-autoscaler"]
  verbs: ["get", "update"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["watch", "list", "get", "update"]
- apiGroups: [""]
  resources: ["pods", "services", "replicationcontrollers", "persistentvolumeclaims", "persistentvolumes"]
  verbs: ["watch", "list", "get"]
- apiGroups: ["extensions"]
  resources: ["replicasets", "daemonsets"]
  verbs: ["watch", "list", "get"]
- apiGroups: ["policy"]
  resources: ["poddisruptionbudgets"]
  verbs: ["watch", "list"]
- apiGroups: ["apps"]
  resources: ["statefulsets", "replicasets", "daemonsets"]
  verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
  verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
  resources: ["jobs"]
  verbs: ["get", "list", "watch", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
- kind: ServiceAccount
  name: cluster-autoscaler
  namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8085'
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
      - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.25.0
        name: cluster-autoscaler
        resources:
          limits:
            cpu: 100m
            memory: 300Mi
          requests:
            cpu: 100m
            memory: 300Mi
        command:
        - ./cluster-autoscaler
        - --v=4
        - --stderrthreshold=info
        - --cloud-provider=aws
        - --skip-nodes-with-local-storage=false
        - --expander=least-waste
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
        env:
        - name: AWS_REGION
          value: us-west-2
        volumeMounts:
        - name: ssl-certs
          mountPath: /etc/ssl/certs/ca-certificates.crt
          readOnly: true
      volumes:
      - name: ssl-certs
        hostPath:
          path: "/etc/ssl/certs/ca-bundle.crt"

GKE

bash

gcloud container clusters update my-cluster \
  --enable-autoscaling \
  --min-nodes 1 \
  --max-nodes 10 \
  --zone us-central1-a

AKS

bash

az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --enable-cluster-autoscaler \
  --min-count 1 \
  --max-count 10

节点池配置

yaml

apiVersion: exp.infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSMachineDeployment
metadata:
  name: my-cluster-md-0
  namespace: default
spec:
  clusterName: my-cluster
  replicas: 3
  template:
    spec:
      instanceType: m5.large
      iamInstanceProfile: nodes.cluster-api-provider-aws.sigs.k8s.io
      sshKeyName: my-key
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: my-cluster-md-0
  namespace: default
spec:
  clusterName: my-cluster
  replicas: 3
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: my-cluster
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: my-cluster
    spec:
      clusterName: my-cluster
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: my-cluster-md-0
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AWSMachineDeployment
        name: my-cluster-md-0
      version: v1.25.0

Cluster Autoscaler参数

yaml

command:
- ./cluster-autoscaler
- --v=4
- --logtostderr=true
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --skip-nodes-with-system-pods=true
- --expander=least-waste
- --balance-similar-node-groups=true
- --max-node-provision-time=15m
- --max-unready-nodes=100
- --max-unready-percentage=45
- --ok-total-unready-count=3
- --scale-down-unneeded-time=10m
- --scale-down-unready-time=20m
- --scale-down-delay-after-add=10m
- --scale-down-delay-after-delete=10s
- --scale-down-delay-after-failure=3m
- --scale-down-non-empty-candidates-count=30
- --scale-down-candidates-pool-ratio=0.1
- --scale-down-candidates-pool-min-count=50
- --scan-interval=10s
- --max-empty-bulk-delete=10
- --min-replica-count=0
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster

实践示例

示例1：Web应用自动扩缩容

Deployment配置

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: webapp
  template:
    metadata:
      labels:
        app: webapp
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
    spec:
      containers:
      - name: webapp
        image: nginx:1.25
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
        livenessProbe:
          httpGet:
            path: /health
            port: 80
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: webapp
  namespace: production
spec:
  selector:
    app: webapp
  ports:
  - port: 80
    targetPort: 80
  type: ClusterIP

HPA配置

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 1000
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
      - type: Pods
        value: 2
        periodSeconds: 60
      selectPolicy: Min
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max

VPA配置

yaml

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
    - containerName: webapp
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
      controlledValues: RequestsAndLimits

示例2：批处理任务自动扩缩容

Job配置

yaml

apiVersion: batch/v1
kind: Job
metadata:
  name: batch-processor
  namespace: batch
spec:
  parallelism: 3
  completions: 100
  template:
    metadata:
      labels:
        app: batch-processor
    spec:
      containers:
      - name: processor
        image: batch-processor:v1.0.0
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
          limits:
            cpu: 2000m
            memory: 4Gi
        env:
        - name: QUEUE_URL
          value: "sqs://my-queue"
      restartPolicy: OnFailure

基于队列长度的HPA

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: batch-processor-hpa
  namespace: batch
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: batch-processor
  minReplicas: 1
  maxReplicas: 50
  metrics:
  - type: External
    external:
      metric:
        name: sqs_queue_messages_visible
        selector:
          matchLabels:
            queue: "my-queue"
      target:
        type: AverageValue
        averageValue: 10
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 600
      policies:
      - type: Percent
        value: 50
        periodSeconds: 120
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 100
        periodSeconds: 30
      - type: Pods
        value: 10
        periodSeconds: 30
      selectPolicy: Max

示例3：微服务自动扩缩容

微服务Deployment

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-gateway
  namespace: microservices
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-gateway
  template:
    metadata:
      labels:
        app: api-gateway
    spec:
      containers:
      - name: api-gateway
        image: api-gateway:v1.0.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service
  namespace: microservices
spec:
  replicas: 2
  selector:
    matchLabels:
      app: user-service
  template:
    metadata:
      labels:
        app: user-service
    spec:
      containers:
      - name: user-service
        image: user-service:v1.0.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 300m
            memory: 512Mi
          limits:
            cpu: 1000m
            memory: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
  namespace: microservices
spec:
  replicas: 2
  selector:
    matchLabels:
      app: order-service
  template:
    metadata:
      labels:
        app: order-service
    spec:
      containers:
      - name: order-service
        image: order-service:v1.0.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 400m
            memory: 512Mi
          limits:
            cpu: 1500m
            memory: 2Gi

微服务HPA配置

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-gateway-hpa
  namespace: microservices
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-gateway
  minReplicas: 3
  maxReplicas: 30
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 500
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: user-service-hpa
  namespace: microservices
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: user-service
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: order-service-hpa
  namespace: microservices
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: order-service
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80
  - type: External
    external:
      metric:
        name: kafka_consumer_lag
        selector:
          matchLabels:
            topic: "orders"
            group: "order-service"
      target:
        type: AverageValue
        averageValue: 100

kubectl操作命令

HPA管理

bash

kubectl get hpa

kubectl get hpa -n production

kubectl describe hpa webapp-hpa

kubectl get hpa webapp-hpa -o yaml

kubectl apply -f hpa.yaml

kubectl delete hpa webapp-hpa

kubectl autoscale deployment webapp --cpu-percent=70 --min=2 --max=10

kubectl get hpa webapp-hpa -o jsonpath='{.status.currentMetrics}'

kubectl get hpa webapp-hpa -o jsonpath='{.status.desiredReplicas}'

kubectl patch hpa webapp-hpa -p '{"spec":{"maxReplicas":20}}'

kubectl edit hpa webapp-hpa

VPA管理

bash

kubectl get vpa

kubectl describe vpa webapp-vpa

kubectl get vpa webapp-vpa -o yaml

kubectl apply -f vpa.yaml

kubectl delete vpa webapp-vpa

kubectl get vpa webapp-vpa -o jsonpath='{.status.recommendation}'

kubectl get vpa webapp-vpa -o jsonpath='{.status.recommendation.containerRecommendations[0].target}'

Cluster Autoscaler管理

bash

kubectl get pods -n kube-system -l app=cluster-autoscaler

kubectl logs -n kube-system -l app=cluster-autoscaler

kubectl logs -n kube-system -l app=cluster-autoscaler --tail=100

kubectl describe deployment cluster-autoscaler -n kube-system

kubectl get nodes

kubectl describe nodes | grep -A 5 "Allocated resources"

kubectl get nodes -o custom-columns=NAME:.metadata.name,CPU:.status.capacity.cpu,MEMORY:.status.capacity.memory

监控和调试

bash

kubectl top pods

kubectl top pods -n production

kubectl top nodes

kubectl top pods -l app=webapp

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods"

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1"

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1"

kubectl get events --sort-by='.lastTimestamp'

kubectl describe pod <pod-name>

故障排查指南

问题1：HPA无法获取指标

症状

bash

kubectl get hpa webapp-hpa
NAME         REFERENCE           TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
webapp-hpa   Deployment/webapp   <unknown>/70%   2         10        2          5m

排查步骤

bash

kubectl get pods -n kube-system -l k8s-app=metrics-server

kubectl logs -n kube-system -l k8s-app=metrics-server

kubectl top pods

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods"

kubectl describe hpa webapp-hpa

解决方案

检查Metrics Server是否正常运行
验证Metrics Server是否正确配置
确认资源请求已设置
检查网络策略限制

问题2：HPA扩缩容不稳定

症状

bash

kubectl get hpa webapp-hpa -w
NAME         REFERENCE           TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
webapp-hpa   Deployment/webapp   85%/70%    2         10        5          10m
webapp-hpa   Deployment/webapp   65%/70%    2         10        3          11m
webapp-hpa   Deployment/webapp   90%/70%    2         10        6          12m

排查步骤

bash

kubectl describe hpa webapp-hpa

kubectl get events --field-selector involvedObject.name=webapp-hpa

kubectl top pods -l app=webapp

kubectl get pods -l app=webapp -o wide

解决方案

调整稳定窗口时间
配置扩缩容策略
优化指标采集间隔
检查应用负载模式

问题3：VPA推荐不合理

症状

bash

kubectl describe vpa webapp-vpa
...
Recommendation:
  Container Recommendations:
    Container Name:  webapp
    Lower Bound:
      Cpu:     25m
      Memory:  262144k
    Target:
      Cpu:     25m
      Memory:  262144k
    Uncapped Target:
      Cpu:     25m
      Memory:  262144k
    Upper Bound:
      Cpu:     25m
      Memory:  262144k

排查步骤

bash

kubectl describe vpa webapp-vpa

kubectl top pods -l app=webapp

kubectl get pods -l app=webapp -o jsonpath='{.items[0].spec.containers[0].resources}'

kubectl logs -n kube-system -l app=vpa-recommender

解决方案

等待更多历史数据
调整VPA最小/最大资源限制
检查应用资源使用模式
验证VPA配置正确

问题4：Cluster Autoscaler不扩容

症状

bash

kubectl get pods -l app=webapp
NAME                      READY   STATUS    RESTARTS   AGE
webapp-6d4b5d6f5-abcde    0/1     Pending   0          5m
webapp-6d4b5d6f5-fghij    0/1     Pending   0          5m

排查步骤

bash

kubectl describe pod webapp-6d4b5d6f5-abcde

kubectl logs -n kube-system -l app=cluster-autoscaler

kubectl get nodes

kubectl get configmap -n kube-system cluster-autoscaler-status -o yaml

kubectl get events --field-selector reason=ClusterAutoscaler

解决方案

检查节点池配置
验证云提供商配额
确认节点组标签正确
检查资源请求是否合理

问题5：Pod被频繁驱逐

症状

bash

kubectl get events --field-selector reason=Evicted
LAST SEEN   TYPE      REASON    OBJECT                         MESSAGE
2m          Warning   Evicted   pod/webapp-6d4b5d6f5-abcde     Pod was evicted

排查步骤

bash

kubectl describe node <node-name>

kubectl get pods --field-selector=status.phase=Failed

kubectl top nodes

kubectl describe pod <evicted-pod>

kubectl get resourcequota -n production

解决方案

调整资源请求和限制
配置Pod优先级
优化节点资源分配
检查资源配额限制

最佳实践

1. HPA最佳实践

资源请求设置

yaml

resources:
  requests:
    cpu: 200m
    memory: 256Mi
  limits:
    cpu: 500m
    memory: 512Mi

扩缩容策略

yaml

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 10
      periodSeconds: 60
  scaleUp:
    stabilizationWindowSeconds: 60
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15

2. VPA最佳实践

更新模式选择

yaml

updatePolicy:
  updateMode: Auto

资源限制

yaml

resourcePolicy:
  containerPolicies:
  - containerName: webapp
    minAllowed:
      cpu: 100m
      memory: 128Mi
    maxAllowed:
      cpu: 2000m
      memory: 4Gi

3. Cluster Autoscaler最佳实践

节点池配置

设置合理的最小/最大节点数
使用多种实例类型
配置节点标签和污点
优化实例规格选择

扩缩容参数

yaml

- --scale-down-unneeded-time=10m
- --scale-down-delay-after-add=10m
- --max-node-provision-time=15m
- --balance-similar-node-groups=true

4. 监控最佳实践

指标采集

部署Metrics Server
配置Prometheus监控
设置告警规则
监控扩缩容事件

告警配置

yaml

groups:
- name: autoscaling.rules
  rules:
  - alert: HPAUnableToScale
    expr: |
      kube_hpa_status_condition{condition="ScalingLimited", status="true"} == 1
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: HPA {{ $labels.hpa }} is unable to scale
      description: HPA {{ $labels.hpa }} has been unable to scale for 5 minutes
  
  - alert: HPAMaxReplicasReached
    expr: |
      kube_hpa_status_current_replicas == kube_hpa_spec_max_replicas
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: HPA {{ $labels.hpa }} reached max replicas
      description: HPA {{ $labels.hpa }} has reached max replicas {{ $labels.max_replicas }}

5. 成本优化最佳实践

资源规划

合理设置资源请求
使用Spot/Preemptible实例
配置集群自动扩缩容
优化Pod调度策略

成本监控

yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: cost-monitor
data:
  config.yaml: |
    resources:
      cpu:
        cost_per_core: 0.05
      memory:
        cost_per_gb: 0.01
    thresholds:
      warning: 0.8
      critical: 0.9

总结

本章详细介绍了Kubernetes自动扩缩容的核心概念和实践方法：

基础概念: 掌握了HPA、VPA、Cluster Autoscaler等核心概念
HPA配置: 学会了基于CPU、内存、自定义指标的HPA配置
VPA配置: 理解了垂直扩缩容的工作原理和配置方法
集群扩缩容: 掌握了Cluster Autoscaler的部署和配置
实践示例: 通过完整案例掌握了自动扩缩容的应用
故障排查: 掌握了常见问题的诊断和解决方法

自动扩缩容是Kubernetes的核心能力，为应用的弹性伸缩和资源优化提供了强大的支持。

下一步学习

服务网格 - 学习Istio服务网格架构
Helm Charts - 回顾Helm包管理器使用
Operator模式 - 深入学习Operator开发
自定义资源 - 掌握CRD定义和管理

自动扩缩容 ​

概述 ​

核心概念 ​

1. 水平Pod自动扩缩器（HPA） ​

2. 垂直Pod自动扩缩器（VPA） ​

3. 集群自动扩缩器（Cluster Autoscaler） ​

4. 扩缩容策略 ​

HPA配置 ​

HPA架构 ​

基于CPU的HPA ​

HPA定义 ​

目标Deployment ​

基于内存的HPA ​

基于自定义指标的HPA ​

部署Metrics Server ​

部署Prometheus Adapter ​

使用自定义指标的HPA ​

基于外部指标的HPA ​

多指标HPA ​

VPA配置 ​

VPA架构 ​

安装VPA ​

VPA定义 ​

基础VPA ​

VPA更新模式 ​

VPA推荐模式 ​

查看VPA推荐 ​

VPA与HPA协同 ​

集群自动扩缩 ​

Cluster Autoscaler架构 ​

部署Cluster Autoscaler ​

AWS EKS ​

GKE ​

AKS ​

节点池配置 ​

Cluster Autoscaler参数 ​

实践示例 ​

示例1：Web应用自动扩缩容 ​

Deployment配置 ​

HPA配置 ​

VPA配置 ​

示例2：批处理任务自动扩缩容 ​

Job配置 ​

基于队列长度的HPA ​

示例3：微服务自动扩缩容 ​

微服务Deployment ​

微服务HPA配置 ​

kubectl操作命令 ​

HPA管理 ​

VPA管理 ​

Cluster Autoscaler管理 ​

监控和调试 ​

故障排查指南 ​

问题1：HPA无法获取指标 ​

问题2：HPA扩缩容不稳定 ​

问题3：VPA推荐不合理 ​

问题4：Cluster Autoscaler不扩容 ​

问题5：Pod被频繁驱逐 ​

最佳实践 ​

1. HPA最佳实践 ​

资源请求设置 ​

扩缩容策略 ​

2. VPA最佳实践 ​

更新模式选择 ​

资源限制 ​

3. Cluster Autoscaler最佳实践 ​

节点池配置 ​

扩缩容参数 ​

4. 监控最佳实践 ​

指标采集 ​

告警配置 ​

5. 成本优化最佳实践 ​

资源规划 ​

成本监控 ​

总结 ​

下一步学习 ​

自动扩缩容

概述

核心概念

1. 水平Pod自动扩缩器（HPA）

2. 垂直Pod自动扩缩器（VPA）

3. 集群自动扩缩器（Cluster Autoscaler）

4. 扩缩容策略

HPA配置

HPA架构

基于CPU的HPA

HPA定义

目标Deployment

基于内存的HPA

基于自定义指标的HPA

部署Metrics Server

部署Prometheus Adapter

使用自定义指标的HPA

基于外部指标的HPA

多指标HPA

VPA配置

VPA架构

安装VPA

VPA定义

基础VPA

VPA更新模式

VPA推荐模式

查看VPA推荐

VPA与HPA协同

集群自动扩缩

Cluster Autoscaler架构

部署Cluster Autoscaler

AWS EKS

GKE

AKS

节点池配置

Cluster Autoscaler参数

实践示例

示例1：Web应用自动扩缩容

Deployment配置

HPA配置

VPA配置

示例2：批处理任务自动扩缩容

Job配置

基于队列长度的HPA

示例3：微服务自动扩缩容

微服务Deployment

微服务HPA配置

kubectl操作命令

HPA管理

VPA管理

Cluster Autoscaler管理

监控和调试

故障排查指南

问题1：HPA无法获取指标

问题2：HPA扩缩容不稳定

问题3：VPA推荐不合理

问题4：Cluster Autoscaler不扩容

问题5：Pod被频繁驱逐

最佳实践

1. HPA最佳实践

资源请求设置

扩缩容策略

2. VPA最佳实践

更新模式选择

资源限制

3. Cluster Autoscaler最佳实践

节点池配置

扩缩容参数

4. 监控最佳实践

指标采集

告警配置

5. 成本优化最佳实践

资源规划

成本监控

总结

下一步学习