Skip to content

自动扩缩容

概述

自动扩缩容是Kubernetes的核心能力之一,它可以根据工作负载的需求自动调整资源分配。Kubernetes提供了多种自动扩缩容机制:水平Pod自动扩缩器(HPA)、垂直Pod自动扩缩器(VPA)和集群自动扩缩器(Cluster Autoscaler)。这些机制协同工作,确保应用在负载变化时能够获得足够的资源,同时优化资源利用率,降低运营成本。

核心概念

1. 水平Pod自动扩缩器(HPA)

HPA根据观察到的CPU利用率或其他指标自动扩缩Deployment、ReplicaSet或StatefulSet中的Pod数量:

  • 基于CPU/内存利用率扩缩
  • 支持自定义指标和外部指标
  • 声明式配置,自动调整副本数
  • 适用于无状态应用

2. 垂直Pod自动扩缩器(VPA)

VPA根据容器的实际资源使用情况自动调整CPU和内存的requests和limits:

  • 分析历史资源使用数据
  • 自动设置资源请求和限制
  • 支持自动更新和推荐模式
  • 适用于资源使用模式稳定的应用

3. 集群自动扩缩器(Cluster Autoscaler)

Cluster Autoscaler根据Pod的调度需求和节点利用率自动调整集群节点数量:

  • 监控未调度的Pod
  • 自动添加或删除节点
  • 与云提供商集成
  • 优化集群资源利用率

4. 扩缩容策略

扩缩容策略定义了扩缩容的行为:

  • 扩缩容速率限制
  • 冷却时间
  • 稳定窗口
  • 行为策略

HPA配置

HPA架构

┌─────────────────────────────────────────────────────────────┐
│                    Horizontal Pod Autoscaler                 │
│                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │ Metrics      │ -> │   Decision   │ -> │   Scaling    │  │
│  │ Server       │    │   Engine     │    │   Action     │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│         ↑                   │                   │           │
│         │                   ↓                   ↓           │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │   Metrics    │    │   Desired    │    │   Replica    │  │
│  │   Collection │    │   Replicas   │    │   Count      │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
└─────────────────────────────────────────────────────────────┘

基于CPU的HPA

HPA定义

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
      - type: Pods
        value: 2
        periodSeconds: 60
      selectPolicy: Min
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max

目标Deployment

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: webapp
  template:
    metadata:
      labels:
        app: webapp
    spec:
      containers:
      - name: webapp
        image: nginx:1.25
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
---
apiVersion: v1
kind: Service
metadata:
  name: webapp
  namespace: default
spec:
  selector:
    app: webapp
  ports:
  - port: 80
    targetPort: 80

基于内存的HPA

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-memory-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

基于自定义指标的HPA

部署Metrics Server

bash
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

kubectl get pods -n kube-system -l k8s-app=metrics-server

kubectl top pods

kubectl top nodes

部署Prometheus Adapter

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: adapter-config
  namespace: monitoring
data:
  config.yaml: |
    rules:
    - seriesQuery: 'http_requests_total{kubernetes_namespace!="",kubernetes_pod_name!=""}'
      resources:
        overrides:
          kubernetes_namespace: {resource: namespace}
          kubernetes_pod_name: {resource: pod}
      name:
        matches: "^(.*)_total"
        as: "${1}_per_second"
      metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: custom-metrics-apiserver
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: custom-metrics-apiserver
  template:
    metadata:
      labels:
        app: custom-metrics-apiserver
    spec:
      serviceAccountName: custom-metrics-apiserver
      containers:
      - name: custom-metrics-apiserver
        image: directxman12/k8s-prometheus-adapter:v0.11.0
        args:
        - --secure-port=6443
        - --tls-cert-file=/var/run/serving-cert/serving.crt
        - --tls-private-key-file=/var/run/serving-cert/serving.key
        - --logtostderr=true
        - --prometheus-url=http://prometheus.monitoring.svc:9090/
        - --metrics-relist-interval=30s
        - --v=10
        - --config=/etc/adapter/config.yaml
        ports:
        - containerPort: 6443
        volumeMounts:
        - mountPath: /var/run/serving-cert
          name: volume-serving-cert
          readOnly: false
        - mountPath: /etc/adapter/
          name: config
          readOnly: false
      volumes:
      - name: volume-serving-cert
        secret:
          secretName: adapter-serving-certs
      - name: config
        configMap:
          name: adapter-config

使用自定义指标的HPA

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-custom-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 1000

基于外部指标的HPA

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-external-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: External
    external:
      metric:
        name: queue_messages_ready
        selector:
          matchLabels:
            queue: "myqueue"
      target:
        type: AverageValue
        averageValue: 30

多指标HPA

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-multi-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 1000
  - type: External
    external:
      metric:
        name: queue_messages_ready
        selector:
          matchLabels:
            queue: "myqueue"
      target:
        type: AverageValue
        averageValue: 30
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max

VPA配置

VPA架构

┌─────────────────────────────────────────────────────────────┐
│                    Vertical Pod Autoscaler                    │
│                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │ Recommender  │ -> │   Updater    │ -> │   Admission  │  │
│  │              │    │              │    │   Controller │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│         ↑                   │                   │           │
│         │                   ↓                   ↓           │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │   Metrics    │    │   Pod        │    │   Resource   │  │
│  │   Server     │    │   Eviction   │    │   Update     │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
└─────────────────────────────────────────────────────────────┘

安装VPA

bash
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh

kubectl get pods -n kube-system | grep vpa

VPA定义

基础VPA

yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
    - containerName: webapp
      minAllowed:
        cpu: 100m
        memory: 256Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
      controlledValues: RequestsAndLimits

VPA更新模式

yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa-recommend
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: Off

VPA推荐模式

yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa-off
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: Off
  resourcePolicy:
    containerPolicies:
    - containerName: webapp
      mode: Off

查看VPA推荐

bash
kubectl get vpa webapp-vpa -o yaml

kubectl describe vpa webapp-vpa

kubectl get vpa webapp-vpa -o jsonpath='{.status.recommendation.containerRecommendations}'

VPA与HPA协同

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
    - containerName: webapp
      minAllowed:
        cpu: 100m
        memory: 256Mi
      maxAllowed:
        cpu: 1000m
        memory: 2Gi
      controlledResources: ["cpu", "memory"]

集群自动扩缩

Cluster Autoscaler架构

┌─────────────────────────────────────────────────────────────┐
│                    Cluster Autoscaler                         │
│                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │   Unschedul  │ -> │   Scale      │ -> │   Node       │  │
│  │   Pods       │    │   Decision   │    │   Provision  │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│         ↑                   │                   │           │
│         │                   ↓                   ↓           │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │   Pod        │    │   Cloud      │    │   Node       │  │
│  │   Scheduler  │    │   Provider   │    │   Addition   │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
└─────────────────────────────────────────────────────────────┘

部署Cluster Autoscaler

AWS EKS

yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT_ID:role/cluster-autoscaler
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
  resources: ["events", "endpoints"]
  verbs: ["create", "patch"]
- apiGroups: [""]
  resources: ["pods/eviction"]
  verbs: ["create"]
- apiGroups: [""]
  resources: ["pods/status"]
  verbs: ["update"]
- apiGroups: [""]
  resources: ["endpoints"]
  resourceNames: ["cluster-autoscaler"]
  verbs: ["get", "update"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["watch", "list", "get", "update"]
- apiGroups: [""]
  resources: ["pods", "services", "replicationcontrollers", "persistentvolumeclaims", "persistentvolumes"]
  verbs: ["watch", "list", "get"]
- apiGroups: ["extensions"]
  resources: ["replicasets", "daemonsets"]
  verbs: ["watch", "list", "get"]
- apiGroups: ["policy"]
  resources: ["poddisruptionbudgets"]
  verbs: ["watch", "list"]
- apiGroups: ["apps"]
  resources: ["statefulsets", "replicasets", "daemonsets"]
  verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
  verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
  resources: ["jobs"]
  verbs: ["get", "list", "watch", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
- kind: ServiceAccount
  name: cluster-autoscaler
  namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8085'
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
      - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.25.0
        name: cluster-autoscaler
        resources:
          limits:
            cpu: 100m
            memory: 300Mi
          requests:
            cpu: 100m
            memory: 300Mi
        command:
        - ./cluster-autoscaler
        - --v=4
        - --stderrthreshold=info
        - --cloud-provider=aws
        - --skip-nodes-with-local-storage=false
        - --expander=least-waste
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
        env:
        - name: AWS_REGION
          value: us-west-2
        volumeMounts:
        - name: ssl-certs
          mountPath: /etc/ssl/certs/ca-certificates.crt
          readOnly: true
      volumes:
      - name: ssl-certs
        hostPath:
          path: "/etc/ssl/certs/ca-bundle.crt"

GKE

bash
gcloud container clusters update my-cluster \
  --enable-autoscaling \
  --min-nodes 1 \
  --max-nodes 10 \
  --zone us-central1-a

AKS

bash
az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --enable-cluster-autoscaler \
  --min-count 1 \
  --max-count 10

节点池配置

yaml
apiVersion: exp.infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSMachineDeployment
metadata:
  name: my-cluster-md-0
  namespace: default
spec:
  clusterName: my-cluster
  replicas: 3
  template:
    spec:
      instanceType: m5.large
      iamInstanceProfile: nodes.cluster-api-provider-aws.sigs.k8s.io
      sshKeyName: my-key
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: my-cluster-md-0
  namespace: default
spec:
  clusterName: my-cluster
  replicas: 3
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: my-cluster
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: my-cluster
    spec:
      clusterName: my-cluster
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: my-cluster-md-0
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AWSMachineDeployment
        name: my-cluster-md-0
      version: v1.25.0

Cluster Autoscaler参数

yaml
command:
- ./cluster-autoscaler
- --v=4
- --logtostderr=true
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --skip-nodes-with-system-pods=true
- --expander=least-waste
- --balance-similar-node-groups=true
- --max-node-provision-time=15m
- --max-unready-nodes=100
- --max-unready-percentage=45
- --ok-total-unready-count=3
- --scale-down-unneeded-time=10m
- --scale-down-unready-time=20m
- --scale-down-delay-after-add=10m
- --scale-down-delay-after-delete=10s
- --scale-down-delay-after-failure=3m
- --scale-down-non-empty-candidates-count=30
- --scale-down-candidates-pool-ratio=0.1
- --scale-down-candidates-pool-min-count=50
- --scan-interval=10s
- --max-empty-bulk-delete=10
- --min-replica-count=0
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster

实践示例

示例1:Web应用自动扩缩容

Deployment配置

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: webapp
  template:
    metadata:
      labels:
        app: webapp
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
    spec:
      containers:
      - name: webapp
        image: nginx:1.25
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
        livenessProbe:
          httpGet:
            path: /health
            port: 80
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: webapp
  namespace: production
spec:
  selector:
    app: webapp
  ports:
  - port: 80
    targetPort: 80
  type: ClusterIP

HPA配置

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: webapp-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 1000
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
      - type: Pods
        value: 2
        periodSeconds: 60
      selectPolicy: Min
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max

VPA配置

yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
    - containerName: webapp
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
      controlledValues: RequestsAndLimits

示例2:批处理任务自动扩缩容

Job配置

yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: batch-processor
  namespace: batch
spec:
  parallelism: 3
  completions: 100
  template:
    metadata:
      labels:
        app: batch-processor
    spec:
      containers:
      - name: processor
        image: batch-processor:v1.0.0
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
          limits:
            cpu: 2000m
            memory: 4Gi
        env:
        - name: QUEUE_URL
          value: "sqs://my-queue"
      restartPolicy: OnFailure

基于队列长度的HPA

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: batch-processor-hpa
  namespace: batch
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: batch-processor
  minReplicas: 1
  maxReplicas: 50
  metrics:
  - type: External
    external:
      metric:
        name: sqs_queue_messages_visible
        selector:
          matchLabels:
            queue: "my-queue"
      target:
        type: AverageValue
        averageValue: 10
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 600
      policies:
      - type: Percent
        value: 50
        periodSeconds: 120
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 100
        periodSeconds: 30
      - type: Pods
        value: 10
        periodSeconds: 30
      selectPolicy: Max

示例3:微服务自动扩缩容

微服务Deployment

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-gateway
  namespace: microservices
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-gateway
  template:
    metadata:
      labels:
        app: api-gateway
    spec:
      containers:
      - name: api-gateway
        image: api-gateway:v1.0.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service
  namespace: microservices
spec:
  replicas: 2
  selector:
    matchLabels:
      app: user-service
  template:
    metadata:
      labels:
        app: user-service
    spec:
      containers:
      - name: user-service
        image: user-service:v1.0.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 300m
            memory: 512Mi
          limits:
            cpu: 1000m
            memory: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
  namespace: microservices
spec:
  replicas: 2
  selector:
    matchLabels:
      app: order-service
  template:
    metadata:
      labels:
        app: order-service
    spec:
      containers:
      - name: order-service
        image: order-service:v1.0.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 400m
            memory: 512Mi
          limits:
            cpu: 1500m
            memory: 2Gi

微服务HPA配置

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-gateway-hpa
  namespace: microservices
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-gateway
  minReplicas: 3
  maxReplicas: 30
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 500
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: user-service-hpa
  namespace: microservices
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: user-service
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: order-service-hpa
  namespace: microservices
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: order-service
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80
  - type: External
    external:
      metric:
        name: kafka_consumer_lag
        selector:
          matchLabels:
            topic: "orders"
            group: "order-service"
      target:
        type: AverageValue
        averageValue: 100

kubectl操作命令

HPA管理

bash
kubectl get hpa

kubectl get hpa -n production

kubectl describe hpa webapp-hpa

kubectl get hpa webapp-hpa -o yaml

kubectl apply -f hpa.yaml

kubectl delete hpa webapp-hpa

kubectl autoscale deployment webapp --cpu-percent=70 --min=2 --max=10

kubectl get hpa webapp-hpa -o jsonpath='{.status.currentMetrics}'

kubectl get hpa webapp-hpa -o jsonpath='{.status.desiredReplicas}'

kubectl patch hpa webapp-hpa -p '{"spec":{"maxReplicas":20}}'

kubectl edit hpa webapp-hpa

VPA管理

bash
kubectl get vpa

kubectl describe vpa webapp-vpa

kubectl get vpa webapp-vpa -o yaml

kubectl apply -f vpa.yaml

kubectl delete vpa webapp-vpa

kubectl get vpa webapp-vpa -o jsonpath='{.status.recommendation}'

kubectl get vpa webapp-vpa -o jsonpath='{.status.recommendation.containerRecommendations[0].target}'

Cluster Autoscaler管理

bash
kubectl get pods -n kube-system -l app=cluster-autoscaler

kubectl logs -n kube-system -l app=cluster-autoscaler

kubectl logs -n kube-system -l app=cluster-autoscaler --tail=100

kubectl describe deployment cluster-autoscaler -n kube-system

kubectl get nodes

kubectl describe nodes | grep -A 5 "Allocated resources"

kubectl get nodes -o custom-columns=NAME:.metadata.name,CPU:.status.capacity.cpu,MEMORY:.status.capacity.memory

监控和调试

bash
kubectl top pods

kubectl top pods -n production

kubectl top nodes

kubectl top pods -l app=webapp

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods"

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1"

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1"

kubectl get events --sort-by='.lastTimestamp'

kubectl describe pod <pod-name>

故障排查指南

问题1:HPA无法获取指标

症状

bash
kubectl get hpa webapp-hpa
NAME         REFERENCE           TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
webapp-hpa   Deployment/webapp   <unknown>/70%   2         10        2          5m

排查步骤

bash
kubectl get pods -n kube-system -l k8s-app=metrics-server

kubectl logs -n kube-system -l k8s-app=metrics-server

kubectl top pods

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods"

kubectl describe hpa webapp-hpa

解决方案

  • 检查Metrics Server是否正常运行
  • 验证Metrics Server是否正确配置
  • 确认资源请求已设置
  • 检查网络策略限制

问题2:HPA扩缩容不稳定

症状

bash
kubectl get hpa webapp-hpa -w
NAME         REFERENCE           TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
webapp-hpa   Deployment/webapp   85%/70%    2         10        5          10m
webapp-hpa   Deployment/webapp   65%/70%    2         10        3          11m
webapp-hpa   Deployment/webapp   90%/70%    2         10        6          12m

排查步骤

bash
kubectl describe hpa webapp-hpa

kubectl get events --field-selector involvedObject.name=webapp-hpa

kubectl top pods -l app=webapp

kubectl get pods -l app=webapp -o wide

解决方案

  • 调整稳定窗口时间
  • 配置扩缩容策略
  • 优化指标采集间隔
  • 检查应用负载模式

问题3:VPA推荐不合理

症状

bash
kubectl describe vpa webapp-vpa
...
Recommendation:
  Container Recommendations:
    Container Name:  webapp
    Lower Bound:
      Cpu:     25m
      Memory:  262144k
    Target:
      Cpu:     25m
      Memory:  262144k
    Uncapped Target:
      Cpu:     25m
      Memory:  262144k
    Upper Bound:
      Cpu:     25m
      Memory:  262144k

排查步骤

bash
kubectl describe vpa webapp-vpa

kubectl top pods -l app=webapp

kubectl get pods -l app=webapp -o jsonpath='{.items[0].spec.containers[0].resources}'

kubectl logs -n kube-system -l app=vpa-recommender

解决方案

  • 等待更多历史数据
  • 调整VPA最小/最大资源限制
  • 检查应用资源使用模式
  • 验证VPA配置正确

问题4:Cluster Autoscaler不扩容

症状

bash
kubectl get pods -l app=webapp
NAME                      READY   STATUS    RESTARTS   AGE
webapp-6d4b5d6f5-abcde    0/1     Pending   0          5m
webapp-6d4b5d6f5-fghij    0/1     Pending   0          5m

排查步骤

bash
kubectl describe pod webapp-6d4b5d6f5-abcde

kubectl logs -n kube-system -l app=cluster-autoscaler

kubectl get nodes

kubectl get configmap -n kube-system cluster-autoscaler-status -o yaml

kubectl get events --field-selector reason=ClusterAutoscaler

解决方案

  • 检查节点池配置
  • 验证云提供商配额
  • 确认节点组标签正确
  • 检查资源请求是否合理

问题5:Pod被频繁驱逐

症状

bash
kubectl get events --field-selector reason=Evicted
LAST SEEN   TYPE      REASON    OBJECT                         MESSAGE
2m          Warning   Evicted   pod/webapp-6d4b5d6f5-abcde     Pod was evicted

排查步骤

bash
kubectl describe node <node-name>

kubectl get pods --field-selector=status.phase=Failed

kubectl top nodes

kubectl describe pod <evicted-pod>

kubectl get resourcequota -n production

解决方案

  • 调整资源请求和限制
  • 配置Pod优先级
  • 优化节点资源分配
  • 检查资源配额限制

最佳实践

1. HPA最佳实践

资源请求设置

yaml
resources:
  requests:
    cpu: 200m
    memory: 256Mi
  limits:
    cpu: 500m
    memory: 512Mi

扩缩容策略

yaml
behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 10
      periodSeconds: 60
  scaleUp:
    stabilizationWindowSeconds: 60
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15

2. VPA最佳实践

更新模式选择

yaml
updatePolicy:
  updateMode: Auto

资源限制

yaml
resourcePolicy:
  containerPolicies:
  - containerName: webapp
    minAllowed:
      cpu: 100m
      memory: 128Mi
    maxAllowed:
      cpu: 2000m
      memory: 4Gi

3. Cluster Autoscaler最佳实践

节点池配置

  • 设置合理的最小/最大节点数
  • 使用多种实例类型
  • 配置节点标签和污点
  • 优化实例规格选择

扩缩容参数

yaml
- --scale-down-unneeded-time=10m
- --scale-down-delay-after-add=10m
- --max-node-provision-time=15m
- --balance-similar-node-groups=true

4. 监控最佳实践

指标采集

  • 部署Metrics Server
  • 配置Prometheus监控
  • 设置告警规则
  • 监控扩缩容事件

告警配置

yaml
groups:
- name: autoscaling.rules
  rules:
  - alert: HPAUnableToScale
    expr: |
      kube_hpa_status_condition{condition="ScalingLimited", status="true"} == 1
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: HPA {{ $labels.hpa }} is unable to scale
      description: HPA {{ $labels.hpa }} has been unable to scale for 5 minutes
  
  - alert: HPAMaxReplicasReached
    expr: |
      kube_hpa_status_current_replicas == kube_hpa_spec_max_replicas
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: HPA {{ $labels.hpa }} reached max replicas
      description: HPA {{ $labels.hpa }} has reached max replicas {{ $labels.max_replicas }}

5. 成本优化最佳实践

资源规划

  • 合理设置资源请求
  • 使用Spot/Preemptible实例
  • 配置集群自动扩缩容
  • 优化Pod调度策略

成本监控

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: cost-monitor
data:
  config.yaml: |
    resources:
      cpu:
        cost_per_core: 0.05
      memory:
        cost_per_gb: 0.01
    thresholds:
      warning: 0.8
      critical: 0.9

总结

本章详细介绍了Kubernetes自动扩缩容的核心概念和实践方法:

  1. 基础概念: 掌握了HPA、VPA、Cluster Autoscaler等核心概念
  2. HPA配置: 学会了基于CPU、内存、自定义指标的HPA配置
  3. VPA配置: 理解了垂直扩缩容的工作原理和配置方法
  4. 集群扩缩容: 掌握了Cluster Autoscaler的部署和配置
  5. 实践示例: 通过完整案例掌握了自动扩缩容的应用
  6. 故障排查: 掌握了常见问题的诊断和解决方法

自动扩缩容是Kubernetes的核心能力,为应用的弹性伸缩和资源优化提供了强大的支持。

下一步学习