Skip to content

StorageClass(存储类)

概述

StorageClass是Kubernetes中用于定义存储类型的资源对象,它提供了动态存储供应的能力。通过StorageClass,管理员可以定义不同性能、不同特性的存储类型,用户可以通过PVC自动创建所需的存储资源。

核心概念

动态供应(Dynamic Provisioning)

  • 根据PVC请求自动创建PV
  • 无需管理员手动预配置PV
  • 支持多种存储后端(云存储、NFS、Ceph等)

存储类特性

  • 定义存储性能等级(SSD、HDD等)
  • 配置存储参数(IOPS、吞吐量等)
  • 设置回收策略和绑定模式
  • 支持存储扩容

工作流程

用户创建PVC

指定StorageClass

Provisioner检测PVC请求

调用存储后端API创建存储

自动创建PV并绑定PVC

Pod使用PVC

StorageClass核心字段

主要配置项

字段说明示例
provisioner存储供应器kubernetes.io/aws-ebs
parameters存储参数type: gp2
reclaimPolicy回收策略Delete/Retain
volumeBindingMode绑定模式Immediate/WaitForFirstConsumer
allowVolumeExpansion是否允许扩容true/false
mountOptions挂载选项["hard", "nfsvers=4.1"]

绑定模式

模式说明适用场景
Immediate立即绑定并创建PV云存储、网络存储
WaitForFirstConsumer等待第一个消费者Pod调度后再创建本地存储、区域存储

完整YAML配置示例

1. AWS EBS StorageClass

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: aws-ebs-gp2
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  fsType: ext4
  encrypted: "true"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: topology.kubernetes.io/zone
    values:
    - us-east-1a
    - us-east-1b

2. Azure Disk StorageClass

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: azure-disk-premium
provisioner: kubernetes.io/azure-disk
parameters:
  storageaccounttype: Premium_LRS
  kind: Managed
  cachingmode: ReadOnly
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

3. GCE Persistent Disk StorageClass

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gce-pd-ssd
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
  replication-type: regional-pd
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: topology.gke.io/zone
    values:
    - us-central1-a
    - us-central1-b

4. NFS StorageClass(使用NFS Provisioner)

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-client
provisioner: nfs-storage.example.com/nfs
parameters:
  archiveOnDelete: "true"
  pathPattern: "${.PVC.namespace}/${.PVC.name}"
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
  - hard
  - nfsvers=4.1
  - noatime

5. Ceph RBD StorageClass

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ceph-rbd
provisioner: rbd.csi.ceph.com
parameters:
  clusterID: ceph-cluster
  pool: kubernetes
  imageFormat: "2"
  imageFeatures: layering
  csi.storage.k8s.io/provisioner-secret-name: ceph-secret
  csi.storage.k8s.io/provisioner-secret-namespace: ceph
  csi.storage.k8s.io/node-stage-secret-name: ceph-secret
  csi.storage.k8s.io/node-stage-secret-namespace: ceph
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
mountOptions:
  - discard

6. Local StorageClass

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete

7. 高性能SSD StorageClass

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
  annotations:
    description: "High performance SSD storage for databases"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io1
  iopsPerGB: "50"
  fsType: xfs
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: node.kubernetes.io/instance-type
    values:
    - m5.large
    - m5.xlarge
    - c5.large

8. 对象存储StorageClass(MinIO)

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: minio-object-storage
provisioner: minio.direct.csi.min.io
parameters:
  csi.storage.k8s.io/provisioner-secret-name: minio-secret
  csi.storage.k8s.io/provisioner-secret-namespace: minio
reclaimPolicy: Delete
volumeBindingMode: Immediate

kubectl操作命令

StorageClass管理命令

bash
# 查看所有StorageClass
kubectl get storageclass
kubectl get sc

# 查看StorageClass详细信息
kubectl describe sc aws-ebs-gp2

# 查看默认StorageClass
kubectl get sc -o jsonpath='{.items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")].metadata.name}'

# 创建StorageClass
kubectl apply -f storageclass.yaml

# 删除StorageClass
kubectl delete sc aws-ebs-gp2

# 编辑StorageClass
kubectl edit sc aws-ebs-gp2

# 查看StorageClass的YAML配置
kubectl get sc aws-ebs-gp2 -o yaml

# 设置默认StorageClass
kubectl patch sc aws-ebs-gp2 -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

# 取消默认StorageClass
kubectl patch sc aws-ebs-gp2 -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'

PVC使用StorageClass

bash
# 创建使用特定StorageClass的PVC
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: aws-ebs-gp2
  resources:
    requests:
      storage: 10Gi
EOF

# 查看PVC使用的StorageClass
kubectl get pvc -o custom-columns=NAME:.metadata.name,STORAGECLASS:.spec.storageClassName,STATUS:.status.phase

# 查看StorageClass创建的PV
kubectl get pv -o json | jq '.items[] | select(.spec.storageClassName=="aws-ebs-gp2")'

故障排查命令

bash
# 查看Provisioner日志
kubectl logs -n kube-system deployment/nfs-provisioner

# 查看StorageClass事件
kubectl describe sc aws-ebs-gp2

# 检查PVC绑定状态
kubectl get pvc -o wide

# 查看动态创建的PV
kubectl get pv -l storage.k8s.io/storage-provisioner=kubernetes.io/aws-ebs

# 检查存储后端连接
kubectl exec -it pod-name -- ls -la /data

真实场景实践示例

场景1:多环境存储管理

需求:为开发、测试、生产环境配置不同的存储策略。

yaml
# 1. 开发环境 - 低成本存储
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: dev-storage
  annotations:
    description: "Low cost storage for development environment"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  fsType: ext4
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
---
# 2. 测试环境 - 中等性能存储
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: test-storage
  annotations:
    description: "Medium performance storage for testing"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  fsType: ext4
  encrypted: "true"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
# 3. 生产环境 - 高性能存储
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: prod-storage
  annotations:
    description: "High performance encrypted storage for production"
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io1
  iopsPerGB: "50"
  fsType: xfs
  encrypted: "true"
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
# 4. 开发环境命名空间配额
apiVersion: v1
kind: ResourceQuota
metadata:
  name: storage-quota-dev
  namespace: development
spec:
  hard:
    requests.storage: "100Gi"
    persistentvolumeclaims: "20"
---
# 5. 测试环境命名空间配额
apiVersion: v1
kind: ResourceQuota
metadata:
  name: storage-quota-test
  namespace: testing
spec:
  hard:
    requests.storage: "500Gi"
    persistentvolumeclaims: "30"
---
# 6. 生产环境命名空间配额
apiVersion: v1
kind: ResourceQuota
metadata:
  name: storage-quota-prod
  namespace: production
spec:
  hard:
    requests.storage: "2Ti"
    persistentvolumeclaims: "50"

使用示例

yaml
# 开发环境应用
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-data
  namespace: development
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: dev-storage
  resources:
    requests:
      storage: 10Gi
---
# 生产环境应用
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-data
  namespace: production
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: prod-storage
  resources:
    requests:
      storage: 100Gi

场景2:数据库存储分层

需求:为不同类型的数据库配置不同性能等级的存储。

yaml
# 1. MySQL高IOPS存储
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: mysql-high-iops
  annotations:
    description: "High IOPS storage for MySQL databases"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io1
  iopsPerGB: "100"
  fsType: xfs
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: topology.kubernetes.io/zone
    values:
    - us-east-1a
---
# 2. PostgreSQL存储
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: postgresql-storage
  annotations:
    description: "Optimized storage for PostgreSQL"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3
  fsType: ext4
  encrypted: "true"
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
# 3. MongoDB存储
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: mongodb-storage
  annotations:
    description: "Storage for MongoDB with good throughput"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3
  fsType: xfs
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
# 4. Redis缓存存储
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: redis-cache
  annotations:
    description: "Fast storage for Redis cache"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io1
  iopsPerGB: "200"
  fsType: ext4
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
# 5. MySQL StatefulSet示例
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
  namespace: database
spec:
  serviceName: mysql-headless
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        ports:
        - containerPort: 3306
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: root-password
        volumeMounts:
        - name: mysql-data
          mountPath: /var/lib/mysql
        resources:
          requests:
            cpu: 1000m
            memory: 2Gi
          limits:
            cpu: 4000m
            memory: 8Gi
  volumeClaimTemplates:
  - metadata:
      name: mysql-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: mysql-high-iops
      resources:
        requests:
          storage: 100Gi

场景3:混合云存储策略

需求:在混合云环境中,根据应用特性选择不同的存储后端。

yaml
# 1. 本地SSD存储(用于高性能计算)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-ssd
  annotations:
    description: "Local SSD for high performance computing"
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
allowedTopologies:
- matchLabelExpressions:
  - key: node.kubernetes.io/storage
    values:
    - local-ssd
---
# 2. NFS共享存储(用于文件共享)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-shared
  annotations:
    description: "NFS shared storage for file sharing"
provisioner: nfs-client
parameters:
  archiveOnDelete: "true"
reclaimPolicy: Retain
volumeBindingMode: Immediate
mountOptions:
  - hard
  - nfsvers=4.1
  - noatime
  - rsize=1048576
  - wsize=1048576
---
# 3. Ceph分布式存储(用于对象存储)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ceph-object
  annotations:
    description: "Ceph object storage for unstructured data"
provisioner: rbd.csi.ceph.com
parameters:
  clusterID: ceph-cluster
  pool: object-pool
  imageFormat: "2"
  imageFeatures: layering
  csi.storage.k8s.io/provisioner-secret-name: ceph-secret
  csi.storage.k8s.io/provisioner-secret-namespace: ceph
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
---
# 4. 云存储(用于备份和归档)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: cloud-archive
  annotations:
    description: "Cloud storage for backup and archival"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: sc1
  fsType: ext4
reclaimPolicy: Retain
volumeBindingMode: Immediate
---
# 5. 本地PV配置
apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv-1
spec:
  capacity:
    storage: 500Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: local-ssd
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - node-1
---
# 6. 高性能计算应用
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ml-training
  namespace: ml
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ml-training
  template:
    metadata:
      labels:
        app: ml-training
    spec:
      containers:
      - name: tensorflow
        image: tensorflow/tensorflow:latest-gpu
        command: ["python", "train.py"]
        volumeMounts:
        - name: training-data
          mountPath: /data
        - name: model-output
          mountPath: /models
        resources:
          limits:
            nvidia.com/gpu: 2
      volumes:
      - name: training-data
        persistentVolumeClaim:
          claimName: training-data-pvc
      - name: model-output
        persistentVolumeClaim:
          claimName: model-output-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: training-data-pvc
  namespace: ml
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: local-ssd
  resources:
    requests:
      storage: 400Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: model-output-pvc
  namespace: ml
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: nfs-shared
  resources:
    requests:
      storage: 50Gi

场景4:存储快照和克隆

需求:为数据库创建快照备份,并支持快速克隆用于测试。

yaml
# 1. 支持快照的StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: snapshot-enabled
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  encrypted: "true"
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
# 2. VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-snapclass
driver: ebs.csi.aws.com
deletionPolicy: Delete
parameters:
  tagSpecification_1: "Name=snapshot-${PV_NAME}"
  tagSpecification_2: "CreatedBy=VolumeSnapshot"
---
# 3. 创建VolumeSnapshot
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: mysql-snapshot-20240101
  namespace: database
spec:
  volumeSnapshotClassName: csi-snapclass
  source:
    persistentVolumeClaimName: mysql-data
---
# 4. 从快照恢复PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-data-restored
  namespace: database
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: snapshot-enabled
  dataSource:
    name: mysql-snapshot-20240101
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  resources:
    requests:
      storage: 100Gi
---
# 5. 从PVC克隆
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-data-clone
  namespace: database
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: snapshot-enabled
  dataSource:
    name: mysql-data
    kind: PersistentVolumeClaim
  resources:
    requests:
      storage: 100Gi
---
# 6. 自动快照CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
  name: mysql-snapshot-job
  namespace: database
spec:
  schedule: "0 2 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: snapshot-creator
          containers:
          - name: snapshot
            image: bitnami/kubectl:latest
            command:
            - /bin/sh
            - -c
            - |
              TIMESTAMP=$(date +%Y%m%d_%H%M%S)
              kubectl create volumesnapshot mysql-snapshot-${TIMESTAMP} \
                --namespace=database \
                --class=csi-snapclass \
                --source=mysql-data
            env:
            - name: KUBERNETES_SERVICE_HOST
              value: "kubernetes.default.svc"
            - name: KUBERNETES_SERVICE_PORT
              value: "443"
          restartPolicy: OnFailure

故障排查指南

常见问题及解决方案

1. PVC无法自动创建PV

症状

bash
$ kubectl get pvc
NAME       STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
my-pvc     Pending                                       aws-ebs-gp2    10m

排查步骤

bash
# 1. 查看PVC事件
kubectl describe pvc my-pvc

# 2. 检查StorageClass是否存在
kubectl get sc aws-ebs-gp2

# 3. 检查Provisioner是否运行
kubectl get pods -n kube-system | grep provisioner

# 4. 查看Provisioner日志
kubectl logs -n kube-system deployment/ebs-csi-controller

# 5. 检查云服务商权限
kubectl describe clusterrole ebs-csi-controller

可能原因

  • StorageClass不存在或配置错误
  • Provisioner未安装或未运行
  • 云服务商权限不足
  • 存储配额已满
  • 参数配置错误

解决方案

bash
# 检查StorageClass配置
kubectl get sc aws-ebs-gp2 -o yaml

# 确保Provisioner正常运行
kubectl get pods -n kube-system -l app=ebs-csi-controller

# 检查云服务商权限
kubectl auth can-i create persistentvolumes --as=system:serviceaccount:kube-system:ebs-csi-controller

# 验证参数配置
kubectl describe sc aws-ebs-gp2

2. 存储创建超时

症状

Warning  ProvisioningFailed  Failed to provision volume: timeout waiting for volume creation

排查步骤

bash
# 1. 查看Provisioner日志
kubectl logs -n kube-system deployment/ebs-csi-controller --tail=100

# 2. 检查云服务商API状态
kubectl exec -it ebs-csi-controller-xxx -n kube-system -- aws ec2 describe-volumes

# 3. 检查网络连接
kubectl exec -it ebs-csi-controller-xxx -n kube-system -- ping api.ec2.amazonaws.com

# 4. 查看资源配额
kubectl describe resourcequota -n default

解决方案

yaml
# 调整Provisioner超时配置
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ebs-csi-controller
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: ebs-plugin
        env:
        - name: CSI_TIMEOUT
          value: "120s"

3. 存储扩容失败

症状

Error: volume expansion is not allowed for StorageClass aws-ebs-gp2

排查步骤

bash
# 1. 检查StorageClass是否支持扩容
kubectl get sc aws-ebs-gp2 -o jsonpath='{.allowVolumeExpansion}'

# 2. 检查PVC状态
kubectl describe pvc my-pvc

# 3. 查看存储后端是否支持在线扩容
kubectl logs -n kube-system ebs-csi-controller-xxx

解决方案

yaml
# 启用存储扩容
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: aws-ebs-gp2
provisioner: ebs.csi.aws.com
parameters:
  type: gp2
allowVolumeExpansion: true  # 启用扩容
bash
# 执行扩容
kubectl patch pvc my-pvc -p '{"spec":{"resources":{"requests":{"storage":"20Gi"}}}}'

# 验证扩容
kubectl get pvc my-pvc

4. WaitForFirstConsumer导致PVC Pending

症状

bash
$ kubectl get pvc
NAME       STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
my-pvc     Pending                                       local-storage  5m

排查步骤

bash
# 1. 检查StorageClass绑定模式
kubectl get sc local-storage -o jsonpath='{.volumeBindingMode}'

# 2. 检查是否有Pod使用该PVC
kubectl get pods -o json | jq '.items[] | select(.spec.volumes[]?.persistentVolumeClaim.claimName=="my-pvc")'

# 3. 查看PVC事件
kubectl describe pvc my-pvc

解决方案

yaml
# 创建使用PVC的Pod
apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
  - name: app
    image: nginx
    volumeMounts:
    - name: data
      mountPath: /data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: my-pvc

5. 存储性能问题

症状:应用I/O性能低下

排查步骤

bash
# 1. 检查StorageClass配置
kubectl get sc aws-ebs-gp2 -o yaml

# 2. 查看PV详细信息
kubectl describe pv pvc-xxx

# 3. 在Pod中测试I/O性能
kubectl exec -it my-app -- fio --name=randwrite --ioengine=libaio --iodepth=16 --rw=randwrite --bs=4k --direct=1 --size=1G --numjobs=4 --runtime=60 --group_reporting

# 4. 检查存储类型和IOPS
kubectl exec -it my-app -- df -h
kubectl exec -it my-app -- mount | grep /data

解决方案

yaml
# 升级到高性能存储
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: high-performance
provisioner: ebs.csi.aws.com
parameters:
  type: io1
  iopsPerGB: "100"  # 高IOPS
  fsType: xfs       # 高性能文件系统
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

故障排查流程图

PVC Pending

检查StorageClass → 不存在 → 创建StorageClass
    ↓ 存在
检查Provisioner → 未运行 → 启动Provisioner
    ↓ 运行中
检查权限配置 → 权限不足 → 配置RBAC权限
    ↓ 权限正常
检查存储参数 → 参数错误 → 修正参数
    ↓ 参数正确
检查绑定模式 → WaitForFirstConsumer → 创建Pod
    ↓ Immediate
检查云服务商 → API故障 → 联系云服务商
    ↓ 正常
PVC绑定成功

最佳实践建议

1. 存储类分层设计

性能分层

yaml
# 1. 极高性能层(数据库、缓存)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ultra-fast
  annotations:
    description: "Ultra high performance storage"
    tier: "1"
provisioner: ebs.csi.aws.com
parameters:
  type: io2
  iopsPerGB: "500"
---
# 2. 高性能层(应用数据)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast
  annotations:
    description: "High performance storage"
    tier: "2"
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
---
# 3. 标准层(一般用途)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
  annotations:
    description: "Standard storage"
    tier: "3"
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: ebs.csi.aws.com
parameters:
  type: gp2
---
# 4. 归档层(备份、日志)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: archive
  annotations:
    description: "Archive storage"
    tier: "4"
provisioner: ebs.csi.aws.com
parameters:
  type: sc1

2. 安全配置

加密存储

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: encrypted-storage
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  encrypted: "true"
  kmsKeyId: "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"
reclaimPolicy: Retain
allowVolumeExpansion: true

访问控制

yaml
# 限制StorageClass使用权限
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: storage-user
rules:
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["persistentvolumeclaims"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: storage-user-binding
  namespace: production
subjects:
- kind: ServiceAccount
  name: app-service-account
  namespace: production
roleRef:
  kind: ClusterRole
  name: storage-user
  apiGroup: rbac.authorization.k8s.io

3. 成本优化

存储成本监控

yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: storage-cost-alerts
  namespace: monitoring
spec:
  groups:
  - name: storage-cost
    rules:
    - alert: HighStorageUsage
      expr: |
        sum(kubelet_volume_stats_used_bytes) by (namespace, persistentvolumeclaim) / 
        sum(kubelet_volume_stats_capacity_bytes) by (namespace, persistentvolumeclaim) > 0.8
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "High storage usage for {{ $labels.persistentvolumeclaim }}"
        description: "PVC {{ $labels.persistentvolumeclaim }} in namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage }} of allocated storage"
    
    - alert: UnusedPVC
      expr: |
        kube_persistentvolumeclaim_resource_requests_storage_bytes - 
        on(namespace, persistentvolumeclaim) 
        kubelet_volume_stats_used_bytes > 0.9 * 
        kube_persistentvolumeclaim_resource_requests_storage_bytes
      for: 1h
      labels:
        severity: info
      annotations:
        summary: "Potentially unused PVC {{ $labels.persistentvolumeclaim }}"
        description: "PVC {{ $labels.persistentvolumeclaim }} in namespace {{ $labels.namespace }} has less than 10% usage"

自动扩缩容

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: storage-aware-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: External
    external:
      metric:
        name: storage_usage_percentage
        selector:
          matchLabels:
            app: my-app
      target:
        type: AverageValue
        averageValue: 70

4. 多区域部署

区域感知存储

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: regional-storage
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
  - key: topology.kubernetes.io/zone
    values:
    - us-east-1a
    - us-east-1b
    - us-east-1c
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: zonal-storage-us-east-1a
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
  - key: topology.kubernetes.io/zone
    values:
    - us-east-1a

5. 存储配额管理

yaml
# 命名空间存储配额
apiVersion: v1
kind: ResourceQuota
metadata:
  name: storage-quota
  namespace: production
spec:
  hard:
    requests.storage: "1Ti"
    persistentvolumeclaims: "20"
    # 按StorageClass限制
    aws-ebs-gp2.storageclass.storage.k8s.io/requests.storage: "500Gi"
    aws-ebs-gp2.storageclass.storage.k8s.io/persistentvolumeclaims: "10"
    fast-ssd.storageclass.storage.k8s.io/requests.storage: "300Gi"
    fast-ssd.storageclass.storage.k8s.io/persistentvolumeclaims: "5"

6. 备份和恢复策略

yaml
# Velero备份配置
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
  name: aws-backup
  namespace: velero
spec:
  provider: aws
  objectStorage:
    bucket: my-k8s-backups
  config:
    region: us-east-1
---
apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
  name: aws-snapshots
  namespace: velero
spec:
  provider: aws
  config:
    region: us-east-1
---
apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: daily-storage-backup
  namespace: velero
spec:
  schedule: "0 2 * * *"
  template:
    includedNamespaces:
    - production
    - database
    storageLocation: aws-backup
    volumeSnapshotLocations:
    - aws-snapshots
    ttl: 720h
    snapshotVolumes: true

7. 监控和告警

yaml
# 存储监控Dashboard配置
apiVersion: v1
kind: ConfigMap
metadata:
  name: storage-monitoring-dashboard
  namespace: monitoring
  labels:
    grafana_dashboard: "1"
data:
  storage-dashboard.json: |
    {
      "dashboard": {
        "title": "Kubernetes Storage Monitoring",
        "panels": [
          {
            "title": "PVC Usage",
            "type": "graph",
            "targets": [
              {
                "expr": "kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes",
                "legendFormat": "{{ namespace }}/{{ persistentvolumeclaim }}"
              }
            ]
          },
          {
            "title": "Storage IOPS",
            "type": "graph",
            "targets": [
              {
                "expr": "rate(kubelet_volume_stats_inodes_used[5m])",
                "legendFormat": "{{ namespace }}/{{ persistentvolumeclaim }}"
              }
            ]
          }
        ]
      }
    }

总结

核心要点

  1. 动态供应:StorageClass实现了存储的自动化管理,无需手动创建PV
  2. 性能分层:通过不同StorageClass提供不同性能等级的存储
  3. 成本优化:合理选择存储类型,平衡性能和成本
  4. 安全配置:启用加密、访问控制等安全特性
  5. 多后端支持:支持云存储、NFS、Ceph等多种存储后端

关键命令速查

bash
# StorageClass管理
kubectl get sc                                    # 查看所有StorageClass
kubectl describe sc <sc-name>                     # 查看StorageClass详情
kubectl apply -f storageclass.yaml                # 创建StorageClass
kubectl delete sc <sc-name>                       # 删除StorageClass

# PVC管理
kubectl get pvc                                   # 查看所有PVC
kubectl describe pvc <pvc-name>                   # 查看PVC详情
kubectl patch pvc <pvc-name> -p '{...}'          # 扩容PVC

# 快照管理
kubectl get volumesnapshot                        # 查看存储快照
kubectl create volumesnapshot <name> --source=<pvc>  # 创建快照

下一步学习

参考资源