PersistentVolume(PV/PVC)
概述
PersistentVolume(PV)和PersistentVolumeClaim(PVC)是Kubernetes中用于管理存储资源的核心概念。它们将存储资源从Pod的生命周期中解耦出来,实现了存储的持久化和独立管理。
核心概念
PersistentVolume(PV)
- 集群级别的存储资源,由管理员配置或动态创建
- 独立于Pod生命周期,数据持久化存储
- 支持多种存储后端(NFS、Ceph、云存储等)
PersistentVolumeClaim(PVC)
- 命名空间级别的存储请求,由用户创建
- 声明所需的存储大小和访问模式
- 自动绑定满足条件的PV
存储生命周期
- 供应(Provisioning):创建PV(静态或动态)
- 绑定(Binding):PVC与PV建立关联
- 使用(Using):Pod通过PVC使用存储
- 回收(Reclaiming):释放存储资源
PV与PVC的关系
┌─────────────┐
│ Pod │
└──────┬──────┘
│ 使用
↓
┌─────────────┐ ┌─────────────┐
│ PVC │ ←绑定─→ │ PV │
└─────────────┘ └──────┬──────┘
│
↓
┌─────────────┐
│ 存储后端 │
│ (NFS/Cloud) │
└─────────────┘访问模式
| 模式 | 缩写 | 说明 |
|---|---|---|
| ReadWriteOnce | RWO | 单节点读写 |
| ReadOnlyMany | ROX | 多节点只读 |
| ReadWriteMany | RWX | 多节点读写 |
| ReadWriteOncePod | RWOP | 单Pod读写(Kubernetes 1.22+) |
回收策略
| 策略 | 说明 | 适用场景 |
|---|---|---|
| Retain | 保留数据,需手动清理 | 重要数据、生产环境 |
| Delete | 自动删除存储资源 | 云存储、临时数据 |
| Recycle | 删除数据后重新可用(已废弃) | NFS等 |
完整YAML配置示例
1. 静态PV示例
yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-nfs-1
labels:
type: nfs
environment: production
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs-storage
mountOptions:
- hard
- nfsvers=4.1
nfs:
server: 192.168.1.100
path: "/data/pv1"2. PVC示例
yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-web-app
namespace: default
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
resources:
requests:
storage: 5Gi
storageClassName: nfs-storage
selector:
matchLabels:
environment: production3. Pod使用PVC
yaml
apiVersion: v1
kind: Pod
metadata:
name: web-app
namespace: default
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
volumeMounts:
- name: web-data
mountPath: /usr/share/nginx/html
volumes:
- name: web-data
persistentVolumeClaim:
claimName: pvc-web-app
readOnly: false4. 块存储PV示例
yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-block-1
spec:
capacity:
storage: 50Gi
volumeMode: Block
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: fast-block
local:
path: /dev/sdb
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- node-15. PVC使用块存储
yaml
apiVersion: v1
kind: Pod
metadata:
name: mysql-block
spec:
containers:
- name: mysql
image: mysql:8.0
ports:
- containerPort: 3306
volumeDevices:
- name: mysql-data
devicePath: /dev/xvda
env:
- name: MYSQL_ROOT_PASSWORD
value: "password123"
volumes:
- name: mysql-data
persistentVolumeClaim:
claimName: pvc-block-mysqlkubectl操作命令
PV管理命令
bash
# 查看所有PV
kubectl get pv
# 查看PV详细信息
kubectl describe pv pv-nfs-1
# 查看PV容量使用情况
kubectl get pv -o custom-columns=NAME:.metadata.name,CAPACITY:.spec.capacity.storage,STATUS:.status.phase,CLAIM:.spec.claimRef.name
# 编辑PV
kubectl edit pv pv-nfs-1
# 删除PV(必须先删除PVC)
kubectl delete pv pv-nfs-1
# 查看PV的YAML配置
kubectl get pv pv-nfs-1 -o yamlPVC管理命令
bash
# 查看所有PVC
kubectl get pvc
# 查看特定命名空间的PVC
kubectl get pvc -n production
# 查看PVC详细信息
kubectl describe pvc pvc-web-app
# 创建PVC
kubectl apply -f pvc-web-app.yaml
# 删除PVC
kubectl delete pvc pvc-web-app
# 查看PVC绑定的PV
kubectl get pvc pvc-web-app -o jsonpath='{.spec.volumeName}'
# 扩容PVC(需要存储类支持)
kubectl patch pvc pvc-web-app -p '{"spec":{"resources":{"requests":{"storage":"20Gi"}}}}'故障排查命令
bash
# 查看PVC事件
kubectl describe pvc pvc-web-app | grep -A 10 Events
# 查看PV绑定状态
kubectl get pv -o json | jq '.items[] | select(.spec.claimRef.name=="pvc-web-app")'
# 查看存储使用情况
kubectl top pv
# 查看Pod挂载的PVC
kubectl get pod web-app -o jsonpath='{.spec.volumes[?(@.persistentVolumeClaim)].persistentVolumeClaim.claimName}'
# 检查PV回收策略
kubectl get pv -o custom-columns=NAME:.metadata.name,RECLAIM:.spec.persistentVolumeReclaimPolicy真实场景实践示例
场景1:Web应用静态网站部署
需求:部署一个静态网站,需要持久化存储HTML文件,支持多副本读取。
yaml
# 1. 创建PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-static-web
labels:
app: static-web
tier: frontend
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: manual
nfs:
server: nfs-server.example.com
path: "/data/static-web"
---
# 2. 创建PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-static-web
namespace: web
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
resources:
requests:
storage: 5Gi
storageClassName: manual
selector:
matchLabels:
app: static-web
---
# 3. 创建Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: static-web
namespace: web
spec:
replicas: 3
selector:
matchLabels:
app: static-web
template:
metadata:
labels:
app: static-web
spec:
containers:
- name: nginx
image: nginx:1.21-alpine
ports:
- containerPort: 80
volumeMounts:
- name: web-content
mountPath: /usr/share/nginx/html
readOnly: true
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
volumes:
- name: web-content
persistentVolumeClaim:
claimName: pvc-static-web
readOnly: true
---
# 4. 创建Service
apiVersion: v1
kind: Service
metadata:
name: static-web-svc
namespace: web
spec:
type: LoadBalancer
selector:
app: static-web
ports:
- port: 80
targetPort: 80部署步骤:
bash
# 创建命名空间
kubectl create namespace web
# 应用配置
kubectl apply -f static-web.yaml
# 验证部署
kubectl get all -n web
# 上传网站内容到NFS
kubectl cp ./website/. static-web-xxx:/usr/share/nginx/html/ -n web场景2:MySQL数据库持久化存储
需求:部署MySQL数据库,需要高性能块存储,支持数据持久化和备份。
yaml
# 1. 创建PVC(使用动态供应)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-data
namespace: database
labels:
app: mysql
component: database
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 50Gi
storageClassName: fast-ssd
---
# 2. 创建MySQL配置
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql-config
namespace: database
data:
my.cnf: |
[mysqld]
innodb_buffer_pool_size = 1G
max_connections = 500
query_cache_size = 0
query_cache_type = 0
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 2
---
# 3. 创建Secret
apiVersion: v1
kind: Secret
metadata:
name: mysql-secret
namespace: database
type: Opaque
stringData:
root-password: "StrongPassword123!"
user-password: "UserPassword456!"
---
# 4. 创建MySQL StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
namespace: database
spec:
serviceName: mysql-headless
replicas: 1
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
ports:
- containerPort: 3306
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: root-password
- name: MYSQL_DATABASE
value: "appdb"
volumeMounts:
- name: mysql-data
mountPath: /var/lib/mysql
- name: mysql-config
mountPath: /etc/mysql/conf.d
readOnly: true
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 2Gi
livenessProbe:
exec:
command:
- mysqladmin
- ping
- -h
- localhost
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- mysql
- -h
- localhost
- -u
- root
- -p$(MYSQL_ROOT_PASSWORD)
- -e
- SELECT 1
initialDelaySeconds: 5
periodSeconds: 2
volumes:
- name: mysql-config
configMap:
name: mysql-config
volumeClaimTemplates:
- metadata:
name: mysql-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: fast-ssd
resources:
requests:
storage: 50Gi
---
# 5. 创建Service
apiVersion: v1
kind: Service
metadata:
name: mysql-headless
namespace: database
spec:
type: ClusterIP
clusterIP: None
selector:
app: mysql
ports:
- port: 3306
targetPort: 3306
---
apiVersion: v1
kind: Service
metadata:
name: mysql
namespace: database
spec:
type: ClusterIP
selector:
app: mysql
ports:
- port: 3306
targetPort: 3306备份脚本:
bash
#!/bin/bash
# mysql-backup.sh
NAMESPACE="database"
BACKUP_DIR="/backups/mysql"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="${BACKUP_DIR}/mysql_backup_${DATE}.sql.gz"
# 创建备份目录
mkdir -p ${BACKUP_DIR}
# 执行备份
kubectl exec -n ${NAMESPACE} mysql-0 -- \
mysqldump -u root -p"${MYSQL_ROOT_PASSWORD}" \
--all-databases \
--single-transaction \
--routines \
--triggers \
--events | gzip > ${BACKUP_FILE}
# 保留最近7天的备份
find ${BACKUP_DIR} -name "mysql_backup_*.sql.gz" -mtime +7 -delete
echo "Backup completed: ${BACKUP_FILE}"场景3:共享存储的CI/CD流水线
需求:CI/CD流水线需要共享存储,多个构建任务共享工作空间和依赖缓存。
yaml
# 1. 创建共享存储PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-cicd-workspace
labels:
type: shared
environment: cicd
spec:
capacity:
storage: 100Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: shared-storage
nfs:
server: nfs-storage.example.com
path: "/data/cicd-workspace"
---
# 2. 创建PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-cicd-workspace
namespace: cicd
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
resources:
requests:
storage: 100Gi
storageClassName: shared-storage
---
# 3. Maven缓存PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-maven-cache
namespace: cicd
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
resources:
requests:
storage: 20Gi
storageClassName: shared-storage
---
# 4. Jenkins Agent Pod模板
apiVersion: v1
kind: Pod
metadata:
name: jenkins-agent-maven
namespace: cicd
labels:
app: jenkins-agent
type: maven
spec:
containers:
- name: maven
image: maven:3.8.6-openjdk-11
command:
- cat
tty: true
volumeMounts:
- name: workspace
mountPath: /home/jenkins/agent
- name: maven-cache
mountPath: /root/.m2/repository
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 4Gi
- name: docker
image: docker:20.10
command:
- cat
tty: true
volumeMounts:
- name: workspace
mountPath: /home/jenkins/agent
- name: docker-sock
mountPath: /var/run/docker.sock
securityContext:
privileged: true
volumes:
- name: workspace
persistentVolumeClaim:
claimName: pvc-cicd-workspace
- name: maven-cache
persistentVolumeClaim:
claimName: pvc-maven-cache
- name: docker-sock
hostPath:
path: /var/run/docker.sockJenkinsfile示例:
groovy
pipeline {
agent {
kubernetes {
yaml '''
apiVersion: v1
kind: Pod
spec:
containers:
- name: maven
image: maven:3.8.6-openjdk-11
command:
- cat
tty: true
volumeMounts:
- name: workspace
mountPath: /home/jenkins/agent
- name: maven-cache
mountPath: /root/.m2/repository
volumes:
- name: workspace
persistentVolumeClaim:
claimName: pvc-cicd-workspace
- name: maven-cache
persistentVolumeClaim:
claimName: pvc-maven-cache
'''
}
}
stages {
stage('Checkout') {
steps {
checkout scm
}
}
stage('Build') {
steps {
container('maven') {
sh 'mvn clean package -DskipTests'
}
}
}
stage('Test') {
steps {
container('maven') {
sh 'mvn test'
}
}
post {
always {
junit '**/target/surefire-reports/*.xml'
}
}
}
stage('Deploy') {
steps {
echo 'Deploying application...'
}
}
}
}故障排查指南
常见问题及解决方案
1. PVC一直处于Pending状态
症状:
bash
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-web-app Pending nfs-storage 5m排查步骤:
bash
# 1. 查看PVC事件
kubectl describe pvc pvc-web-app
# 2. 检查是否存在匹配的PV
kubectl get pv -l environment=production
# 3. 检查StorageClass
kubectl get storageclass nfs-storage -o yaml
# 4. 检查存储后端是否正常
kubectl logs -n kube-system nfs-provisioner-xxx可能原因:
- 没有匹配的PV
- StorageClass配置错误
- 存储后端不可用
- 访问模式不匹配
- 存储容量不足
解决方案:
yaml
# 确保PV标签匹配PVC选择器
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-nfs-1
labels:
environment: production # 必须匹配PVC的selector
spec:
storageClassName: nfs-storage # 必须匹配PVC的storageClassName
accessModes:
- ReadWriteMany # 必须匹配PVC的accessModes
capacity:
storage: 10Gi # 必须大于等于PVC请求的大小2. Pod无法挂载PVC
症状:
Warning FailedMount Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[data]排查步骤:
bash
# 1. 查看Pod事件
kubectl describe pod web-app
# 2. 检查PVC状态
kubectl get pvc pvc-web-app -o yaml
# 3. 检查PV状态
kubectl get pv -o yaml | grep -A 5 "claimRef:"
# 4. 查看kubelet日志
kubectl logs -n kube-system kubelet-xxx | grep -i mount
# 5. 检查节点上的挂载
kubectl exec web-app -- df -h
kubectl exec web-app -- mount | grep pvc可能原因:
- PVC未绑定到PV
- PV的访问模式与Pod需求不匹配
- 存储后端故障
- 节点权限问题
- 挂载路径冲突
解决方案:
bash
# 检查PVC是否已绑定
kubectl get pvc pvc-web-app -o jsonpath='{.status.phase}'
# 如果未绑定,检查原因
kubectl describe pvc pvc-web-app
# 检查存储后端连接
kubectl exec web-app -- ls -la /usr/share/nginx/html
# 重新创建Pod
kubectl delete pod web-app
kubectl apply -f pod.yaml3. 数据丢失问题
症状:Pod重启后数据丢失
排查步骤:
bash
# 1. 检查PV回收策略
kubectl get pv -o custom-columns=NAME:.metadata.name,RECLAIM:.spec.persistentVolumeReclaimPolicy
# 2. 检查PVC是否正确绑定
kubectl get pvc -o wide
# 3. 检查Pod的volumeMounts配置
kubectl get pod web-app -o jsonpath='{.spec.volumes}'
# 4. 验证数据是否真的在PV中
kubectl exec web-app -- ls -la /data可能原因:
- 使用了emptyDir而非PVC
- PV回收策略为Delete
- 挂载路径错误
- 数据未同步到存储后端
解决方案:
yaml
# 确保使用PVC而非emptyDir
volumes:
- name: data
persistentVolumeClaim: # 正确
claimName: pvc-web-app
# emptyDir: {} # 错误:会导致数据丢失
# 设置正确的回收策略
spec:
persistentVolumeReclaimPolicy: Retain # 保留数据4. 存储扩容失败
症状:
Error: persistentvolumeclaims "pvc-web-app" could not be patched: persistentvolumeclaims "pvc-web-app" is forbidden: only dynamically provisioned pvc can be resized排查步骤:
bash
# 1. 检查StorageClass是否支持扩容
kubectl get storageclass -o json | jq '.items[] | {name:.metadata.name, allowExpansion:.allowVolumeExpansion}'
# 2. 检查PVC状态
kubectl describe pvc pvc-web-app
# 3. 查看存储后端是否支持在线扩容
kubectl logs -n kube-system csi-driver-xxx解决方案:
yaml
# 启用StorageClass扩容功能
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
allowVolumeExpansion: true # 启用扩容bash
# 执行扩容
kubectl patch pvc pvc-web-app -p '{"spec":{"resources":{"requests":{"storage":"20Gi"}}}}'
# 验证扩容
kubectl get pvc pvc-web-app5. 多节点挂载冲突
症状:
Multi-Attach error for volume "pv-nfs-1" The volume is already exclusively attached to one node and can't be attached to another.排查步骤:
bash
# 1. 检查PV访问模式
kubectl get pv pv-nfs-1 -o jsonpath='{.spec.accessModes}'
# 2. 检查哪些Pod在使用该PV
kubectl get pods -A -o json | jq '.items[] | select(.spec.volumes[]?.persistentVolumeClaim.claimName=="pvc-web-app") | {name:.metadata.name, namespace:.metadata.namespace, node:.spec.nodeName}'
# 3. 检查存储后端是否支持多节点挂载解决方案:
yaml
# 对于需要多节点访问的场景,使用RWX模式
spec:
accessModes:
- ReadWriteMany # 支持多节点读写
# 确保存储后端支持RWX(如NFS)
nfs:
server: nfs-server.example.com
path: "/data/shared"故障排查流程图
PVC Pending
↓
检查StorageClass → 不存在 → 创建StorageClass
↓ 存在
检查PV → 不存在 → 创建PV或启用动态供应
↓ 存在
检查标签匹配 → 不匹配 → 修改PV标签
↓ 匹配
检查访问模式 → 不匹配 → 修改PV访问模式
↓ 匹配
检查容量 → 不足 → 扩容PV
↓ 足够
检查存储后端 → 故障 → 修复存储后端
↓ 正常
PVC绑定成功最佳实践建议
1. 存储规划
容量规划
- 预留20-30%的存储余量
- 考虑数据增长趋势
- 设置合理的资源配额
yaml
# 设置命名空间存储配额
apiVersion: v1
kind: ResourceQuota
metadata:
name: storage-quota
namespace: production
spec:
hard:
requests.storage: "500Gi"
persistentvolumeclaims: "10"存储类型选择
| 应用类型 | 推荐存储类型 | 访问模式 | 回收策略 |
|---|---|---|---|
| Web静态文件 | NFS/对象存储 | RWX | Retain |
| 数据库 | 块存储(SSD) | RWO | Retain |
| 缓存 | 本地SSD | RWO | Delete |
| 日志 | 对象存储 | RWX | Delete |
| CI/CD工作空间 | NFS | RWX | Retain |
2. 安全配置
访问控制
yaml
# 限制PVC创建权限
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pvc-manager
namespace: production
rules:
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch"]数据加密
yaml
# 使用加密StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: encrypted-storage
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
encrypted: "true"
kmsKeyId: "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"3. 性能优化
存储性能调优
yaml
# 高性能StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: io1
iopsPerGB: "50"
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer挂载选项优化
yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-optimized
spec:
capacity:
storage: 100Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
mountOptions:
- noatime
- nobarrier
- data=writeback
nfs:
server: nfs-server.example.com
path: "/data/optimized"4. 持久卷管理策略
卷快照管理
卷快照是Kubernetes 1.17+引入的特性,用于创建PV的点-in-time副本,支持数据备份和恢复。
yaml
# 创建卷快照
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: mysql-snapshot
namespace: database
spec:
volumeSnapshotClassName: csi-aws-vsc
source:
persistentVolumeClaimName: mysql-datayaml
# 从快照恢复
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-data-restored
namespace: database
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp2
resources:
requests:
storage: 50Gi
dataSource:
name: mysql-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io存储迁移策略
方案1:使用卷快照进行迁移
bash
# 1. 在源集群创建快照
kubectl apply -f snapshot.yaml
# 2. 导出快照元数据
kubectl get volumesnapshot mysql-snapshot -o yaml > snapshot.yaml
# 3. 在目标集群导入快照
kubectl apply -f snapshot.yaml
# 4. 从快照创建PVC
kubectl apply -f restore.yaml方案2:使用Velero跨集群迁移
bash
# 1. 在源集群创建备份
velero backup create mysql-backup --include-resources=pvc,pv --selector app=mysql
# 2. 将备份复制到目标集群
velero backup get mysql-backup
velero backup download mysql-backup
# 3. 在目标集群恢复
velero restore create --from-backup mysql-backup存储生命周期管理
基于StorageClass的生命周期管理
yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp2-with-lifecycle
type: kubernetes.io/aws-ebs
parameters:
type: gp2
encrypted: "true"
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
mountOptions:
- debug存储资源回收
bash
# 清理未使用的PV
kubectl get pv | grep Available
kubectl delete pv pv-xxx
# 清理绑定但未使用的PVC
kubectl get pvc | grep Bound
kubectl get pods -o json | jq '.items[] | select(.spec.volumes[]?.persistentVolumeClaim.claimName=="pvc-xxx")'5. 高级备份策略
分层备份策略
| 备份类型 | 频率 | 保留期 | 适用场景 |
|---|---|---|---|
| 完全备份 | 每天 | 7天 | 重要数据 |
| 增量备份 | 每小时 | 24小时 | 频繁变更数据 |
| 差异备份 | 每周 | 30天 | 中等重要数据 |
备份存储策略
本地备份
yaml
# 使用本地存储作为备份目标
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
name: local-backup
namespace: velero
spec:
provider: velero.io/aws
objectStorage:
bucket: velero-backups
credential:
name: cloud-credentials
key: cloud云存储备份
yaml
# 使用AWS S3作为备份目标
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
name: aws-backup
namespace: velero
spec:
provider: velero.io/aws
objectStorage:
bucket: kubernetes-backups
credential:
name: aws-credentials
key: credentials灾难恢复计划
1. 备份策略
- 所有关键数据每天进行完全备份
- 每小时进行增量备份
- 备份数据异地存储
2. 恢复测试
- 每月进行恢复演练
- 验证备份完整性
- 记录恢复时间
3. 恢复流程
- 确认灾难情况
- 选择合适的备份点
- 执行恢复操作
- 验证服务可用性
4. 自动化恢复
yaml
# 灾难恢复计划
apiVersion: velero.io/v1
kind: Schedule
metadata:
name: disaster-recovery
namespace: velero
spec:
schedule: "@every 6h"
template:
includedNamespaces:
- "*"
includedResources:
- "*"
storageLocation: aws-backup
volumeSnapshotLocations:
- aws-snapshots
ttl: 168h # 7天备份验证框架
自动化验证
bash
#!/bin/bash
# validate-backup.sh
BACKUP_NAME="daily-backup-$(date +%Y%m%d)"
NAMESPACE="production"
# 触发备份
velero backup create ${BACKUP_NAME} --include-namespaces ${NAMESPACE}
# 等待备份完成
velero backup describe ${BACKUP_NAME} --wait
# 验证备份状态
STATUS=$(velero backup get ${BACKUP_NAME} -o jsonpath='{.status.phase}')
if [ "${STATUS}" != "Completed" ]; then
echo "Backup failed: ${STATUS}"
exit 1
fi
# 测试恢复
velero restore create --from-backup ${BACKUP_NAME} --namespace-mappings ${NAMESPACE}:test-restore
# 验证恢复
kubectl get pods -n test-restore
# 清理测试环境
velero restore delete test-restore
kubectl delete namespace test-restore
echo "Backup validation completed successfully"数据加密备份
静态加密
yaml
# 加密存储类
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: encrypted-gp2
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
encrypted: "true"
kmsKeyId: "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"传输加密
- 使用TLS加密备份传输
- 配置Velero使用HTTPS
- 启用服务器端加密
6. 存储监控与告警
详细监控指标
| 指标名称 | 描述 | 告警阈值 |
|---|---|---|
| kubelet_volume_stats_used_bytes | 已使用存储字节数 | 85% |
| kubelet_volume_stats_available_bytes | 可用存储字节数 | <10GB |
| kubelet_volume_stats_inodes_used | 已使用inode数 | 90% |
| kubelet_volume_stats_io_time_seconds_total | IO操作时间 | >60s/min |
| kubelet_volume_stats_reads_total | 读取操作次数 | 基线的200% |
| kubelet_volume_stats_writes_total | 写入操作次数 | 基线的200% |
高级告警配置
yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: storage-alerts
namespace: monitoring
spec:
groups:
- name: storage
rules:
# 存储使用率告警
- alert: StorageUsageWarning
expr: |
(kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) > 0.8
for: 10m
labels:
severity: warning
annotations:
summary: "Storage usage warning"
description: "PVC {{ $labels.persistentvolumeclaim }} in {{ $labels.namespace }} is {{ $value | humanizePercentage }} full"
# 存储使用率严重告警
- alert: StorageUsageCritical
expr: |
(kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) > 0.9
for: 5m
labels:
severity: critical
annotations:
summary: "Storage usage critical"
description: "PVC {{ $labels.persistentvolumeclaim }} in {{ $labels.namespace }} is {{ $value | humanizePercentage }} full"
# 存储IO性能告警
- alert: StorageIOPerformance
expr: |
rate(kubelet_volume_stats_io_time_seconds_total[5m]) > 0.5
for: 5m
labels:
severity: warning
annotations:
summary: "Storage IO performance issue"
description: "High IO wait time on PVC {{ $labels.persistentvolumeclaim }}"
# PVC绑定失败告警
- alert: PVCBindingFailed
expr: |
kube_persistentvolumeclaim_status_phase{phase="Pending"} == 1
for: 15m
labels:
severity: critical
annotations:
summary: "PVC binding failed"
description: "PVC {{ $labels.persistentvolumeclaim }} in {{ $labels.namespace }} has been pending for 15 minutes"存储健康检查
bash
#!/bin/bash
# storage-health-check.sh
# 检查PVC状态
kubectl get pvc --all-namespaces | grep -v Bound
# 检查PV状态
kubectl get pv | grep -v Available | grep -v Bound
# 检查存储使用率
kubectl get --raw /api/v1/nodes | jq '.items[].status.volumesInUse'
# 检查存储类
kubectl get storageclass
# 检查卷快照
kubectl get volumesnapshot --all-namespaces
# 检查备份状态
velero backup get5. 监控告警
Prometheus监控规则
yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: storage-alerts
namespace: monitoring
spec:
groups:
- name: storage
rules:
- alert: PVCAlmostFull
expr: |
(kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) > 0.85
for: 5m
labels:
severity: warning
annotations:
summary: "PVC {{ $labels.persistentvolumeclaim }} almost full"
description: "PVC {{ $labels.persistentvolumeclaim }} in namespace {{ $labels.namespace }} is {{ $value | humanizePercentage }} full"
- alert: PVCCriticalFull
expr: |
(kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) > 0.95
for: 1m
labels:
severity: critical
annotations:
summary: "PVC {{ $labels.persistentvolumeclaim }} critical full"
description: "PVC {{ $labels.persistentvolumeclaim }} in namespace {{ $labels.namespace }} is {{ $value | humanizePercentage }} full"
- alert: PVCNotBound
expr: |
kube_persistentvolumeclaim_status_phase{phase!="Bound"} == 1
for: 5m
labels:
severity: warning
annotations:
summary: "PVC {{ $labels.persistentvolumeclaim }} not bound"
description: "PVC {{ $labels.persistentvolumeclaim }} in namespace {{ $labels.namespace }} is in phase {{ $labels.phase }}"6. 命名规范
推荐命名模式
yaml
# PV命名:pv-{storage-type}-{environment}-{purpose}-{index}
metadata:
name: pv-nfs-prod-web-01
# PVC命名:pvc-{app-name}-{purpose}
metadata:
name: pvc-mysql-data
# StorageClass命名:{performance}-{storage-type}-{environment}
metadata:
name: fast-ssd-prod7. 文档化
存储文档模板
yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-mysql-prod
annotations:
description: "MySQL production database storage"
owner: "database-team@example.com"
backup-schedule: "daily at 2:00 AM"
retention-policy: "30 days"
cost-center: "CC-12345"
spec:
# ... PV配置总结
核心要点
- PV/PVC解耦:将存储资源与应用分离,实现存储的独立管理
- 生命周期管理:理解供应、绑定、使用、回收四个阶段
- 访问模式选择:根据应用需求选择合适的访问模式(RWO/ROX/RWX)
- 回收策略配置:根据数据重要性设置Retain或Delete策略
- 动态供应:使用StorageClass实现存储的自动化管理
- 卷快照管理:利用VolumeSnapshot实现数据备份和恢复
- 存储迁移:支持跨集群存储迁移和灾难恢复
- 高级备份策略:分层备份、加密备份、灾难恢复计划
- 存储监控:详细的监控指标和告警配置
- 性能优化:存储类型选择、挂载选项优化、资源管理
存储管理最佳实践
存储规划:
- 根据应用特性选择合适的存储类型
- 预留足够的存储容量
- 制定合理的存储配额
数据安全:
- 使用加密存储保护敏感数据
- 实现定期备份策略
- 配置合适的回收策略
性能优化:
- 选择高性能存储类型
- 优化挂载选项
- 合理配置资源请求和限制
备份与恢复:
- 实现分层备份策略
- 定期验证备份完整性
- 制定灾难恢复计划
监控与维护:
- 监控存储使用率和性能
- 设置合理的告警阈值
- 定期进行存储健康检查
关键命令速查
bash
# PV管理
kubectl get pv # 查看所有PV
kubectl describe pv <pv-name> # 查看PV详情
kubectl delete pv <pv-name> # 删除PV
# PVC管理
kubectl get pvc # 查看所有PVC
kubectl describe pvc <pvc-name> # 查看PVC详情
kubectl patch pvc <pvc-name> -p '{...}' # 扩容PVC
# 卷快照管理
kubectl get volumesnapshot # 查看卷快照
kubectl create -f snapshot.yaml # 创建卷快照
kubectl get volumesnapshotcontent # 查看快照内容
# 备份管理
velero backup get # 查看备份
velero backup create <backup-name> # 创建备份
velero restore create --from-backup <backup-name> # 恢复备份
# 故障排查
kubectl describe pvc <pvc-name> # 查看PVC事件
kubectl get events --field-selector involvedObject.name=<pvc-name> # 查看事件
kubectl logs -n kube-system <provisioner-pod> # 查看供应器日志下一步学习
- 18-StorageClass - 学习动态存储供应和存储类配置
- 19-ConfigMap与Secret - 掌握配置管理和敏感信息管理
- 20-有状态应用 - 深入学习有状态应用的部署和管理
参考资源
- Kubernetes官方文档 - Persistent Volumes
- Kubernetes官方文档 - Storage Classes
- Kubernetes官方文档 - Volume Snapshots
- Velero官方文档
- CSI Volume Snapshots
关键命令速查
bash
# PV管理
kubectl get pv # 查看所有PV
kubectl describe pv <pv-name> # 查看PV详情
kubectl delete pv <pv-name> # 删除PV
# PVC管理
kubectl get pvc # 查看所有PVC
kubectl describe pvc <pvc-name> # 查看PVC详情
kubectl patch pvc <pvc-name> -p '{...}' # 扩容PVC
# 故障排查
kubectl describe pvc <pvc-name> # 查看PVC事件
kubectl get events --field-selector involvedObject.name=<pvc-name> # 查看事件
kubectl logs -n kube-system <provisioner-pod> # 查看供应器日志下一步学习
- 18-StorageClass - 学习动态存储供应和存储类配置
- 19-ConfigMap与Secret - 掌握配置管理和敏感信息管理
- 20-有状态应用 - 深入学习有状态应用的部署和管理