CI/CD流水线
概述
CI/CD(持续集成/持续部署)是现代软件开发的核心实践,本章将深入探讨如何在Kubernetes环境中构建完整的CI/CD流水线,包括GitOps实践、持续集成、持续部署、自动化测试等关键内容。
核心概念
CI/CD流程
- 持续集成(CI):代码提交后自动构建、测试
- 持续部署(CD):自动部署到不同环境
- GitOps:以Git为单一事实来源的运维模式
- 基础设施即代码:所有配置代码化管理
GitOps核心原则
- 声明式描述:系统状态声明式定义
- 版本控制:所有变更通过Git提交
- 自动应用:变更自动应用到集群
- 持续协调:持续确保实际状态与期望状态一致
CI/CD工具链
- Git仓库:GitHub、GitLab、Bitbucket
- CI工具:Jenkins、GitLab CI、GitHub Actions
- CD工具:ArgoCD、Flux、Spinnaker
- 镜像仓库:Docker Hub、Harbor、ECR
GitOps实践
ArgoCD部署
yaml
apiVersion: v1
kind: Namespace
metadata:
name: argocd
---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/myorg/my-app.git
targetRevision: HEAD
path: k8s/overlays/production
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
allowEmpty: false
syncOptions:
- Validate=true
- CreateNamespace=true
- PrunePropagationPolicy=foreground
- PruneLast=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
---
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: production
namespace: argocd
spec:
description: Production environment project
sourceRepos:
- 'https://github.com/myorg/*'
destinations:
- namespace: production
server: https://kubernetes.default.svc
- namespace: staging
server: https://kubernetes.default.svc
clusterResourceWhitelist:
- group: ''
kind: Namespace
namespaceResourceBlacklist:
- group: ''
kind: ResourceQuota
- group: ''
kind: LimitRange
- group: ''
kind: NetworkPolicy
roles:
- name: admin
description: Admin privileges for production project
policies:
- p, proj:production:admin, applications, *, production/*, allow
- p, proj:production:admin, repositories, *, production/*, allow
---
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: microservices
namespace: argocd
spec:
generators:
- list:
elements:
- name: user-service
path: services/user-service
- name: order-service
path: services/order-service
- name: product-service
path: services/product-service
template:
metadata:
name: '{{name}}'
spec:
project: default
source:
repoURL: https://github.com/myorg/microservices.git
targetRevision: HEAD
path: '{{path}}/k8s/overlays/production'
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: trueFlux部署
yaml
apiVersion: v1
kind: Namespace
metadata:
name: flux-system
---
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: GitRepository
metadata:
name: my-app
namespace: flux-system
spec:
interval: 1m0s
ref:
branch: main
secretRef:
name: flux-system
url: ssh://git@github.com/myorg/my-app
---
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: my-app
namespace: flux-system
spec:
interval: 5m0s
path: ./k8s/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: my-app
validation: client
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: user-service
namespace: production
- apiVersion: apps/v1
kind: Deployment
name: order-service
namespace: production
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
name: my-app
namespace: flux-system
spec:
image: registry.example.com/my-app
interval: 1m0s
secretRef:
name: registry-credentials
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
name: my-app
namespace: flux-system
spec:
imageRepositoryRef:
name: my-app
policy:
semver:
range: '>=1.0.0'
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageUpdateAutomation
metadata:
name: my-app
namespace: flux-system
spec:
interval: 1m0s
sourceRef:
kind: GitRepository
name: my-app
git:
checkout:
ref:
branch: main
commit:
author:
email: flux@example.com
name: flux
messageTemplate: '{{range .Updated.Images}}{{println .}}{{end}}'
push:
branch: main
update:
path: ./k8s
strategy: settlers持续集成
GitHub Actions配置
yaml
name: CI Pipeline
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
env:
REGISTRY: registry.example.com
IMAGE_NAME: ${{ github.repository }}
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up JDK 17
uses: actions/setup-java@v3
with:
java-version: '17'
distribution: 'temurin'
- name: Cache Maven packages
uses: actions/cache@v3
with:
path: ~/.m2
key: ${{ runner.os }}-m2-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ runner.os }}-m2
- name: Build with Maven
run: mvn clean package -DskipTests
- name: Run tests
run: mvn test
- name: Generate coverage report
run: mvn jacoco:report
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
file: ./target/site/jacoco/jacoco.xml
flags: unittests
name: codecov-umbrella
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Log in to Container Registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ secrets.REGISTRY_USERNAME }}
password: ${{ secrets.REGISTRY_PASSWORD }}
- name: Extract metadata for Docker
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=sha
- name: Build and push Docker image
uses: docker/build-push-action@v4
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy scan results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
test:
runs-on: ubuntu-latest
needs: build
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Run integration tests
run: |
docker-compose -f docker-compose.test.yml up -d
sleep 30
mvn verify -P integration-test
docker-compose -f docker-compose.test.yml down
- name: Publish test results
uses: EnricoMi/publish-unit-test-result-action@v2
if: always()
with:
files: |
target/surefire-reports/*.xml
target/failsafe-reports/*.xml
sonarqube:
runs-on: ubuntu-latest
needs: build
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Set up JDK 17
uses: actions/setup-java@v3
with:
java-version: '17'
distribution: 'temurin'
- name: Cache SonarCloud packages
uses: actions/cache@v3
with:
path: ~/.sonar/cache
key: ${{ runner.os }}-sonar
restore-keys: ${{ runner.os }}-sonar
- name: Cache Maven packages
uses: actions/cache@v3
with:
path: ~/.m2
key: ${{ runner.os }}-m2-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ runner.os }}-m2
- name: Build and analyze
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
run: mvn -B verify org.sonarsource.scanner.maven:sonar-maven-plugin:sonar -Dsonar.projectKey=myorg_my-appGitLab CI配置
yaml
stages:
- build
- test
- security
- deploy
variables:
MAVEN_CLI_OPTS: "-s .m2/settings.xml --batch-mode"
MAVEN_OPTS: "-Dmaven.repo.local=.m2/repository"
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: "/certs"
cache:
paths:
- .m2/repository/
build:
stage: build
image: maven:3.8.6-openjdk-17
script:
- mvn $MAVEN_CLI_OPTS compile
artifacts:
paths:
- target/
expire_in: 1 hour
test:
stage: test
image: maven:3.8.6-openjdk-17
script:
- mvn $MAVEN_CLI_OPTS test
- mvn $MAVEN_CLI_OPTS jacoco:report
artifacts:
reports:
junit:
- target/surefire-reports/TEST-*.xml
coverage_report:
coverage_format: jacoco
path: target/site/jacoco/jacoco.xml
paths:
- target/site/jacoco/
expire_in: 1 week
code_quality:
stage: test
image: maven:3.8.6-openjdk-17
script:
- mvn $MAVEN_CLI_OPTS verify sonar:sonar -Dsonar.projectKey=$CI_PROJECT_NAME -Dsonar.host.url=$SONAR_URL -Dsonar.login=$SONAR_TOKEN
allow_failure: true
security_scan:
stage: security
image: docker:latest
services:
- docker:dind
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker run --rm -v /var/run/docker.sock:/var/run/docker.sock aquasec/trivy:latest image --exit-code 1 --severity HIGH,CRITICAL $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
allow_failure: true
docker_build:
stage: build
image: docker:latest
services:
- docker:dind
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
- |
if [ "$CI_COMMIT_BRANCH" == "main" ]; then
docker tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA $CI_REGISTRY_IMAGE:latest
docker push $CI_REGISTRY_IMAGE:latest
fi
deploy_staging:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl config set-cluster k8s --server="$KUBE_URL" --insecure-skip-tls-verify=true
- kubectl config set-credentials admin --token="$KUBE_TOKEN"
- kubectl config set-context default --cluster=k8s --user=admin
- kubectl config use-context default
- kubectl set image deployment/my-app my-app=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA -n staging
- kubectl rollout status deployment/my-app -n staging
environment:
name: staging
url: https://staging.example.com
only:
- develop
deploy_production:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl config set-cluster k8s --server="$KUBE_URL" --insecure-skip-tls-verify=true
- kubectl config set-credentials admin --token="$KUBE_TOKEN"
- kubectl config set-context default --cluster=k8s --user=admin
- kubectl config use-context default
- kubectl set image deployment/my-app my-app=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA -n production
- kubectl rollout status deployment/my-app -n production
environment:
name: production
url: https://www.example.com
when: manual
only:
- mainJenkins Pipeline配置
groovy
pipeline {
agent {
kubernetes {
yaml '''
apiVersion: v1
kind: Pod
metadata:
labels:
app: jenkins-agent
spec:
containers:
- name: maven
image: maven:3.8.6-openjdk-17
command:
- cat
tty: true
volumeMounts:
- name: m2-cache
mountPath: /root/.m2
- name: docker
image: docker:latest
command:
- cat
tty: true
volumeMounts:
- name: docker-sock
mountPath: /var/run/docker.sock
volumes:
- name: m2-cache
emptyDir: {}
- name: docker-sock
hostPath:
path: /var/run/docker.sock
'''
}
}
environment {
REGISTRY = 'registry.example.com'
IMAGE_NAME = "${REGISTRY}/${JOB_NAME}"
DOCKER_CREDENTIALS = credentials('docker-registry')
}
stages {
stage('Checkout') {
steps {
checkout scm
sh 'git rev-parse HEAD > commit-id'
script {
env.COMMIT_ID = readFile('commit-id').trim()
}
}
}
stage('Build') {
steps {
container('maven') {
sh 'mvn clean package -DskipTests'
}
}
}
stage('Test') {
steps {
container('maven') {
sh 'mvn test'
}
}
post {
always {
junit 'target/surefire-reports/*.xml'
jacoco execPattern: 'target/jacoco.exec'
}
}
}
stage('Code Quality') {
steps {
container('maven') {
withSonarQubeEnv('SonarQube') {
sh 'mvn sonar:sonar'
}
}
}
}
stage('Security Scan') {
steps {
container('docker') {
sh """
docker build -t ${IMAGE_NAME}:${COMMIT_ID} .
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy:latest image --exit-code 1 --severity HIGH,CRITICAL \
${IMAGE_NAME}:${COMMIT_ID}
"""
}
}
}
stage('Build and Push Image') {
steps {
container('docker') {
sh """
docker login -u ${DOCKER_CREDENTIALS_USR} -p ${DOCKER_CREDENTIALS_PSW} ${REGISTRY}
docker push ${IMAGE_NAME}:${COMMIT_ID}
if [ "\${GIT_BRANCH}" == "origin/main" ]; then
docker tag ${IMAGE_NAME}:${COMMIT_ID} ${IMAGE_NAME}:latest
docker push ${IMAGE_NAME}:latest
fi
"""
}
}
}
stage('Deploy to Staging') {
when {
branch 'develop'
}
steps {
container('kubectl') {
sh """
kubectl set image deployment/my-app my-app=${IMAGE_NAME}:${COMMIT_ID} -n staging
kubectl rollout status deployment/my-app -n staging
"""
}
}
}
stage('Deploy to Production') {
when {
branch 'main'
}
steps {
input 'Deploy to Production?'
container('kubectl') {
sh """
kubectl set image deployment/my-app my-app=${IMAGE_NAME}:${COMMIT_ID} -n production
kubectl rollout status deployment/my-app -n production
"""
}
}
}
}
post {
always {
cleanWs()
}
success {
slackSend(color: 'good', message: "Build successful: ${env.JOB_NAME} #${env.BUILD_NUMBER}")
}
failure {
slackSend(color: 'danger', message: "Build failed: ${env.JOB_NAME} #${env.BUILD_NUMBER}")
}
}
}持续部署
部署策略配置
yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: my-app-rollout
namespace: production
spec:
replicas: 5
revisionHistoryLimit: 2
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: registry.example.com/my-app:v1.0.0
ports:
- containerPort: 8080
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2000m"
memory: "2Gi"
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 60
periodSeconds: 10
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 30
periodSeconds: 5
strategy:
canary:
steps:
- setWeight: 20
- pause: {duration: 10m}
- setWeight: 40
- pause: {duration: 10m}
- setWeight: 60
- pause: {duration: 10m}
- setWeight: 80
- pause: {duration: 10m}
analysis:
templates:
- templateName: success-rate
startingStep: 2
args:
- name: service-name
value: my-app
---
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
namespace: production
spec:
args:
- name: service-name
metrics:
- name: success-rate
interval: 5m
successCondition: result[0] >= 0.95
provider:
prometheus:
address: http://prometheus.monitoring.svc.cluster.local:9090
query: |
sum(rate(http_requests_total{status=~"2..",service="{{args.service-name}}"}[5m])) /
sum(rate(http_requests_total{service="{{args.service-name}}"}[5m]))
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app-canary
namespace: production
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "20"
spec:
rules:
- host: www.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-app-canary
port:
number: 80蓝绿部署配置
yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: my-app-bluegreen
namespace: production
spec:
replicas: 5
revisionHistoryLimit: 2
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: registry.example.com/my-app:v1.0.0
ports:
- containerPort: 8080
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2000m"
memory: "2Gi"
strategy:
blueGreen:
activeService: my-app-active
previewService: my-app-preview
autoPromotionEnabled: false
scaleDownDelaySeconds: 30
prePromotionAnalysis:
templates:
- templateName: pre-promotion
postPromotionAnalysis:
templates:
- templateName: post-promotion
---
apiVersion: v1
kind: Service
metadata:
name: my-app-active
namespace: production
spec:
type: ClusterIP
selector:
app: my-app
ports:
- port: 80
targetPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: my-app-preview
namespace: production
spec:
type: ClusterIP
selector:
app: my-app
ports:
- port: 80
targetPort: 8080kubectl操作命令
ArgoCD管理
bash
# 安装ArgoCD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# 获取ArgoCD初始密码
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
# 访问ArgoCD UI
kubectl port-forward svc/argocd-server -n argocd 8080:443
# 登录ArgoCD
argocd login localhost:8080
# 查看应用列表
argocd app list
# 同步应用
argocd app sync my-app
# 查看应用状态
argocd app get my-app
# 查看应用差异
argocd app diff my-app
# 查看应用历史
argocd app history my-app
# 回滚应用
argocd app rollback my-app <revision>
# 删除应用
argocd app delete my-appFlux管理
bash
# 安装Flux CLI
curl -s https://fluxcd.io/install.sh | sudo bash
# 安装Flux到集群
flux install
# 创建Git仓库
flux create source git my-app \
--url=https://github.com/myorg/my-app \
--branch=main \
--interval=1m
# 创建Kustomization
flux create kustomization my-app \
--source=my-app \
--path="./k8s/overlays/production" \
--prune=true \
--interval=5m
# 查看Flux资源
flux get sources git
flux get kustomizations
# 手动同步
flux reconcile source git my-app
flux reconcile kustomization my-app
# 暂停自动同步
flux suspend kustomization my-app
# 恢复自动同步
flux resume kustomization my-app
# 查看日志
flux logs部署管理
bash
# 查看部署状态
kubectl get deployments -n production
# 查看Rollout状态
kubectl get rollouts -n production
# 查看Rollout详情
kubectl describe rollout my-app-rollout -n production
# 查看部署历史
kubectl rollout history deployment/my-app -n production
# 回滚部署
kubectl rollout undo deployment/my-app -n production
# 查看Pod状态
kubectl get pods -n production -w
# 查看部署事件
kubectl get events -n production --sort-by='.lastTimestamp'
# 查看资源使用
kubectl top pods -n production
# 查看日志
kubectl logs -f deployment/my-app -n production镜像管理
bash
# 更新镜像
kubectl set image deployment/my-app my-app=registry.example.com/my-app:v2.0.0 -n production
# 查看镜像版本
kubectl get deployment my-app -n production -o jsonpath='{.spec.template.spec.containers[0].image}'
# 查看镜像历史
kubectl rollout history deployment/my-app -n production
# 查看镜像拉取策略
kubectl get deployment my-app -n production -o jsonpath='{.spec.template.spec.containers[0].imagePullPolicy}'
# 查看镜像仓库密钥
kubectl get secrets -n production | grep registry
# 创建镜像仓库密钥
kubectl create secret docker-registry regcred \
--docker-server=registry.example.com \
--docker-username=<username> \
--docker-password=<password> \
--namespace=production实践示例
示例1:完整CI/CD流水线
场景描述
构建一个完整的CI/CD流水线,从代码提交到生产部署的全流程自动化。
流水线配置
yaml
# .github/workflows/ci-cd.yml
name: Complete CI/CD Pipeline
on:
push:
branches: [ main, develop, 'release/*' ]
pull_request:
branches: [ main ]
env:
REGISTRY: registry.example.com
IMAGE_NAME: ${{ github.repository }}
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Super-Linter
uses: github/super-linter@v4
env:
DEFAULT_BRANCH: main
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
build:
needs: lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up JDK 17
uses: actions/setup-java@v3
with:
java-version: '17'
distribution: 'temurin'
- name: Build with Maven
run: mvn clean package -DskipTests
- name: Upload build artifacts
uses: actions/upload-artifact@v3
with:
name: build-artifacts
path: target/
test:
needs: build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up JDK 17
uses: actions/setup-java@v3
with:
java-version: '17'
distribution: 'temurin'
- name: Run unit tests
run: mvn test
- name: Run integration tests
run: mvn verify -P integration-test
- name: Generate coverage report
run: mvn jacoco:report
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
file: ./target/site/jacoco/jacoco.xml
security:
needs: build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
scan-ref: '.'
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy scan results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
docker:
needs: [test, security]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Log in to Container Registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ secrets.REGISTRY_USERNAME }}
password: ${{ secrets.REGISTRY_PASSWORD }}
- name: Extract metadata for Docker
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=semver,pattern={{version}}
type=sha
- name: Build and push Docker image
uses: docker/build-push-action@v4
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
deploy-staging:
needs: docker
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/develop'
steps:
- uses: actions/checkout@v3
- name: Deploy to staging
uses: steebchen/kubectl@v2.0.0
with:
config: ${{ secrets.KUBE_CONFIG }}
command: apply -f k8s/overlays/staging
- name: Wait for deployment
run: |
kubectl rollout status deployment/my-app -n staging --timeout=300s
- name: Run smoke tests
run: |
./scripts/smoke-test.sh staging
deploy-production:
needs: docker
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
environment: production
steps:
- uses: actions/checkout@v3
- name: Deploy to production
uses: steebchen/kubectl@v2.0.0
with:
config: ${{ secrets.KUBE_CONFIG }}
command: apply -f k8s/overlays/production
- name: Wait for deployment
run: |
kubectl rollout status deployment/my-app -n production --timeout=300s
- name: Run smoke tests
run: |
./scripts/smoke-test.sh production
- name: Notify deployment
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: 'Deployed to production'
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}部署命令
bash
# 触发CI/CD流水线
git push origin main
# 查看GitHub Actions状态
gh run list
gh run view
# 查看ArgoCD应用状态
argocd app get my-app
# 手动同步ArgoCD应用
argocd app sync my-app
# 查看部署状态
kubectl get all -n production
# 查看部署日志
kubectl logs -f deployment/my-app -n production示例2:GitOps多环境部署
场景描述
使用GitOps模式管理多个环境(开发、测试、生产)的部署。
目录结构
my-app/
├── k8s/
│ ├── base/
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ └── kustomization.yaml
│ └── overlays/
│ ├── development/
│ │ ├── kustomization.yaml
│ │ ├── patches/
│ │ └── config/
│ ├── staging/
│ │ ├── kustomization.yaml
│ │ ├── patches/
│ │ └── config/
│ └── production/
│ ├── kustomization.yaml
│ ├── patches/
│ └── config/
└── .github/
└── workflows/
└── ci-cd.ymlArgoCD ApplicationSet
yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: my-app-environments
namespace: argocd
spec:
generators:
- list:
elements:
- env: development
namespace: development
branch: develop
- env: staging
namespace: staging
branch: release/*
- env: production
namespace: production
branch: main
template:
metadata:
name: 'my-app-{{env}}'
spec:
project: default
source:
repoURL: https://github.com/myorg/my-app.git
targetRevision: '{{branch}}'
path: k8s/overlays/{{env}}
destination:
server: https://kubernetes.default.svc
namespace: '{{namespace}}'
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=trueKustomize配置
yaml
# k8s/base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml
commonLabels:
app: my-app
---
# k8s/overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: production
resources:
- ../../base
patchesStrategicMerge:
- patches/deployment-replicas.yaml
- patches/resource-limits.yaml
configMapGenerator:
- name: app-config
behavior: merge
files:
- config/application.yml
secretGenerator:
- name: app-secret
behavior: merge
type: Opaque
files:
- config/db-password
images:
- name: registry.example.com/my-app
newTag: v1.0.0
commonLabels:
environment: production
---
# k8s/overlays/production/patches/deployment-replicas.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 5示例3:自动化测试集成
场景描述
在CI/CD流水线中集成自动化测试,包括单元测试、集成测试、端到端测试。
测试配置
yaml
# .github/workflows/test.yml
name: Automated Testing
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
unit-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up JDK 17
uses: actions/setup-java@v3
with:
java-version: '17'
- name: Run unit tests
run: mvn test
- name: Generate coverage report
run: mvn jacoco:report
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
file: ./target/site/jacoco/jacoco.xml
threshold: 80%
integration-test:
runs-on: ubuntu-latest
services:
mysql:
image: mysql:8.0
env:
MYSQL_ROOT_PASSWORD: root
MYSQL_DATABASE: test_db
ports:
- 3306:3306
options: --health-cmd="mysqladmin ping" --health-interval=10s --health-timeout=5s --health-retries=3
redis:
image: redis:latest
ports:
- 6379:6379
options: --health-cmd="redis-cli ping" --health-interval=10s --health-timeout=5s --health-retries=3
steps:
- uses: actions/checkout@v3
- name: Set up JDK 17
uses: actions/setup-java@v3
with:
java-version: '17'
- name: Run integration tests
run: mvn verify -P integration-test
env:
DB_HOST: localhost
DB_PORT: 3306
REDIS_HOST: localhost
REDIS_PORT: 6379
e2e-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Deploy to test environment
run: |
kubectl apply -f k8s/overlays/test
kubectl wait --for=condition=ready pod -l app=my-app -n test --timeout=300s
- name: Run E2E tests
run: |
npm install
npm run test:e2e
env:
TEST_URL: https://test.example.com
- name: Cleanup test environment
if: always()
run: kubectl delete namespace test
performance-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run performance tests
uses: grafana/k6-action@v0.3.0
with:
filename: tests/performance/load-test.js
env:
K6_CLOUD_TOKEN: ${{ secrets.K6_CLOUD_TOKEN }}
- name: Upload performance results
uses: actions/upload-artifact@v3
with:
name: performance-results
path: summary.json测试脚本
javascript
// tests/performance/load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
export let options = {
stages: [
{ duration: '2m', target: 100 },
{ duration: '5m', target: 100 },
{ duration: '2m', target: 200 },
{ duration: '5m', target: 200 },
{ duration: '2m', target: 0 },
],
thresholds: {
http_req_duration: ['p(99)<1500'],
http_req_failed: ['rate<0.01'],
},
};
export default function () {
let res = http.get('https://test.example.com/api/health');
check(res, {
'status is 200': (r) => r.status == 200,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}故障排查指南
常见问题1:ArgoCD同步失败
症状
- 应用状态显示OutOfSync
- 同步操作失败
排查步骤
bash
# 查看应用状态
argocd app get my-app
# 查看应用详情
argocd app get my-app --refresh
# 查看同步状态
argocd app sync my-app --dry-run
# 查看应用日志
kubectl logs -n argocd deployment/argocd-application-controller
# 查看应用事件
kubectl get events -n production --field-selector involvedObject.name=my-app
# 查看资源差异
argocd app diff my-app解决方案
bash
# 强制同步
argocd app sync my-app --force
# 回滚到上一个版本
argocd app rollback my-app
# 删除并重新创建应用
argocd app delete my-app
argocd app create my-app --file app.yaml常见问题2:镜像拉取失败
症状
- Pod状态为ImagePullBackOff
- ErrImagePull错误
排查步骤
bash
# 查看Pod状态
kubectl describe pod <pod-name> -n production
# 查看镜像拉取密钥
kubectl get secrets -n production | grep registry
# 查看密钥详情
kubectl describe secret regcred -n production
# 测试镜像拉取
docker pull registry.example.com/my-app:v1.0.0
# 查看镜像仓库日志
kubectl logs -n production deployment/my-app解决方案
yaml
# 创建镜像拉取密钥
apiVersion: v1
kind: Secret
metadata:
name: regcred
namespace: production
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: <base64-encoded-docker-config>
---
# 在Deployment中引用密钥
spec:
template:
spec:
imagePullSecrets:
- name: regcred常见问题3:部署超时
症状
- 部署一直处于进行中
- Rollout卡住
排查步骤
bash
# 查看Rollout状态
kubectl get rollout my-app-rollout -n production
# 查看Rollout详情
kubectl describe rollout my-app-rollout -n production
# 查看Pod状态
kubectl get pods -n production -l app=my-app
# 查看Pod事件
kubectl describe pod <pod-name> -n production
# 查看资源使用
kubectl top pods -n production
# 查看节点资源
kubectl describe nodes解决方案
yaml
# 增加部署超时时间
spec:
strategy:
canary:
steps:
- setWeight: 20
- pause: {duration: 10m}
analysis:
templates:
- templateName: success-rate
args:
- name: timeout
value: "600"常见问题4:测试失败
症状
- CI流水线测试失败
- 测试覆盖率不达标
排查步骤
bash
# 查看测试日志
cat target/surefire-reports/*.txt
# 查看测试报告
open target/site/surefire-report.html
# 查看覆盖率报告
open target/site/jacoco/index.html
# 运行特定测试
mvn test -Dtest=MyTest
# 调试测试
mvn test -Dtest=MyTest -Dmaven.surefire.debug解决方案
xml
<!-- 增加测试超时时间 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<argLine>-Xmx1024m</argLine>
<forkedProcessTimeoutInSeconds>600</forkedProcessTimeoutInSeconds>
</configuration>
</plugin>常见问题5:GitOps配置冲突
症状
- Git配置与集群状态不一致
- 手动修改被自动覆盖
排查步骤
bash
# 查看Git配置
git diff HEAD~1 k8s/overlays/production/
# 查看集群状态
kubectl get all -n production -o yaml
# 比较差异
diff <(kubectl get deployment my-app -n production -o yaml) k8s/overlays/production/deployment.yaml
# 查看ArgoCD同步历史
argocd app history my-app
# 查看ArgoCD操作日志
argocd app logs my-app解决方案
bash
# 禁用自动同步
argocd app set my-app --sync-policy none
# 手动同步
argocd app sync my-app
# 启用自动同步
argocd app set my-app --sync-policy automated
# 使用GitOps最佳实践
# 所有变更都通过Git提交,避免手动修改集群资源最佳实践建议
1. GitOps最佳实践
单一事实来源
yaml
# 所有配置存储在Git仓库
# 集群状态由Git配置驱动
# 避免手动修改集群资源分支策略
yaml
# main分支 -> 生产环境
# develop分支 -> 开发环境
# release/*分支 -> 测试环境
# feature/*分支 -> 功能开发变更审批
yaml
# 生产环境变更需要PR审批
# 至少2人review
# 自动化测试通过
# 安全扫描通过2. CI/CD最佳实践
流水线设计
yaml
# 分阶段执行
stages:
- lint
- build
- test
- security
- deploy
# 失败快速反馈
# 并行执行独立任务
# 缓存依赖加速构建安全集成
yaml
# 代码扫描
- 静态代码分析
- 依赖漏洞扫描
- 密钥泄露检测
# 镜像扫描
- 基础镜像漏洞
- 应用依赖漏洞
- 配置安全检查3. 部署策略最佳实践
金丝雀发布
yaml
# 逐步增加流量
- 20%流量 -> 监控10分钟
- 40%流量 -> 监控10分钟
- 60%流量 -> 监控10分钟
- 80%流量 -> 监控10分钟
- 100%流量
# 自动回滚
- 错误率超过阈值自动回滚
- 响应时间超过阈值自动回滚蓝绿部署
yaml
# 准备新版本
- 部署绿色环境
- 运行冒烟测试
- 运行性能测试
# 切换流量
- 一次性切换所有流量
- 监控关键指标
- 快速回滚能力4. 监控和告警最佳实践
关键指标
yaml
# 应用指标
- 请求成功率
- 响应时间
- 错误率
# 系统指标
- CPU使用率
- 内存使用率
- 网络IO
# 业务指标
- 订单量
- 用户活跃度
- 转化率告警配置
yaml
# 告警规则
- 错误率 > 5% 触发告警
- 响应时间 > 2s 触发告警
- CPU使用率 > 80% 触发告警
# 告警渠道
- Slack通知
- 邮件通知
- 短信通知5. 测试最佳实践
测试金字塔
yaml
# 单元测试
- 快速执行
- 高覆盖率
- 隔离测试
# 集成测试
- 服务间集成
- 数据库集成
- 外部API集成
# 端到端测试
- 用户场景测试
- 关键路径测试
- 性能测试测试自动化
yaml
# 自动运行测试
- 每次提交运行单元测试
- 每次PR运行集成测试
- 每次部署运行冒烟测试
# 测试报告
- 测试覆盖率报告
- 性能测试报告
- 安全测试报告6. 环境管理最佳实践
环境隔离
yaml
# 开发环境
- 快速迭代
- 宽松限制
- 频繁部署
# 测试环境
- 接近生产
- 自动化测试
- 性能测试
# 生产环境
- 严格限制
- 审批流程
- 监控告警配置管理
yaml
# 环境变量
- 使用ConfigMap
- 使用Secret
- 环境隔离
# 配置版本化
- Git管理配置
- 配置变更审批
- 配置回滚能力7. 团队协作最佳实践
代码审查
yaml
# PR审查
- 至少2人review
- 自动化检查通过
- 测试覆盖达标
# 审查清单
- 代码质量
- 安全问题
- 性能问题
- 文档完整性文档管理
yaml
# 架构文档
- 系统架构图
- 部署流程
- 故障处理
# API文档
- OpenAPI规范
- 使用示例
- 变更日志总结
CI/CD流水线是现代软件开发的核心实践,本章我们学习了:
- GitOps实践:ArgoCD和Flux的部署和配置
- 持续集成:GitHub Actions、GitLab CI、Jenkins Pipeline
- 持续部署:金丝雀发布、蓝绿部署等策略
- 实践示例:完整CI/CD流水线、多环境部署、自动化测试
- 故障排查:常见问题的诊断和解决方案
- 最佳实践:生产环境的CI/CD经验和建议
通过本章的学习,您应该能够构建完整的CI/CD流水线,实现从代码提交到生产部署的全流程自动化。