172.16.0.100 master172.16.0.101 node01172.16.0.102 node021. 开机卡顿# 停止禁用屏蔽彻底不启动sudosystemctl disable--nowNetworkManager-wait-online.servicesudosystemctl mask NetworkManager-wait-online.servicesudosystemctl disable--nowsystemd-networkd-wait-online.servicesudosystemctl mask systemd-networkd-wait-online.service# 重载systemd配置sudosystemctl daemon-reload# Netplan 网卡设置防止单网卡 DHCP 卡住等待sudovim/etc/netplan/*.yaml network: ethernets: ens33: dhcp4:trueoptional:true# 新增这一行version:2sudonetplan apply# chrony aliyun.comsudovim/etc/chrony/sources.d/aliyun.sources# 阿里云公共 NTP 服务server ntp.aliyun.com iburst server ntp1.aliyun.com iburst server ntp2.aliyun.com iburst server ntp3.aliyun.com iburst server ntp4.aliyun.com iburst server ntp5.aliyun.com iburst server ntp6.aliyun.com iburst server ntp7.aliyun.com iburst systemctl restart chronyd2. apt 源配置/etc/apt/sources.list.d/ubuntu.sources Types: deb URIs: https://mirrors.aliyun.com/ubuntu Suites: resolute resolute-updates resolute-backports Components: main universe restricted multiverse Signed-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg Types: deb URIs: https://mirrors.aliyun.com/ubuntu Suites: resolute-security Components: main universe restricted multiverse Signed-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg3. 添加域名echo172.16.0.100 master/etc/hostsecho172.16.0.101 node01/etc/hostsecho172.16.0.102 node02/etc/hosts4. 关闭 Swapsudosystemctl stop ufwsudosystemctl disable ufwsudosystemctl mask ufwsudoiptables-Fsudoiptables-Xsudoiptables-tnat-Fsudoiptables-tnat-Xsudoiptables-PINPUT ACCEPTsudoiptables-PFORWARD ACCEPTsudoiptables-POUTPUT ACCEPT swapoff-ased-i/swap/s/^/#//etc/fstabfree-h5. 加载内核模块并开启网络转发vimk8s-network.sh#!/bin/bash# 1. 内核模块开机自启cat/etc/modules-load.d/k8s-network.confEOF overlay br_netfilter EOF# 2. 网络转发参数持久化cat/etc/sysctl.d/99-k8s-forward.confEOF net.ipv4.ip_forward 1 net.bridge.bridge-nf-call-iptables 1 net.bridge.bridge-nf-call-ip6tables 1 EOF# 3. 立即加载模块modprobe overlay modprobe br_netfilter# 4. 立即生效sysctlsysctl--system# 5. 验证输出echo 验证参数 sysctlnet.ipv4.ip_forward net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tablesecho 验证内核模块 lsmod|grep-Eoverlay|br_netfilter# 重启检验# 重启后执行sysctlnet.ipv4.ip_forwardsysctlnet.bridge.bridge-nf-call-iptablessysctlnet.bridge.bridge-nf-call-ip6tables lsmod|grepoverlay lsmod|grepbr_netfilter6. 安装 containerdsudoaptinstall-ycontainerdvifix-containerd2.shset-emkdir-p/etc/containerd# 生成纯净默认配置containerd config default/etc/containerd/config.toml# 开启 systemd cgroupsed-is/SystemdCgroup false/SystemdCgroup true//etc/containerd/config.toml# 修改 pause 镜像地址# 替换 pinned sandbox 为阿里云pausesed-is|sandbox .*|sandbox registry.aliyuncs.com/google_containers/pause:3.10.1|/etc/containerd/config.toml# 重载重启systemctl daemon-reload systemctl restart containerd7. 安装 K8s 组件mkdir-p/etc/apt/keyringscurl-fsSLhttps://pkgs.k8s.io/core:/stable:/v1.36/deb/Release.key|\gpg--dearmor-o/etc/apt/keyrings/kubernetes-apt-keyring.gpgechodeb [signed-by/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.36/deb/ /|\tee/etc/apt/sources.list.d/kubernetes.listaptupdateaptinstall-ykubelet kubeadm kubectl apt-mark hold kubelet kubeadm kubectl kubeadm version kubectl version--clientkubelet--versionkubeadm reset-frm-rf/etc/kubernetes /var/lib/kubelet /var/lib/etcdrm-rf$HOME/.kube8. 初始化 master 节点# 查看镜像# containerd 镜像 /etc/containerd/config.toml grep pause:3.10.1rootubuntu26:~# kubeadm config images listregistry.k8s.io/kube-apiserver:v1.36.2 registry.k8s.io/kube-controller-manager:v1.36.2 registry.k8s.io/kube-scheduler:v1.36.2 registry.k8s.io/kube-proxy:v1.36.2 registry.k8s.io/coredns/coredns:v1.14.2 registry.k8s.io/pause:3.10.2 registry.k8s.io/etcd:3.6.8-0# 提前下载并更换 tag 脚本#!/bin/bashK8S_VERSIONv1.36.2COREDNS_VERSIONv1.14.2ETCD_VERSION3.6.8-0PAUSE_VERSION3.10.2REGISTRY_PREFIXswr.cn-north-4.myhuaweicloud.com/ddn-k8s/registry.k8s.ioimages(kube-apiserver:${K8S_VERSION}kube-controller-manager:${K8S_VERSION}kube-scheduler:${K8S_VERSION}kube-proxy:${K8S_VERSION}coredns/coredns:${COREDNS_VERSION}etcd:${ETCD_VERSION}pause:${PAUSE_VERSION})forimgin${images[]};dosource_img${REGISTRY_PREFIX}/${img}target_imgregistry.k8s.io/${img}echo正在拉取命名空间 k8s.io:${source_img}ctr-nk8s.io images pull${source_img}echo正在打标签:${target_img}ctr-nk8s.io images tag${source_img}${target_img}# 若想删除源镜像华为云前缀执行ctr-nk8s.io images remove${source_img}doneecho所有镜像准备完毕aptupdateaptinstall-ycri-toolscat/etc/crictl.yamlEOF runtime-endpoint: unix:///run/containerd/containerd.sock image-endpoint: unix:///run/containerd/containerd.sock timeout: 10 debug: false EOFcrictl--versioncrictl info# 查看所有容器看 apiserver/etcd 是否退出crictlps-a# 查看 etcd 日志ETCD_ID$(crictlps-a|grepetcd|awk{print $1})crictl logs$ETCD_ID# 查看 apiserver 日志APISERVER_ID$(crictlps-a|grepkube-apiserver|awk{print $1})crictl logs$APISERVER_ID# 列出 k8s 命名空间所有容器ctr-nk8s.io c list# 查看镜像ctr-nk8s.io images list# kubeadm 配置文件修改默认仓库kubeadm config print init-defaultskubeadm-config.yamlvimkubeadm-config.yaml imageRepository: registry.aliyuncs.com/google_containers advertiseAddress:172.16.0.100 kubeadm config images pull--configkubeadm-config.yaml kubeadm config images pull\--kubernetes-versionv1.36.1\--image-repository registry.aliyuncs.com/google_containers kubeadm init--configkubeadm-config.yamlmkdir-p$HOME/.kubesudocp-i/etc/kubernetes/admin.conf$HOME/.kube/configsudochown$(id-u):$(id-g)$HOME/.kube/config kubeadmjoin172.16.0.100:6443--tokenabcdef.0123456789abcdef\--discovery-token-ca-cert-hash sha256:2daec271f14cf5b143ba8ab7ece30c0b21a874942a1bf809b643f934cdd1c433 kubeadm reset-frm-rf/etc/kubernetes /var/lib/kubelet /var/lib/etcdrm-rf$HOME/.kube9. Node 节点加入kubeadmjoin172.16.0.100:6443--tokenabcdef.0123456789abcdef\--discovery-token-ca-cert-hash sha256:7538b5a6f63256963831309c85ef60ec1a4fa7855144cbaa9a1ca2ff0286caa1mkdir-p$HOME/.kubesudocp-i/etc/kubernetes/admin.conf$HOME/.kube/configsudochown$(id-u):$(id-g)$HOME/.kube/config10. 安装calico网络插件wgethttps://raw.githubusercontent.com/projectcalico/calico/v3.30.3/manifests/calico.yamlwgethttps://fastgit.org/projectcalico/calico/raw/v3.30.3/manifests/calico.yamlwgethttps://cdn.jsdelivr.net/gh/projectcalico/calicov3.30.3/manifests/calico.yamlgrepimage:calico.yaml image: docker.io/calico/cni:v3.30.3 image: docker.io/calico/node:v3.30.3 image: docker.io/calico/kube-controllers:v3.30.3vimpull_calico_images.sh#!/bin/bash# Calico 版本请根据实际需求修改CALICO_VERSIONv3.30.3# 华为云 SWR 镜像仓库前缀源地址REGISTRY_PREFIXswr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/calico# Calico 镜像名称列表不含版本号images(cninodekube-controllers)forimgin${images[]};dosource_img${REGISTRY_PREFIX}/${img}:${CALICO_VERSION}target_imgdocker.io/calico/${img}:${CALICO_VERSION}echo 正在从华为云拉取源镜像命名空间 k8s.io:${source_img}ctr-nk8s.io images pull${source_img}||{echo拉取失败请检查镜像是否存在或网络连接;exit1;}echo 打标签为命名空间 k8s.io:${target_img}ctr-nk8s.io images tag${source_img}${target_img}echo 删除源镜像华为云前缀:${source_img}ctr-nk8s.io images remove${source_img}echo 完成${img}echodoneecho所有 Calico 镜像已准备在 k8s.io 命名空间echo当前 k8s.io 命名空间中的 Calico 镜像列表ctr-nk8s.io images list|grepcalico# 根据 master kubeadm init --pod-network-cidr10.244.0.0/16 指定为 pod 地址 10.244.0.0/16# 修改 CALICO_IPV4POOL_CIDR 保持一致- name: CALICO_IPV4POOL_CIDR value:10.0.17.0/24# 修改为 BGP 模式# Enable IPIP- name: CALICO_IPV4POOL_IPIP value:Always#改成Offkubectl apply-fcalico.yaml kubectl get pod-Akubectl delete pod-nkube-system pod--force--grace-period011. nginx# 创建 nginx 部署kubectl create deployment nginx--imagedocker.m.daocloud.io/library/nginx:alpine# 暴露端口外部可访问NodePortkubectl expose deployment nginx--port80--typeNodePort nginx-deploy.yaml apiVersion: apps/v1 kind: Deployment metadata: name: nginx spec: replicas:1selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: docker.m.daocloud.io/library/nginx:alpine ports: - containerPort:80--- apiVersion: v1 kind: Service metadata: name: nginx-svc spec: type: NodePort selector: app: nginx ports: - port:80targetPort:80nodePort:3008012 Cephlsblk-f# 清除硬盘数据# 请将 /dev/sdX 替换为你的实际设备名例如 /dev/sdbsudowipefs-a/dev/sdX# 部署Rook Operator# 克隆Rook仓库gitclone --single-branch--branchv1.15.1 https://gitee.com/mirrors/ROOK.gitcdROOK/deploy/examples# 一键拉取脚本vimpull_ceph_images.sh#!/bin/bash# Rook v1.15.1 Ceph v18.2.4 镜像预拉取脚本set-e# 定义镜像列表 (源镜像 - 目标镜像)# 格式: 源镜像地址 目标镜像地址# 如果源镜像与目标镜像相同则只需填写一次declare-Aimages([swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/rook/ceph:v1.15.1]docker.io/rook/ceph:v1.15.1[swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/ceph/ceph:v18.2.4]docker.io/ceph/ceph:v18.2.4# 如果 quay.io 能访问也可以使用官方镜像# [quay.io/ceph/ceph:v18.2.4]quay.io/ceph/ceph:v18.2.4)# 使用 ctr (containerd) 拉取并打标签forsrcin${!images[]};dotarget${images[$src]}echo 正在拉取源镜像:${src}ctr-nk8s.io images pull${src}||{echo拉取${src}失败请检查网络;exit1;}# 如果源镜像与目标镜像不同则需要打标签if[${src}!${target}];thenecho 打标签为:${target}ctr-nk8s.io images tag${src}${target}# (可选) 删除源镜像以节省空间# echo 删除源镜像: ${src}# ctr -n k8s.io images remove ${src}fiecho 完成${target}echodoneecho所有 Ceph 镜像已准备完毕echo当前 k8s.io 命名空间中的相关镜像列表ctr-nk8s.io images list|grep-Erook/ceph|ceph/ceph# 部署kubectl create-fcrds.yaml-fcommon.yaml-foperator.yaml kubectl-nrook-ceph get pod# 创建 Ceph 集群# 获取并修改集群配置文件cpROOK/deploy/examples/cluster.yaml cluster.yamlvimcluster.yaml# 修改 cephClusterSpec 中的版本为 v18.2.4apiVersion: ceph.rook.io/v1 kind: CephCluster metadata: name: rook-ceph namespace: rook-ceph spec: cephVersion: image: quay.io/ceph/ceph:v18.2.4# 使用的Ceph版本[reference:10]dataDirHostPath: /var/lib/rook# 宿主机上存储Ceph配置的目录[reference:11]mon: count:3# 3个monitor实现高可用[reference:12]allowMultiplePerNode:falsemgr: count:2# 2个managerdashboard: enabled:true# 启用Ceph仪表板[reference:13]storage: useAllNodes:true# 使用所有节点上的设备[reference:14]useAllDevices:false# 【必须设为false】防止使用系统盘# 【关键】通过deviceFilter精准选择硬盘deviceFilter:^sd[bc]# 正则匹配所有以 sd 开头后跟 b 或 c 的设备config: osdsPerDevice:1# 每块硬盘创建一个OSD[reference:15]# 资源限制配置可选但建议设置[reference:16]resources: mon: limits: memory:2Girequests: memory:1Gicpu:500mosd: limits: memory:4Girequests: memory:2Gicpu:500mplacement: all:# 此配置将应用于所有 Ceph 组件 (mon, mgr, osd等)tolerations: - effect: NoSchedule key: node-role.kubernetes.io/control-plane - effect: NoSchedule key: node-role.kubernetes.io/master# 启动工具箱如果尚未运行kubectl create-ftoolbox.yaml# 进入工具箱 Podkubectl-nrook-cephexec-itdeploy/rook-ceph-tools --bash# 在工具箱内执行ceph命令ceph status ceph osd status cephdf# 创建存储类StorageClass以供应用使用# 创建块存储RBD池和StorageClasskubectl create-fcsi/rbd/storageclass.yaml kubectl get sc# dashboardkubectl edit cephcluster rook-ceph-nrook-ceph spec: dashboard: enabled:true# 开启面板ssl:true# 默认HTTPS建议开启port:8443# 默认端口kubectl get svc-nrook-ceph|grepdashboard dashboard-nodeport.yaml apiVersion: v1 kind: Service metadata: name: rook-ceph-mgr-dashboard-np namespace: rook-ceph spec: type: NodePort selector: app: rook-ceph-mgr rook_cluster: rook-ceph mgr_role: active ports: - port:8443targetPort:8443nodePort:30443# 自定义端口(30000-32767)# 查看密码kubectl-nrook-ceph get secret rook-ceph-dashboard-password\-ojsonpath{.data.password}|base64 --decode;echo# 进入工具箱kubectl-nrook-cephexec-itdeploy/rook-ceph-tools --bash# 修改密码替换自定义密码ceph dashboard set-login-credentials admin 你的新密码