• 首页 首页 icon
  • 工具库 工具库 icon
    • IP查询 IP查询 icon
  • 内容库 内容库 icon
    • 快讯库 快讯库 icon
    • 精品库 精品库 icon
    • 问答库 问答库 icon
  • 更多 更多 icon
    • 服务条款 服务条款 icon

K8S 七Metrics Server部署的问题

武飞扬头像
itachi-uchiha
帮助1

目录

 填坑过程

问题一:启动metrics server报证书错误:x509: cannot validate certificate for x.x.x.x because it doesn't contain any IP SANs" node="k8s-testing-02-191"

问题二:metrics server 一直未ready,查看日志报错:Failed to scrape node" err="Get \"https://x.x.x.x:10250/metrics/resource\": context deadline exceeded" 

问题三:metrics server启动成功,但是执行kubectl top node报错:Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

metrics server启动参数

 附件:kube-metric-server.yaml启动文件


前面在使用kubeadm工具部署K8S时,做过Metrics的部署,过程很简单。后来在生产上使用二进制方式部署K8S后,创建Metrics插件却屡屡遇坑,此处记录一下填坑过程。部署步骤请参考《【K8S 三】部署 metrics-server 插件

为了更方便厘清问题,先上一张拓扑图(flanneld网络插件可以换成calico)

学新通

 填坑过程

问题一:启动metrics server报证书错误:x509: cannot validate certificate for x.x.x.x because it doesn't contain any IP SANs" node="k8s-testing-02-191"

 E0725 05:27:26.638019       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.11.191:10250/metrics/resource\": x509: cannot validate certificate for 192.168.11.191 because it doesn't contain any IP SANs" node="k8s-testing-02-191"
 I0725 05:27:33.495998       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"

解决:

  1.  
     添加参数
  2.  
            - --kubelet-insecure-tls
  3.  
     或者
  4.  
            - --tls-cert-file=/etc/ssl/pki/ca.pem
  5.  
            - --tls-private-key-file=/etc/ssl/pki/ca-key.pem

问题二:metrics server 一直未ready,查看日志报错:Failed to scrape node" err="Get \"https://x.x.x.x:10250/metrics/resource\": context deadline exceeded" 

 scraper.go:140] "Failed to scrape node" err="Get \"https://linshi-k8s-54:10250/metrics/resource\": context deadline exceeded" node="linshi-k8s-54"
 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"

解决:

保持--kubelet-preferred-address-types和apiserver一致

问题三:metrics server启动成功,但是执行kubectl top node报错:Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

 kubectl top node
 Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

问题定位:

#-- 查看metrics apiservice的event
Message:               failing or missing response from 
https://10.254.156.1:443/apis/metrics.k8s.io/v1beta1: Get 
"https://10.254.156.1:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.254.156.1:443: i/o timeout
Reason:                FailedDiscoveryCheck
#-- 可以看到kubectl访问metrics的clusterIP超时了,配置apiserver配置--enable-aggregator-routing=true后,发现报错为
Message:               failing or missing response from 
https://172.254.247.87:4443/apis/metrics.k8s.io/v1beta1: Get 
"https://172.254.247.87:4443/apis/metrics.k8s.io/v1beta1": dial tcp 172.254.247.87:4443: i/o timeout
Reason:                FailedDiscoveryCheck
#-- kubectl直接访问endpoint也超时了
#-- 另:metrics service port只能监听在443上,手动配置成4443报错
Message:               service/metrics-server in "kube-system" is not listening on port 443
Reason:                ServicePortError
这是因为从该master到metrics server不通导致的;因为部署的master上没有kubelet和kube-proxy,如果apiserver上配置了--enable-aggregator-routing=true,则kubectl命令会直接访问metrics的endpoint,但是master无法访问node的pod网络(因为没有kubelet)。如果不配置--enable-aggregator-routing=true通过metrics service的clusterIP访问呢?因为没有kube-proxy代理导致对clusterIP也是不通(可以参看前面的拓扑图)。

学新通

解决:

  1.  
    # 修改metrics server启动YAML文件:
  2.  
     deployment.spec.template.spec.hostNetwork: true
  3.  
    # 或者
  4.  
    # 固定metrics service的地址,然后手动添加路由策略。

metrics server启动参数

#--- metricsTLS  server的启动参数可以通过下面命令自查询
docker run --rm 192.168.11.101/library/metrics-server:v0.6.1 --help

--cert-dir=/tmp
#-- TLS证书存放目录,如果--tls-cert-file and --tls-private-key-file配置了,那么该参数被忽略
--secure-port=4443
#-- 提供带有身份验证和授权的HTTPS服务的端口。如果为0,则不提供HTTPS服务。443(默认)
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
#-- 用于kubelet连接的首选NodeAddressTypes的列表.这里要和kube-apiserver配置保持一致  (default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])
--kubelet-use-node-status-port
#-- 使用node状态中的port,优先级高于--kubelet-port
--metric-resolution=30s
#-- metrics-server到kubelet的采集周期,必须设置值至少10s。(默认1m0s)
--kubelet-insecure-tls
#-- 不要验证由Kubelets提供的CA或服务证书。仅供测试之用。如果不用该参数则需要将--tls-cert-file和--tls-private-key-file传入metrics server
--tls-cert-file
#-- 包含用于HTTPS的默认x509证书的文件。如果启用HTTPS服务,且不提供--tls-cert-file和--tls-private-key-file,则生成一个针对公共地址的自签名证书和密钥,并保存到--cert-dir指定的目录中。
--tls-private-key-file
#-- 包含默认的x509私钥匹配的文件--tls-cert-file。
--kubelet-port
#-- The port to use to connect to Kubelets. (default 10250)

 附件:kube-metric-server.yaml启动文件

  1.  
    apiVersion: v1
  2.  
    kind: ServiceAccount
  3.  
    metadata:
  4.  
    labels:
  5.  
    k8s-app: metrics-server
  6.  
    name: metrics-server
  7.  
    namespace: kube-system
  8.  
    ---
  9.  
    apiVersion: rbac.authorization.k8s.io/v1
  10.  
    kind: ClusterRole
  11.  
    metadata:
  12.  
    labels:
  13.  
    k8s-app: metrics-server
  14.  
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
  15.  
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
  16.  
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  17.  
    name: system:aggregated-metrics-reader
  18.  
    rules:
  19.  
    - apiGroups:
  20.  
    - metrics.k8s.io
  21.  
    resources:
  22.  
    - pods
  23.  
    - nodes
  24.  
    verbs:
  25.  
    - get
  26.  
    - list
  27.  
    - watch
  28.  
    ---
  29.  
    apiVersion: rbac.authorization.k8s.io/v1
  30.  
    kind: ClusterRole
  31.  
    metadata:
  32.  
    labels:
  33.  
    k8s-app: metrics-server
  34.  
    name: system:metrics-server
  35.  
    rules:
  36.  
    - apiGroups:
  37.  
    - ""
  38.  
    resources:
  39.  
    - nodes/metrics
  40.  
    verbs:
  41.  
    - get
  42.  
    - apiGroups:
  43.  
    - ""
  44.  
    resources:
  45.  
    - pods
  46.  
    - nodes
  47.  
    verbs:
  48.  
    - get
  49.  
    - list
  50.  
    - watch
  51.  
    ---
  52.  
    apiVersion: rbac.authorization.k8s.io/v1
  53.  
    kind: RoleBinding
  54.  
    metadata:
  55.  
    labels:
  56.  
    k8s-app: metrics-server
  57.  
    name: metrics-server-auth-reader
  58.  
    namespace: kube-system
  59.  
    roleRef:
  60.  
    apiGroup: rbac.authorization.k8s.io
  61.  
    kind: Role
  62.  
    name: extension-apiserver-authentication-reader
  63.  
    subjects:
  64.  
    - kind: ServiceAccount
  65.  
    name: metrics-server
  66.  
    namespace: kube-system
  67.  
    ---
  68.  
    apiVersion: rbac.authorization.k8s.io/v1
  69.  
    kind: ClusterRoleBinding
  70.  
    metadata:
  71.  
    labels:
  72.  
    k8s-app: metrics-server
  73.  
    name: metrics-server:system:auth-delegator
  74.  
    roleRef:
  75.  
    apiGroup: rbac.authorization.k8s.io
  76.  
    kind: ClusterRole
  77.  
    name: system:auth-delegator
  78.  
    subjects:
  79.  
    - kind: ServiceAccount
  80.  
    name: metrics-server
  81.  
    namespace: kube-system
  82.  
    ---
  83.  
    apiVersion: rbac.authorization.k8s.io/v1
  84.  
    kind: ClusterRoleBinding
  85.  
    metadata:
  86.  
    labels:
  87.  
    k8s-app: metrics-server
  88.  
    name: system:metrics-server
  89.  
    roleRef:
  90.  
    apiGroup: rbac.authorization.k8s.io
  91.  
    kind: ClusterRole
  92.  
    name: system:metrics-server
  93.  
    subjects:
  94.  
    - kind: ServiceAccount
  95.  
    name: metrics-server
  96.  
    namespace: kube-system
  97.  
    ---
  98.  
    apiVersion: v1
  99.  
    kind: Service
  100.  
    metadata:
  101.  
    labels:
  102.  
    k8s-app: metrics-server
  103.  
    name: metrics-server
  104.  
    namespace: kube-system
  105.  
    spec:
  106.  
    ports:
  107.  
    - name: https
  108.  
    port: 443
  109.  
    protocol: TCP
  110.  
    targetPort: https
  111.  
    selector:
  112.  
    k8s-app: metrics-server
  113.  
    ---
  114.  
    apiVersion: apps/v1
  115.  
    kind: Deployment
  116.  
    metadata:
  117.  
    labels:
  118.  
    k8s-app: metrics-server
  119.  
    name: metrics-server
  120.  
    namespace: kube-system
  121.  
    spec:
  122.  
    selector:
  123.  
    matchLabels:
  124.  
    k8s-app: metrics-server
  125.  
    strategy:
  126.  
    rollingUpdate:
  127.  
    maxUnavailable: 0
  128.  
    template:
  129.  
    metadata:
  130.  
    labels:
  131.  
    k8s-app: metrics-server
  132.  
    spec:
  133.  
    containers:
  134.  
    - args:
  135.  
    - --cert-dir=/tmp
  136.  
    - --secure-port=4443
  137.  
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
  138.  
    - --kubelet-use-node-status-port
  139.  
    - --metric-resolution=30s
  140.  
    - --kubelet-insecure-tls
  141.  
    # - --tls-cert-file=/etc/ssl/pki/ca.pem
  142.  
    # - --tls-private-key-file=/etc/ssl/pki/ca-key.pem
  143.  
    image: HARBOR_HOST_NAME/library/metrics-server:v0.6.1
  144.  
    imagePullPolicy: IfNotPresent
  145.  
    livenessProbe:
  146.  
    failureThreshold: 3
  147.  
    httpGet:
  148.  
    path: /livez
  149.  
    port: https
  150.  
    scheme: HTTPS
  151.  
    periodSeconds: 10
  152.  
    name: metrics-server
  153.  
    ports:
  154.  
    - containerPort: 4443
  155.  
    name: https
  156.  
    protocol: TCP
  157.  
    readinessProbe:
  158.  
    failureThreshold: 3
  159.  
    httpGet:
  160.  
    path: /readyz
  161.  
    port: https
  162.  
    scheme: HTTPS
  163.  
    initialDelaySeconds: 20
  164.  
    periodSeconds: 10
  165.  
    resources:
  166.  
    requests:
  167.  
    cpu: 100m
  168.  
    memory: 200Mi
  169.  
    securityContext:
  170.  
    allowPrivilegeEscalation: false
  171.  
    readOnlyRootFilesystem: true
  172.  
    runAsNonRoot: true
  173.  
    runAsUser: 1000
  174.  
    volumeMounts:
  175.  
    - mountPath: /tmp
  176.  
    name: tmp-dir
  177.  
    # - mountPath: /etc/ssl/pki
  178.  
    # name: cert-dir
  179.  
    nodeSelector:
  180.  
    kubernetes.io/os: linux
  181.  
    priorityClassName: system-cluster-critical
  182.  
    serviceAccountName: metrics-server
  183.  
    hostNetwork: true
  184.  
    volumes:
  185.  
    - emptyDir: {}
  186.  
    name: tmp-dir
  187.  
    # - name: cert-dir
  188.  
    # hostPath:
  189.  
    # path: /etc/ssl/certs/ca-certs/
  190.  
    ---
  191.  
    apiVersion: apiregistration.k8s.io/v1
  192.  
    kind: APIService
  193.  
    metadata:
  194.  
    labels:
  195.  
    k8s-app: metrics-server
  196.  
    name: v1beta1.metrics.k8s.io
  197.  
    spec:
  198.  
    group: metrics.k8s.io
  199.  
    groupPriorityMinimum: 100
  200.  
    insecureSkipTLSVerify: true
  201.  
    service:
  202.  
    name: metrics-server
  203.  
    namespace: kube-system
  204.  
    version: v1beta1
  205.  
    versionPriority: 100
学新通

这篇好文章是转载于:学新通技术网

  • 版权申明: 本站部分内容来自互联网,仅供学习及演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,请提供相关证据及您的身份证明,我们将在收到邮件后48小时内删除。
  • 本站站名: 学新通技术网
  • 本文地址: /boutique/detail/tanhfhajbe
系列文章
更多 icon
同类精品
更多 icon
继续加载