Rancher: Fix “Failed to list *v1.ClusterRepo” (cannot parse time)

In case you stumble upon this error while troubleshooting your cluster agent suddenly being disconnected keep reading, because after sitting with this issue for far too long I finally found and fixed the root cause.

Simply, we had a cluster where the cluster agent repeatedly failed to connect to Rancher, no matter if the node was rebooted, reinstalled or recreated. The only errors available in the cattle-cluster-agent pods was the following:

E1128 22:14:04.027989      39 reflector.go:158] "Unhandled Error" err="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:243: Failed to watch *v1.ClusterRepo: failed to list *v1.ClusterRepo: parsing time \"2024-01-01T22:42:00\" as \"2006-01-02T15:04:05Z07:00\": cannot parse \"\" as \"Z07:00\"" logger="UnhandledError"
W1128 22:14:46.175887      39 reflector.go:561] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:243: failed to list *v1.ClusterRepo: parsing time "2024-01-01T22:42:00" as "2006-01-02T15:04:05Z07:00": cannot parse "" as "Z07:00"

The solution was quite simple, if you run a command searching for the offending datestring

kubectl list clusterrepo -o yaml | grep "2024-01-01T22:42:00" -A10 -B10

in one of the ClusterRepo configurations you’d see the following:

 kind: ClusterRepo
  metadata:
    annotations:
      field.cattle.io/description: https://charts.gitlab.io
    creationTimestamp: "2023-02-22T10:53:59Z"
    generation: 2
    name: gitlab
    resourceVersion: "500050112"
    uid: c2ebaa0d-be50-4e8d-ab7b-2f7385148a04
  spec:
    forceUpdate: 2024-01-01T22:42:00
    url: https://charts.gitlab.io
  status:
    conditions:
    - lastUpdateTime: "2023-02-22T10:53:59Z"
      status: "True"
      type: FollowerDownloaded
    - lastUpdateTime: "2024-09-12T21:23:10Z"
      status: "True"
      type: Downloaded
    downloadTime: "2024-09-12T21:23:09Z"

The field at spec.forceUpdate contains an incorrectly formatted date, which causes the cluster agent to panic before fully starting up. The solution is quite simple, you can just add a Z after the forceUpdate time string

  forceUpdate: 2024-01-01T22:42:00Z

This will resolve the issue Rancher is having with the format. This error may be in multiple clusterrepos, so you can fix all of them by editing all clusterrepos and fixing any incorrect times in this field

kubectl edit clusterrepo

Once done, you can manually restart one of the cluster-agent pods, and the agent should once again connect to Rancher.

Leave a Reply

Company

© 2024 Cloudyne Systems (Scheibling Consulting AB). All Rights Reserved.