Enhance Longhorn application configuration by adding skipCrds option and retry settings to improve deployment resilience and error handling.
This commit is contained in:
@@ -15,6 +15,7 @@ spec:
|
|||||||
chart: longhorn
|
chart: longhorn
|
||||||
targetRevision: "1.11.1"
|
targetRevision: "1.11.1"
|
||||||
helm:
|
helm:
|
||||||
|
skipCrds: false
|
||||||
valuesObject:
|
valuesObject:
|
||||||
defaultSettings:
|
defaultSettings:
|
||||||
createDefaultDiskLabeledNodes: false
|
createDefaultDiskLabeledNodes: false
|
||||||
@@ -23,7 +24,12 @@ spec:
|
|||||||
automated:
|
automated:
|
||||||
prune: true
|
prune: true
|
||||||
selfHeal: true
|
selfHeal: true
|
||||||
|
retry:
|
||||||
|
limit: 5
|
||||||
|
backoff:
|
||||||
|
duration: 20s
|
||||||
|
factor: 2
|
||||||
|
maxDuration: 3m
|
||||||
syncOptions:
|
syncOptions:
|
||||||
- CreateNamespace=true
|
- CreateNamespace=true
|
||||||
- PruneLast=true
|
- PruneLast=true
|
||||||
- ServerSideApply=true
|
|
||||||
|
|||||||
@@ -55,6 +55,39 @@ talosctl -n 192.168.50.20 -e 192.168.50.230 health
|
|||||||
kubectl get nodes -o wide
|
kubectl get nodes -o wide
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### `kubectl` errors: `lookup https: no such host` or `https://https/...`
|
||||||
|
|
||||||
|
That means the **active** kubeconfig has a broken `cluster.server` URL (often a
|
||||||
|
**double** `https://` or **duplicate** `:6443`). Kubernetes then tries to resolve
|
||||||
|
the hostname `https`, which fails.
|
||||||
|
|
||||||
|
Inspect what you are using:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl config view --minify -o jsonpath='{.clusters[0].cluster.server}{"\n"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
It must be a **single** valid URL, for example:
|
||||||
|
|
||||||
|
- `https://192.168.50.230:6443` (API VIP from `talconfig.yaml`), or
|
||||||
|
- `https://kube.noble.lab.pcenicni.dev:6443` (if DNS points at that VIP)
|
||||||
|
|
||||||
|
Fix the cluster entry (replace `noble` with your context’s cluster name if
|
||||||
|
different):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl config set-cluster noble --server=https://192.168.50.230:6443
|
||||||
|
```
|
||||||
|
|
||||||
|
Or point `kubectl` at this repo’s kubeconfig (known-good server line):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export KUBECONFIG="$(pwd)/kubeconfig"
|
||||||
|
kubectl cluster-info
|
||||||
|
```
|
||||||
|
|
||||||
|
Avoid pasting `https://` twice when running `kubectl config set-cluster ... --server=...`.
|
||||||
|
|
||||||
## 6) GitOps-pinned Cilium values
|
## 6) GitOps-pinned Cilium values
|
||||||
|
|
||||||
The Cilium settings that worked for this Talos cluster are now persisted in:
|
The Cilium settings that worked for this Talos cluster are now persisted in:
|
||||||
@@ -134,6 +167,34 @@ Longhorn is deployed from:
|
|||||||
Monitoring apps are configured to use `storageClassName: longhorn`, so you can
|
Monitoring apps are configured to use `storageClassName: longhorn`, so you can
|
||||||
persist Prometheus/Alertmanager/Loki data once Longhorn is healthy.
|
persist Prometheus/Alertmanager/Loki data once Longhorn is healthy.
|
||||||
|
|
||||||
|
### Argo CD: `longhorn` OutOfSync, Health **Missing**, no `longhorn-role`
|
||||||
|
|
||||||
|
**Missing** means nothing has been applied yet, or a sync never completed. The
|
||||||
|
Helm chart creates `ClusterRole/longhorn-role` on a successful install.
|
||||||
|
|
||||||
|
1. See the failure reason:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl describe application longhorn -n argocd
|
||||||
|
```
|
||||||
|
|
||||||
|
Check **Status → Conditions** and **Status → Operation State** for the error
|
||||||
|
(for example Helm render error, CRD apply failure, or repo-server cannot reach
|
||||||
|
`https://charts.longhorn.io`).
|
||||||
|
|
||||||
|
2. Trigger a sync (Argo CD UI **Sync**, or CLI):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
argocd app sync longhorn
|
||||||
|
```
|
||||||
|
|
||||||
|
3. After a good sync, confirm:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl get clusterrole longhorn-role
|
||||||
|
kubectl get pods -n longhorn-system
|
||||||
|
```
|
||||||
|
|
||||||
### Extra drive layout (this cluster)
|
### Extra drive layout (this cluster)
|
||||||
|
|
||||||
Each node uses:
|
Each node uses:
|
||||||
|
|||||||
Reference in New Issue
Block a user