Cozystack Troubleshooting Guide

This guide shows the initial steps to check your cluster’s health and discover problems.

This guide shows the initial steps to check your cluster’s health and discover problems. In the bottom of the page you will find links to troubleshooting guides for various Cozystack components and aspects of cluster operations.

Troubleshooting Checklist

You can use the following commands to check the health of your cluster.

# === Flux CD ===
# broken Helm Releases are missing
kubectl get hr -A | grep -v True

# === Kubernetes ===
# you have no Nodes that are not in Ready state
kubectl get node

# === LINSTOR ===
alias linstor='kubectl exec -n cozy-linstor deploy/linstor-controller -ti -- linstor'

# LINSTOR nodes are online
linstor node list

# LINSTOR storage-pools are Ok
linstor storage-pool list

# You have no broken resources
linstor resource list --faulty

# === Kube-OVN ===
alias ovn-appctl='kubectl -n cozy-kubeovn exec deploy/ovn-central -c ovn-central -- ovn-appctl' 

# Check Northbound database
ovn-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound

# Check Southbound database
ovn-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound

# make sure that you have
# 1. Same amount of Servers as your control-plane nodes
# 2. IPs are correct
# 3. There are no duplicates (eg. two servers  with the same IP)

# to list your control-plane nodes
kubectl get node -o wide -l node-role.kubernetes.io/control-plane=

# Check that you are not hitting namespace quotas and have resources for an update
kubectl get resourcequota --all-namespaces

Additionally, you can check if there are any non-running pods in your cluster:

kubectl get pod -A | grep -v 'Running\|Completed'

Getting basic information

You can see the logs of the Cozystack operator by executing:

kubectl logs -n cozy-system deploy/cozystack-operator -f

All the platform components are installed using Flux CD HelmReleases.

You can get all installed HelmReleases:

# kubectl get hr -A
NAMESPACE                        NAME                        AGE    READY   STATUS
cozy-cert-manager                cert-manager                4m1s   True    Release reconciliation succeeded
cozy-cert-manager                cert-manager-issuers        4m1s   True    Release reconciliation succeeded
cozy-cilium                      cilium                      4m1s   True    Release reconciliation succeeded
cozy-cluster-api                 capi-operator               4m1s   True    Release reconciliation succeeded
cozy-cluster-api                 capi-providers              4m1s   True    Release reconciliation succeeded
cozy-dashboard                   dashboard                   4m1s   True    Release reconciliation succeeded
cozy-fluxcd                      cozy-fluxcd                 4m1s   True    Release reconciliation succeeded
cozy-grafana-operator            grafana-operator            4m1s   True    Release reconciliation succeeded
cozy-kamaji                      kamaji                      4m1s   True    Release reconciliation succeeded
cozy-kubeovn                     kubeovn                     4m1s   True    Release reconciliation succeeded
cozy-kubevirt-cdi                kubevirt-cdi                4m1s   True    Release reconciliation succeeded
cozy-kubevirt-cdi                kubevirt-cdi-operator       4m1s   True    Release reconciliation succeeded
cozy-kubevirt                    kubevirt                    4m1s   True    Release reconciliation succeeded
cozy-kubevirt                    kubevirt-operator           4m1s   True    Release reconciliation succeeded
cozy-linstor                     linstor                     4m1s   True    Release reconciliation succeeded
cozy-linstor                     piraeus-operator            4m1s   True    Release reconciliation succeeded
cozy-mariadb-operator            mariadb-operator            4m1s   True    Release reconciliation succeeded
cozy-metallb                     metallb                     4m1s   True    Release reconciliation succeeded
cozy-monitoring                  monitoring                  4m1s   True    Release reconciliation succeeded
cozy-postgres-operator           postgres-operator           4m1s   True    Release reconciliation succeeded
cozy-rabbitmq-operator           rabbitmq-operator           4m1s   True    Release reconciliation succeeded
cozy-redis-operator              redis-operator              4m1s   True    Release reconciliation succeeded
cozy-telepresence                telepresence                4m1s   True    Release reconciliation succeeded
cozy-victoria-metrics-operator   victoria-metrics-operator   4m1s   True    Release reconciliation succeeded
tenant-root                      tenant-root                 4m1s   True    Release reconciliation succeeded

Normally all of them should be Ready and Release reconciliation succeeded

Packages stuck in DependenciesNotReady

If some packages show DependenciesNotReady status:

$ kubectl get pkg -A | grep -v True
NAME                                        VARIANT          READY   STATUS
cozystack.cozystack-basics                  default          False   One or more dependencies are not ready
cozystack.tenant-application                default          False   One or more dependencies are not ready
cozystack.monitoring-application            default          False   One or more dependencies are not ready

This usually means a package in the dependency chain is missing or disabled. To diagnose:

  1. Find the root cause — check the operator logs for "dependency not found" messages:

    kubectl logs -n cozy-system deploy/cozystack-operator | grep "dependency not found"
    

    This will show which dependency is missing, for example:

    dependency not found, marking as not ready  package=cozystack.monitoring-application  dependency=cozystack.postgres-operator
    
  2. Check if you disabled a required package — some packages have dependencies on other packages. If you disabled a package (e.g. cozystack.postgres-operator) that other packages depend on, the entire dependency chain will be blocked.

  3. Fix the issue — either re-enable the disabled package, or if you intentionally want to keep it disabled, add it to ignoreDependencies on the affected package:

    kubectl edit pkg cozystack.monitoring-application
    
    spec:
      ignoreDependencies:
        - cozystack.postgres-operator
    

Specific Troubleshooting Guides

Cluster Bootstrapping

See the Kubernetes installation troubleshooting.

Cluster Maintenance

Remove a failed node from the cluster

See the Cluster Maintenance > Cluster Scaling.

Flux CD

Flux CD troubleshooting.

Kube-OVN

Kube-OVN troubleshooting.


Troubleshooting etcd

Explains how to resolve etcd problems and errors.

Troubleshooting Flux CD

Explains how to resolve Flux CD errors.

Monitoring Troubleshooting

Guide to diagnosing and resolving issues with monitoring components in Cozystack

Troubleshooting Kube-OVN

Explains how to resolve Kube-OVN crashes caused by a corrupted OVN database.

Troubleshooting LINSTOR controller crash loops

Explains how to resolve LINSTOR controller problems.

Troubleshooting LINSTOR CrashLoopBackOff related to a broken database

Explains how to resolve LINSTOR CrashLoopBackOff related to a broken database.

Troubleshooting Piraeus custom resources

Explains how to resolve issues with stuck Piraeus custom resources.