Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-56697

VolumeSnapshotContent is in the failed state due to etcd leader changes

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.19
    • Etcd
    • None
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem - Provide a detailed description of the issue encountered, including logs/command-output snippets and screenshots if the issue is observed in the UI:

      On RDR setup, One of the cephfs PVCs (busybox-pvc-2 ) has stopped syncing. Stuck in SyncInProgress state.

      The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI):

      VMware

      The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc): DR

      The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable):

      OCP - 4.19.0-0.nightly-2025-05-19-005500

      ODF - 4.19.0-74.konflux

      MCO - 4.19.0-74

      ACM - 2.14.0-DOWNSTREAM-2025-05-08-21-55-29

      Submariner - 962019

      Steps to Reproduce:

      1. On a Regular RDR setup, deploy a CephFS-based application
      2. Keep in the state of syncing for more than a day
      3. One of the PVCs has stopped syncing

      PVC is stopped syncing, waiting for VolumeSnapshot to be ready, but it never happens, because of the following problem:

      apiVersion: v1
      items:
      - apiVersion: snapshot.storage.k8s.io/v1
        kind: VolumeSnapshotContent
        metadata:
          annotations:
            snapshot.storage.kubernetes.io/deletion-secret-name: cephfs-provisioner-915bf7d6-65a4-4810-839c-058cb83f78aa
            snapshot.storage.kubernetes.io/deletion-secret-namespace: openshift-storage
            snapshot.storage.kubernetes.io/volumesnapshot-being-created: "yes"
          creationTimestamp: "2025-05-21T15:45:11Z"
          finalizers:
          - snapshot.storage.kubernetes.io/volumesnapshotcontent-bound-protection
          generation: 1
          name: snapcontent-f3acc2bc-7533-4b2d-a9cb-69c06e653602
          resourceVersion: "2236615"
          uid: 37af38a1-8019-4f5b-90e7-fb30f7e30640
        spec:
          deletionPolicy: Delete
          driver: openshift-storage.cephfs.csi.ceph.com
          source:
            volumeHandle: 0001-0011-openshift-storage-0000000000000001-6f6bb8d3-ee3f-49ae-89f4-3422fd3d17e7
          sourceVolumeMode: Filesystem
          volumeSnapshotClassName: ocs-storagecluster-cephfsplugin-snapclass
          volumeSnapshotRef:
            apiVersion: snapshot.storage.k8s.io/v1
            kind: VolumeSnapshot
            name: volsync-busybox-pvc-2-src
            namespace: test-cephfs-c2
            resourceVersion: "2235641"
            uid: f3acc2bc-7533-4b2d-a9cb-69c06e653602
        status:
          error:
            message: 'Failed to create snapshot: error updating status for volume snapshot
              content snapcontent-f3acc2bc-7533-4b2d-a9cb-69c06e653602: snapshot controller
              failed to update snapcontent-f3acc2bc-7533-4b2d-a9cb-69c06e653602 on API server:
              etcdserver: request timed out'
            time: "2025-05-21T15:46:05Z"
          readyToUse: false
      kind: List
      metadata:
        resourceVersion: ""
      

              dwest@redhat.com Dean West
              egershko Elena Gershkovich
              Ge Liu Ge Liu
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: