#Trident Backend Config fails to create and pointing to older backends even after updating it.

1 messages · Page 1 of 1 (latest)

cosmic ore
#

I am trying to create a new backend by executing
" ./tridentctl create backend --debug -f backend-anf.json"

but it is failing and picking the older backends name only.
I deleted the older backend -fyi.
But not sure from where it is picking the old backend's name and config.

#

Here is the debug message I see when I try to create backend using tridentctl

cosmic ore
#

Could any one please have a look into the above message and let me know the reason for the error and help me resolve the issue?
Thanks

warm solstice
#

The blog at https://mobb.ninja/docs/aro/trident/ says:
you must ensure the Service Principal has the privileges in place. Otherwise you will face an error like this: “error initializing azure-netapp-files SDK client. capacity pool query returned no data; no capacity pools found for storage pool”. One way to avoid this situation is by creating a new custom role (Subscription->IAM->Create a custom role)with all privileges associated in official documentation (netapp for azure ) and associate this new role to the cluster’s service principal.

cosmic ore
#

Yes I did that before creating trident backend
This is the new cluster -fyi
Even on this cluster I could see the same error

cosmic ore
#

I have provided Contributor role to app which I have created for netappfiles, and the name is as shown in the above screenshot "netappfiles2"
Are these role assignments sufficient to create volumes?
@warm solstice

warm solstice
#

I spend my time in the Trident application and do not have enough knowledge of ANF to answer that.

cosmic ore
#

Hey @warm solstice
One quick question: What is the difference between trident-csi and trident-node info?
Do we have an option to choose trident-csi in OCP v4.12.x and trident version 23.07.1 ??

warm solstice
#

I don't really understand your question. CSI is a model and Trident fits in that model. You don't choose trident-csi in place of a different option.

tacit night
#

@cosmic ore, are you simply referring to the name of the Trident pods ?

cosmic ore
#

yes @tacit night
I could see there are 2 differemt pods in 2 different cluster with the trident setup.
In 1st cluster the trident pods are named as "trident-csi-xxxx"
& In 2nd cluster, the trident pods are named as "trident-node-xxxx"

tacit night
#

it is just a naming convention change between 2 different Trident versions. at the end, same thing, new name 🙂

cosmic ore
#

ok Thanks @tacit night

cosmic ore
#

Hi
I followed all the steps and able to setup Azure NetApp Files storage using Trident Operator on my Azure RedHat Openshift cluster.
Could anyone please let me know, how to verify the status of NetApp storage setup on Azure ARO cluster??
Thanks

warm solstice
#

Create a backend, storage class and PVC. If it all works then the setup is working.

little comet
#

you can use the provided example files (in the sample-input directory) to check if provisioning works

cosmic ore
#

yes I have done the same, everything looks good
I have created 3 capacity pools and 3 storage classes, 3 pvcs
so trident operator created 3 volumes
Later, I tried to created a new storageclass and pvc, but for some reason, the volumes are not getting created and it shows the following error in the description of PVC

" Normal ProvisioningFailed 23s (x3 over 54s) csi.trident.netapp.io encountered error(s) in creating the volume: [Failed to create volume pvc-574edcf7-dbc9-4610-97c7-401b5d9919f5 on storage pool azurenetappfiles_adac1_pool_2 from backend azurenetappfiles_adac1: volume pvc-574edcf7-dbc9-4610-97c7-401b5d9919f5 is in Deleting state, not Succeeded]
"

#

Is it like for one capacity pool there should be only one volume ?

#

So I am confused between the concept of volume provisioning and capacity pool in the Azure Netapp file storage 😕

warm solstice
#

Are there any errors about that volume on the ANF side? Is there enough space to create the volume?
The message makes it seem like the volume started to be created on ANF and then was marked for deletion just as quickly as the create started.

cosmic ore
#

yes it is initiating the volume creation but failing immediately, So I suspect for one Azure cpacity pool there willl be only one volume will be attached ?
Because I have created 3 capacity pools and then created 3 PVCs manually, so 3 volumes created and were attached to 3 pools.
But there is no 4 th volume getting created.

echo finch
#

What are the sizes of the 3 pools and the 3 PVC ?

#

Also, if you have not explicitly specified a pool in a backend, or a backend in storage class, if all backends could fit your storage class, Trident will randomly choose the underlying pool, which then could explain why you have 1 PVC per pool

cosmic ore
#

Each pool size is 4 TiB

[core@bastionNode ~]$ oc describe pvc anf-gold
Name:          anf-gold
Namespace:     test-netapp
StorageClass:  anf-gold
Status:        Bound
Volume:        pvc-63a27bd8-d852-427f-8e83-57e6dcd6bace
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: csi.trident.netapp.io
               volume.kubernetes.io/storage-provisioner: csi.trident.netapp.io
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      4000Gi
Access Modes:  RWX
VolumeMode:    Filesystem
Used By:       api-gold-59d8bbffcb-92lk9
               api-gold-59d8bbffcb-lfl2b
Events:        <none>
#
[core@bastionNode ~]$ oc get pvc
NAME            STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
anf-bronze      Bound     pvc-21e7cb23-4dca-4488-8095-e67a7b8b6156   4000Gi     RWX            anf-bronze     11d
anf-gold        Bound     pvc-63a27bd8-d852-427f-8e83-57e6dcd6bace   4000Gi     RWX            anf-gold       11d
anf-silver      Bound     pvc-01764b15-fdf5-4b36-a2cd-65dc3fa5ece1   4000Gi     RWX            anf-silver     11d
basic-rbd-pvc   Pending                                                                        basic          4d9h
#

Storage class

[core@bastionNode ~]$ oc get sc
NAME                    PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
anf-bronze              csi.trident.netapp.io   Delete          Immediate              true                   11d
anf-gold                csi.trident.netapp.io   Delete          Immediate              true                   11d
anf-silver              csi.trident.netapp.io   Delete          Immediate              true                   11d
azurefile-csi           file.csi.azure.com      Delete          Immediate              true                   11d
basic                   csi.trident.netapp.io   Delete          Immediate              true                   4d9h
managed-csi (default)   disk.csi.azure.com      Delete          WaitForFirstConsumer   true                   11d
[core@bastionNode ~]$ oc describe sc anf-gold
Name:            anf-gold
IsDefaultClass:  No
Annotations:     kubectl.kubernetes.io/last-applied-configuration={"allowVolumeExpansion":true,"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"anf-gold"},"mountOptions":["nconnect=16"],"parameters":{"backendType":"azure-netapp-files","fsType":"nfs","selector":"performance=gold"},"provisioner":"csi.trident.netapp.io"}

Provisioner:           csi.trident.netapp.io
Parameters:            backendType=azure-netapp-files,fsType=nfs,selector=performance=gold
AllowVolumeExpansion:  True
MountOptions:
  nconnect=16
ReclaimPolicy:      Delete
VolumeBindingMode:  Immediate
Events:             <none>
tacit night
#

mmmmh, can you show us one PVC manifest?
looks like each PVC takes up the whole pool space

cosmic ore
#

here is one of the PVC

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: anf-silver
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 4000Gi
  storageClassName: anf-silver
#

Do I need to reduce this "storage: 4000Gi" ??

tacit night
#

the PVC size & the Pool size are 2 different things
basically, you can create as many PVC as you want until you fill up the Pool (granted, you need to check the minimal size for a PVC)

#

try to delete the existing Silver PVC

#

& recreate one with 1000Gi

#

you shall then see that you have 3TB left in the pool in the Azure portal

cosmic ore
#

sure
so If we do so and try to create new pvc then, will trident choose the specific pool to allocate new PVC to it?

tacit night
#

if you choose the SILVER storage class, Trident will create a PVC in one of the backends that corresponds to it, in case you have configured several SILVER pools/backends.

#

if you have only one SILVER backend, that links Trident to one single SILVER pool, then the relationship will be 1<=>1<=>1 between SC/Backend/Pool

#

backed & pool are always 1<=>1
however SC & backends are more n<=>n

cosmic ore
#

ok Let me try it once
Thank you @tacit night

cosmic ore
#

As you suggested, I deleted existing PVC "anf-silver" and created new PVC with 1000GiB But volume provisioning state failed

  Type     Reason                Age                  From                                                                                            Message                                                                    
  Normal   ProvisioningFailed    3m46s                csi.trident.netapp.io                                                                           volume state is Creating, not Succeeded
  Warning  ProvisioningFailed    3m46s                csi.trident.netapp.io_trident-controller-5587878776-qngpx_b0adb16f-fb4d-4c32-93ce-f88183328f36  failed to provision volume with StorageClass "anf-silver": rpc error: code = DeadlineExceeded desc = volume state is Creating, not Succeeded
  Normal   ExternalProvisioning  13s (x18 over 4m6s)  persistentvolume-controller                                                                     waiting for a volume to be created, either by external provisioner "csi.trident.netapp.io" or manually created by system administrator
  Normal   Provisioning          8s (x9 over 4m6s)    csi.trident.netapp.io_trident-controller-5587878776-qngpx_b0adb16f-fb4d-4c32-93ce-f88183328f36  External provisioner is provisioning volume for claim "test-netapp/anf-silver"
  Normal   ProvisioningFailed    3s (x8 over 3m38s)   csi.trident.netapp.io                                                                           volume state is still Creating, not Succeeded
  Warning  ProvisioningFailed    3s (x8 over 3m38s)   csi.trident.netapp.io_trident-controller-5587878776-qngpx_b0adb16f-fb4d-4c32-93ce-f88183328f36  failed to provision volume with StorageClass "anf-silver": rpc error: code = DeadlineExceeded desc = volume state is still Creating, not Succeeded
tacit night
#

what volumes do you currently see in the Azure console?

cosmic ore
#

Sorry I saw it as provisioning failed but after sometime, it got created, but not sure how

tacit night
#

so, all good now.
can you confirm in the Azure console that you see 1TB used in the 4TB pool ?

#

for information, I believe the minimal PVC size is 100GiB

cosmic ore
#

yes I could see 1 TB volumes created but when I tried to run some validation tests , I could see that PVC's are not getting created .
I just checked the description of pod which is in "Pending" state
It shows that 0/6 nodes are available

PVC status:

[root@vc-jnk1 cpd-cli-linux-EE-13.1.0-44]# oc get pvc -n storage-validation-1
NAME               STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-sysbench-rwo   Pending                                      basic          93m
pvc-sysbench-rwx   Pending                                      anf-silver     93m
#

And in the Azure console, I just checked the status of volumes.
For one volume, It is showing No valid mount target:

cosmic ore
#

Hi @tacit night
I created one capacity pool with 4TiB capacity and service-level Ultra in Azure Netapp File storage configuration on Azure ARO cluster, but now I noticed that one of our service installation is trying to create volume, but it is failing because of the this error:

 Normal   ProvisioningFailed    2m33s (x148 over 98m)  csi.trident.netapp.io                                                                           encountered error(s) in creating the volume: [Failed to create volume pvc-d2f099fe-a485-4c41-aebc-3c82ab4fd9e8 on storage pool azurenetappfiles_8c958_pool_0 from backend azurenetappfiles_8c958: volume pvc-d2f099fe-a485-4c41-aebc-3c82ab4fd9e8 is in Deleting state, not Succeeded]
#

Is it because of storage of 4TiB exceeding ?