#Astra-Trident iSCSI login failed

1 messages · Page 1 of 1 (latest)

placid plank
#

Hello,
First time here so please forgive me if I am not posting in correct place.
Everything below is all on-prim
We are working on deploying Astra/Trident v24.0.6 on our RKE2 v1.28 cluster - running on RHEL 9.4
We are using iSCSI
Trident installs successfully and I can successfully create PVC, but problem we have is we are getting iSCSI login failed :
MountVolume.MountDevice failed for volume "pvc-xxxxxxx-xxxxxxxx-xxxxxxxx" : rpc error: code = Internal desc = rpc error: code = Internal desc = failed to stage volume: iSCSI login failed
I have gone thru this doc: (for iSCSI) https://docs.netapp.com/us-en/trident/trident-use/worker-node-prep.html
restarted cluster nodes
So, I am little lost - what do I check for and trying to figure out what did I miss?

dusky mica
#

This looks like an iSCSI error more than a Trident error. Find a doc that goes through setting up iSCSI on RHEL9. Run through those steps and see if you can manually log into iSCSI from the hosts. Once you can do it manually, Trident will be able to do it too.

tranquil oak
magic tartan
#

Did you try to configure one side for CHAP and not the other?

If iscsi is failing to login then the svm and/or iscsi lifs are down or you have enabled chap on the Netapp but not the client (or vice-versa)

tranquil oak
#

yeah, and I think ONTAP logs the invalid chap logins in the event log too

placid plank
#

I did go thru that netapp doc - also; tried with and without CHAP but exact same error

#

# iscsiadm -m discovery -t st -p <target IP>:3260 iscsiadm: retrying discovery login to <target IP>:3260 iscsiadm: connection login retries (reopen_max) 5 exceeded iscsiadm: Could not perform SendTargets discovery: iSCSI PDU timed out

#

so I need to get back to netapp config and double check on the configs but that I do not have access to - am waiting on my storage admin to return to office

#

# iscsiadm -m discovery -t st -p <target IP>:3260 iscsiadm: Connection to Discovery Address <target IP> closed iscsiadm: Login I/O error, failed to receive a PDU iscsiadm: retrying discovery login to <target IP>

#

I can connect via netcat so the target IP appears to be correct but that's about it

#

# nc -v <target IP> 3260 Ncat: Version 7.92 ( https://nmap.org/ncat ) Ncat: Connected to <target IP>:3260.

#

thank you all for feedback on this thread

tranquil oak
#

Is the iscsi service running on your SVM? do vserver iscsi show. that's a common mistake. if it is running, "iSCSI PDU timed out" hints at a network problem, check that you can ping your iscsi LIFs from the worker nodes, that the iscsi LIFs have the correct service policy and are in the right VLAN, etc.

placid plank
#

In backend.json I have these two
"managementLIF":"mgmt_IP" "svm":"svm_IP"
I can ping management LIF IP
I cannot ping SVM IP

I do not have any sort of access on Netapp so I am waiting on my storage admin to get back so he can check

tranquil oak
#

"svm": "svm_ip" <-- this should be the name of the SVM. But it actually shouldn't matter if the management LIF is the SVM management, then it will get the SVM name through that. If it's the cluster management, then you need to specify the SVM name here.
However, in any case, you need to be able to reach the iSCSI LIF IPs (I assume that is what you mean by "SVM IP"?) over the network. So that is probably the first thing you need to fix

magic tartan
#

One way or another, the backend needs to talk to the cluster mgmt or the svm mgmt. Has to happen. Definitely sounds like a configuration error

dusky mica
tranquil oak
magic tartan
#

What I’m getting at is that you either give it cluster admin ip and then specify the svm and cluster creds or you need to specify the svm mgmt ip and svm creds

#

I typically have done the cluster creds with customers

tranquil oak
#

I usually tell everyone to use SVM admins, not cluster, unless the whole cluster is dedicated to k8s. The credentials are stored in clear in k8s, and you don't want to leak cluster admin credentials anywhere. Also it makes sense from a multi tenancy POV

placid plank
#

my netapp storage admin found some missing config and after he fixed it I am now able to "discover" via iscsiadm command....

#

In addition to that, now the pods can actually mount the volume

#

thank you all for your inputs and guidance