#┊・astra🔒 | NetApp | Page 1

twilit wharf Jul 29, 2022, 11:56 AM

#

How can I join this group?

hallow zealot Jul 29, 2022, 11:59 AM

#

You're here!

twilit wharf Jul 29, 2022, 12:05 PM

#

Good Morning. Thank you for the confirmation. I don’t see the history

hallow zealot Jul 29, 2022, 1:00 PM

#

Ah - You're not missing anything. This is a relatively new channel, so there's no history to see yet.

steel marsh Jul 29, 2022, 4:42 PM

#

https://github.com/NetApp/trident/releases/tag/v22.07.0

GitHub

Release v22.07.0 · NetApp/trident

Changes since v22.04.0
Fixes:

Kubernetes: Fixed issue to handle boolean and number values for node selector when configuring Trident with Helm or the Trident Operator. (Issue #700)
Kubernetes: Fix...

violet garden Jul 29, 2022, 6:07 PM

#

Congrats team!

coarse obsidian Jul 29, 2022, 6:22 PM

#

Awesome work!

dusty yacht Aug 3, 2022, 1:51 PM

#

https://netapp.io/2022/08/03/trident-v22-07/

thePub

Trident v22.07: Release Announcement - thePub

Facebook Twitter Google+ LinkedIn The Astra Trident team is pleased to announce our latest build: v22.07. v22.07 is now available, and you can download it from Trident’s GitHub webpage! It includes the following features and enhancements: Per-Node Initiator groups for ontap-san volumes: v22.07 will provision an initiator group (igroup) per Kuber...

woeful otter Aug 10, 2022, 4:13 PM

#

Sorry if this has been asked a million times, but I have no idea how to use discord. So I have Ontap AWS as HA setup. Any volumes I create by hand or trident seem to be created on both HA servers (what I want). About a week ago I lost the primary Ontap HA server, and all my pods that were setup were not able to mount using the fallback HA server. I killed the pods, but no luck it still cannot mount the PVs, Deleted the deployment and re-deploy the stateful app, and the pod still trying to use the primary Ontap. On my backend, I do not list any Data IPs as someone from support said to let the system do it for u by just listing the main management IP. What am I doing wrong, or is this always a manual step where i have to recreate the K8 deployment and some other backend config?

violet garden Aug 10, 2022, 4:23 PM

#

@dusty yacht can you assist here?

steel marsh Aug 10, 2022, 6:16 PM

#

is the mgmt lif on DNS and if so does that DNS fail over too?

#

or if not on dns does the ip fail over?

woeful otter Aug 10, 2022, 6:18 PM

#

No it set to the floating IP

misty cargo Aug 10, 2022, 6:21 PM

#

woeful otter Sorry if this has been asked a million times, but I have no idea how to use disc...

In addition to above, a few ideas: 1. Is ONTAP LIF still in failover? 2. Does mount of PV to pod succeed PV directly from a worker node when in failover? If not, it wont work with k8s pod either. 3. If no, troubleshoot connectivity to the failover node. 4. What is status of pod? pod describe events? 5. #tridentcl get backend -n trident. Is backend online? Could try running # tridentctl update backend to resync with the ontap, and retest. 6. This might be a lot to post here. suggesting if further troubleshooting is needed, open an ONTAP case and/or Trident case. Hope that helps!

violet garden Aug 10, 2022, 6:48 PM

#

We welcome it here, but if it gets into private environment info and specifics, take it to a DM or something a little more private. But in general the troubleshooting info is great to keep public for future searches!

lost prismBOT Aug 11, 2022, 3:13 PM

#

📢 Minor Update

In an effort to standardize naming conventions, we’ve renamed the #trident channel to #┊・astra🔒 in order to encompass support for Astra Control, Astra Data Store, and Astra Trident.

woeful otter Aug 11, 2022, 6:11 PM

#

So I upgraded trident 21 to 22.x and I wanted to use the new Cloud Manager way of installing it. I was force to do a full uninstall of trident and had to delete the trident namespace as the Cloud Manger Kubernetes keep failing to install Trident as it said that namespace existed so I had to delete it. After doing that the Cloud Manger installed trident (I really like this option!) and it all seem to work great as it shows me all the volumes as well inside Cloud Manager Kubernetes screen. So when I looked at in from tridentctl i was force to update the backend (secrets was gone) and that worked but all my volumes do not show up anymore from tridentctl get volumes. Is there a way to restore the existing volumes? Do I need to? I ask this as cool as Cloud Manger Kubernetes thing is to keep trident updated and easy to install, You cannot seem to do much with it still form that GUI.

misty cargo Aug 11, 2022, 7:57 PM

#

Not sure why CM required a new NS. Would need to check further. For the existing PV's; A new trident backend has no knowledge/management of trident objects created from previous backend. To regain management of the existing volumes use the tridentctl import command https://docs.netapp.com/us-en/trident/trident-use/vol-import.htmldrivers. Import will create new PVC/PV's and trident volume objects. Then remove the old PVC/PVs. This KB provides the steps for this situation: https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Astra_Trident/Cannot_mount_Kubernetes_PVC_after_deleted_Trident_namespace. Follow the steps regarding importing the volumes.

strong furnace Aug 15, 2022, 1:02 PM

#

Hi All,
does Trident support K3S?

short kestrel Aug 15, 2022, 3:07 PM

#

@strong furnace Officially, no. Have I had success in my own lab with ontap-nas backend, yes. You can't do any fancy stuff with it, and I wouldn't use it in production, but if you just want a feeling for how Trident works then go for it.

coarse obsidian Aug 15, 2022, 3:13 PM

#

I have had success with it in both x86 and aarch64 (manual build of Trident), but that is just for my home lab. It is good enough for learning and a few bits I do here.

strong furnace Aug 15, 2022, 3:18 PM

#

i have customer that wants to use it in production 😅

violet garden Aug 15, 2022, 3:22 PM

#

You'd be missing the autoscaling components of "normal" k8s and using kubelets instead of kubeadm. It's fine for smaller deployed Edge solutions, but I'd never use it in "core" production.

coarse obsidian Aug 15, 2022, 3:25 PM

#

strong furnace i have customer that wants to use it in production 😅

Drop me some more details and we can discuss with the team Jason.Benedicic@netapp.com

feral moon Aug 15, 2022, 3:29 PM

#

violet garden You'd be missing the autoscaling components of "normal" k8s and using kubelets i...

Depends on what core production for the customer looks like... K3s might be a good starting point to get some apps started on RKE2 (if it's a government customer for instance).

proud basalt Aug 15, 2022, 3:42 PM

#

IHAC running Trident 19.07.1 with two Ontap clusters (Yes, I know - officially not supported) and experiencing very slow performance on one of the clusters. Storage provisioning/deletion or even running 'tridentctl get backend -n trident' can be very slow.
They believe it has to do with the tiering policy as the aggregate is tiering to SG by default. We've turned off tiering for future volume creation, but I'm also looking for other items that could be affecting performance.
The biggest difference I identified is that the slow cluster has 2600 volumes (6-nodes), while the cluster performing as expected has 500.
Would utilizing ontap-nas-economy potentially improve performance by reducing the number of volumes? They do a significant amount of volume creates/deletes. Is there a recommended value of qtrees per volume for best performance?
What is the impact of switching from ontap-nas to ontap-nas-economy?

short kestrel Aug 15, 2022, 5:36 PM

#

@proud basalt If ONTAP data access performance is not good, there is nothing Trident can do for you. Trident only provisions volumes, once the volumes are created, Trident is out of the picture and it's all about the host, ONTAP and network.
Utilizing ontap-nas-economy will help reduce volume count, but you might want to have the performance stats looked at by NetApp Support's performance team and find out if reducing the number of volumes will really help in this scenario.
You can read through https://docs.netapp.com/us-en/trident/trident-use/ontap-nas-examples.html#backend-configuration-options to see some of the differences between the two.

proud basalt Aug 15, 2022, 5:45 PM

#

short kestrel <@1008759144246939788> If ONTAP data access performance is not good, there is no...

Thanks Scott. If the large number of Ontap volume creates/deletions is in fact the issue, what would be the best migration method to convert from ontap-nas to ontap-nas-economy? Create a new backend and modify the storage class, or create a new storage class?

short kestrel Aug 15, 2022, 5:57 PM

#

If the number of volumes is the issue then yes, going to ontap-nas-economy will help. There are a number of modifications to a storage class that K8s won't let you make. For that reason and to make it simpler to administer, I would create a new backend and a new storage class. Then you can tell which PVC is using what storage based on the storage class it is using. If you need to move existing data, that is a little more difficult as there is no import with onta-nas-economy. It would have to be a manual process.

hallow zealot Aug 15, 2022, 6:05 PM

#

lost prism

drowsy mural Aug 15, 2022, 9:02 PM

#

proud basalt IHAC running Trident 19.07.1 with two Ontap clusters (Yes, I know - officially n...

Have you confirmed 100% there isn't another bottleneck?

proud basalt Aug 15, 2022, 10:04 PM

#

drowsy mural Have you confirmed 100% there isn't another bottleneck?

I've encouraged the storage team to open a case for a performance review of the contention. Currently trident is down for one of the backends, and not provisioning volumes after attempting to update the backend to update credentials.
When running tridentctl logs -a -n trident, we only get 2 files - errors and trident-controller
Error from server (BadRequest): previous terminated container "trident-main" in pod "trident-7c844b9564-t9gdb" not found

time="2022-08-15T21:02:37Z" level=info msg="Storage driver initialized." driver=ontap-nas
time="2022-08-15T21:02:38Z" level=info msg="Created new storage backend." backend="&{0xc421e1f380 ontap-gold true online map[d1_c3_700_8_ssd_data:0xc422301e00 D1_C3_8080_1_ssd_data:0xc422301cc0 D1_C3_8080_2_ssd_data:0xc422301d00 d1_c3_700_5_ssd_data:0xc422301d40 d1_c3_700_6_ssd_data:0xc422301d80 d1_c3_700_7_ssd_data:0xc422301dc0] map[]}"
time="2022-08-15T21:05:25Z" level=info msg="Updated backend satisfies no storage classes." backend=ontap-gold
time="2022-08-15T21:05:25Z" level=info msg="Updated a backend." backend=ontap-gold handler=UpdateBackend

E0815 20:38:44.672221 1InvolvedObject:v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"mr-5033", Name:"tms-pvc", UID:"f61efddc-1cd7-11ed-9c2a-005056b04233", APIVersion:"v1", ResourceVersion:"463568843", FieldPath:""}, Reason:"ProvisioningFailed", Message:"no available backends for storage class ontap-gold",

strong furnace Aug 16, 2022, 5:45 AM

#

coarse obsidian Drop me some more details and we can discuss with the team Jason.Benedicic@netap...

i will meet the customer in the next 2 weeks to understand more about the request, thanks!

strong furnace Aug 18, 2022, 4:11 PM

#

Hi team. IHAC who is using ACC to manage different kubernetes clusters (in internal and external netwroks). Even when the applications from external clusters can be managed with ACC, external customers can't access the ACC GUI. Is there a way to allow the access to the ACC GUI for external users, even if they are in a different network, maximizing the security? Thanks in advance

limber fable Aug 19, 2022, 4:29 PM

#

@strong furnace thanks for your question! What you are asking for should be possible, provided the different network allows access to the ACC UI

rough shadow Aug 25, 2022, 3:58 PM

#

Hey everyone - did you know that Astra's LIVE on the Azure Marketplace? 🙂

strong furnace Aug 29, 2022, 3:37 PM

#

Yes! I love It!
But for me the best thing is, that AWS EKS is now supported, too! So the data fabric story lives here, as well!

#

I am in the process of adding an AWS EKS cluster to astra right at the moment. 🤗 👀

strong furnace Aug 30, 2022, 7:56 AM

#

Ups ... still pending since yesterday. Have to investigate why this does not finish.

strong furnace Aug 30, 2022, 11:39 AM

#

Anyone knows, what this message could mean:
"Unable to connect to server. Try again later. Unexpected token 'B', "Bad Gateway" is not valid JSON"

steel marsh Aug 30, 2022, 11:41 AM

#

Sounds like there was a 502 Bad Gateway error but the client was expecting JSON data to be returned and tried to parse “Bad Gateway” as JSON

strong furnace Aug 30, 2022, 1:22 PM

#

Good one. But strange. At the moment, the cluster is still pending and cannot be removed, as well.

coarse obsidian Aug 30, 2022, 3:48 PM

#

Could you DM me some more details, account name etc, then I’ll ask if someone on the team can take a look.

cunning crane Aug 31, 2022, 1:55 PM

#

Hi, i have a general question with Backup/Restore of PVC's with Trident. We use the "ontap-nas-economy" driver and use Ontap Storage SnapShots on this volume. The question is how to get single PVC's restored out of those snapshots?

short kestrel Aug 31, 2022, 5:14 PM

#

According to https://docs.netapp.com/us-en/trident/trident-use/vol-snapshots.html the ontap-nas-economy driver is not supported to use snapshots. My expectation is that while taking snapshots works, there is no good way to clone a qtree without cloning the whole volume. Therefore anything that needs to be done with the process of cloning/restoring of qtrees is a completely manual process that cannot be handled by Trident. Anything Trident could do manually it does at a volume level and that reduces efficiency of storage and creates possible security risks for the extra data that is not actively being used by the clone/restore.

dusty yacht Aug 31, 2022, 5:51 PM

#

@short kestrel is right about this. The ontap-san-economy driver doesn't have this restriction as ONTAP provides the ability to snapshot and clone LUNS. The same isn't true for qtrees which are used to represent the PV in the ontap-nas-economy driver.

peak lantern Aug 31, 2022, 10:05 PM

#

Hi, sorry to bump this issue. But ever since the 22.01 release of Trident we have struggled with multi attach issues when k8s nodes are terminated
https://github.com/NetApp/trident/issues/762

GitHub

PVC attachment timeout when a node is terminated / removed from the...

Describe the bug It looks like this bug #691 or a similar bug is introduced in Trident after the 22.01.1 release. Both the 22.04.0 and the 22.07.0 releases suffer from the same behavior. Environmen...

dusty yacht Sep 1, 2022, 1:09 PM

#

@peak lantern how are the nodes being terminated in your cluster?

twilit wharf Sep 1, 2022, 6:59 PM

#

Team, I am using a pv claimed from NetApp using Trident. I am using this PV to mount the postgres database volume. The pod fails because of permission error. kubectl logs postgres-statefulset-0
chmod: changing permissions of '/var/lib/postgresql/data': Read-only file system
chown: changing ownership of '/var/lib/postgresql/data': Read-only file system

#

I tired to use an init container to modify the permission but still getting the same error. Do I need to set any permissions in Astra storage class settings? If anyone has faced this issue please guide me

#

kubectl get pvc postgres-pv-claim
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
postgres-pv-claim Bound pvc-97e891c3-3587-4294-adf1-3dd1c08cd571 5Gi RWO netapp-nas 4h1m

#

My container spec

#

containers:
- name: postgres
image: postgres:13
envFrom:
- configMapRef:
name: postgres-configuration
ports:
- containerPort: 5432
name: postgresdb
volumeMounts:
- name: pv-data
mountPath: /var/lib/postgresql/data
readOnly: false
securityContext:
runAsUser: 1000
allowPrivilegeEscalation: true
volumes:
- name: pv-data
persistentVolumeClaim:
claimName: postgres-pv-claim

feral moon Sep 1, 2022, 7:42 PM

#

twilit wharf Team, I am using a pv claimed from NetApp using Trident. I am using this PV to m...

Hello Jerin, welcome to the Astra channel! It appears that user ID 1000 (defined in your securityContext) does not have the necessary permissions to interact with /var/lib/postgresql/data. Does this user have the required privileges? Do you create/use that user in your image (check Dockerfile)? There is an additional parameter in Astra Trident's backend configuration called unixPermissions which by default is very permissive (see more at https://docs.netapp.com/us-en/trident/trident-use/ontap-nas-examples.html). Hope this helps to narrow this down!

twilit wharf Sep 1, 2022, 9:49 PM

#

Thank you, Tim. I will check it

vale abyss Sep 2, 2022, 1:50 PM

#

Not sure if this is the place, but I tried something naughty with Trident and it popped!

#

Tried to move trident-operator from manual install to helm... that bit worked

#

few annotations here and there and a label added and all is good.... but! once I have it upgraded

#

it does upgrade the rest... that bit also works

#

flawlessly

twilit wharf Sep 2, 2022, 1:51 PM

#

feral moon Hello Jerin, welcome to the Astra channel! It appears that user ID 1000 (defined...

I tried with the bitnami postgres image and it's working without issues.

vale abyss Sep 2, 2022, 1:52 PM

#

only after that it dies a miserably death

#

all backendconfigs fail with: Failed to apply the backend update; updating the data plane IP address

#

even if no change has been made to any of the configuration.... probably helm deployment trying to be funny

vale abyss Sep 2, 2022, 4:47 PM

#

OK, dudes... have good and bad news...

#

good news is with few annotations and an extra label moving from operator managed trident to helm chart works fine

#

bad news is you guys fecked up upgrade to 22.07.0 - the moment this one gets applied and backends fail

#

fix it!

#

also, your discord settings vacuum - one can't edit one's message if it contains content blocked by the community facepalm

violet garden Sep 2, 2022, 4:59 PM

#

Appreciate the info, but let's keep it clean in here please. 🙂

vale abyss Sep 2, 2022, 5:12 PM

#

We are clean FeelsBadMan

#

Also there appears to be issue on github that helps... so me again FeelsBadMan

feral moon Sep 6, 2022, 4:21 PM

#

Permissions on PV in containers

hollow fulcrum Sep 9, 2022, 4:09 PM

#

Hi team.
We have installed Trident on ROKS (OpenShift on IBM Cloud).
We are able to create a PVC (volume is created on the NetApp & PVC is in status "bound") but when we try to use it in a POD we have the following error:

Sep  9 07:54:18 kube-c97vfvbf0ju83sm08vhg-pocrokspar0-pocroks-000002bf kubelet.service: I0909 07:54:18.775085   25364 reconciler.go:243] "operationExecutor.AttachVolume started for volume \"pvc-1f7b2616-1884-4597-bada-dc3ffa0733af\" (UniqueName: \"kubernetes.io/csi/csi.trident.netapp.io^pvc-1f7b2616-1884-4597-bada-dc3ffa0733af\") pod \"prometheus-k8s-0\" (UID: \"a17ed383-ef1e-4524-90da-9b59af14d817\") "Sep  9 07:54:18 kube-c97vfvbf0ju83sm08vhg-pocrokspar0-pocroks-000002bf kubelet.service: E0909 07:54:18.775648   25364 nestedpendingoperations.go:335] Operation for "{volumeName:kubernetes.io/csi/csi.trident.netapp.io^pvc-1f7b2616-1884-4597-bada-dc3ffa0733af podName: nodeName:}" failed. No retries permitted until 2022-09-09 07:56:20.775590174 -0500 CDT m=+169274.381362825 (durationBeforeRetry 2m2s). Error: recovered from panic "runtime error: invalid memory address or nil pointer dereference". (err=<nil>) Call stack:

Any ideas on what we are doing wrong?

dusty yacht Sep 9, 2022, 4:42 PM

#

@hollow fulcrum you need to look at the Trident logs. Run tridentctl -n trident logs -a which will create a zip file of all of the Trident logs. You'll want to look at the Trident controller logs and the node log where the volume attachment is being performed. It looks like the above "K8S?" log snippet is missing the node name.

hollow fulcrum Sep 9, 2022, 5:00 PM

#

@dusty yacht this is indeed strange.
I generated logs and on every node we have this:

time="2022-09-08T14:14:52Z" level=warning msg="Could not update Trident controller with node registration, will retry." error="could not log into the Trident CSI Controller: error communicating with Trident CSI Controller; Put \"https://172.21.172.196:34571/trident/v1/node/10.xx.xx.xx\": dial tcp 172.21.172.196:34571: connect: connection timed out" increment=9.439905465s requestID=a7f2bd23-4d95-4c7f-8947-85fd5eab63c2 requestSource=Internal

I'm not sure what to do to fix this but this looks like a first step: what do you think?

dusty yacht Sep 9, 2022, 5:02 PM

#

This is likely a networking issue in your K8S cluster where the Trident daemonset pod isn't able to communicate with the Trident controller. The daemonset pod tries perform node registration with the Trident controller when it starts up.

misty cargo Sep 9, 2022, 5:07 PM

#

confirm the health of Trident and K8s pods. kubectl get all -n trident, and kubectl get pods -n kube-system. Check all containers are starting, and all pods Running, and not restarting, etc.. Also, confirm trident backend is online. tridentctl get backends -n trident.

misty cargo Sep 9, 2022, 5:22 PM

#

https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Astra_Trident/How_to_test_connectivity_to_Trident_CSI_Controller_from_a_particular_Kubernetes_node

NetApp Knowledge Base

How to test connectivity to Trident CSI Controller from a particula...

hollow fulcrum Sep 9, 2022, 5:33 PM

#

Thank you so much for those precious inputs. We’ll get back to this debug on Monday. I’ll keep you posted.

hollow fulcrum Sep 12, 2022, 8:09 AM

#

misty cargo confirm the health of Trident and K8s pods. kubectl get all -n trident, and ku...

Hi David, everything is running fine, no restart & backends are online.
Connectivity test using the KB procedure is ... (unexpectedly I must admit) working fine ...
... and now we do not have the warning messages anymore but still the same issue.
We'll perform a clean re-install and keep your guys posted.

cunning crane Sep 13, 2022, 12:21 PM

#

hi all, can somebody please explain how to migrate existing Trident managed PVCs from one "old" Ontap Storage System (economy-driver) to a new Ontap Storage System with the ontap-nas driver? Is there somewere a written down path to follow?

misty cargo Sep 13, 2022, 2:27 PM

#

My understanding is you are migrating data residing in qtrees on old ontap array, to new flexvols on a new ontap array.
Correct?
Am not aware if this scenario is covered in 1 doc, however here are the high-level options I see: (others may suggest better..):

Trident doesn't handle migration of data. The migration will need to be performed outside of Trident.
(ontap-nas-economy: each PVC resides in qtree inside a flexvol. ontap-nas: separate flexvol for each PVC)

For the data migration, 2 options to consider depend on # of qtrees, current active writes, and network considerations between the 2 ontap arrays.

a. If having few qtrees: 
   - Stop active writes. 
   - ndmpcopy copy data in each qtree from old array into new flexvols on the new array.

Or, if large # of qtrees or network speed is a concern, or if these qtrees are actively being written to:

b. - SnapMirror the flexvol with qtrees over to the new array. 
   - Stop all writes to qtrees, run final Snapmirror update.
   - On new array, ndmpcopy copy data in each qtree on new flexvol into new flexvols.

Then use 'tridentctl import' command to import the new flexols into a new Trident backend.

Helpful links:

https://docs.netapp.com/us-en/ontap/tape-backup/transfer-data-ndmpcopy-task.html
https://docs.netapp.com/us-en/ontap/data-protection/snapmirror-replication-workflow-concept.html
https://docs.netapp.com/us-en/trident/trident-use/vol-import.html#drivers-that-support-volume-import
https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Astra_Trident/Cannot_mount_Kubernetes_PVC_after_deleted_Trident_namespace

NetApp Knowledge Base

Cannot mount Kubernetes PVC after deleted Trident namespace

cunning crane Sep 13, 2022, 3:10 PM

#

thank you David, i will try to setup a migration path for our environment. With this information i should be able to get it done.

strong furnace Sep 13, 2022, 4:45 PM

#

hello, is this the right place for discussing about trident ?

hallow zealot Sep 13, 2022, 4:45 PM

#

Sure is, @strong furnace !

strong furnace Sep 13, 2022, 4:46 PM

#

thanks.

#

I am using rancher with trident and I would like to know if I can associate more then one svm on one kubernetes cluster

dusty yacht Sep 13, 2022, 5:01 PM

#

@strong furnace you can have multiple backend configurations in Trident. Each backend configuration can specify the SVM to use.

peak lantern Sep 13, 2022, 10:28 PM

#

paalkr6690 how are the nodes being

rough zealot Sep 14, 2022, 8:57 AM

#

hey everyone. running into an issue with Ontap FSX filesystem + EKS. When creating a statefulset using VolumeSnapshots or CSI Volume Cloning, the PVC is created immediately (as expected), and shows as bound to the bound. but I get warnings about timeouts waiting to mount the volume:

#

  Warning  FailedMount             3m9s (x2 over 7m42s)   kubelet                  Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[configmap data kube-api-access-jgsbc]: timed out waiting for the condition
  Normal   Pulled                  118s                   kubelet                  Container image "poktnetwork/pocket-core:RC-0.9.0" already present on machine
  Normal   Created                 118s                   kubelet                  Created container init-container```

#

exactly 10mins into the pods lifecycle, it mounts, and runs.

#

this happens consistently

#

anyone else experienced something similar?

#

this in on a Single AZ ontap

sacred lantern Sep 14, 2022, 11:35 AM

#

rough zealot hey everyone. running into an issue with Ontap FSX filesystem + EKS. When creati...

something more is going on on your K8s, it also fails to mount the configmap, which is a default internal k8s mapping (cert store), does this also happen when you create a pod without pvc?

rough zealot Sep 14, 2022, 1:21 PM

#

No. Which is weird. The only thing I've changed is the storage class (from ebs-csi-driver to trident)

#

The volume that is failing to mount is only the one named data

#

There are three other volumes which are unattached but that's not causing the issue. The time out error is from the netapp fsx provisioned volume

solar wren Sep 14, 2022, 1:38 PM

#

@dusty yacht Can assist with @rough zealot issue?
This is very weird. Every time they add a new node to the EKS cluster it takes 10min to be able to mount volumes (after the node it healthy and available). The mount error msgs are above

sacred lantern Sep 14, 2022, 1:43 PM

#

rough zealot The volume that is failing to mount is only the one named `data`

no additional error in there from kubelet MountVolume.SetUp ?

rough zealot Sep 14, 2022, 2:05 PM

#

I'll take a look now and provide you some logs I find

#

does this mean anything:

#

W0914 14:08:53.846958       1 csi_handler.go:189] VA csi-3dde5a96842061dac3672675592e0c75da7bed274d3cd85b165c55e50a62963f for volume vol-0b4795fe907bd952f has attached status true but actual state false. Adding back to VA queue for forced reprocessing
W0914 14:09:53.853182       1 csi_handler.go:189] VA csi-8c08357f4505f093a4ef28c576d72c661689dc0966d5af06deb95806d3da7eb5 for volume vol-081c9d0e5caac0122 has attached status true but actual state false. Adding back to VA queue for forced reprocessing
W0914 14:09:53.853243       1 csi_handler.go:189] VA csi-3dde5a96842061dac3672675592e0c75da7bed274d3cd85b165c55e50a62963f for volume vol-0b4795fe907bd952f has attached status true but actual state false. Adding back to VA queue for forced reprocessing
W0914 14:10:53.856108       1 csi_handler.go:189] VA csi-3dde5a96842061dac3672675592e0c75da7bed274d3cd85b165c55e50a62963f for volume vol-0b4795fe907bd952f has attached status true but actual state false. Adding back to VA queue for forced reprocessing
W0914 14:10:53.856297       1 csi_handler.go:189] VA csi-8c08357f4505f093a4ef28c576d72c661689dc0966d5af06deb95806d3da7eb5 for volume vol-081c9d0e5caac0122 has attached status true but actual state false. Adding back to VA queue for forced reprocessing```

rancid hearth Sep 14, 2022, 4:50 PM

#

Hey Everyone, I so badly need your help. So I am trying to upgrade trident operator from 21.10 to 22.07 on OpenShift. Post upgrade, I see only one trident-CSI pod is running (there should be 6 trident-CSI pod as I have 6 nodes). I am not sure what is happening. All that I did was, deleted the bundle.yaml and created a new one using 22.07 bundle.yaml

#

#

Here is the error that I see when I do “Oc get events”

#

It was working fine with v21.10 where it had 6 trident CSI pods.
Please help, I’m running out of ideas here

short kestrel Sep 14, 2022, 5:20 PM

#

It looks to me like some Trident CRDs are missing in the environment. What commands did you run to do the upgrade?

misty cargo Sep 14, 2022, 5:36 PM

#

Also check this KB: https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Astra_Trident/Trident_install_failing_due_to_clusterrolebinding_not_allowing
and the Trident doc page:
https://docs.netapp.com/us-en/trident/trident-reference/pod-security.html#required-kubernetes-security-context-and-related-fields

NetApp Knowledge Base

Trident install failing due to clusterrolebinding not allowing

rancid hearth Sep 14, 2022, 5:46 PM

#

short kestrel It looks to me like some Trident CRDs are missing in the environment. What comm...

I ran the following commands for installing V21.10
Oc create -f deploy/crds/trident.netapp.io_tridentorchestrator_crd_post1.16.yaml
Oc create -f deploy/bundle.yaml
Oc create -f deploy/crds/tridentorchestrator_cr.yaml
Later for the upgrade to V22.07, I downloaded the package from github and ran the following commands.
Oc delete -f deploy/bundle yaml (pointed at V22.07 and as well as tried with V21.10)
Oc create -f deploy/bundle.yaml (pointed at 22.07)

#

Post this, noticed trident operator pod getting terminated and a new one got created with v22.07. Next, just one trident CSI pod got created

rancid hearth Sep 14, 2022, 6:05 PM

#

OpenShift Version is 4.10

rancid hearth Sep 14, 2022, 6:39 PM

#

any idea, please?

hallow zealot Sep 14, 2022, 6:45 PM

#

@rancid hearth We're all volunteers here. Please be patient.

rancid hearth Sep 14, 2022, 6:46 PM

#

sorry

short kestrel Sep 14, 2022, 6:54 PM

#

@rancid hearth Did you look at the KB article that David suggested? https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Astra_Trident/Trident_install_failing_due_to_clusterrolebinding_not_allowing

rancid hearth Sep 14, 2022, 6:56 PM

#

Yeah, I don't have access to read the solution

#

I did go through the second URL which David suggested. I checked the namespace label and it is set to enforce:privilege

hallow zealot Sep 14, 2022, 6:58 PM

#

Do you have a NetApp login account, or are there problems getting a guest account?

rancid hearth Sep 14, 2022, 6:59 PM

#

I can get a guest account, I thought kb pages won't be available for guests. Let me go ahead and create one

hallow zealot Sep 14, 2022, 7:00 PM

#

That particular one just requires a guest account. If you run into trouble with it, let me know.

rancid hearth Sep 14, 2022, 7:00 PM

#

awesome 🙂

misty cargo Sep 14, 2022, 7:39 PM

#

rancid hearth Yeah, I don't have access to read the solution

@vinod: was a customer cluster role used in previous install? trident-operator? (default)
or any other edits to custom yamls for service account or cluster role, etc?

searching also found this NetApp KB matching the error in your screenshot: https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Astra_Trident/The_trident-csi_pods_are_not_rebuilt%2C_erroring_on_security_constraints_in_Openshift

NetApp Knowledge Base

The trident-csi pods are not rebuilt, erroring on security constrai...

rancid hearth Sep 15, 2022, 9:48 AM

#

@misty cargo, I used default trident operator. No changes were to the yamls (for both 21 and 22). The above link has a different error message

sacred lantern Sep 15, 2022, 11:31 AM

#

rough zealot does this mean anything:

that does not give any hint why it is failing, I think it is better to generate a support bundle (tridentctl logs -a -n trident) open a case and send in the logs for verification

misty cargo Sep 15, 2022, 4:37 PM

#

rancid hearth <@1002276499526778881>, I used default trident operator. No changes were to the ...

FYI - I sent you a DM if needing further assist, or open a support case please. 😀

rough zealot Sep 15, 2022, 5:50 PM

#

rough zealot hey everyone. running into an issue with Ontap FSX filesystem + EKS. When creati...

so digging further into this @sacred lantern ```I0912 12:44:36.203980 12 event.go:291] "Event occurred" object="pokt-dispatch/data-pokt-dispatcher-fsx-clone2-0" kind="PersistentVolumeClaim" apiVersion="v1" type="Normal" reason="ExternalProvisioning" message="waiting for a volume to be created, either by external provisioner "csi.trident.netapp.io" or manually created by system administrator"

#

so the delay def seems to be on the trident csi

#

it then binds

#

I0912 12:44:39.058861 12 pv_controller.go:879] volume "pvc-df7f62d8-4633-4966-9a19-2e98b70cfdac" entered phase "Bound"

#

I0912 12:44:39.991234      12 reconciler.go:304] attacherDetacher.AttachVolume started for volume "pvc-df7f62d8-4633-4966-9a19-2e98b70cfdac" (UniqueName: "kubernetes.io/csi/csi.trident.netapp.io^pvc-df7f62d8-4633-4966-9a19-2e98b70cfdac") from node "ip-10-0-10-162.eu-central-1.compute.internal"

#

sacred lantern Sep 16, 2022, 7:30 AM

#

so digging further into this Daniel

cloud quarry Sep 21, 2022, 8:57 AM

#

Hi Is there any document that is talking about the Trident backend with ONTAP self-signed certificate?
The reference that I found are all about CA

sacred lantern Sep 21, 2022, 11:36 AM

#

cloud quarry Hi Is there any document that is talking about the Trident backend with ONTAP se...

https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Astra_Trident/Enable_Certificated-based_Authentication_for_a_Trident_backend

NetApp Knowledge Base

Enable Certificated-based Authentication for a Trident backend

#

Is this what you were looking for?

pine birch Sep 21, 2022, 6:28 PM

#

Hi all! We are currently running rancher with all downstream clusters at k8s v1.20.15 and trident v20.10.1 (operator deployed) for provisioning PV/PVCs. I just upgraded trident in one of our dev/test clusters via operator based cluster-scoped upgrade, instead of the namespace-scoped upgrade (20.10.1 to 21.10.1) and it worked without issue. Looks like the the only thing different is that one extra step of manually creating the tridentorchestrator in the namespace-scoped upgrade? Am I missing something here or can the operator based cluster-scoped upgrade procedure be done when upgrading from 20.10.1 to 21.10.1? Upgrade doc ref: https://github.com/NetApp/trident/blob/stable/v21.10/docs/kubernetes/upgrades/operator-upgrade.rst

GitHub

trident/operator-upgrade.rst at stable/v21.10 · NetApp/trident

Storage orchestrator for containers. Contribute to NetApp/trident development by creating an account on GitHub.

misty cargo Sep 22, 2022, 5:32 PM

#

rancid hearth

Note: Issue was with upgrade to Trident 22.07.0 on OCP 4.7. Resolved by upgrading to OCP 4.8. Trident upgrade to v22.07.0 successful. daemonset created. 22.07.0 added support for Pod Security Standards, and OCP 4.7 support expired Aug 22, 2022.

velvet sable Sep 25, 2022, 11:07 AM

#

Using Trident 22.01, how do I configure two Storage Classes with different nfsMountOptions for the same ontap-nas-economy backend? Tried using selectors but did not make it work.

sacred lantern Sep 26, 2022, 9:21 AM

#

velvet sable Using Trident 22.01, how do I configure two Storage Classes with different nfsMo...

using something like the following:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ontapnasudp
provisioner: netapp.io/trident
mountOptions: ["rw", "nfsvers=3", "proto=udp"]
parameters:
backendType: "ontap-nas"

https://netapp.io/2018/01/25/trident-18-01/

thePub

Trident 18.01 is Here! - thePub

Facebook Twitter LinkedIn Happy (belated) New Year, and welcome to 2018, Pub readers! Over the last few months, despite the holidays, our engineers toiled at their keyboards to bring some […]

velvet sable Sep 26, 2022, 9:25 AM

#

sacred lantern using something like the following: apiVersion: storage.k8s.io/v1 kind: StorageC...

Thanks! That sure definitely works. Is there a way to work with selectors and a different Trident backend that specifies nfsMountOptions? Or is that reserved to specifying virtual storage pools?

sacred lantern Sep 26, 2022, 12:53 PM

#

velvet sable Thanks! That sure definitely works. Is there a way to work with selectors and a ...

yes you can, see:
https://docs.netapp.com/us-en/trident/trident-use/ontap-nas-examples.html#map-backends-to-storageclasses

you are on version 22.07?

velvet sable Sep 26, 2022, 1:04 PM

#

sacred lantern yes you can, see: https://docs.netapp.com/us-en/trident/trident-use/ontap-nas-ex...

I'm on 22.01, thanks!

peak lantern Sep 28, 2022, 10:13 PM

#

I hope I can draw some attention to this issue by posting a link here. It's a major show stopper for our AWS FSx for NetApp adoption rate. Competing storage solutions like Rook Ceph do not suffer the same issue, running in the same cluster.
https://github.com/NetApp/trident/issues/762

GitHub

PVC attachment timeout when a node is terminated / removed from the...

Describe the bug It looks like this bug #691 or a similar bug is introduced in Trident after the 22.01.1 release. Both the 22.04.0 and the 22.07.0 releases suffer from the same behavior. Environmen...

violet garden Sep 28, 2022, 10:51 PM

#

peak lantern I hope I can draw some attention to this issue by posting a link here. It's a ma...

Yes this is definitely the right place for this attention.

#

//cc @elfin verge @dusty yacht

elfin verge Sep 28, 2022, 11:26 PM

#

peak lantern I hope I can draw some attention to this issue by posting a link here. It's a ma...

If you want to send me a DM we can try and figure this out offline

dusty yacht Sep 29, 2022, 3:25 PM

#

@peak lantern, we've recently confirmed that this only happens when a K8S node is terminated when a volume attachment still exists on the K8S worker node. The root cause hasn't been determined as of yet which is why the GitHub issue hasn't been update yet.

peak lantern Sep 29, 2022, 6:33 PM

#

@dusty yacht , yes that's exactly the problem. This happen often in AWS when running on spot nodes. Thanks for confirming the issue.

dusty yacht Sep 29, 2022, 7:06 PM

#

Chuck Fouts0462 yes that s exactly the

cunning fable Sep 29, 2022, 9:34 PM

#

Trident question: if I add iscsi LIFs to an SVM that trident is accessing, how do I tell trident to consider using the new LIFs?

cunning fable Sep 29, 2022, 9:54 PM

#

cunning fable Trident question: if I add iscsi LIFs to an SVM that trident is accessing, how d...

A little reading suggests the answer is "nothing."

short kestrel Sep 30, 2022, 12:08 PM

#

cunning fable A little reading suggests the answer is "nothing."

Correct. There has to be a management LIF specified so that APIs can be sent from Trident to the SVM, but Trident will discover the data LIFs (for either NFS or iSCSI) by querying the SVM.

peak lantern Oct 5, 2022, 9:53 PM

#

Any comments on this issue? Are we doing something wrong, or is it in fact a bug with the EBS CSI drivers?
https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/1417

GitHub

EBS PVCs will not mount on nodes with iSCSI multipath installed · I...

Bug What happened? EBS PVCs fails to mount on hosts where open-iscsi and multipath-tools is installed. How to reproduce it (as minimally and precisely as possible)? On an Ubuntu host execute the fo...

strong furnace Oct 6, 2022, 9:22 AM

#

Hello All, we are using trident for pvc in our k8s clusters and at this time we are testing velero backup but it does not seem to work fine. We can consider alternatives but we would like to know what we need to add to our infrasctrucure. Does Astra use a generic S3 for backup repo or we need to acquire netapp StorageGrid ?

coarse obsidian Oct 6, 2022, 10:14 AM

#

We support a range of S3 backends, some are listed here and there is a note, while we support Generic S3, not all object stores will work, this depends on how they have implemented the spec https://docs.netapp.com/us-en/astra-control-center/use/manage-buckets.html

strong furnace Oct 6, 2022, 10:39 AM

#

thanks

pine birch Oct 6, 2022, 8:24 PM

#

Howdy all, we are upgrading our k8s environment that uses Trident and need to settle on a k8s version hopefully either 1.24 or 1.25. Does anyone know when Trident will support either of those? v22.10? Thanks

pine birch Oct 6, 2022, 8:47 PM

#

Howdy all we are upgrading our k8s

dusty yacht Oct 6, 2022, 8:49 PM

#

@pine birch, K8S 1.25 was released after Trident 22.07 came out. There are changes that needed to be made to support K8S 1.25 and Trident 22.10 will support K8S 1.25.

#

Trident 22.10 will be released at the end of October

pine birch Oct 6, 2022, 8:54 PM

#

dusty yacht <@1022200732524486696>, K8S 1.25 was released after Trident 22.07 came out. Ther...

Thanks for the info, @dusty yacht We'll go with v22.07 and target k8s v1.24.

astral dirge Oct 7, 2022, 5:54 AM

#

Hi all, I'm facing some issues of deploying Astra Control Center. When I create ACC instances, it always gets stuck because of a pod "polaris-mongodb-0". Do you have any ideas to resolve?

$ oc get pod -n netapp-acc
NAME                             READY   STATUS        RESTARTS        AGE
acc-helm-repo-844696b68d-d7vz2   1/1     Running       0               49m
influxdb2-0                      1/1     Running       0               48m
loki-0                           1/1     Running       0               48m
nats-0                           1/1     Running       0               48m
nats-1                           1/1     Running       0               48m
nats-2                           1/1     Running       0               48m
polaris-consul-consul-server-0   1/1     Running       0               48m
polaris-consul-consul-server-1   1/1     Running       0               48m
polaris-consul-consul-server-2   1/1     Running       0               48m
polaris-mongodb-0                0/3     Terminating   0               41s
polaris-vault-0                  1/1     Running       7 (2m56s ago)   48m
polaris-vault-1                  1/1     Running       7 (2m56s ago)   48m
polaris-vault-2                  1/1     Running       7 (2m56s ago)   48m
$ oc describe pod -n netapp-acc polaris-mongodb-0
Events:
  Type     Reason                  Age                From                     Message
  ----     ------                  ----               ----                     -------
  Normal   Scheduled               97s                default-scheduler        Successfully assigned netapp-acc/polaris-mongodb-0 to ocp-gn2ns-worker-nkr8r
  Normal   SuccessfulAttachVolume  97s                attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-9549c2ca-7d35-47fb-83cb-8a2ef09304a4"
  Warning  FailedMount             33s (x8 over 97s)  kubelet                  MountVolume.SetUp failed for volume "certs" : secret "tls-polaris-mongodb" not found

astral dirge Oct 7, 2022, 5:55 AM

#

astral dirge Hi all, I'm facing some issues of deploying Astra Control Center. When I create ...

Also here is the ACC manifest that I deployed

kind: AstraControlCenter
apiVersion: astra.netapp.io/v1
metadata:
  name: astra
  namespace: netapp-acc
spec:
  accountName: Example
  additionalValues: {}
  astraAddress: astra.apps.ocp.opt-test.local
  astraResourcesScaler: Default
  astraVersion: 22.08.1-26
  autoSupport:
    enrolled: true
  crds:
    externalCertManager: false
    externalTraefik: false
  email: admin@example.com
  firstName: Yu
  imageRegistry:
    name: east-master.local:8443/netapp/astracc/22.08.1-26
  ingressType: Generic
  lastName: Shimizu
  storageClass: nfs
  volumeReclaimPolicy: Retain

coarse obsidian Oct 7, 2022, 11:06 AM

#

ACC Install Issue

cunning fable Oct 7, 2022, 9:34 PM

#

I've got some iscsi PVCs that were created a couple years ago by trident before we figured out multipathing. Even though the SVM has 4 links available for iscsi, the old PV is only using one. How can I update the PVC to use the added links + the installed and configured multipathing drivers? If I delete the PVC and reimport the volume, would that get me there?

dusty yacht Oct 7, 2022, 9:54 PM

#

I ve got some iscsi PVCs that were

strong furnace Oct 10, 2022, 8:27 AM

#

Hello, I am facing some issue with velero migration betwwen cluster on trident and the velero community suggested me to check if trident supports cross cluster

#

I would like to backup a cluster with velero and restore it on another cluster but the pvc on the restored gave me some errors: kubectl describe pvc mysql-pv-claim -n miko-test-ns

#

Any help, please ?

📎 message.txt

#

Both clusters are using the same svm

dusty yacht Oct 10, 2022, 2:10 PM

#

Any help please

pallid dirge Oct 10, 2022, 7:31 PM

#

Is the trident operator also providing the external provisioner or is something that has to be installed on its own?

pallid dirge Oct 10, 2022, 8:23 PM

#

strong furnace Both clusters are using the same svm

BTW the version of trident is 21.10

dusty yacht Oct 10, 2022, 8:23 PM

#

The Trident Operator can install and uninstall Trident. During the install process required images like the external provisioner are also pulled

pallid dirge Oct 10, 2022, 8:31 PM

#

dusty yacht The Trident Operator can install and uninstall Trident. During the install proce...

but I do not see any external provisioner container in the trident-csi pod

#

dusty yacht Oct 10, 2022, 8:33 PM

#

It is the csi-provisioner

pallid dirge Oct 10, 2022, 8:42 PM

#

ok thanks

sacred lantern Oct 11, 2022, 6:30 AM

#

strong furnace I would like to backup a cluster with velero and restore it on another cluster b...

@strong furnace , you can import the volume in the other cluster, is that what you are looking for?
See:
https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Astra_Trident/How_to_attach_the_same_Trident_created_PVC_to_multiple_k8s_clusters

#shamelessplug but you can also look at :https://cloud.netapp.com/astra-control

NetApp Knowledge Base

How to attach the same Trident created PVC to multiple k8s clusters

NetApp Astra Control

Project Astra is a Kubernetes application data lifecycle management service for stateful applications across public clouds & your on-premises data center.

fallow niche Oct 11, 2022, 8:15 AM

#

Can Astra Datastore allows to have a single NFS namespace across 2 regions in AWS (with a single mountpoint)?

sacred lantern Oct 11, 2022, 10:09 AM

#

fallow niche Can Astra Datastore allows to have a single NFS namespace across 2 regions in AW...

@fallow niche , can you elaborate on that question?

fallow niche Oct 11, 2022, 10:19 AM

#

sacred lantern <@644300976659890207> , can you elaborate on that question?

Hi Daniel, thanks for answering. My thinking is the following. Imagine 2 regions in AWS. Imagine that we create a volume on FSxO per site. Is there a way to have the 2 volumes seen as a single mount point? The idea is to have an active/active NFS export where EKS can write on both sites. I wonder if Astra Datastore would be able to do that.

covert trellis Oct 11, 2022, 11:46 AM

#

If you are using Astra Trident, or have customers who are using Astra Trident, please register / invite them to register for our next Webinar. It will be the first part of a multiple part series. An overview / intro to the product from a support perspective:

Knowledge and Know-how with NetApp Support - Episode 8: Astra Trident

https://netapp.zoom.us/webinar/register/WN_z2xar1GOSDepusAzslghJw?mkt_tok=MDExLVRXSy02MzYAAAGHYVV6EmaVGuX6AyxgDDJP6vPbsNU6MzdXgKSsOMsb4zhZNRU7RMjJoAKyCYzH6soDN5sdgzuVPznAwGeQjH8

Zoom Video Communications

Welcome! You are invited to join a webinar: Knowledge and Know-how ...

We are in love with the cloud, and we want the whole world to know it, so we’ve got an exclusive and exciting webinar coming up in our Knowledge and Know-how with NetApp Support series: Episode 8: Intro to Astra Trident hosted by:

Shivanjali Pothan, Technical Support Engineer II
David Crosson, Escalation Support Engineer
Scott Stanton, Seni...

strong furnace Oct 11, 2022, 3:32 PM

#

sacred lantern <@456226577798135808> , you can import the volume in the other cluster, is that ...

I can try

sacred lantern Oct 12, 2022, 3:34 PM

#

fallow niche Hi Daniel, thanks for answering. My thinking is the following. Imagine 2 regions...

Hi Christian, think this better discussed in a meeting, to go over the requirements that you are having and maybe suggest a few possible solutions, would that be OK?

strong furnace Oct 13, 2022, 2:35 PM

#

sacred lantern <@456226577798135808> , you can import the volume in the other cluster, is that ...

Hello Daniel, I tried to follow the kb you sent me but it does not solve the issue. To recap what 's happening: I create a velero backup on a cluster (A) and it works. I have a cluster (B) which is using the same svm used by Cluster A. When I restore the velero backup on cluster B it creates the pvc and pv and they are in bound state but no deplpyment can use them (volume attach failed). I think this is because tridentctl command does not show the related volumes. So I tried to import the volumes on cluster B with tridentctl and it works but it is a manual trick because I have to modify my restored deployments. I wonder if velero does not call some trident api during the restore phase or id somoething is missing in trident.

fallow niche Oct 13, 2022, 3:46 PM

#

sacred lantern Hi Christian, think this better discussed in a meeting, to go over the requireme...

Hi @sacred lantern. It is just a question out of curiosity. Customer is requiring a single NFS namespace across Regions (not sure performance-wise this would be OK, rtt must be low) and I was looking for solutions. Of course we are suggesting volume cross-replication, but i was looking for other possible solutions and could not find much about Astra Datastore about this particular requirement. Thanks anyway Daniel.

sacred lantern Oct 14, 2022, 6:52 AM

#

strong furnace Hello Daniel, I tried to follow the kb you sent me but it does not solve the iss...

I would not dare say it is one or the other without a full investigation, but yes, the backup application should request the volume to be imported when it is not there, or maybe it is a pre-req. So , just to confirm, manual import, depending on the import changing the pod config, was working for you, right?

sacred lantern Oct 14, 2022, 6:57 AM

#

fallow niche Hi <@1005050372701831189>. It is just a question out of curiosity. Customer is r...

Yes we were thinking along FSxN MAZ setup as well, or maybe some other thing, our cloud team wanted to discuss to understand the business need and possible future enhancement of our products, based on your customers requirement. On ADS, I never tested a MAZ cluster.

strong furnace Oct 14, 2022, 8:36 AM

#

sacred lantern I would not dare say it is one or the other without a full investigation, but ye...

Hello, we are going to try the manual import before restoring. I will keep you updated

strong furnace Oct 14, 2022, 10:49 AM

#

sacred lantern Yes we were thinking along FSxN MAZ setup as well, or maybe some other thing, ou...

I can confirm that importing volumes before restoring works fine so velero misses this phase

#

The velero community configrmed that trident is not supported by velero

#

CSI supported by astra control ?

#

at this time we are working with longhorn and trident

#

I presume astra control is made for working only with netapp

tardy orchid Oct 14, 2022, 12:59 PM

#

Astra Control Center only works with CSI Trident, you are correct

#

however Astra Control Service also supports: GCP GPD, Azure AMD & AWS EBS

coarse obsidian Oct 14, 2022, 1:38 PM

#

If you have use cases you’d like us to evaluate then we can take it back up to product management. As Yves said above in our managed service we have support for cloud native disks via CSI. These are what we’ve tested so far.

strong furnace Oct 14, 2022, 3:45 PM

#

tardy orchid however Astra Control Service also supports: GCP GPD, Azure AMD & AWS EBS

thanks

naive flax Oct 20, 2022, 8:13 AM

#

Hi all..
I'm new to kubernetes and trident.. But since Ansible AWX now requires kubernetes, we have set up a k3s cluster with 2cp's and 2 workers, installed trident v22.07.0 and AWX 0.28.0.
Everything seems to be working correctly initially. Ansible AWX has its projects dir and internal postgresql database on pvc's managed by trident.
But sometimes playbooks suddenly stop running and finally fail with an Error with no further detail. The awx automation-job pod, just stops logging and after some timeout it is removed.
The only thing I see is that at the same time that pod stops working, is, for the trident-csi pod on that worker:

Liveness probe failed: Get "https://***.***.***.***:17546/liveness": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

and

 Readiness probe failed: Get "https://***.***.***.***:17546/readiness": net/http: request canceled (Client.Timeout exceeded while awaiting headers)

pointing to the IP of the worker where the awx automation-job is running on.
In the trident-csi pod on that worker, I see:

2022/10/20 07:11:15 http: TLS handshake error from ***.***.***.***:49360: EOF
2022/10/20 07:11:19 http: TLS handshake error from ***.***.***.***:49374: EOF
2022/10/20 07:11:46 http: TLS handshake error from ***.***.***.***:58240: EOF
2022/10/20 07:11:50 http: TLS handshake error from ***.***.***.***:58256: EOF

But I have no idea how to troubleshoot any further. Why this happens, why many playbooks do run and finish correctly but some don't, with this behaviour.

Can anyone here help me with this ?

naive flax Oct 20, 2022, 9:28 AM

#

I reinstalled trident using tridentctl -d to add more debugging. But that does not give any extra clues..

time="2022-10-20T09:18:26Z" level=debug msg="<<<< filesystem_linux.GetFilesystemStats" requestID=2a5d51b3-bfba-460c-bb27-022faae9d4e6 requestSource=CSI
time="2022-10-20T09:18:26Z" level=debug msg="GRPC response: usage:<available:2152660992 total:4295032832 used:2142371840 unit:BYTES > usage:<available:124242 total:131072 used:6830 unit:INODES > " requestID=2a5d51b3-bfba-460c-bb27-022faae9d4e6 requestSource=CSI
2022/10/20 09:18:38 http: TLS handshake error from ***.***.***.***:51104: EOF
2022/10/20 09:18:49 http: TLS handshake error from ***.***.***.***:47786: EOF
2022/10/20 09:18:49 http: TLS handshake error from ***.***.***.***:47790: EOF
time="2022-10-20T09:18:57Z" level=info msg="Shutting down."
time="2022-10-20T09:18:57Z" level=info msg="Deactivating plain CSI helper frontend."
time="2022-10-20T09:18:57Z" level=info msg="Deactivating CSI frontend." requestID=b3edee06-b774-4bed-8015-f4eae763533a requestSource=Internal
2022/10/20 09:18:58 http: TLS handshake error from ***.***.***.***:59646: EOF
time="2022-10-20T09:19:17Z" level=debug msg="Transaction monitor stopped."
time="2022-10-20T09:19:17Z" level=info msg="Deactivating HTTPS REST frontend." address=":17546"
time="2022-10-20T09:19:17Z" level=info msg="Stopping periodic node access reconciliation service." requestID=0c665dd9-aa62-4dbf-9e5a-4bffa544d8dd requestSource=Periodic

and at the point where this pod is terminated and restarted due to the failing health probes. The awx-automation-job pod starts hanging and after timeout the awx job fails.

naive flax Oct 20, 2022, 11:21 AM

#

alright. I found out the trident pods have a livenessprobe and readinessprobe configured with a timeout of 1 sec. And deriving from the fact that those probes seems to work most of the time, but not when some Ansible playbooks are executed; I'm assuming that the pods are too slow in responding on the probes, hence the timeout in k3s and the EOF on the pods.

But how do I change/customize the timeouts of those probes in the trident pods ?

dusty yacht Oct 20, 2022, 11:44 AM

#

I don't think that it is the timeout on the liveness probes unless your K8S cluster is running with very low resources.

#

More than likely it is a connectivity issue on that node where the kube-apiserver is unable to reach the liveness probe port. The liveness probe is basically a heartbeat status operation that takes very little time. 1s is the K8S default and is more than enough time in most situations.

naive flax Oct 20, 2022, 12:13 PM

#

I managed to increase the timeout to 10s using tridentctl --generate-custom-yaml and --use-custom-yaml .. and now the trident pods keep on running without errors during such a playbook..
But the playbook itself still suddenly hangs 😕 and gets killed after some time.. now without any further lead to what could be wrong..
The workers have 2CPU's and 16G ram.
After increasing the workers CPU's to 4.. the playbook seems to finish correctly..it seems that the awx automation-job is quite resource hungry, as those workers don't run anything else beside trident and rancher agents..

dusty yacht Oct 20, 2022, 12:16 PM

#

@naive flax , you may want to ask about the CPU load in #╭・ansible🔒 in that case. It sounds like Trident is working correctly. Again 1s should be more than enough for a heartbeat operation.

naive flax Oct 20, 2022, 12:43 PM

#

dusty yacht <@710029549420675135> , you may want to ask about the CPU load in <#855075042673...

I've now reset the timeout values for the probes on the trident-csi pods, and indeed, with 4 CPU's in the workers, the pods still keep on running correctly now.
I'll play with the number of forks for the ansible jobs, which seem to default to 5.. to decrease the resource hungriness of it..
Thanks anyway.

cloud quarry Oct 21, 2022, 8:45 AM

#

In the link below
https://docs.netapp.com/us-en/trident/trident-docker/volume-driver-options.html#ontap-volume-options
the "unixPermissions" is for NFS only,
If I want to change the permission to 777 in isCSI,
How can I do that?
I saw the UnixPermissions in the storage drivers source code.
https://github.com/NetApp/trident/blob/b69aef94a369d1648225ff43f9537bbe7ee114bd/storage_drivers/ontap/ontap_san.go

GitHub

trident/ontap_san.go at b69aef94a369d1648225ff43f9537bbe7ee114bd · ...

Storage orchestrator for containers. Contribute to NetApp/trident development by creating an account on GitHub.

peak lantern Oct 21, 2022, 12:55 PM

#

Hi. I'm using the Trident Operator Helm chart to deploy Trident CSI. Is it possible to define resource requests and limits for the provisioner pods and CSI pods?

formal ingot Oct 25, 2022, 8:51 AM

#

Any official release date of Trident 22.10? 😃
Really looking forward to this feature getting included: https://github.com/NetApp/trident/issues/672

dusty yacht Oct 25, 2022, 12:58 PM

#

formal ingot Any official release date of Trident 22.10? 😃 Really looking forward to this fe...

The Trident 22.10 release is expected to be out by 10/31/22. 🎃

tulip ginkgo Oct 27, 2022, 2:11 PM

#

Is there any way to specify/configure the storage efficiency of the volumes created by the ontap drivers in Trident, especially the nas variants. If you are using an AFF you get it automatically, but what if you have a FAS system?

solar wren Oct 27, 2022, 2:21 PM

#

tulip ginkgo Is there any way to specify/configure the storage efficiency of the volumes crea...

I don't believe you can. These are the available backend options for ONTAP NAS
https://docs.netapp.com/us-en/trident/trident-use/ontap-nas-examples.html?q=dedup#backend-configuration-options

formal ingot Nov 1, 2022, 10:24 AM

#

dusty yacht The Trident 22.10 release is expected to be out by 10/31/22. 🎃

Hmm.. I don't see any mention of Ontap NAE in the Trident 22.10 changelog, is it in there?

dusty yacht Nov 1, 2022, 12:55 PM

#

@formal ingot, this topic is covered in the Use Astra Trident with NVE and NAE section of the security documentation. https://docs.netapp.com/us-en/trident/trident-reco/security-reco.html

formal ingot Nov 1, 2022, 1:00 PM

#

dusty yacht <@1019515080947281951>, this topic is covered in the Use Astra Trident with NVE ...

Sweet, thank you!

dusty yacht Nov 1, 2022, 4:18 PM

#

tridentntap Astra Trident v22.10 Release

The Trident v22.10 release is now available!

🚨 Critical Information 🚨
IMPORTANT: Kubernetes 1.25 is now supported in Trident. Please upgrade Trident prior to upgrading Kubernetes.
IMPORTANT: Trident will now strictly enforce the use of multipathing configuration in SAN environments, with a recommended value of find_multipaths: no in multipath.conf file. Use of non-multipathing configuration or use of find_multipaths: yes or find_multipaths: smart value in multipath.conf file will result in mount failures. Trident has recommended the use of find_multipaths: no since the 21.07 release.

Read the release announcement to find out about new Trident capabilities in v22.10.
https://netapp.io/2022/11/01/astra-trident-v22-10/

Download the release and read about fixes, enhancements, and deprecations in the changelog available on GitHub.
https://github.com/NetApp/trident/releases/tag/v22.10.0

As always, find detailed information for any Astra Trident version in our documentation.
https://docs.netapp.com/us-en/trident/index.html

limber fable Nov 1, 2022, 6:27 PM

#

Hello all! Are you interested in learning more about Kubernetes, Astra Trident, and Astra Control? Take a look at these curated courses from NetApp Learning Services!
If you would like to enroll, please use the links below!

Course title: Kubernetes Administration
Enrollment link: https://netapp.sabacloud.com/Saba/Web_spf/NA1PRD0047/app/me/learningeventdetail/cours000000000045318

Course title: Using Astra Trident with Kubernetes
Enrollment link: https://netapp.sabacloud.com/Saba/Web_spf/NA1PRD0047/app/me/learningeventdetail/cours000000000045559

Course title: Using Astra Control with Kubernetes
Enrollment link: https://netapp.sabacloud.com/Saba/Web_spf/NA1PRD0047/app/me/learningeventdetail/cours000000000046623

📎 Astra_Control_with_Kubernetes.pdf 📎 Astra_Trident_with_Kubernetes.pdf 📎 Kuberneters_administration.pdf

wind mason Nov 4, 2022, 2:36 AM

#

It’s excited to know Astra Trident v22.10.0 is now available and I found that it added new operator yaml (bundle_post_1_25.yaml). If I am going to deploy Astra Trident v22.10.0 with Trident operator in OCP 4.10 environment, can I choose to use bundle_post_1_25.yaml to get ready the configuration to support K8S 1.25 in the future?
Or I should still use bundle_pre_1_25.yaml in this moment until OCP was upgraded to 1.25 or later one day then deploy the Trident operator bundle_post_1_25.yaml afterwards?
Thanks for supports.

random gale Nov 4, 2022, 11:27 AM

#

hey, has anyone changed/migrated QOS policies with trident, we want to migrate a bunch of iscsi luns to new QOS policies but we're unsure on the impact of this. Would it require new backends or can we migrate without new backends? Ideally we'd rename the old qos policies move the luns to the new qos policies with the existing backend QOS policy name on the netapp but I am unsure what would happen to those existing objects before we moved them to the "new" policy

dusty yacht Nov 4, 2022, 9:27 PM

#

wind mason It’s excited to know Astra Trident v22.10.0 is now available and I found that it...

Hi @wind mason, you do want to use the pre 1.25 bundle until OCP supports K8S 1.25. If Red Hat follows previous release patterns I'd expect to see an OCP release in 01/2023 that support K8S 1.25.

dusty yacht Nov 4, 2022, 9:35 PM

#

random gale hey, has anyone changed/migrated QOS policies with trident, we want to migrate a...

@random gale, for volumes that are already created there isn't a way to update the QOS policy that has been assigned to those volumes. However, the qosPolicy and adaptiveQosPolicy parameters in the backend configuration are only used when the volume is created. So you should be able to migrate existing volumes to a new QOS policy without changing the backend configuration file. I do recommend that you test this first on a few temporary LUNs to verify that it will work as you want it to work.

wind mason Nov 5, 2022, 2:41 AM

#

dusty yacht Hi <@1008252114948083827>, you do want to use the pre 1.25 bundle until OCP supp...

Thanks for reply

random gale Nov 7, 2022, 2:45 PM

#

dusty yacht <@568858043295596547>, for volumes that are already created there isn't a way to...

thanks for the repsonse, we will test this out

random gale Nov 7, 2022, 3:13 PM

#

tested and can confirm this works as described above

long yoke Nov 7, 2022, 5:47 PM

#

Hi we see "CSINode wrkra4 does not contain driver csi.trident.netapp.io" when trying to attach a volume. After some googling, added --kubelet-dir /opt/rke/var/lib/kubelet to tridentctl install. But still getting same error. "kubectl get ds -n trident trident-csi -o json " shows it still uses /var/lib/kubelet. We usetriden 22.07.0 and k8s v1.20.15. Thanks!

sacred lantern Nov 8, 2022, 1:22 PM

#

long yoke Hi we see "CSINode wrkra4 does not contain driver csi.trident.netapp.io" when tr...

what does the following in the .spec.drivers section give you?
kubectl get csinode wrka4 -o yaml

have a look at:
https://kb.netapp.com/Advice_and_Troubleshooting/Cloud_Services/Astra_Trident/Trident_CSI_pods_are_missing_the_storage_driver_csi.trident.netapp.io

NetApp Knowledge Base

Trident CSI pods are missing the storage driver csi.trident.netapp.io

long yoke Nov 8, 2022, 2:53 PM

#

sacred lantern what does the following in the .spec.drivers section give you? kubectl get csino...

Here is output from kubectl get csinode wrkra4 -o yaml. spec.drivers: null
apiVersion: storage.k8s.io/v1
kind: CSINode
metadata:
creationTimestamp: "2022-09-12T17:54:26Z"
name: wrkra4
ownerReferences:

apiVersion: v1
kind: Node
name: wrkra4
uid: c5e0fe68-b26c-4c81-9fbf-b35457dc68d3
resourceVersion: "7108394"
uid: 6115cc56-8a9f-4e22-b489-5465f2be250c
spec:
drivers: null

we did reinstall trident a few times but same error. Some node creahloops with this error
level=fatal msg="Unable to start the CSI frontend. open /certs/aesKey: no such file or directory

sacred lantern Nov 9, 2022, 7:31 AM

#

long yoke Here is output from kubectl get csinode wrkra4 -o yaml. spec.drivers: null apiVe...

ok, anything in the log for the driver registrar?
find the pod (trident-csi-xxxxx) for that node ( k -n trident get pod -o wide)
show the logs
k -n trident logs trident-csi-xxxxx --container=driver-registrar

if there is nothing obvious there I think it is better to open an support case and upload the trident logbundle

long yoke Nov 9, 2022, 3:30 PM

#

yeah the logs are filled with registration failure.
time="2022-11-09T15:24:48Z" level=warning msg="Could not update Trident controller with node registration, will retry." error="could no │
│ t log into the Trident CSI Controller: error communicating with Trident CSI Controller; Put "https://10.3.128.106:34571/trident/v1/nod │
│ e/wrkrb5": dial tcp 10.3.128.106:34571: connect: connection refused" increment=2m1.305724064s requestID=1d02da8e-ea17-43b9-8c2e-59da03 │
│ 9f590a requestSource=Internal

sacred lantern Nov 9, 2022, 3:42 PM

#

long yoke yeah the logs are filled with registration failure. time="2022-11-09T15:24:48Z...

looks like it is not able to connect to the service\trident-csi which should be of type ClusterIP and be reachable by all nodes, can you also post the following 2?
‌‌‌ k -n trident describe service/trident-csi
k -n trident get pod -l app=controller.csi.trident.netapp.io -o wide

long yoke Nov 9, 2022, 3:45 PM

#

kubectl -n trident describe service/trident-csi
Name: trident-csi
Namespace: trident
Labels: app=controller.csi.trident.netapp.io
k8s_version=v1.20.15
trident_version=v22.07.0
Annotations: <none>
Selector: app=controller.csi.trident.netapp.io
Type: ClusterIP
IP Families: <none>
IP: 10.3.128.106
IPs: 10.3.128.106
Port: https 34571/TCP
TargetPort: 8443/TCP
Endpoints:
Port: metrics 9220/TCP
TargetPort: 8001/TCP
Endpoints:
Session Affinity: None
Events: <none>

kubectl -n trident get pod -l app=controller.csi.trident.netapp.io -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
trident-csi-64f9f9fd5b-sg4dn 4/6 CrashLoopBackOff 308 14h 10.2.8.238 wrkra4 <none> <none>

sacred lantern Nov 9, 2022, 3:48 PM

#

ah, only 4/6, 2 containers seems to fail to start

long yoke Nov 9, 2022, 3:49 PM

#

yeah all in crashloop

sacred lantern Nov 9, 2022, 4:04 PM

#

long yoke yeah all in crashloop

put in a DM

misty cargo Nov 9, 2022, 8:26 PM

#

FYI - @long yoke @sacred lantern Case 2009362016 issue resolved

long yoke Nov 9, 2022, 10:19 PM

#

misty cargo FYI - <@1019101849846034472> <@1005050372701831189> Case 2009362016 issue re...

Thanks!

rustic summit Nov 11, 2022, 9:19 AM

#

Hi guys,
I would like to serve Trident volumes on a VLAN behind a firewall, I have ontap-nas and ontap-san drivers. What are the ports that I would need to allow from one VLAN to the other?

short kestrel Nov 11, 2022, 1:18 PM

#

Assuming the VLAN/firewall is between the K8s cluster and storage, you would need port 443 for APIs, then whatever ports NFS and iSCSI need.

long yoke Nov 11, 2022, 9:55 PM

#

Hi, have a question about san driver. we have a 4-node filer. node 1/2 have hdd, node 3/4 have ssd. the trident SVM has access to all 4 aggrs. we created 2 StorageClass, silver and bronze for ssd and hdd. our iscsi LIFs are only on node 3/4 (ssd nodes). when we try to create pvc for hdd (node1/2), it fails and complains node1/2 have no LIFs configured with the iSCSI or FCP protocol. do we have to create iscsi LIFs on every node? or there is some setting so we don't have to?

pallid dirge Nov 14, 2022, 10:03 AM

#

Hi I'm using trindet to connect to a NetApp storage. We had some network issues between the cluster and the storage and now we have the tridentbackendconfig that states that the backend is lost, but the backend is still there.

#

we are getting this error:

#

time="2022-11-14T10:00:15Z" level=info msg=-------------------------------------------------
time="2022-11-14T10:00:15Z" level=info msg=-------------------------------------------------
time="2022-11-14T10:00:15Z" level=error msg="error syncing backend configuration 'trident/fas-backend-svil', requeuing; could not find backend during update; backend dbcbdc3c-0829-4a6c-a2d4-9e051f5b5fbb was not found" logSource=trident-crd-controller requestID=f8688418-ead0-47e4-970d-f8866944eda7 requestSource=CRD
time="2022-11-14T10:00:34Z" level=error msg="GRPC error: rpc error: code = InvalidArgument desc = no available storage for access modes: [ReadWriteMany]" requestID=cae4eb6e-efe6-4117-a582-940620272868 requestSource=CSI
time="2022-11-14T10:00:36Z" level=error msg="Could not find backend during update." backendConfig.Name=fas-backend-svil crdControllerEvent=update logSource=trident-crd-controller requestID=590f49e1-0a9b-40fe-81a0-fd6a7d0e67f6 requestSource=CRD
time="2022-11-14T10:00:36Z" level=info msg="New status is same as the old phase, no status update needed." TridentBackendConfigCR=fas-backend-svil
time="2022-11-14T10:00:36Z" level=error msg="error syncing backend configuration 'trident/fas-backend-svil', requeuing; could not find backend during update; backend dbcbdc3c-0829-4a6c-a2d4-9e051f5b5fbb was not found" crdControllerEvent=update logSource=trident-crd-controller requestID=590f49e1-0a9b-40fe-81a0-fd6a7d0e67f6 requestSource=CRD

#

also how to properly update the config and the backend when we have existing PVC? it seems that is not possible to change it without making a mess with volumes

#

kubectl --kubeconfig kubeconfig-kira.yaml get tbc -n trident
NAME BACKEND NAME BACKEND UUID PHASE STATUS
fas-backend-svil ontap-nas-svmp3-k8scsisvil dbcbdc3c-0829-4a6c-a2d4-9e051f5b5fbb Lost Failed
PS D:\docker> kubectl --kubeconfig kubeconfig-kira.yaml get tbe -n trident
NAME BACKEND BACKEND UUID
tbe-tlt2x ontap-nas-svmp3-k8scsisvil dbcbdc3c-0829-4a6c-a2d4-9e051f5b5fbb

pallid dirge Nov 14, 2022, 3:08 PM

#

we solved by bringing the deployment to 0 and than back to 1, but we are interested in understanding why trident entered this state of "confusion"

tropic fog Nov 14, 2022, 4:00 PM

#

Our customer is planning for a major DR test next year where one site (of their 2-site DC infrastructure) will be shut down for a couple of weeks. They utilize Trident and were asking about how this plays out with Trident for a D/R scenario, and we provided the information at the following link: https://netapp-trident.readthedocs.io/en/stable-v19.04/dag/kubernetes/backup_disaster_recovery.html.

#

Specifically, for section 9.4.3 we see this:9.4.3. SnapMirror SVM Disaster Recovery Workflow for Trident
The following steps describe how Trident can resume functioning during a catastrophe from the secondary site (SnapMirror destination) using the SnapMirror SVM replication.

 In the event of the source SVM failure, activate the SnapMirror destination SVM. Activating the destination SVM involves stopping scheduled SnapMirror transfers, aborting ongoing SnapMirror transfers, breaking the replication relationship, stopping the source SVM, and starting the destination SVM.

 Uninstall Trident from the Kubernetes cluster using the tridentctl uninstall -n <namespace> command. Don’t use the -a flag during the uninstall.

 Before re-installing Trident, make sure to change the backend.json file to reflect the new destination SVM name.

 Re-install Trident using “tridentctl install -n <namespace>” command.

 Update all the required backends to reflect the new destination SVM name using the “./tridentctl update backend <backend-name> -f <backend-json-file> -n <namespace>” command.

 All the volumes provisioned by Trident will start serving data as soon as the destination SVM is activated.

#

Customer is now asking about steps 2-4, and "Why Trident must be uninstalled and reinstalled?"

#

Anyway, wanted to ask for validation since I am likely missing something fundamental since K8s and Trident aren't in my wheelhouse. 🙂

coarse obsidian Nov 14, 2022, 4:19 PM

#

I can't speak for this process entirely as that is referenced from a fairly old version of the docs, I'll take a look and see if that has changed in newer versions. However for Kubernetes Disaster Recovery I would really talk to them about Astra Control Center. We can handle the SnapMirror and failover of the apps for them automatically between sites, even reverse the replication etc.

#

If you'd like more information just DM me and we can sort out a call/demo

tropic fog Nov 14, 2022, 4:30 PM

#

Thanks Jason, will be looking forward to what you find out...and will also bring up ACC to them.

coarse obsidian Nov 14, 2022, 5:07 PM

#

The latest process is documented here https://netapp-trident.readthedocs.io/en/latest/dag/kubernetes/backup_disaster_recovery.html

#

There are some assumptions you'll have to work through with the customer, they are listed there.

tardy orchid Nov 14, 2022, 6:06 PM

#

I think the Trident ReadTheDocs is retired, documentation should be read from https://docs.netapp.com/us-en/trident/trident-reco/backup.html#recover-date-by-using-ontap-snapshots

#

@tropic fog , just to confirm, you dont have one K8S stretched across both sites, right? you have 2 completely separate environments?

tropic fog Nov 14, 2022, 8:11 PM

#

@tardy orchid Will verify with the customer and let you know...thanks!

vale abyss Nov 15, 2022, 12:17 PM

#

Tell me I am holding it wrong, please! This is what I get:

$ helm -n trident upgrade trident netapp-trident/trident-operator
Error: UPGRADE FAILED: unable to build kubernetes objects from current release manifest: resource mapping not found for name: "tridentoperatorpods" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"
ensure CRDs are installed first"

short kestrel Nov 15, 2022, 1:29 PM

#

@vale abyss Everything I'm seeing about PodSecurityPolicy is showing that it should be giving a warning during the upgrade and not an error. Have you upgraded K8s recently? What version of K8s are you using?

north spade Nov 16, 2022, 12:16 AM

#

I am trying to install trident csi in windows kubernetes cluster but gettting below error
0/8 nodes are available: 3 node(s) had taint {cattle.io/os: linux}, that the pod didn't tolerate, 5 node(s) didn't match Pod's node affinity.
@short kestrel any suggestions?

short kestrel Nov 16, 2022, 7:02 PM

#

@north spade The K8s for Windows is new for us in support as well as for you. I may be able to help, but I'm going to need more information than that to go on. Is the Win environment ANF or on prem, or other? Can you describe the trident pod that isn't fully coming up (I'm assuming it's the orchestrator node, but it may be one of the temporary ones that usually doesn't stick around long enough for me to memorize the name) and see what it shows?

north spade Nov 16, 2022, 11:56 PM

#

short kestrel <@916397851683221515> The K8s for Windows is new for us in support as well as fo...

Win environment is onprem.
yes, its trident-operator pod that is causing the issue.
We overcame this by defining tolerations in values.yaml file but the pods that get deployed by operator(trident-csi and trident-csi-windows) are failing. I think its because they are trying to pull linux based based docker images instead of windows based docker images.
Any idea how we can define what image to be pulled via helm?
are you using both google and docker hub as registry?

short kestrel Nov 17, 2022, 2:35 PM

#

What does a describe on one of the trident-csi-windows pods look like? Does it show the correct image? What events does it show?

short kestrel Nov 17, 2022, 4:07 PM

#

Also, did you define any tolerations for node affinity?

wind mason Nov 18, 2022, 12:58 AM

#

Hi Support,
May I know if all the nodes in OCP should be able to communicate with ONTAP management interface by 443 port as there is daemonset pod on each nodes?

dusty yacht Nov 18, 2022, 1:55 PM

#

Hi Support

wind mason Nov 21, 2022, 4:02 AM

#

Hi @dusty yacht,
If I would like to customize the deployment of Trident Control pod in infra nodes of OCP, I can find the useful information here https://docs.netapp.com/us-en/trident/trident-get-started/kubernetes-customize-deploy.html#sample-configurations.
But our infra nodes have another tolerate settings, any sample of format in editing TridentOrchestrator by adding tolerate parameters in nodeselector for reference?
Thanks for support.

pallid dirge Nov 21, 2022, 6:14 PM

#

Any way to solve backend in "lost" status or how to debug it?

short kestrel Nov 21, 2022, 6:21 PM

#

Backend Lost: The backend associated with the TridentBackendConfig CR was accidentally or deliberately deleted and the TridentBackendConfig CR still has a reference to the deleted backend.

#

I would try to update it using the "tridentctl update backend <Backend Name> -f <Backend File.json>"

#

This assumes you have the json file.

pallid dirge Nov 21, 2022, 6:41 PM

#

We did not delete tridentbackendconfig that is still there

#

We configure the cluster with gitops so no tridentctl

#

We have both tbc and tbe but tbe is lost

pallid dirge Nov 21, 2022, 7:25 PM

#

Does also workers need to talk with APIs or only masters?

vale abyss Nov 22, 2022, 3:58 PM

#

@short kestrel kubernetes is 1.25.3 and yes that is the problem....
Interesting fact helm template | kubectl apply -f works fine.... helm upgrade not so much... even helm uninstall fails miserably on 1.25 leaving tons of crd with a non-existent finaliser.... really sloppy helm chart that is.... smells of java developers again...

#

after all, we still remember "ClientPrivateKey: ''" that no one bothered to fix 😄

vale abyss Nov 22, 2022, 4:36 PM

#

ok, a workaround would be to helm uninstall, then delete trident-operator deployment and sh.helm.... secret and redeploy

#

also delete tridentorchestrator and recreate it from template (helm template output)

short kestrel Nov 22, 2022, 7:32 PM

#

@vale abyss Sorry to hear you are having issues with the helm installer. If you are willing to document the problems at https://github.com/NetApp/trident/issues that would put it on our development team's radar.

#

@pallid dirge Steps to triage:

Look at the tbc YAML output of tbc’s metadata.uid, status.backendInfo.backendName and status.backendInfo.backendUUID
Look at the tbe’s YAML output, ensure configRef matches tbc’s metadata.uid, the backendName or backendUUID are also consistent with tbc’s YAML output.
If they are consistent then update the tbc using the kubectl apply -f tbc.yaml command. The update could be a change to either of values in tbc :

debugTraceFlags:
api: true
method: true

If this does not help then capture the controller logs to see what may have put the tbc to be in a Lost state and open up a case with our support team.

vale abyss Nov 22, 2022, 9:09 PM

#

@short kestrel did this - https://github.com/NetApp/trident/issues/783 - hope it helps. I may be the only idiot running 22.07 on 1.25, but if there is anyone else done the same mistake... hope this helps

GitHub

trident-operator helm upgrade to v22.10 fails when kubernetes versi...

In the rare cases when trident-operator 22.07.0 runs on kubernetes 1.25 (with PodSecurityPolicies deprecated), helm upgrade to 22.10 fails with the following message: 'Error: UPGRADE FAILED...

#

from what I discovered, seems helm gets confused with release details so anything in the chart will miserably fail even before touched. Not very good helmer myself - used to hate the thing in pre v3 era - so I may be talking rubbish as usual...

wind mason Nov 24, 2022, 6:30 AM

#

wind mason Hi <@989614650113011742>, If I would like to customize the deployment of Trident...

Hi Support,
May I have any hints on this? I would like to deploy trident controller pod on infra nodes only and my infra node taint is defined as “infra=reserved: NoSchedule” and “infra=reserved:NoExecute”

sacred lantern Nov 24, 2022, 12:05 PM

#

wind mason Hi Support, May I have any hints on this? I would like to deploy trident control...

I would path the deployment csi and operator like:
kubectl patch deployment.apps/trident-csi -n trident --type=merge -p '{"spec":{"template":{"spec":{"tolerations":[{"key":"infra","operator": "Equal","value": "reserved","effect":"NoSchedule"},{"key":"infra","operator": "Equal","value": "reserved","effect":"NoExecute"}]}}}}'

wind mason Nov 24, 2022, 1:54 PM

#

sacred lantern I would path the deployment csi and operator like: kubectl patch deployment.app...

Hi @sacred lantern Thanks for reply. But I wonder if we patch TridentOrchestartor or patch the deployment of trident-csi directly?
If we just patch the deployment of trident-csi, will it be recovered to original configuration when trident-csi deployment is deleted or during trident upgrade?
I found that there is a configuration parameter in TridentOrchestrator called “controllerPluginTolerations” in Trident doc but I am not sure how I can set to fit the taint setting of infra nodes in my environment as I tried many time but still fail.

sacred lantern Nov 25, 2022, 10:46 AM

#

wind mason Hi <@1005050372701831189> Thanks for reply. But I wonder if we patch TridentOrch...

trying this as well
that worked, added the toleration in the deploy/operator.yaml before you run the kustomize, and added the following in the deploy/crds/tridentorchestrator_cr.yaml (had to add the master because it had that taint as well)
controllerPluginTolerations:
- key: "infra"
operator: "Equal"
value: "reserved"
effect: "NoSchedule"
- key: "infra"
operator: "Equal"
value: "reserved"
effect: "NoExecute"
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"

gritty imp Nov 28, 2022, 6:31 AM

#

hello, i am trying to https://docs.netapp.com/us-en/astra-control-center/get-started/setup_overview.html#add-a-bucket hosted on AWS with Type: Generic S3 via Virtual-hosted–style access.. however State: "Unavailable" and Status: "An event happened internally that stopped the system from obtaining state"

checking IAM on AWS console I noticed the access key created for this was never used, any idea what might be wrong?

short kestrel Nov 28, 2022, 4:43 PM

#

hello i am trying to httpsdocs netapp

wind mason Nov 30, 2022, 1:56 AM

#

sacred lantern trying this as well that worked, added the toleration in the deploy/operator.yam...

Hi @sacred lantern, I tried your provided solution and it works. I have few questions:

Can I add toleration to deploy/crds/tridentorchestrator_cr.yaml only but not deploy/operation.yaml if I just want trident controller pod to be created in infra nodes?
In adding toleration, is it a must to add the master?
Thanks for your support.

sacred lantern Nov 30, 2022, 6:47 AM

#

wind mason Hi <@1005050372701831189>, I tried your provided solution and it works. I have f...

Hi @wind mason , on your questions:

yes you can
no, I only did that because in my configuration by default it did not want to go to my master node, and I wanted to be sure it got schedules on the node I want. So, if you don't want it on your master node, you can remove that toleration.

wind mason Nov 30, 2022, 8:43 AM

#

sacred lantern Hi <@1008252114948083827> , on your questions: 1. yes you can 2. no, I only did ...

Hi @sacred lantern
Thanks for advice.

dense tendon Dec 5, 2022, 9:11 AM

#

Hi All, Is it possible to add list label (fabric_clusters in below example) to backend configuration? If yes how to use it in storage class?
Example below :
labels": {
"environment": "DEV",
"location": "EW2",
"fabric_clusters": [
"eng",
"ldg",
"frt",
"dev"
]
},

I now want to use "fabric_clusters" label as a selector in my storage class, how to do that?

short kestrel Dec 5, 2022, 1:03 PM

#

@dense tendon The older documentation shows examples for this, it hasn't changed to my knowledge. https://netapp-trident.readthedocs.io/en/stable-v20.07/kubernetes/operations/tasks/backends/ontap/ontap-nas/examples.html

solar wren Dec 6, 2022, 9:37 PM

#

Hello
Was Astra Trident tested with Google Container-Optimized OS on GKE?

dusty yacht Dec 6, 2022, 10:31 PM

#

solar wren Hello Was Astra Trident tested with Google Container-Optimized OS on GKE?

Hi Mickey, Trident has worked with Google's COS for several years now. We didn't specifically test GCP COS with the Trident v22.10 release in GKE though.

gritty imp Dec 7, 2022, 5:53 AM

#

hello, is it possible to get astra to backup applications with ontap-nas-economy backend type?

sacred lantern Dec 7, 2022, 6:57 AM

#

gritty imp hello, is it possible to get astra to backup applications with ontap-nas-economy...

Hi @gritty imp , yes it is:
https://docs.netapp.com/us-en/astra-control-center/get-started/requirements.html#operational-environment-requirements
Astra Trident / ONTAP configuration: Astra Control Center requires that a storage class be created and set as the default storage class. Astra Control Center supports the following ONTAP drivers provided by Astra Trident:
ontap-nas
ontap-nas-flexgroup
ontap-san
ontap-san-economy (not supported for app replication)

gritty imp Dec 7, 2022, 7:00 AM

#

sacred lantern Hi <@1046669137977552896> , yes it is: https://docs.netapp.com/us-en/astra-contr...

Hi Daniel, how about ontap-nas-economy ?

sacred lantern Dec 7, 2022, 7:10 AM

#

gritty imp Hi Daniel, how about ontap-**nas**-economy ?

🤦‍♂️ ,* me going to get coffee first.......*, ah NAS.... ehhh no, apparently not, let me see if I can find out more...

ok, ontap-nas-economy driver cant take snapshots for a specific pvc, see https://docs.netapp.com/us-en/trident/trident-concepts/snapshots.html

gritty imp Dec 7, 2022, 8:26 AM

#

does this mean astra cannot backup applications using ontap-nas-economy pvc ?

sacred lantern Dec 7, 2022, 8:27 AM

#

gritty imp does this mean astra cannot backup applications using ontap-nas-economy pvc ?

unfortunately it does

pine birch Dec 7, 2022, 1:47 PM

#

Morning all. I've been in the process of slowly upgrading Trident, via the operator, in all our kubernetes clusters. Just recently It's been taking quite a while to complete and noticed the delay is from imagepullbackoff errors pulling netapp/trident-operator:21.10.1 as we are hitting docker registry rate limiting. Again this is very recent development, within the last couple/few weeks, and the docker rate limiting has been in place for quite some time. Is this now expected behavior when pulling down Trident images from Docker? If so, does NetApp have their own registry supported images can be pulled from?

sacred lantern Dec 7, 2022, 2:49 PM

#

pine birch Morning all. I've been in the process of slowly upgrading Trident, via the opera...

If you are 'hitting docker registry rate limiting' , means you probably did not login into docker, did you do a docker login for your profile, did the password change?
you can alternatively setup your own registry and upload the bundle to there, see:
https://docs.netapp.com/us-en/astra-control-center/get-started/install_acc.html#download-and-unpack-the-astra-control-center-bundle

pine birch Dec 7, 2022, 3:30 PM

#

sacred lantern If you are 'hitting docker registry rate limiting' , means you probably did not ...

We've never had to login to docker before, even after their rate-limiting was put in place. I was under the assumption that NetApp, like other companies, had exceptions to the rate-limiting for their supported images. I guess not. Anyhoo, yeah I was thinking we could just push/pull to/from our internal private registry. So, for Trident, the only line in the trident-installer/deploy/bundle.yaml that would need to be changed when performing and install or upgrade would be adding our registry to image: netapp/trident-operator:21.10.1 ? Correct?

dusty yacht Dec 7, 2022, 3:50 PM

#

We ve never had to login to docker

placid vortex Dec 13, 2022, 4:27 PM

#

Hey team. Has Astra Control been tested with Kubevirt or OpenShift Virtualization? I didn't see anything specifically referencing it in the docs.

cloud quarry Dec 14, 2022, 3:24 AM

#

Hi all,
I have a question about trident monitor.
I Try to parsing from trident log.
Is there any alert rule or keyboard that can detect trident-csi problem.

short kestrel Dec 14, 2022, 2:01 PM

#

placid vortex Hey team. Has Astra Control been tested with Kubevirt or OpenShift Virtualizatio...

Alan, I had to ask around since I haven't personally tested installs of Astra control, but this is what I received from one of our experts... "It's nothing we test/qualify at the moment. I don't see any reason why it wouldn't work, it is a regular PVC in the end."

placid vortex Dec 14, 2022, 2:04 PM

#

short kestrel Alan, I had to ask around since I haven't personally tested installs of Astra co...

I did some tests a while ago, I think the 21.08 release at the time. A backup worked, but restore left the VM unbootable. I just didn't know if there had been an update in the last year or so that tackled that.

still bone Dec 14, 2022, 2:53 PM

#

My Azure secret expired making my Azure Blob unavailable in Astra. I've created a new secret in Azure and updated the credentials in Astra but still get an unavailable error. I can access the blob storage via BlueXP so I know the new secret is working properly.

pine parcel Dec 15, 2022, 12:36 PM

#

Hi, I just noticed that Astra DS has been removed from Trident. Also there is no Astra DS documentation anymore on docs.netapp.com... What happened to Astra DS? Has it been discontinued?

viral stump Dec 15, 2022, 2:10 PM

#

hi. I got a PV which is Released and trident is trying to delete it without success. if I descrive the PV the Events says:

rpc error: code = Unknown desc = object is being deleted: tridenttransactions.trident.netapp.io "pvc-xxx" already exists

if I look in the trident namespace I can see a CR of type tridenttransactions.trident.netapp.io with this name

any tips how to fix the state trident is in right now?

viral stump Dec 15, 2022, 3:14 PM

#

viral stump hi. I got a PV which is Released and trident is trying to delete it without succ...

a bit more context. it seems at first the original cause was the volume had child clones and therefor couldnt be deleted. and at some point the trident pod had trouble talking to the apiserver due to a network hickup. after that all we see in the logs are those "tridenttransaction already exist" type of failures.

I suspect it somehow ended up in a limbo state and can't reconcile the transaction properly.

now the clones have been split from its parent so it should be good to delete the parent but trident can't remove it because of the already existing transaction object

pine parcel Dec 15, 2022, 3:19 PM

#

maybe try deleting the trident pod(s)?

viral stump Dec 15, 2022, 3:19 PM

#

pine parcel maybe try deleting the trident pod(s)?

I could try a rollout restart of the daemonset

viral stump Dec 15, 2022, 3:21 PM

#

viral stump I could try a rollout restart of the daemonset

actually probably the controller that needs a restart.

#

restarted them both, but unfortunately no change. so I am wondering if its safe to delete the transaction and let trident try again.

sacred lantern Dec 15, 2022, 3:26 PM

#

viral stump restarted them both, but unfortunately no change. so I am wondering if its safe ...

not delete, patch the finalizer for it
kubectl patch tridenttransaction <pvc_name> -n <trident_namespace> -p '{"metadata":{"finalizers":[]}}' --type=merge

viral stump Dec 15, 2022, 3:27 PM

#

sacred lantern not delete, patch the finalizer for it kubectl patch tridenttransaction <pvc_nam...

yes, thats what I was thinking. I see it (the transaction) has a finalizer

short kestrel Dec 15, 2022, 3:36 PM

#

I did some tests a while ago I think the

viral stump Dec 15, 2022, 3:51 PM

#

yes thats what I was thinking I see it

dusty yacht Dec 15, 2022, 6:00 PM

#

Hi I just noticed that Astra DS has been

pine kernel Dec 15, 2022, 9:52 PM

#

Anyone know if Astra has some log files that can be used for debugging? I cannot connect to a GKE Cluster and the Service Principal has all the correct roles assigned to it.

#

violet garden Dec 15, 2022, 10:42 PM

#

@coarse obsidian might be able to dig into this one for ya

coarse obsidian Dec 15, 2022, 11:22 PM

#

Have you checked all the other pre-requisites for using Google Cloud? There are some APIs that also need to be enabled.

#

https://docs.netapp.com/us-en/astra-control-service/get-started/set-up-google-cloud.html#quick-start-for-setting-up-google-cloud

#

Check all the APIs in step 3

pine kernel Dec 16, 2022, 8:09 PM

#

@coarse obsidian - All looks good from the API's and the Roles for the Service Principal. Still getting the same error. Any way to see what is actually throwing the error?

coarse obsidian Dec 16, 2022, 9:47 PM

#

If it’s not in the activity section then no I don’t know a way to find it in ACS. I’ll see if any one else on the team knows, or if someone can take a look for you

pine kernel Dec 21, 2022, 10:26 PM

#

@coarse obsidian - I am using the same JSON file as part of the backend definition for Trident (CVS Backend) and it is working fine. Just can't use it for the Astra connection. All of the API's are enabled as well as the roles assigned to the service principal. Moving on....

pine kernel Dec 22, 2022, 7:40 PM

#

@coarse obsidian - got it working.

coarse obsidian Dec 22, 2022, 7:40 PM

#

That’s good to know, what was up?

#

I’ve got a request going in for better error messages where we can

pine kernel Dec 22, 2022, 7:47 PM

#

@coarse obsidian - I know its strange but .... I changed..... NOTHING. It just worked yesterday. I wish I could give you a definitive answer. But that's the truth..

coarse obsidian Dec 22, 2022, 7:51 PM

#

Ok, I’ll try get that looked at. I’m on break for the holidays but I’ll speak to the team when I’m back

olive zealot Dec 28, 2022, 6:02 PM

#

I'm trying to use trident as data source for kasten

#

I already i stalled trident operator with helm, and everything is on I can create volumes and mound them in a pod

#

I don't know what else needs kasten

violet garden Dec 28, 2022, 7:38 PM

#

Out of curiosity, did you know that NetApp Astra is an equivalent product with the same (and more!) functionality and has trident support built-in?

olive zealot Dec 28, 2022, 10:31 PM

#

Didn't try astra