#Upgrade trident-operator 22.10 to 23.01 using helm after upgrading to kubernetes 1.25 fails

1 messages · Page 1 of 1 (latest)

devout storm
#

Helm fails to find tridentopertorpods and I suspect tridentpods PSPs when tries to upgrade trident-operator from version 22.10 to 23.01 and after kubernetes cluster have been upgraded from 1.24 to 1.25 after the trident 22.10 upgrade (as isnstructed). The error message that comes out is:

$ helm -n trident upgrade trident netapp/trident-operator
Error: UPGRADE FAILED: unable to build kubernetes objects from current release manifest: resource mapping not found for name: "tridentoperatorpods" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"
ensure CRDs are installed first

This is the current version of trident-operator installed:

$ helm -n trident list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
trident trident 2 2022-11-18 11:29:03.809021419 +0100 CET deployed trident-operator-22.10.0 22.10.0

And to be fair a 1.24 cluster still has some PSPs available:

$ kubectl get psp
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
tridentoperatorpods false RunAsAny RunAsAny RunAsAny RunAsAny false projected
tridentpods true RunAsAny RunAsAny RunAsAny RunAsAny false hostPath,projected,emptyDir

Maybe I am doing the upgrade wrong.....

idle horizon
#

Hi @devout storm, it looks like we missed documenting a fix that went into Trident v23.01 for this. We're working on that now.

#

If you are performing a helm upgrade with Trident v23.01 on a K8S v1.25+ cluster then you need to set the excludePodSecurityPolicy var to true.

#

The command would look like helm upgrade -n trident trident chart.tgz --set excludePodSecurityPolicy=true.

#

You can also update the Helm chart's values.yaml file to set excludePodSecurityPolicy to true.

idle horizon
#

It seems that Helm isn't capable of ignoring removed APIs like Kubernetes is which is why there is this additional param just for the upgrade. It also can't evaluate which version of K8S is being used to select the correct value.

devout storm
#

Hey Chuck, I am going to try this right now. Oddly enough half of my 1.25 clusters took 23.01 like champs. The other half however....

Thank you for the hint! Usually read the values of new charts, but there isn't that many values in trident, so neglected those..

devout storm
#

Hmm, seems helm does not register my request to be ignorant:

helm -n trident upgrade trident netapp/trident-operator --version=23.01.0 --set excludePodSecurityPolicy=true
Error: UPGRADE FAILED: unable to build kubernetes objects from current release manifest: resource mapping not found for name: "tridentoperatorpods" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"
ensure CRDs are installed first

idle horizon
#

@devout storm, it sounds like you upgraded Kubernetes to v1.25+ already. We're going to have to test how to get out of that situation with Helm. You can't just do a Helm uninstall and then Helm install to fix this issue.

devout storm
#

Ok, I've 4 clusters in this state. Want me to keep a cluster or 2 to test helm upgrades?

idle horizon
#

That would be great if you can verify the steps. We're also talking about how to improve the documentation to make this issue more obvious.

red mango
#

@devout storm You can delete the helm release from your cluster and reinstall the chart. I think that'll be easier than editing the release.

  1. Find release secret or configmap (should be named something like sh.helm.release.v1.trident.v1 in the trident namespace)
  2. Delete that secret or configmap
  3. Reinstall using the same values as your original trident install

Because you're already on k8s 1.25, the chart will ignore excludePodSecurityPolicy, so you don't need to set it

devout storm
#

I think I tried deleting and reinstalling the operator with helm, but then it refused to bump the trident itself to the newer version...Anyway, tried the helm fix and it worked like charm. The bit that needs to be removed is right at the beginning of manifest field (note that everything in there is a single line) and looks like this:

Source: trident-operator/templates/podsecuritypolicy.yaml\napiVersion: policy/v1beta1\nkind: PodSecurityPolicy\nmetadata:\n name: tridentoperatorpods\n labels:\n app: operator.trident.netapp.io\nspec:\n privileged: false\n seLinux:\n rule: RunAsAny\n supplementalGroups:\n rule: RunAsAny\n runAsUser:\n rule: RunAsAny\n fsGroup:\n rule: RunAsAny\n volumes:\n - projected\n---\n

Then the encode is as they mention it only I added a -w0 to the last base64 in order to fit it in the secret yml. After that the upgrade went on flawlessly.
This all can easily be scripted perhaps and a regex used to remove text between say "# Source: trident-operator/templates/podsecuritypolicy.yml" all the way to the next "# Source" or something like that...

#

So far this method patching helm release works great and is easy to go through. One cluster is upgraded and happy... Will try again a more scripted patch tomorrow on the others... then tell you how it went.

#

Thank you for all your help.... I wish I asked about helm much earlier.... everyday is a school day 🙂

devout storm
#

OK, all clusters now upgraded and happy and these are the steps I used in a bash file of sorts:

#!/usr/bin/env bash

NAMESPACE=$1
RELEASE=$2

RELEASESECRET=kubectl -n $NAMESPACE get secret -l owner=helm,status=deployed,name=$RELEASE | awk '{print $1}' | grep -v NAME

echo "Updating $RELEASESECRET"

kubectl -n $NAMESPACE get secret $RELEASESECRET -o yaml > $RELEASESECRET.yaml
cp $RELEASESECRET.yaml $RELEASESECRET.bak
cat $RELEASESECRET.yaml | grep -oP '(?<=release: ).*' | base64 -d | base64 -d | gzip -d > $RELEASESECRET.data.decoded

<here i tried some sed magic to no avail deleting the string above.... I'm sure more clever people here will have no trouble figuring it out>

echo "Encoding $RELEASESECRET"

NEWRELEASE=cat $RELEASESECRET.data.decoded | gzip | base64 | base64 -w0

sed -i "s/ release:.*/ release: $NEWRELEASE/" $RELEASESECRET.yaml

kubectl -n $NAMESPACE apply -f $RELEASESECRET.yaml

#########
All above worked like charm on all 4 clusters and is pretty simple to implement as a fix. Not sure what is the point of keeping history releases if these cannot be reverted to, but...

And with all that I will now stop rambling and just want to say 'uge thank you for the help and the insights! THANK YOU!