I've got some wild issues that support can't fully explain yet. While they are working it out, thought I check the community's pulse.
We replaced our under powered AFF-A400s with a set of A900s. We started to experience controller crashes (only one side, thankfully) every time a Rubrik NAS protection SLA would run. It was like clockwork, the SLA kicks off at 1:00 AM, boom, node goes down. Wild part (1) is that when the work fails over to the partner controller, it completes without issue. Hell, it's done before the first controller is back. Since it was new, we requested a replacement controller. Wild part (2) the replacement controller has the exact same failure state, right when that SLA kicks off...
...or at least it did. Now it works just fine. I have had 3 peaceful nights of sleep with no failures. I'm optimistic that they will continue. However, we have made no changes to our Rubrik or NetApp environments. At this point, support has been looking at the ASUPs and core dumps. They haven't asked for or made any changes.
Anyway, anyone out there with weird Rubrik/NetApp crashes that just magically went away?