#┊・ransomware
1 messages · Page 1 of 1 (latest)
Thanks Matt!
Since all the tests ransomware solution vendors (like NetApp with ARP or Cloud Secure, or ProLion CryptoSpike, Cleondris SnapGuard, ...) usually provide only test scripts and not the real deal, I'm asking myself: Have you ever had a case where a NetApp ransomware solution actually noticed/blocked a real ransomware attack? (I know you can't provide names.)
We sold SnapGuard quite often last year and still no real proof it defending against in-the-wild malware (not that I would wish it on my customers... I'm just curious 😅)
you also have to remember at detection is only part of the job
Taking NetApp SnapShots on Datastores containing Oracle DB VM's for ransomware protection, and recovery, is this a valid way?
Are you storing the /u0x mount points separately?
(or is this Oracle for Windows?)
Don't understand your first question. Sorry, and please explain.
It is Oracle for Linux
The default naming conventions for Oracle DB’s are like /u01, /u02, etc.
Where are your binaries, db files, redo logs, etc stored? Locally or on a separate set of volumes on a NAS?
We didn't use that default naming, unfortunately.
We have mixed environments. On some DB servers, all those files you mentioned are local. On the others, they are mounted separately on different NFS volumes. Please explain in both situtation.
I haven’t done much with it since the good ol’ 10g/11g days, but I don’t think much has changed fundamentally. I’m gonna include @fierce blade here just in case.
The problem with snapshotting the whole VM when vols are remote is that you need to quiesce the DB before snapshotting, either way. So a pre/post ALTER DATABASE OPEN/CLOSED need to be issued. Otherwise you’re gonna have dirty db backups. In most instances with a good redo logs system this can be sufficient, but personally I wouldn’t trust it.
The Oracle plugin for SnapCenter can automate a lot of this for you
"vols are remote" meaning these files, binaries, db files, redo logs etc are stored on separated NFS volumes?
Excellent. So you basically need TWO separate distinct plans. One for all of the volumes containing data that the db attaches to (we can take care of this with SnapCenter), and one for the VMs holding the Oracle config and installation. Something like Veeam or the VMware plugin for SnapCenter.
If the DB VM itself contains binaries, all DB files, redo logs, etc, SnapShotting the VM could be valid for recovery?
you still need to put the DB into a consistent state, so a pre and post ALTER DATABASE CLOSED > snap VM > ALTER DATABASE OPEN set of scripts should accompany it. But theoretically, yes. I would test the hell out of it.
If you're doing that for a production workload, we should have a different conversation. 🙂
Current Oracle versions are better at recovering from crash consistent backups than the "good ol' 10g/11g' days, but the snapshot of all the volumes should be consistent. There are ways in current versions of ONTAP to make "Consistency Group" (CG) snapshots without backup software. That would get you a crash consistent backup as if the plug were pulled on the entire datacenter. Personally, I'm still old school, so I prefer the application approach to backups using something like SnapCenter to snapshot the database (datafiles and archive logs) and a corresponding application volume snapshot to capture the rest of the volumes required (control, redo, binaries).
Hi, I have a customer who has enabled ARP on their Ontap system running 9.11.1.P3. They have a few questions about the behavior of the ARP. feature. 1. They have an old ARP snapshot from January 6, shouldn't these snapshots get deleted automatically after 5 days? 2. After the learning period the ARP found 1821(!!) unique file extensions in a single volume and some of the extensions listed were very long. This doesn't look right so wonder if there are any knows issues how ARP determine file extensions? 3. In several occasions an ARP snapshot has been taken but nothing shows up under events in System manager or in the volume security tab so the sys admin is not notified, isn't some kind of notification expected even if the attack probability is low? Any comments on these observations? Any improvements in newer releases of 9.11.1?
Took me a second too - not Address Resolution Protocol 😆
Autonomous Ransomware Protection 😉
HI Hammer, I'd suggest starting with this KB article. It should address most of this. https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/Understanding_Anti-ransomware_protection_attacks_and_the_anti-ransomware_snapshot
Anti-ransomware protection is protection against Anti-ransomware solutions, right? 😆
Hi Matt,
related question here: on 9.12.1 our SysAdmin switched on ARP (after 30 days of dry-run) and it almost immediately created an anti-ransomware snapshot. He asked why...?
I went through the documentation and the man pages (and the KB above), but it does not anywhere mention this behavior.
There was no relevant event in the event log, neither was any attack recognized:
FLHAM-Cluster::> security anti-ransomware volume show -instance -volume volflanehh
Vserver Name: FLHAM-cifs
Volume Name: volflanehh
State: enabled
Dry Run Start Time: -
Attack Probability: none
Attack Timeline: -
Number of Attacks: -
Snapshot also cannot be (easily) deleted:
FLHAM-Cluster::> snapshot delete -volume volflanehh -snapshot Anti_ransomware_backup.2023-03-09_0921 -vserver FLHAM-cifs
Error: command failed: This Snapshot copy is currently used as a reference Snapshot copy by Job IDs: anti_ransomware. Deleting the Snapshot copy can fail the jobs.
I do suspect "normal" behavior, this snapshot being used as a "reference". But if so, it should be documented and logged in the event log IMHO.
HI @wet comet sorry for the late reply, i've been on PTO. An ARP snapshot can generate without an alert. Basically it generates the snapshot right away if it thinks an attack is occurring, but then it analyzes the data and if doesn't find an attack then no alert is generated. The snapshot sticks around for a little while (configurable duration) but then it can be deleted. Then KB article I posted above goes over the options to change the duration if you need to.
In case you haven't seen it yet, some announcements were made today: https://www.netapp.com/blog/ransomware-recovery-guarantee/
That seems like it was an easy guarantee to make :) Snapshots still behave like snapshots, and SnapLock is as Snaplock does.
One of the big announcements is that it's no longer extra cost
The ONTAP license changes do seem to also be gathering a lot of attention for sure
You're not wrong. It's something NetApp customers have enjoyed for decades now. We decided to formalize it. I was an extensive user of what-we-know-today as SnapLock Compliance in the public healthcare space wayyyy back in 2008. PII regulatory archives, Malpractice legal holds, you name it. With the rise of ransomware, having an immutable destination made a whole lot of sense to leverage SnapLock functionality for that. It's SnapLock as we've always known and loved it, finally getting some love again. It's just maybe got a new bumper sticker slapped on it. 😎 (oh, and everybody gets it included free now with ONTAP One)
When will we see powershell module data.ontap support anti-ransomware with all functions like in Ontap. ? (OBSERVED FILE TYPES......Volume's Workload Characteristics)
Just to confirm/clarify, are you asking when the NetApp Powershell Toolkit will support Autonmous Ransomware Protection (aka on box anti-ransomware)? This toolkit? https://mysupport.netapp.com/site/tools/tool-eula/ontap-powershell-toolkit. If so, is it that it doesn't support it at all or only certain cmdlets aren't supported? While it's not a product I cover as Product Manager, I know the PM who does and can mention it to him if you can help clarify what the gaps are for me. Thanks.
I coud be nice if we also could get some information about Learned statistics...observed file stype/surge Statistics...new file types....could be useful if we later want til create Fpolicy (as a part of automation)
Is the on-box ransomware feature recommended for NFS datastores? Can't find any documentation that mentions vSphere.
It's not aware of files inside the VMDKs. If that's where you're going with the question.
Not thinking it would be that granular. A single file within a VM folder however
If it's a file on the storage itself, ONTAP will be able to do things against it, However files within files won't happen. ONTAP isn't that sophisticated.
No performance impact? Not looking to cover files within a VM.
Not sure on perf impact.
(ironically with me being a perf TSE :D)
I believe it is minimal for detecting it.
There's minimal impact.
Roger that, appreciate
Please post this in #1062048885847117935 . This is the channel to discuss ransomware.
Ohhk..thank.
You said playbook, so I assume you’re referring to an Ansible playbook
Check page 7 here where it goes into expected perf impact (which is minimal as mentioned): https://docs.netapp.com/us-en/ontap/pdfs/sidebar/Ransomware_protection.pdf
Is it possible to use multi admin verification not only for local users on the storage system but also for AD-Users with domain-tunnel?
Hans, please post this question in the #1062049169520476220 support forums channel
Hi, I need some assistance on designing an ONTAP-based backup storage preferably FAS with anti-ransomware protection enabled, where the backup software is from Veeam? What would be the best practices around deploying ONTAP? I have seen exagrid being positioned in almost all the places where Veeam is running. ARP works with NAS but Veeam would prefer block, using ReFS on top for efficiency. Can snaplock be enabled if FC/iSCI LUN is mounted? Backup retention vs snaplock retention
So you are right that ARP is not the right fit for Veeam. For Veeam backups, the goal to to make sure they do not change after they have been created. SnapLock is your answer there. It does support Volumes with LUNs so after the Veeam backup completes, create a TamperProof Snapshot to hold the backup. Now if you want EXTRA protection, do a SnapMirror transfer to a SnapLock volume and lock that recovery point for your desired retention. That will protect your recovery points.
I want our IT service people to manage ARP low attack probability with system manager, following KB How_to_create_a_custom_role_and_user_for_ONTAP_System_Manager_in_ONTAP_9.7_and_later. Volume modification works fine, so I added the this: "security login rest-role create -vserver xxx -role xxx -api /api/security/anti-ransomware/suspects -access read_modify". Unfortunately the user is still not authorized to access api/security/anti-ransomware/suspects/...
I want to say REST may require more permissions.
We are planning on testing ONTAP Anti-ramsomeware Protection, is there a way we can have scripts like simulate_create.sh and simulate_attack.sh to create multiple files and encrypt the files on Linux clients?
You can do that. I've triggered low attack probabilities in the lab by quickly creating a bunch of files with random extensions on a fresh install / volume.
How and where could you trigger "low attack probabilities"?
What did you mean by "a fresh install/volume"?
I am sorry, but have no clues about what you are talking about.
I did this via an NFS client. I simply ran a bunch of touch commands back-to-back, creating random/nonsensical file names and extensions. It took a bit of time but eventually ONTAP flagged it as a low attack probability for my NFS volume that had ARP enabled.
What did you mean by "a fresh install/volume"?
I mean that the cluster was newly re-initialized. I thought I'd mention it because it seems like it might be easier to trigger an attack probability on a cluster/volume with no history, but I am not positive that it matters.
creating random/nonsensical file names and extensions
In other words, normal EDA workloads 🙂
Can you please provide some examples as for what random/nonsensical file names should look like? Thanks!
@sharp rock @harsh relic Could you please give me some real examples about what these file names should look like?
These are actual file or directory names - yes, including special characters.
kittg2smp.top.vcs64:sci_top_test.all_traffic+regr_normal_mode+smoke_test+hacks:00000002
smc.lp5_u_fp_v.vcs64:memory_tagging.mtc_invd_corner_case:130c9724
svlib_dpi_imports.svh#inc__svlib_private_base_pkg__211#.tdc
smc_regr_kt2_evt2_asym_chan_smc_3484_bug_hunting.132.b83805160bc0f14eL - it's quite common to have a hash as the "file type"
I've also seen UTF-8 file names 😦
Does anyone havea good way of montioring the ransomware on NetApp Systems ? Powershell script or a good way of getting all the info. Right now im only getting an autosupport email that give not good information at all about why its sending me an alarm, then i have to look in the cluster for whats its sending me messages about (in this case ransomware alert)
Built-in File System Analytics (FSA) displays some things, or you can use the functionality in Cloud Insights monitoring suite.
Will look into that!
I enabled ARP (Learning Mode) on a single volume at one of our Sites. You recommend 30 days LM phase. How or where can I view this activity ? Only using CI ? or also FSA ? thanks...........
Does FSA cause any additional load on your stack\volume etc..... IOPS increase ?......thanks
The initial scan does, depending on the number of directories and files. Lateron there should not be much latency increase.
https://docs.netapp.com/us-en/ontap/file-system-analytics/considerations-concept.html#performance-considerations
https://kb.netapp.com/onprem/ontap/Performance/High_or_fluctuating_latency_after_turning_on_NetApp_ONTAP_File_System_Analytics
ONTAP 9.14.1 improved on that.
@digital pond Where did you see 9.14.1 improved on it? I swear I missed that communication.
If you have a crapton (usually >10b inodes) of inodes in a FlexVol, FSA on slightly older releases can really bog down the system to make it unusable on the initial pass.
With 9.14.1 you can monitor the progress of the initialization scan but I think it was also mentioned that the initial scan will throttle if clients need the IO. Which did not happen with earlier versions.
Will need to find the slides tomorrow.
Also mentioned here: https://docs.netapp.com/us-en/ontap/file-system-analytics/considerations-concept.html#scan-considerations
Ah ok.
When you enable capacity analytics, ONTAP conducts an initialization scan for capacity analytics. The scan accesses metadata for all files in volumes for which capacity analytics is enabled. No file data is read during the scan. Beginning in ONTAP 9.14.1, you can track the progress of the scan with the REST API, in the Explorer tab of System Manager, or with the volume analytics show CLI command. If there is a throttling event, ONTAP provides a notification.
I knew they were making some changes to the performance of FSA on large inode volumes in 9.13 or 9.14, but I'm glad to see that is out.
I'm not sure how it works so I'll wait to update the KB.
Oh yeah it did improve it by adding a throttling enhancement and reason. Ok I'll update the KB. Thanks OG1.
@brazen dune https://mysupport.netapp.com/site/bugs-online/product/ONTAP/BURT/1583697 --> fixed today in 9.13.1P7
Is there a way to determine why a ARP creates snapshots? We enabled it a while ago after a dry run of a few weeks and now have a bunch of volumes that constantly create new ARP snapshots altough there are no observed surges or false positives to be cleared
The algo thinks some sort of anomaly happend so it creates a snapshot. Most times these are low "attack propabilities" which resolve themselves once ONTAP determines "oh false alarm". IIRC they get deleted after two days and at most there are 6x ARP snapshots per volume. Simply ignore them, there is no way around if want to use ARP. (or wait for ARP/AI which should be much more accurate)
The problem is they won't be deleted. Some volumes have 2 daily ARP Snapshots from the last 3 days for the max of 6 and some others just have over 10 day old snapshots laying around ( we left the svm options for retention on default so should be 5 days max) that can't be deleted
As we bill our customers by used space this is a bit annoying and we couldn't fix this by prolonging the learing period. Idk it would be super helpful to see in the event logs why those snapshots were created in the first place
Have you checked if there are volumes with an attack-probability higher than "none"?
security anti-ransomware volume show -vserver * - volume * -attack-probability !none
Also what ONTAP-version are you on? You can configure that an event is being triggered once a snapshot has been created but I think that's only in 9.14.1.
https://docs.netapp.com/us-en/ontap-cli-9141/security-anti-ransomware-volume-event-log-show.html
yeah those affected volume have none and also no observed surges
If possible at least update to the latest patch-versions of 9.12.1 or 9.13.1. They included quite some ARP-related bugfixes.
we're at 9.13p4. I think I will disable ARP for those volumes for now. We keep 30-90 daily locked snapshots anyway :/
They were fixed with 9.13.1P7, check the bugfixes on the download-page.
It includes some of my reported bugs.
oh okay that's good to know thanks 👍
yeah, ARP is mainly helpful for the detection part, so you get notified that "something" has happend.
Of course if you have your Tamperproof snapshots you can simply restore from them. But since they're only daily you might miss some encrypted files.
true
tbh I wouldn't care about a few snapshots if there was a field in the volume object for used space with snapshots but without ARP snapshots 😅 I would probably just create an alert in grafana when >10% of the volume space is consumed by ARP snapshots or something like that.
Hi all, what is the recommended approach in the current situation - to enable VSCAN or ARP/ Workload Security for Malware protection on ONTAP?
All three are not mutually exclusive and they use different detection mechanisms. So if you really want to use all. The only being free of charge (included in your ONTAP licenses) is ARP though. You don't need to pay for the vScan feature but most likely for your 3rd-party virus scanner.
There’s a thread for this over here:
https://discord.com/channels/855068651522490400/1216313556044222504
Hello all, I'm curious how you tackle the alerts / warnings associated with ARP. In larger environments, you can have quite a few volumes that have alerts. Do you work with your relative security teams? Reach out to App owners of the data in question?
ARP has no additional cost?
referring to Autonomous Ransomware Protection (ARP) ?
I thought that it required the license specifically for this to be enabled
ONTAP 9.11.1 and later License required : Anti_ransomware
or is there something else that doesn't require a license?
Correct, It’s part of ONTAP one license. If you can get ONE for your system you can get arp
right, One, not just ONTAP
Does no response imply that others are having the same issue? The problem with the quanity of alerts is that they become overwheming and then possibly ignored.
In my experience you need to tweak it especially in the beginning. The aim needs to be that once you do that for some weeks and allow many false positives the alerts will/should decrease.
For some volumes you need to adjust some parameters, for others you need to disable some of the features, and for others ARP simply does not fit.
Also I would really suggest to update to a current version of ONTAP (last 1-2 months) because many bugs are fixed which should help when you're trying to decide if it's a false positive or a real attack.
Ultimately you need to handle each and every alert if you're taking it seriously. And yes, I know that's time-consuming. I have customers who also ended up simply ignoring the alerts for some volumes. 🤷
Also checking and allowing alerts for like 50 volumes is really not fun with System Manager. You're jumping around from page to page and it takes forever reloading again. The sorting of the alerts is also nonexistent (it sorts by vol-UUID instead by alert date or something 😑). A simple page which includes all necessary information and where you could handle one alert after the other would be really helpful.
If it's too much, disable ARP for volumes with high number of false positives and wait for ARP/AI which should be better in that regard.
Looks like NetApp is aiming to "solve" this issue with the new ransomware dashboard in BlueXP: https://docs.netapp.com/us-en/bluexp-ransomware-protection/index.html
At least the information on the alert page seems to be more helpful there. It''s currently in Public Preview.
I totally agree with that. We tested ARP with 9.11.1 and came to the conclusion, for a small company with one or two volumes it works finde but for a provider it is not feasible. We will have a look again later this year
hopefully ARP AI will solve a lot of this issues https://www.netapp.com/blog/first-enterprise-storage-with-ai-powered-ransomware-detection/
SE Labs ® tested NetApp ONTAP Autonomous Ransomware Protection with AI against a range of ransomware attacks designed to extort victims
Download the PDF with this link
"Overall, NetApp ONTAP Autonomous Ransomware Protection with AI provided 99% detection of the advanced ransomware attacks. It detected 3,585 of the 3,635 attacks, earning a AAA award."
Finally an independent test! 🎉
Have you looked at somethin like Prolion yet?
Prolion Cryptospike, uses active fpolicy, so it's actively stopping activity, and blocks a user before it can make harm, have multiple customers using it.
@gloomy harbor Cloud Insights - Workload Security - is the product from NetApp to detect Insider threats like ransomware. Easy to implement, based on AI/ML, integrates ONTAP ARP.
you mean Data Infrastructure Insights Storage Workload Security?
Marketing was busy again 🙄
Yeah, exactly 😃
There's no "snap" or "flex", marketing fail
TBH I like the new name much more. 'Cloud' Insights was sometimes misleading for an infrastructure observability product. But thats a different story 😀
I like CI, but our size makes CI prohibitive.
Looking at options to automate ransomware reporting on false positives to our security team. At the moment you can generate a report by volume to a share on a volume. Surely there's a better way to forward reports automatically? I would imagine going to 9.15.1 will help the alerts lowering but not in the position to do yet. I'm on 9.14.1p9. The email alerts don't specify the details at present. It just says ransomware attack lol.
Is it possible to sync a NetApp array with an NTP server temporarily to initialize the ComplianceClock and then disable synchronization later?
Also is there any documentation or article on this?
Sorry, misunderstood your question.
Yes, you can configure NTP and then initialize the SCC and later remove NTP config. The System Compliance Clock is initialized once and cannot be changed afterwards (outside of the scenarios documented below), so we don't much "care" if you change how the regular system clock behaves after initializing.
https://docs.netapp.com/us-en/ontap/snaplock/initialize-complianceclock-task.html
The docs don't exactly cover that specific scenario, but explain how the SCC behaves.
As with any SnapLock configuration, recommend testing in the lab / virtual appliance before applying to production.
If you were maybe thinking about this command, that's only valid for the Compliance clock in ONTAP Select: https://docs.netapp.com/us-en/ontap-cli/snaplock-compliance-clock-ntp-modify.html