#changing existing volume language to utf8mb4
1 messages · Page 1 of 1 (latest)
it's actually sort of an extension of the utf8 encoding, so nothing really changes... the system can just add the extra bits if needed
"forward compatible", you might say... and with that, there is also no switching it back
it can impact you when you have old NFS clients that don't speak proper UTF-8, that's why the NetApp KB says to open a case before switching. In all other cases it has no impact
it has nothing to do with UTF-16 though, UTF-16 is never used anywhere in ONTAP (and that's good, because UTF-16 was an abomination)
basically, utf8mb4 allows encoding characters of the SMP (plane 1) of unicode, which contains emojis and other asian characters. For the BMP, 3 bytes are enough and NetApp probably thought "gee, 65536 characters should be enough for everyone!" when they implemented utf-8, even though even back then it was pretty clear that the additional planes would have to be used at some point. Then everyone was surprised when 4-byte UTF-8 encoded glyphs came around (MySQL had the same realization, they also arbitrarily limited UTF-8 text to max. 3 bytes per code point)
if you hate your data, use nfsv2