I have a java method written which takes a file as input and computes the md5hash of the file.
The issue is that I want to execute this code on two different machines - Windows and Linux.
Windows is using UTF-16 encoding whereas Linux is using UTF-8 encoding.
The contents of the file are the same but due to different encodings, the md5hash that is computed for these files is different. My aim is to compare the content in the files, character to character.
Sharing the java method here.
try (FileInputStream fis = new FileInputStream(filename)) {
MessageDigest md = MessageDigest.getInstance("MD5");
// Read the file and update the message digest
byte[] buffer = new byte[8192];
int bytesRead;
while ((bytesRead = fis.read(buffer)) != -1) {
md.update(buffer, 0, bytesRead);
}
// Get the MD5 hash
byte[] md5Bytes = md.digest();
// Convert the byte array to a hexadecimal string
StringBuilder sb = new StringBuilder();
for (byte md5Byte : md5Bytes) {
sb.append(Integer.toString((md5Byte & 0xff) + 0x100, 16).substring(1));
}
return sb.toString();
}
}```