I am looking for the way how to compare two files (especially large files) in S3 within the same bucket using Java AWS SDK.
I do not need to verify whole bucket if there are duplicates. As I understood the Athena tool should be good for that to find all duplicates in a bucket. I need to compare only two files (objects) in S3 and nothing else.
Is there some better way than downloading data to local? I know that I can verify MD5, but if the MD5 hash is the same, I still need to download those files and compare them if those files are really identical. It is pretty ineffective to download two large files from S3.