I used Google Takeout to download all of my photos from Google Photos and realized that Google compresses these images 2-3x to give me free storage. This is great, but a lot of my images are stored at original size. Unfortunately, both 'google-compressed' images and 'original-high quality' images are stored with the jpg extension. I am wondering how to figure out which is which? Does Google add metadata tags to identify if they have been recompressed?
2 Answers
Google does add some tags to images that it recompresses, including images stored as "High Quality", which may be downsampled to 16MP or less. Images stored at "Original" quality appear to be kept unaltered. At this time, the following tags appear to be added or altered:
- XMPToolkit = XMP Core 5.5.0
- ImageUniqueID
The following command may list images that have been altered by Google:
exiftool -if '($XMPToolkit =~ /^XMP\ Core\ [\.\d]+$/) \
and ($ImageUniqueID)' -s2 -q -FilePath -ext jpg .
Some caveates
The command may include images not altered by Google. Other programs may use the same, or similar, XMPToolkit strings that Google does, especially if they happen to use the same image-writing library that Google does. For instance, GIMP uses "XMP Core 4.4.0-Exiv2". Photoshop uses "Adobe XMP Core 5.3-c011 66.145661, 2012/02/06-14:56:27" (as noted by StarGeek).
The command may miss images altered by Google. This depends on how Google has changed their image processing over the years. For instance, it's not known (to me) when Google started using the
ImageUniqueIDtag. So some images may not have it set.There are other tags that may be altered by Google, but they are not reliable to check because many JPEG images have them, including those straight from my camera (FujiFilm X-T20):
- JPEGDigest
- YCbCrSubSampling
Other options
You may also guess whether images have been altered by comparing file sizes or using tools like jpegjudge.
- 26,951
- 4
- 39
- 126
If you know the typical size of a jpeg from your camera, you would then be able to tell which images had additional compression added simply by their file size.
- 572
- 2
- 4
XMPToolKittag had a value ofXMP Core 5.5.0. You could list just those images with this command:exiftool -if "$XMPToolkit eq 'XMP Core 5.5.0'" -filename -ext jpg .(Reverse double/single quotes if on Linux/Mac) – StarGeek Aug 28 '18 at 23:24
$XMPToolkit=~/^XMP Core/to ignore version numbers – StarGeek Aug 28 '18 at 23:33Adobe XMP Core 5.3-c011 66.145661, 2012/02/06-14:56:27in that field. This can be used to filter out files which have definitely not been re-compressed. – StarGeek Aug 28 '18 at 23:39JPEGDigestandYCbCrSubSamplingwill also be changed by the recompression. This does not guarantee that a file was recompressed, but can be used to filter out files that were not recompressed. – StarGeek Aug 28 '18 at 23:47YCbCr4:2:0 (2 2)andUnknownand didn't start that way? Quite possible, my test sample consisted of only of the three images that were on the HuffingtonPost article. – StarGeek Aug 28 '18 at 23:57exiftooloutput, Google also appears to addImageUniqueID– xiota Aug 29 '18 at 00:08