#Google Takeout / Partner Sharing / google-photos-exif / Duplicates

1 messages ยท Page 1 of 1 (latest)

tender geyser
#

Me and my wife have had Partner Sharing turned on in Google Photos for years, so her images are on mine and mine on hers. But it was not always like that. My question is if I use the https://github.com/mattwilson1024/google-photos-exif along with the immich API to upload my photos is either of these solutions going to account for duplicate photos?

GitHub

A tool to populate missing DateTimeOriginal EXIF metadata in Google Photos takeout, using Google's JSON metadata. - GitHub - mattwilson1024/google-photos-exif: A tool to populate missing ...

balmy solar
#

In gphotos you have your photos on your account and she has her photos in hers. Partner sharing just means you can see each other's photos

tender geyser
#

I mean this option

#

I had it set to save all photos

tawny scroll
#

Auto save would make a copy of the file in your account, so each google takeout would include a copy of the photo. Immich only enforces unique photos per user, so it would let you both upload the same image individually. If you want to avoid that, you will probably need to do some extra work to sort out and exclude photos from each account (before uploading them to immich)

tender geyser
#

what if I upload both hers and mine to my immich account

tawny scroll
#

Yeah, you could upload all of them to your account and that would most likely result in not having duplicates. I'm not 100% sure if both versions of the image have the same hash, but I would hope and expect them to.

tender geyser
#

does it check md5 hash as counting it as a duplicate?

#

I can manually check a bunch to make sure

#

before I try it

tawny scroll
#

Pretty much, it does sha1 actually

#

I think I have the same problem, at least for a period of time, with my wife's account. I haven't looked into it yet though. I'm hoping there is some metadata in the json export that will make it possible to figure out who was the original taker of the photo.

tender geyser
#

will that hash be the same after google-photos-exif does its thing on both takeouts?

tawny scroll
#

I've not used that personally. I assume it actually writes exif data back to the file, which would change the hash. So it depends if it writes the same data to both files or not.

#

I've just heard google takeout is a huge pain/mess.

tender geyser
#

yes ๐Ÿ˜ฆ

#

Should never of done auto-save lol

tawny scroll
#

I'm dreading mine.

#

I have auto saved some of my wifes. I've uploaded both originals and lower quality versions. I think that might be true for the auto-saved ones as well. Then I have originals backed-up to immich which means I need to reconcile originals vs compressed there as well.

#

Luckily I have never renamed any of my files, I'm hoping I can use that somewhat reliably to detect duplicates and auto-pick the better resolution one.

tender geyser
#

Good luck

tawny scroll
#

In immich we use a hash because it's the only way to guarantee uniqueness and not accidentally not upload a picture, which could result in data loss. In reality, depending on the files, you might be able to employ smarter de-duplication methods, like file name + modified date, camera models, etc.

tender geyser
#

I might look into checking file names before I upload

#

I guess if I merge both takeouts together, it might get some of them

#

looking at the json from a few of these files, I dont see a photo taken by property

tender geyser
#

In the JSON object there is a googlePhotosOrigin property that has another property it in fromPartnerSharing

#

I might be able to go through and delete these out

#

"googlePhotosOrigin": {
"fromPartnerSharing": {
}
}

#

other files have this

#

"googlePhotosOrigin": {
"mobileUpload": {
"deviceFolder": {
"localFolderName": ""
},
"deviceType": "ANDROID_PHONE"
}
}

#

ill have to do some investigating, but I might be able to run a script to see if that property is in the JSON and delete the JSON/Image from the folder

#

other option that might work is in Google Photos it self, you can actually search the name of the device in the search box

#

so If I type Google Pixel 6 Pro

#

it brings up all the photos that were taking by that phone cause its in the info of the photo on the website

#

so I could go through and delete them and then re-export

#

we have never had the same device

tender geyser
#

I added code into a forked repo of google-photos-exif

#

โžœ google-photos-exif git:(master) โœ— yarn start --inputDir /Users/seion/Documents/forks/google-photos-exif/test/Takeout --outputDir /Users/seion/Documents/forks/google-photos-exif/test/output --errorDir /Users/seion/Documents/forks/google-photos-exif/test/error --excludePartnerSharingMedia

#
--- Scan complete, found: ---
0 files with extension .jpeg
3 files with extension .jpg
0 files with extension .heic
0 files with extension .gif
0 files with extension .mp4
0 files with extension .png
0 files with extension .avi
0 files with extension .mov
--- Processing media files ---
--- Partner shared media will be excluded ---
Copying file 0 of 3: /Users/seion/Documents/forks/google-photos-exif/test/Takeout/Google Photos/20230513_120534.jpg -> 20230513_120534.jpg
Copying file 1 of 3: /Users/seion/Documents/forks/google-photos-exif/test/Takeout/Google Photos/3.jpg -> 3.jpg
Skipping file 2 of 3 because it was partner shared: /Users/seion/Documents/forks/google-photos-exif/test/Takeout/Google Photos/PXL_20230219_193430224.jpg
--- Finished processing media files: ---
0 files with extension .jpeg
3 files with extension .jpg
0 files with extension .heic
0 files with extension .gif
0 files with extension .mp4
0 files with extension .png
0 files with extension .avi
0 files with extension .mov
--- 1 partner shared files were excluded ---
--- The file modified timestamp has been updated on all media files ---
--- We did not edit EXIF metadata for any of the files. This could be because all files already had a value set for the DateTimeOriginal field, or because we did not have a corresponding JSON file. ---
Done ๐ŸŽ‰
โœจ  Done in 0.82s.
#
20230513_120534.jpg 3.jpg
โžœ  google-photos-exif git:(master) โœ— ls test/Takeout/Google\ Photos 
20230513_120534.jpg             3.jpg                           PXL_20230219_193430224.jpg
20230513_120534.jpg.json        3.jpg.json                      PXL_20230219_193430224.jpg.json
โžœ  google-photos-exif git:(master) โœ— 
#

added a flag for --excludePartnerSharingMedia

#
โžœ  google-photos-exif git:(master) โœ— yarn start -h
yarn run v1.22.19
$ ./bin/run -h
Takes in a directory path for an extracted Google Photos Takeout. Extracts all photo/video files (based on the conigured list of file extensions) and places them into an output directory. All files will have their modified timestamp set to match the timestamp specified in Google's JSON metadata files (where present). In addition, for file types that support EXIF, the EXIF "DateTimeOriginal" field will be set to the timestamp from Google's JSON metadata, if the field is not already set in the EXIF metadata.

USAGE
  $ google-photos-exif

OPTIONS
  -e, --errorDir=errorDir       (required) Directory for any files that have bad EXIF data - including the matching metadata files
  -h, --help                    show CLI help
  -i, --inputDir=inputDir       (required) Directory containing the extracted contents of Google Photos Takeout zip file
  -o, --outputDir=outputDir     (required) Directory into which the processed output will be written
  -v, --version                 show CLI version

  --excludePartnerSharingMedia  Include this parameter if you do not want the output to contain media that was saved from partner sharing 
                                in google phots

โœจ  Done in 0.75s.
โžœ  google-photos-exif git:(master) โœ— 
#

ill do some more testing and do a bunch of checks to make sure that the following structure is accurate for all the photos I have

"googlePhotosOrigin": {
"fromPartnerSharing": {
}
}

"googlePhotosOrigin": {
"mobileUpload": {
"deviceFolder": {
"localFolderName": ""
},
"deviceType": "ANDROID_PHONE"
}
}

tender geyser
#

seems to be the same formated JSON all the way back to when photos were stored in Drive, I checked photos going all the way back to 2012 or later for me

tawny scroll
#

Oh wow, nice

tender geyser
tawny scroll
#

You should open a PR against the original repo

tender geyser
#

I will want to see if any comments about code before I do