Me and my wife have had Partner Sharing turned on in Google Photos for years, so her images are on mine and mine on hers. But it was not always like that. My question is if I use the https://github.com/mattwilson1024/google-photos-exif along with the immich API to upload my photos is either of these solutions going to account for duplicate photos?
#Google Takeout / Partner Sharing / google-photos-exif / Duplicates
1 messages ยท Page 1 of 1 (latest)
In gphotos you have your photos on your account and she has her photos in hers. Partner sharing just means you can see each other's photos
Auto save would make a copy of the file in your account, so each google takeout would include a copy of the photo. Immich only enforces unique photos per user, so it would let you both upload the same image individually. If you want to avoid that, you will probably need to do some extra work to sort out and exclude photos from each account (before uploading them to immich)
what if I upload both hers and mine to my immich account
Yeah, you could upload all of them to your account and that would most likely result in not having duplicates. I'm not 100% sure if both versions of the image have the same hash, but I would hope and expect them to.
does it check md5 hash as counting it as a duplicate?
I can manually check a bunch to make sure
before I try it
Pretty much, it does sha1 actually
I think I have the same problem, at least for a period of time, with my wife's account. I haven't looked into it yet though. I'm hoping there is some metadata in the json export that will make it possible to figure out who was the original taker of the photo.
will that hash be the same after google-photos-exif does its thing on both takeouts?
I've not used that personally. I assume it actually writes exif data back to the file, which would change the hash. So it depends if it writes the same data to both files or not.
I've just heard google takeout is a huge pain/mess.
I'm dreading mine.
I have auto saved some of my wifes. I've uploaded both originals and lower quality versions. I think that might be true for the auto-saved ones as well. Then I have originals backed-up to immich which means I need to reconcile originals vs compressed there as well.
Luckily I have never renamed any of my files, I'm hoping I can use that somewhat reliably to detect duplicates and auto-pick the better resolution one.
Good luck
In immich we use a hash because it's the only way to guarantee uniqueness and not accidentally not upload a picture, which could result in data loss. In reality, depending on the files, you might be able to employ smarter de-duplication methods, like file name + modified date, camera models, etc.
I might look into checking file names before I upload
I guess if I merge both takeouts together, it might get some of them
I am having this issue now too
https://discord.com/channels/979116623879368755/1109830963065786378
looking at the json from a few of these files, I dont see a photo taken by property
In the JSON object there is a googlePhotosOrigin property that has another property it in fromPartnerSharing
I might be able to go through and delete these out
"googlePhotosOrigin": {
"fromPartnerSharing": {
}
}
other files have this
"googlePhotosOrigin": {
"mobileUpload": {
"deviceFolder": {
"localFolderName": ""
},
"deviceType": "ANDROID_PHONE"
}
}
ill have to do some investigating, but I might be able to run a script to see if that property is in the JSON and delete the JSON/Image from the folder
other option that might work is in Google Photos it self, you can actually search the name of the device in the search box
so If I type Google Pixel 6 Pro
it brings up all the photos that were taking by that phone cause its in the info of the photo on the website
so I could go through and delete them and then re-export
we have never had the same device
I added code into a forked repo of google-photos-exif
โ google-photos-exif git:(master) โ yarn start --inputDir /Users/seion/Documents/forks/google-photos-exif/test/Takeout --outputDir /Users/seion/Documents/forks/google-photos-exif/test/output --errorDir /Users/seion/Documents/forks/google-photos-exif/test/error --excludePartnerSharingMedia
--- Scan complete, found: ---
0 files with extension .jpeg
3 files with extension .jpg
0 files with extension .heic
0 files with extension .gif
0 files with extension .mp4
0 files with extension .png
0 files with extension .avi
0 files with extension .mov
--- Processing media files ---
--- Partner shared media will be excluded ---
Copying file 0 of 3: /Users/seion/Documents/forks/google-photos-exif/test/Takeout/Google Photos/20230513_120534.jpg -> 20230513_120534.jpg
Copying file 1 of 3: /Users/seion/Documents/forks/google-photos-exif/test/Takeout/Google Photos/3.jpg -> 3.jpg
Skipping file 2 of 3 because it was partner shared: /Users/seion/Documents/forks/google-photos-exif/test/Takeout/Google Photos/PXL_20230219_193430224.jpg
--- Finished processing media files: ---
0 files with extension .jpeg
3 files with extension .jpg
0 files with extension .heic
0 files with extension .gif
0 files with extension .mp4
0 files with extension .png
0 files with extension .avi
0 files with extension .mov
--- 1 partner shared files were excluded ---
--- The file modified timestamp has been updated on all media files ---
--- We did not edit EXIF metadata for any of the files. This could be because all files already had a value set for the DateTimeOriginal field, or because we did not have a corresponding JSON file. ---
Done ๐
โจ Done in 0.82s.
20230513_120534.jpg 3.jpg
โ google-photos-exif git:(master) โ ls test/Takeout/Google\ Photos
20230513_120534.jpg 3.jpg PXL_20230219_193430224.jpg
20230513_120534.jpg.json 3.jpg.json PXL_20230219_193430224.jpg.json
โ google-photos-exif git:(master) โ
added a flag for --excludePartnerSharingMedia
โ google-photos-exif git:(master) โ yarn start -h
yarn run v1.22.19
$ ./bin/run -h
Takes in a directory path for an extracted Google Photos Takeout. Extracts all photo/video files (based on the conigured list of file extensions) and places them into an output directory. All files will have their modified timestamp set to match the timestamp specified in Google's JSON metadata files (where present). In addition, for file types that support EXIF, the EXIF "DateTimeOriginal" field will be set to the timestamp from Google's JSON metadata, if the field is not already set in the EXIF metadata.
USAGE
$ google-photos-exif
OPTIONS
-e, --errorDir=errorDir (required) Directory for any files that have bad EXIF data - including the matching metadata files
-h, --help show CLI help
-i, --inputDir=inputDir (required) Directory containing the extracted contents of Google Photos Takeout zip file
-o, --outputDir=outputDir (required) Directory into which the processed output will be written
-v, --version show CLI version
--excludePartnerSharingMedia Include this parameter if you do not want the output to contain media that was saved from partner sharing
in google phots
โจ Done in 0.75s.
โ google-photos-exif git:(master) โ
ill do some more testing and do a bunch of checks to make sure that the following structure is accurate for all the photos I have
"googlePhotosOrigin": {
"fromPartnerSharing": {
}
}
"googlePhotosOrigin": {
"mobileUpload": {
"deviceFolder": {
"localFolderName": ""
},
"deviceType": "ANDROID_PHONE"
}
}
seems to be the same formated JSON all the way back to when photos were stored in Drive, I checked photos going all the way back to 2012 or later for me
Oh wow, nice
You should open a PR against the original repo
I will want to see if any comments about code before I do