#Immich is incorrectly marking photos as duplicates during fresh install

1 messages · Page 1 of 1 (latest)

stark pine
#

I recently had to reinstall immich, and had to reupload all my photos. However, when uploading the photos it terminates when not all the photos have been uploaded. I had uploaded 20k photos previously, but am only able to import 15k. The rest are refusing to upload (likely because they are duplicates). I am uploading using the cli. I don't believe the 5k photos are duplicates, as that is a pretty large amount. There are no errors in my server or microservices log, the last lines are letting me know the server has been started.

marsh patio
#

Can you show the screenshots when the upload completed?

stark pine
marsh patio
#

it has finished upload those to the server

stark pine
#

It has finished uploading, but it hasn't actually uploaded the photos

#

When I try running it again it says that 6879 photos still need to be uploaded

marsh patio
#

so those got rejected by the server because of duplication

stark pine
#

I have two folders, a fresh one containing the photos I am now uploading to immich and an old one containing the photos I previously uploaded to immich

#

I am uploading the images to the fresh one from the directory containing the old one

#

Technically, the photo count in both should be the same. The old one contains all the photos uploaded to immich so it should contain no duplicates. The new one is when I upload photos from my old directory

#

Unless there was a significant change in detecting duplicates, I don't think 25% of the photos I previously uploaded to immich would now be marked as duplicates

marsh patio
stark pine
#

When I run something like fdupes, it doesn't find any duplicates. Is there another way you guys are finding duplicates?

marsh patio
#

just by calculating the hash of the files

#

Can you check the server logs when it is uploading from the CLI?

#

if ther is duplicated it will have messages like this

stark pine
#

Both microservices and server logs don't print anything

#

Do I need to change verboseness

marsh patio
#

It should print from the postgres container

stark pine
#

To note, some users may have redundant photos between users. As in User A may have some photos User B may have uploaded, although I'm not sure if duplicates are considered between users.

marsh patio
#

duplicated only considered from a single user

#

Are you a developer?

stark pine
#

Yes, although I'm not super familiar with docker or postgres

marsh patio
#

If you want to make sure, you can pull up the database, and check of of the asset based on its checksum value report in the database

lime kettleBOT
marsh patio
#

This section

#

choose one of the reported checksum, find the createdAtDate, check for that photo on the timeline

#

and then check again in your directory if you have it reupload again

stark pine
#

So in something like this

#

2023-10-17 02:34:58.586 UTC [18846] DETAIL: Key ("ownerId", "libraryId", checksum)=(9cee2165-ed52-4cde-8ba0-317c934f37b9, 18c94438-777d-437d-8068-c8c718401fbf, \x4f17d99a4978d4b89ffdd134c5ef47bcfd7a1d3d) already exists

#

9cee.. 18cc, and \x4f.. are 3 different checksums?

marsh patio
#

\x4f17d99a4978d4b89ffdd134c5ef47bcfd7a1d3d is the checksum

#

9c is the ownerId

#

18 is the libraryId

stark pine
#

So I ran the query using 4f17d99a4978d4b89ffdd134c5ef47bcfd7a1d3d and found one row. I'm guessing that means that it is indeed a duplicate?

marsh patio
#

yes sir