#Immich v1.106.4 faceDetection/face-detection Error: request failed with ECONNREFUSED :3003

1 messages · Page 1 of 1 (latest)

dusky flint
#

Long time user and huge advocate of Immich! I have managed to avoid asking anyone for assistance up until this morning....

Though the issue I seem to be experiencing is not being actively discussed here - which gives me that feeling like perhaps I didn't do something properly.

To make troubleshooting easier, I simply deleted everything I had for previous Immich versions, pruned the docker system and rebuilt everything following the latest release notes + latest .env and latest docker-compose files.

Upon uploading a single picture the following error appears:

[Nest] 7 - 06/14/2024, 9:03:00 AM ERROR [Microservices:JobService] Unable to run job handler (faceDetection/face-detection): Error: Machine learning request to "http://immich_machine_learning:3003" failed with SocketError: other side closed

[Nest] 7 - 06/14/2024, 9:03:00 AM ERROR [Microservices:JobService] Error: Machine learning request to "http://immich_machine_learning:3003" failed with SocketError: other side closed

I have attempted to change that http address to point to the container's IP, the machine's IP, the localhost and all result in the same problem

Using all the latest files, without any previous versions setup on this server - what step am I likely missing in the documentation?

Note: This is a dedicated Immich server (Debian 12), no proxy, no external routing, it's primarily for testing as I slowly de-google my family. Point being, I've attempted to keep it as simple as possible with the most complex piece probably being my Nvidia card...though that works flawlessly.

timid plinthBOT
#

:wave: Hey @dusky flint,

Thanks for reaching out to us. Please follow the recommended actions below; this will help us be more effective in our support effort and leave more time for building Immich immich.

References

Checklist

  1. :ballot_box_with_check: I have verified I'm on the latest release(note that mobile app releases may take some time).
  2. :ballot_box_with_check: I have read applicable release notes.
  3. :ballot_box_with_check: I have reviewed the FAQs for known issues.
  4. :ballot_box_with_check: I have reviewed Github for known issues.
  5. :ballot_box_with_check: I have tried accessing Immich via local ip (without a custom reverse proxy).
  6. :ballot_box_with_check: I have uploaded the relevant logs, docker compose, and .env files using the buttons below or the /upload command.
  7. :ballot_box_with_check: I have tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable

(an item can be marked as "complete" by reacting with the appropriate number)

If this ticket can be closed you can use the /close command, and re-open it later if needed.

timid plinthBOT
delicate orchid
#

Does machine-learning have any errors in the logs?

dusky flint
#

Actually its log looks pretty standard for what I've seen previously:

[06/14/24 12:54:38] ERROR Worker (pid:10) was sent code 139!
[06/14/24 12:54:38] INFO Booting worker with pid: 48
[06/14/24 12:54:41] INFO Started server process [48]
[06/14/24 12:54:41] INFO Waiting for application startup.
[06/14/24 12:54:41] INFO Created in-memory cache with unloading after 300s
of inactivity.
[06/14/24 12:54:41] INFO Initialized request thread pool with 8 threads.
[06/14/24 12:54:41] INFO Application startup complete.
[06/14/24 13:03:00] INFO Setting 'ViT-B-32__openai' execution providers to
['CUDAExecutionProvider', 'CPUExecutionProvider'],
in descending order of preference
[06/14/24 13:03:00] INFO Loading visual model 'ViT-B-32__openai' to memory
[06/14/24 13:03:02] ERROR Worker (pid:48) was sent code 139!
[06/14/24 13:03:02] INFO Booting worker with pid: 189
[06/14/24 13:03:05] INFO Started server process [189]
[06/14/24 13:03:05] INFO Waiting for application startup.
[06/14/24 13:03:05] INFO Created in-memory cache with unloading after 300s
of inactivity.
[06/14/24 13:03:05] INFO Initialized request thread pool with 8 threads.
[06/14/24 13:03:05] INFO Application startup complete.

#

The code 139! occurs frequently but nothing that's acted as a blocker.

#

I also would normally change the model - but for this I kept it all vanilla

delicate orchid
#

Can you try disabling hwaccel for machine-learning and seeing if it still happens?

#

The code 139 means it's segfaulting, which is definitely a problem

dusky flint
#

Roger that - will update once I shut down alll the nvidia related things

dusky flint
#

Update: Rebuilt again, this time removing all hwaccel.ml and transcoding references.
Uploaded a few tests and:

The great:

  • The kernel trap is gone. No longer 139! - gunicorn[65321] general protection fault is not showing up any longer
  • Uploads completed and logs appeared to be completely error free for all containers...

The meh:

  • Facial detection doesn't seem to be working any longer. People are empty as is the 'Explore' section
    I up'ed the logging a bit and saw a bunch of these zoom by for the pics uploaded:
  • [Nest] 7 - 06/14/2024, 11:35:49 AM DEBUG [Microservices:PersonService] 5 faces detected in upload/thumbs/cb420871-617f-471f-9e7e-bea77840be78/49/5c/495cfe6e-5852-419a-aed4-e80e82abf39b-preview.jpeg

  • [Nest] 7 - 06/14/2024, 11:35:54 AM DEBUG [Microservices:PersonService] Skipping facial recognition queueing because 31 jobs are already queued

#

Those were followed by a spamming of api calls - due to me browsing the web container and debug being thorough.

dusky flint
#

Ok - pushed a few thousand images in and now I am beginning to get facial recognition and parsing. Whew! Still error free...

I'm going to gut the nvidia install and walk through it all again to see if that helps allow hwaccel to work without filling up logs with segs

delicate orchid
#

Great! Just for completeness, what model is the GPU?

dusky flint
#

GeForce 1070-Ti
One of the ole' classics... i7, 32GB ram, 20TB and our family has an insane anti-google device. 🙂

#

sadly all this works fine in windows... but eeew