#Machine learning failing second instance

1 messages · Page 1 of 1 (latest)

true quiver
#

I have a second instance that handles additional machine learning jobs and I added the ip to the api instance but I keep getting these random WARNing logs

[Nest] 253  - 04/17/2025, 11:56:35 AM    WARN [Microservices:MachineLearningRepository] Machine learning request to "http://192.168.1.226:3003" failed: fetch failed

The second instance ip is http://192.168.1.226:3003 and when I visit that URL I get

{"message":"Immich ML"}

and these are my machine learning container settings

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, rocm, openvino, rknn] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-openvino
    extends:
      # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
      file: hwaccel.ml.yml
      service: openvino # set to one of [armnn, cuda, rocm, openvino, openvino-wsl, rknn] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always
    healthcheck:
      disable: false
    ports:
      - 3003:3003
cinder carbonBOT
#

:wave: Hey @true quiver,

Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich immich.

References

#

Checklist

I have...

  1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time).
  2. :ballot_box_with_check: read applicable release notes.
  3. :ballot_box_with_check: reviewed the FAQs for known issues.
  4. :ballot_box_with_check: reviewed Github for known issues.
  5. :ballot_box_with_check: tried accessing Immich via local ip (without a custom reverse proxy).
  6. :ballot_box_with_check: uploaded the relevant information (see below).
  7. :blue_square: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable

(an item can be marked as "complete" by reacting with the appropriate number)

Information

In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider:

  • Your docker-compose.yml and .env files.
  • Logs from all the containers and their status (see above).
  • All the troubleshooting steps you've tried so far.
  • Any recent changes you've made to Immich or your system.
  • Details about your system (both software/OS and hardware).
  • Details about your storage (filesystems, type of disks, output of commands like fdisk -l and df -h).
  • The version of the Immich server, mobile app, and other relevant pieces.
  • Any other information that you think might be relevant.

Please paste files and logs with proper code formatting, and especially avoid blurry screenshots.
Without the right information we can't work out what the problem is. Help us help you ;)

If this ticket can be closed you can use the /close command, and re-open it later if needed.

cinder carbonBOT
sullen kestrel
#

Have you made sure that the ML container is up to date on the remote machine? I would docker down, prune, and bring it back up to just verify

true quiver
#

Yeah I am on the latest version v1.131.3, but did the prune just in case and still the same thing

#

Also I wanted to note that it doesn't occur always, some times it works fine and the other times it doesn't.

faint vine
#

Worker (pid:121) was sent SIGKILL! Perhaps out of memory?

#

How much memory is there?

#

Does it really or does it spike to 6 and then immediately die? 😛

sullen kestrel
#

Also if you try without openvino, does it work?

true quiver
#

There is 6gib of memory, but it looks like it never reaches that

faint vine
#

I can't tell because the time is different 😛

#

are you looking at average or peak

true quiver
#

I am looking at peak(maximum) and same thing happens without openvino

main ledge
#

Is this VM or LXC?

true quiver
#

this is LXC and I don't thing memory is an issue, since it happens even when I assign 30GB to it

main ledge
#

I understand you probably need a LXC for the pass through but that’s most likely related to the issue here.

true quiver
#

So you saying that the issue is that I am using an LXC container instead of a VM?

faint vine
#

It may be, we don't know for sure. They have some quirks that make investigating troublesome and unrewarding, that's why we advocate VMs