#Machine learning container start.sh failed

1 messages · Page 1 of 1 (latest)

ruby swallow
#

Hey everyone, I’ve encountered an issue with Immich and could use some help. The smart search feature is not working, and when I checked the logs for the machine learning container, I found the following errors:

[FATAL tini (7)] exec ./start.sh failed: No such file or directory
[FATAL tini (7)] exec ./start.sh failed: No such file or directory
[FATAL tini (6)] exec ./start.sh failed: No such file or directory
[FATAL tini (8)] exec ./start.sh failed: No such file or directory

I decided to repull the machine learning container (because I update the Immich server this way), but I hadn’t updated any of the non-server containers since the initial setup in January 2025. After repulling the ML container, it stopped starting altogether, and I’m still seeing the same error messages in the logs.

I’m not super experienced with containers, so I’m a bit lost here. ChatGPT suggested recreating the ML container, but I’m afraid I might break something even more. Does anyone have advice on what I should do next? Thanks in advance!

barren mulchBOT
#

:wave: Hey @ruby swallow,

Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich immich.

References

#

Checklist

I have...

  1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time).
  2. :ballot_box_with_check: read applicable release notes.
  3. :ballot_box_with_check: reviewed the FAQs for known issues.
  4. :ballot_box_with_check: reviewed Github for known issues.
  5. :ballot_box_with_check: tried accessing Immich via local ip (without a custom reverse proxy).
  6. :blue_square: uploaded the relevant information (see below).
  7. :blue_square: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable

(an item can be marked as "complete" by reacting with the appropriate number)

Information

In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider:

  • Your docker-compose.yml and .env files.
  • Logs from all the containers and their status (see above).
  • All the troubleshooting steps you've tried so far.
  • Any recent changes you've made to Immich or your system.
  • Details about your system (both software/OS and hardware).
  • Details about your storage (filesystems, type of disks, output of commands like fdisk -l and df -h).
  • The version of the Immich server, mobile app, and other relevant pieces.
  • Any other information that you think might be relevant.

Please paste files and logs with proper code formatting, and especially avoid blurry screenshots.
Without the right information we can't work out what the problem is. Help us help you ;)

If this ticket can be closed you can use the /close command, and re-open it later if needed.

full crow
#

You missed a breaking change from a while ago

barren mulchBOT
ruby swallow
full crow
#

Hmm

#

Can you post the info requested by the bot?

kindred sail
#

Where did you get your compose file @ruby swallow ?

ruby swallow
kindred sail
#

Marius likes to do things differently for whatever reason, best post your compose here and we'll see what the way forward is

ruby swallow
kindred sail
#

Actually looks fine to me, pulling all containers should get it back up

ruby swallow
#

I recreated all of the four containers with the option "re-pull image".
But the ML container still refuses to start with the same log lines like above.

kindred sail
#

Alright, compose down, stop and delete the ML container, delete any cache volume it has (it's the docker volume). Delete the ML container image too.

Then re-pull the ML image

Be sure that you don't delete anything related to UPLOAD_LOCATION or DB_DATA_LOCATION

ruby swallow
#

I did it like that. Removed the ML container, deleted the image and cache, repulled the image and created the container new.
The container now starts but is 'unhealthy'. The Immich-SERVER doesn't seem to reach it (fetch failed).

kindred sail
#

unhealthy is a known issue, it's actually the healthcheck that is missing

#

But the not reaching it isn't the know issue 🙃

ruby swallow
#

Alright. That's good and not good 🫣
In the ML container the log tells me the following:
[03/31/25 22:09:02] INFO Starting gunicorn 23.0.0
[03/31/25 22:09:02] INFO Listening at: http://[::]:3003 (8)
[03/31/25 22:09:02] INFO Using worker: immich_ml.config.CustomUvicornWorker
[03/31/25 22:09:02] INFO Booting worker with pid: 9
[03/31/25 22:09:06] INFO Started server process [9]
[03/31/25 22:09:06] INFO Waiting for application startup.
[03/31/25 22:09:06] INFO Created in-memory cache with unloading after 300s of inactivity.
[03/31/25 22:09:06] INFO Initialized request thread pool with 4 threads.
[03/31/25 22:09:06] INFO Application startup complete.

kindred sail
#

What's the ML URL in your admin settings?

ruby swallow
kindred sail
#

and the fetch error is from the immich-server container I assume?

ruby swallow
#

The full lines:
[Nest] 8 - 04/01/2025, 12:12:50 AM WARN [Microservices:MachineLearningRepository] Machine learning request to "http://immich-machine-learning:3003" failed: fetch failed
[Nest] 8 - 04/01/2025, 12:12:50 AM ERROR [Microservices:{"id":"241293c4-0dc0-48d4-8681-522d67143b06"}] Unable to run job handler (smartSearch/smart-search): Error: Machine learning request '{"clip":{"visual":{"modelName":"ViT-B-32__openai"}}}' failed for all URLs

kindred sail
#

I have to leave now but do you mind telling about your platform a bit? Like VM, host stuff like that

wintry grove
#

@kindred sail @full crow I have the same issue and get the same errors since today as @ruby swallow and yeah, I think I definitely missed a breaking change.

#

Is there a way for me to re-do the breaking change (could you maybe tell me which one, if you know) or do I have to completely reinstall Immich?

kindred sail
#

Below the top {/if} there is a ] missing a , after it, so turn it into:

      {/if}
    ],

@wintry grove

#

This has nothing to do with any of this topic though.

wintry grove
#

@kindred sail Thanks for the quick reply!

#

Oh uhm, may I ask why? Because I have the exact same problem as OP

#

My machine learning container keeps restarting with [FATAL tini (7)] exec ./start.sh failed: No such file or directory since today

kindred sail
#

There is no breaking change this version 👀

#

And your error message was SyntaxError: JSON.parse: expected ',' or ']' after array element at line 16 column 7 of the JSON data

wintry grove
#

Sorry for not being clear, the syntax error is unrelated to my issue

#

I just pasted it here because you asked @ruby swallow for the source of his Immich compose too

kindred sail
#

Ok but other than parsing the syntax I/we don't know anything about Cosmos

wintry grove
#

Okay, yeah I see

kindred sail
#

There are issues with nvidia hardware acceleration in the latest release, but I haven't seen the start.sh error anywhere else

wintry grove
#

But if we ignore Cosmos for a second:

Do you know whether it is possible to implement the breaking change which leads to the error according to@full crow now or if I have to completely reinstall my instance?

wintry grove
kindred sail
#

bo0tzz meant a fairly old change, if you went through cosmos you definitely don't have it (and I can see from the compose that you don't)

#

But I think your issue might be the container image pulled badly?

wintry grove
#

I am using Immich since almost 2 years, so I think I could definitely have missed it

#

Ah wait, my bad. I didn't post my actual Cosmos compose (Cosmos own implementation of Docker Compose) but I posted the Compose Compose you would get when pulling Immich from the Cosmos market right now

#

One sec, I'll paste my compose which is actually currently in use

kindred sail
#

"entrypoint": "tini -- /bin/bash",

#

But what's this? "org.opencontainers.image.version": "v1.106.4"

#

Oh that's just a label, nevermind

#

This entire JSON reads like it needs an update badly

wintry grove
#

haha oh

kindred sail
#

Did it build from source? So confused

wintry grove
#

now, I don't think it did

#

I thought the line "command": "start.sh", is the culprit

kindred sail
wintry grove
wintry grove
kindred sail
#

It's not much, but that's the change bo0tzz meant

wintry grove
#

Ah okay, thanks for finding it

#

I guess my Cosmos Compose didn't update for some reason

wintry grove
kindred sail
#

Were you actually running 1.130 ?

#

or 1.129 or any recent change 😛

wintry grove
kindred sail
#

If the database user/pass/location and upload location are the same, everything should ™️ work

wintry grove
#

Man, I am confused haha

#

About what I need to do now

kindred sail
#

Let me rephrase, immich doesn't care what happened to/in the containers, you could wipe it all, and just point a freshly installed operating system/docker+immich to the old folders and it should work (it probably won't because postgres won't like all those changes, but docker will be fine)

#

I really have to wonder whether you're not actually somehow running 2 installs though :

kindred sail
#

First things first, do backups work for you

#

Because they were somewhat recently introduced, but with your compose weirdness... 👀

#

Should be over at UPLOAD_LOCATION/backups

wintry grove
#

Maybe I should just completely reinstall Immich and just restore one of my backups?

kindred sail
#

Ok just to be clear here, the backups are database dumps only, no images 😛

#

don't go wiping the image folder

#

Does cosmos allow you to stop the individual containers? You'll need only the DB running for a proper restore

barren mulchBOT
elder nexus
#

Figured I'd post here for anyone else looking but it seems to be an issue exclusive to cosmos and I'll post here if I find a solution :p

wintry grove
elder nexus
#

Hopefully I can figure something out tonight :p

elder nexus
#

aight everyone on cosmos having issues go to your compose in the machine learning container and change the line with "command": to this:

#

"command": "python -m immich_ml",

kindred sail
#

@wintry grove @ruby swallow @quick grove ^

elder nexus
#

#1083875835741741096 message

#

i think he said he had to remove some other stuff that had to do with start.sh but i only had to change the command line from start.sh to this python command. im on the cuda image so idk if theres much of a difference

wintry grove
#

Eyy, thanks so much, y'all!!

#

That fixed it! Just had to change the command line in the compose for the ML container to "command": "python -m immich_ml", like @elder nexus said :))

elder nexus
#

glad to hear its working now :3

ruby swallow
#

In portainer I had already 'python' '-m' 'immich_ml' in. I changed it to 'python -m immich_ml' -> the container is now healthy
Thanks for the help!

But my Immich-SERVER can't reach the ML container still:
[Nest] 6 - 04/02/2025, 5:14:41 PM WARN [Microservices:MachineLearningRepository] Machine learning request to "http://immich-machine-learning:3003" failed: fetch failed [Nest] 6 - 04/02/2025, 5:14:41 PM ERROR [Microservices:{"id":"8f6ef92f-8a95-4897-b337-c9121656093b"}] Unable to run job handler (smartSearch/smart-search): Error: Machine learning request '{"clip":{"visual":{"modelName":"ViT-B-32__openai"}}}' failed for all URLs

kindred sail
#

I think the healthy part is with update 1.131.3 which fixed that @ruby swallow 😛