#Tons of repeating errors in the log after upgrading to 1.86. please help me decipher/troubleshoot

1 messages · Page 1 of 1 (latest)

stoic tinsel
#

I only updated because my mobile app auto updated and I couldn't log into immich via mobile by upgrading the server. getting repeating errors in the log that looks like this. SOmething to do with Typesense?

immich_server            | Request #1700229163220: Sleeping for 4s and then retrying request...
immich_proxy             | 2023/11/17 13:53:24 [error] 51#51: *27 connect() failed (111: Connection refused) while connecting to upstream, client: 172.22.0.1, server: , request: "GET /api/socket.io/?EIO=4&transport=websocket HTTP/1.1", upstream: "http://172.22.0.6:3001/socket.io/?EIO=4&transport=websocket", host: "my.domain.com"
immich_proxy             | 2023/11/17 13:53:24 [error] 44#44: *8 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.5.146, server: , request: "GET /api/socket.io/?EIO=4&transport=polling&t=OlTZjYh HTTP/1.1", upstream: "http://172.22.0.6:3001/socket.io/?EIO=4&transport=polling&t=OlTZjYh", host: "192.168.95.149:2283", referrer: "http://192.168.95.149:2283/photos"
immich_typesense         | E20231117 13:53:25.039273   550 raft_server.cpp:601] Node not ready yet (known_applied_index is 0).
immich_microservices     | Request #1700229174953: Request to Node 0 failed due to "undefined Request failed with HTTP code 503 | Server said: Not Ready or Lagging"
immich_microservices     | Request #1700229174953: Sleeping for 4s and then retrying request...
immich_server            | Request #1700229163220: Request to Node 0 failed due to "undefined Request failed with HTTP code 503 | Server said: Not Ready or Lagging"
immich_server            | Request #1700229163220: Sleeping for 4s and then retrying request...


#

also: out of memory errors? running on unraid with 128gb of RAM on only 54% is used?

immich_microservices     | 
immich_microservices     | <--- Last few GCs --->
immich_microservices     | 
immich_microservices     | [7:0x3fe1a310000]   221028 ms: Mark-Compact 4045.9 (4133.5) -> 4031.3 (4134.7) MB, 3133.87 / 0.00 ms  (average mu = 0.099, current mu = 0.013) allocation failure; scavenge might not succeed
immich_microservices     | [7:0x3fe1a310000]   224206 ms: Mark-Compact 4047.1 (4134.7) -> 4032.4 (4136.0) MB, 3151.07 / 0.00 ms  (average mu = 0.055, current mu = 0.008) allocation failure; scavenge might not succeed
immich_microservices     | 
immich_microservices     | 
immich_microservices     | <--- JS stacktrace --->
immich_microservices     | 
immich_microservices     | FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
immich_microservices     |  1: 0xc99960 node::Abort() [immich_microservices]
immich_microservices     |  2: 0xb6ffcb  [immich_microservices]
immich_microservices     |  3: 0xebe910 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [immich_microservices]
immich_microservices     |  4: 0xebebf7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [immich_microservices]
immich_microservices     |  5: 0x10d06a5  [immich_microservices]
immich_microservices     |  6: 0x10d0c34 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [immich_microservices]
immich_microservices     |  7: 0x10e7b24 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*) [immich_microservices]


#

running a pretty standard docker compose. let me know if you see any issues:

#
version: "2.4"
#version: "3.8"

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
#    cpus: 4
    cpuset: 0,1,2,3,4,10,11,12,13,14
    command: [ "start.sh", "immich" ]
    volumes:
      - ${LIBRARY_LOCATION}:/usr/src/app/upload/library
      - ${UPLOAD_LOCATION}:/usr/src/app/upload/upload
      - ${THUMBS_LOCATION}:/usr/src/app/upload/thumbs
      - ${ENCODED_VIDEO_LOCATION}:/usr/src/app/upload/encoded-video
      - ${PROFILE_PICTURE_LOCATION}:/usr/src/app/upload/profile
    env_file:
      - .env
    depends_on:
      - redis
#      - database
      - typesense
    restart: always

  immich-microservices:
    container_name: immich_microservices
    cpuset: 0,1,2,3,4,10,11,12,13,14
#    cpus: 4
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends:
      file: hwaccel.yml
      service: hwaccel
    command: [ "start.sh", "microservices" ]
    volumes:
      - ${LIBRARY_LOCATION}:/usr/src/app/upload/library
      - ${UPLOAD_LOCATION}:/usr/src/app/upload/upload
      - ${THUMBS_LOCATION}:/usr/src/app/upload/thumbs
      - ${ENCODED_VIDEO_LOCATION}:/usr/src/app/upload/encoded-video
      - ${PROFILE_PICTURE_LOCATION}:/usr/src/app/upload/profile
    env_file:
      - .env
    depends_on:
      - redis
#      - database
      - typesense
    restart: always
    environment:
      - TZ=America/New_York

#
  immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    cpuset: 0,1,2,3,4,10,11,12,13,14
#    cpus: 6
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  immich-web:
    container_name: immich_web
    image: ghcr.io/immich-app/immich-web:${IMMICH_VERSION:-release}
    env_file:
      - .env
    restart: always

  typesense:
    container_name: immich_typesense
    image: typesense/typesense:0.24.1@sha256:9bcff2b829f12074426ca044b56160ca9d777a0c488303469143dd9f8259d4dd
    cpuset: 0,1,2,3,4,10,11,12,13,14
#    cpus: 4
    environment:
      - TYPESENSE_API_KEY=${TYPESENSE_API_KEY}
      - TYPESENSE_DATA_DIR=/data
      # remove this to get debug messages
      - GLOG_minloglevel=1
    volumes:
      - tsdata:/data
    restart: always

  redis:
    container_name: immich_redis
    image: redis:6.2-alpine@sha256:70a7a5b641117670beae0d80658430853896b5ef269ccf00d1827427e3263fa3
    restart: always

#  database:
#    container_name: immich_postgres
#    image: postgres:14-alpine@sha256:28407a9961e76f2d285dc6991e8e48893503cc3836a4755bbc2d40bcc272a441
#    env_file:
#      - .env
#    environment:
#      POSTGRES_PASSWORD: ${DB_PASSWORD}
#      POSTGRES_USER: ${DB_USERNAME}
#      POSTGRES_DB: ${DB_DATABASE_NAME}
#    volumes:
#      - pgdata:/var/lib/postgresql/data
#    restart: always

  immich-proxy:
    container_name: immich_proxy
    image: ghcr.io/immich-app/immich-proxy:${IMMICH_VERSION:-release}
    environment:
      # Make sure these values get passed through from the env file
      - IMMICH_SERVER_URL
      - IMMICH_WEB_URL
    ports:
      - 2283:8080
    depends_on:
      - immich-server
      - immich-web
    restart: always

volumes:
#  pgdata:
  model-cache:
  tsdata:
stoic tinsel
#

should note that this is after uploading 80K+ photos via cli yesterday. import/jobs took a while but seemed to work without issue, but it's not happy today

silk plover
stoic tinsel
#

hmm, user in the the issue you linked was running truenas. don't know much about that but with unraid, unless you limit the container it should be able to access all addressable free memory. Guessing this might be a combination of container JVM settings and library size. Any ideas I could try to confirm?

stoic tinsel
#

just added NODE_OPTIONS="--max-old-space-size=16384" to my .env file

#

no out of memory errors yet, but still tons of:

immich_microservices     | [Nest] 7  - 11/17/2023, 10:08:22 AM   ERROR [JobService] Unable to run job handler (objectTagging/classify-image): TypeError: fetch failed
immich_microservices     | [Nest] 7  - 11/17/2023, 10:08:22 AM   ERROR [JobService] TypeError: fetch failed
immich_microservices     |     at Object.fetch (node:internal/deps/undici/undici:11372:11)
immich_microservices     |     at async MachineLearningRepository.post (/usr/src/app/dist/infra/repositories/machine-learning.repository.js:16:21)
immich_microservices     |     at async SmartInfoService.handleClassifyImage (/usr/src/app/dist/domain/smart-info/smart-info.service.js:55:22)
immich_microservices     |     at async /usr/src/app/dist/domain/job/job.service.js:108:37
immich_microservices     |     at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:350:28)
immich_microservices     |     at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:537:24)
immich_microservices     | [Nest] 7  - 11/17/2023, 10:08:22 AM   ERROR [JobService] Object:
immich_microservices     | {
immich_microservices     |   "id": "f2ef9bc2-4967-4844-bc99-2d7fbed38221",
immich_microservices     |   "source": "upload"
immich_microservices     | }
#

and:

immich_microservices     | Request #1700233698269: Request to Node 0 failed due to "ECONNRESET socket hang up"
immich_microservices     | Request #1700233698269: Sleeping for 4s and then retrying request...
immich_microservices     | Request #1700233723039: Request to Node 0 failed due to "ECONNRESET socket hang up"
immich_microservices     | Request #1700233723039: Sleeping for 4s and then retrying request...
#

oof - segmentation fault now:

#
immich_microservices     | [Nest] 7  - 11/17/2023, 10:09:17 AM     LOG [SearchService] Indexing 289318 faces
immich_typesense         | W20231117 15:09:18.837061   635 index.cpp:5406] Error while removing field `embedding` from document, message: Label not found
immich_typesense         | W20231117 15:09:18.841943   635 index.cpp:5406] Error while removing field `embedding` from document, message: Label not found
immich_typesense         | W20231117 15:09:18.841964   635 index.cpp:5406] Error while removing field `embedding` from document, message: Label not found
immich_typesense         | W20231117 15:09:18.845333   635 index.cpp:5406] Error while removing field `embedding` from document, message: Label not found
immich_typesense         | E20231117 15:09:20.430876    44 backward.hpp:4199] Stack trace (most recent call last) in thread 44:
immich_typesense         | E20231117 15:09:20.430892    44 backward.hpp:4199] #11   Object "", at 0xffffffffffffffff, in 
immich_typesense         | E20231117 15:09:20.430895    44 backward.hpp:4199] #10   Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x151717b25bb3, in __clone
immich_typesense         | E20231117 15:09:20.430896    44 backward.hpp:4199] #9    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x151717a94b42, in 
immich_typesense         | E20231117 15:09:20.430902    44 backward.hpp:4199] #8    Source "../../../../../libstdc++-v3/src/c++11/thread.cc", line 80, in execute_native_thread_routine [0x15126cf]
immich_typesense         | E20231117 15:09:20.430902    44 backward.hpp:4199] #7  | Source "/usr/local/gcc-10.1.0/include/c++/10.1.0/thread", line 215, in operator()

#
immich_typesense         | E20231117 15:09:20.430938    44 backward.hpp:4199]       Source "/usr/local/gcc-10.1.0/include/c++/10.1.0/bits/invoke.h", line 60, in operator() [0x6301fc]
immich_typesense         | E20231117 15:09:20.430939    44 backward.hpp:4199] #1    Source "/typesense/src/index.cpp", line 1014, in operator() [0x6300a9]
immich_typesense         | E20231117 15:09:20.430941    44 backward.hpp:4199] #0    Source "/typesense/external-Linux/hnswlib-21de18ffabea1a9d1e8b16b49afc6045d7707e4c/hnswlib/hnswalg.h", line 848, in insertPoint [0x6241fb]
immich_typesense         | Segmentation fault (Address not mapped to object [0x8])
immich_microservices     | [Nest] 7  - 11/17/2023, 10:09:20 AM     LOG [MediaService] Successfully generated WEBP image thumbnail for asset 58916f70-4c11-4091-86ba-26212ea007a3
immich_typesense         | E20231117 15:09:20.984668    44 typesense_server.cpp:102] Typesense 0.24.1 is terminating abruptly.
immich_microservices     | Request #1700233760502: Request to Node 0 failed due to "ECONNRESET write ECONNRESET"
immich_microservices     | Request #1700233760502: Sleeping for 4s and then retrying request...
immich_typesense exited with code 1
#

and:

immich_typesense         | W20231117 15:14:51.559652   597 index.cpp:5406] Error while removing field `embedding` from document, message: Label not found
immich_typesense         | W20231117 15:14:51.566134   597 index.cpp:5406] Error while removing field `embedding` from document, message: Label not found
immich_typesense         | W20231117 15:14:51.579839   597 index.cpp:5406] Error while removing field `embedding` from document, message: Label not found
immich_typesense         | W20231117 15:14:51.598605   597 index.cpp:5406] Error while removing field `embedding` from document, message: Label not found
immich_typesense         | W20231117 15:14:51.641995   597 index.cpp:5406] Error while removing field `embedding` from document, message: Label not found
#

sounds like this is all related to typesense. I tried removing the typesense (tsdata) volume so it could be recreated, but still getting all the errors above

errant frost
#

Yeah we got similar issue reported

stoic tinsel
#

thx for confirming - happy to test out a fix if there's anything you can suggest

errant frost
#

We are working to move away from Typesense so I am not sure if we ever find a fix for this issue >.<

faint path
#

Same here 🙂 Lots of typesense errors 🙂

stoic tinsel
#

forgive me - but what does typesense do? what am i missing by it not working in my stack?

errant frost
#

It does vector search, for CLIP and facial recognition to work

stoic tinsel
#

ahh - so getting rid of clips and facial recognition in the future? or just changing the architecture for it?

errant frost
#

since it can also do vector search

stoic tinsel
#

so, couldn't I just pull typesense from my compose.yml and wait for the release that switches vector search to psql? is there a proper way to disable typesense? do you recommend that?

errant frost
#

You can go into settings and disable machine learning features

stoic tinsel
#

cool, I just did that - logs are still full of

immich_typesense         | E20231117 15:53:00.740618   549 raft_server.cpp:601] Node not ready yet (known_applied_index is 0).
immich_microservices     | Request #1700236352431: Request to Node 0 failed due to "undefined Request failed with HTTP code 503 | Server said: Not Ready or Lagging"
immich_microservices     | Request #1700236352431: Sleeping for 4s and then retrying request...
immich_microservices     | Request #1700236382424: Request to Node 0 failed due to "undefined Request failed with HTTP code 503 | Server said: Not Ready or Lagging"
immich_microservices     | Request #1700236382424: Sleeping for 4s and then retrying request...
immich_typesense         | W20231117 15:53:03.741865   549 node.cpp:811] [default_group:172.24.0.4:8107:8108 ] Refusing concurrent configuration changing
immich_typesense         | E20231117 15:53:03.742159   731 raft_server.h:62] Peer refresh failed, error: Doing another configuration change
immich_microservices     | Request #1700236352431: Request to Node 0 failed due to "undefined Request failed with HTTP code 503 | Server said: Not Ready or Lagging"
immich_microservices     | Request #1700236352431: Sleeping for 4s and then retrying request...
immich_microservices     | Request #1700236387458: Request to Node 0 failed due to "undefined Request failed with HTTP code 503 | Server said: Not Ready or Lagging"
immich_microservices     | Request #1700236387458: Sleeping for 4s and then retrying request...

maybe takes while to clear?

errant frost
#

I think you will also need to add an environment variable to disable it entirely

stoic tinsel
#

added TYPESENSE_ENABLED=false to .env file but logs still flooded with errors from the typesense container. With the env vairable set and machine learning turned off - can I just pull the container and any references from my docker compose yml?

#

I should add that, yes I restarted the stack

errant frost
#

I think after add the variable to disable typesense you can comment it out from the services in the docker-compose file

stoic tinsel
#

ok, trying that

#

ahh quiet logs 😆 and now my OCD can rest for a bit

#

also deleted/removed refernce to the tsdata volume.

#

thx for the support Alex. Proud to support this project. Looking forward to the pivot to psql for vector search. Should be easy enough to re-run facial recognition/ml job once it's available

errant frost
#

Correct, I assume you was doing some face merging then ran into this issue with Typesense, right?

stoic tinsel
#

actually no

#

I did a cli import of 80K+ photos yesterday

#

noticed all the errors today

errant frost
#

Hmm strange

stoic tinsel
#

didn't have an issue with cli imports up until yesterday

#

but that was the first time I uploaded such a large amount

#

up until then I was uploading about 10K at a time max

#

didn't touch the face tagging feature in the web UI

errant frost
#

Thank you for the data point

stoic tinsel
#

anytime - thank you for the project