#Official paperwork documentation and home assistant llm integration

15 messages · Page 1 of 1 (latest)

unreal storm
#

As discussed in #the-water-cooler
Im tired with my paper/electronical offical paperwork.

@cerulean tapir Mentioned Paperless-ngx that seams to be very suitable for the situation.

Its a foss document mangement system. (What we need) and it has a extensive api. I asked chatgpt and that said. That.

  1. searching and opening a document with the api is possible.

So i can make linux shell scripts (i cant write home assistant integrations) that searches documents and can read them. This way a LLM can use a script (that fires the shell script) to search and read documents.

  1. webhooks. The api also has web hooks. When a document is created.

This allows ha to be triggered on document creation. It can read the document and make actions on this.

This allows a end user the following.
A functional document management system.
And HA that can create TODO lists and callendar items based on documents.

It would also help with explaining a document to a user by asking questions.

1st things to do pick a document system and setup a basic environment.

unreal storm
#
services:
  broker:
    image: docker.io/library/redis:7
    restart: unless-stopped
    volumes:
      - redisdata:/data

  db:
    image: docker.io/library/postgres:16
    restart: unless-stopped
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: paperless
      POSTGRES_USER: paperless
      POSTGRES_PASSWORD: paperless

  webserver:
    image: ghcr.io/paperless-ngx/paperless-ngx:latest
    restart: unless-stopped
    depends_on:
      - db
      - broker
    ports:
      - "9926:8000"
    volumes:
      - /mnt/usb/paperless-data:/usr/src/paperless/data
      - /mnt/usb/paperless:/usr/src/paperless/media
      - ./export:/usr/src/paperless/export
      - /opt/docker/paperless-ngx/consume:/usr/src/paperless/consume
    env_file: docker-compose.env
    environment:
      PAPERLESS_REDIS: redis://broker:6379
      PAPERLESS_DBHOST: db

volumes:
  pgdata:
  redisdata:
#

I have issues with paperless. It expects an insecure docker system. (that the user has docker group rights, and that means root rights)

cerulean tapir
#

Hm? The user inside the container?

unreal storm
#

and i set a PAPERLESS_CONSUMPTION_DIR=/mnt/usb/nextcloud/Documenten/archief/
and its not working.

#
[2024-12-30 22:30:50,116] [DEBUG] [paperless.classifier] Document classification model does not exist (yet), not performing automatic matching.

[2024-12-30 22:30:50,149] [DEBUG] [paperless.classifier] Document classification model does not exist (yet), not performing automatic matching.

[2024-12-30 22:38:32,354] [DEBUG] [paperless.classifier] Document classification model does not exist (yet), not performing automatic matching.

[2024-12-30 22:38:32,377] [DEBUG] [paperless.classifier] Document classification model does not exist (yet), not performing automatic matching.

[2024-12-30 22:39:24,499] [DEBUG] [paperless.classifier] Document classification model does not exist (yet), not performing automatic matching.

[2024-12-30 22:39:24,508] [DEBUG] [paperless.classifier] Document classification model does not exist (yet), not performing automatic matching.

[2024-12-30 22:43:10,291] [INFO] [paperless.management.consumer] Using inotify to watch directory for changes: /mnt/usb/nextcloud/Documenten/archief

[2024-12-30 22:43:28,129] [INFO] [paperless.management.consumer] Received SIGINT, stopping inotify

[2024-12-30 22:43:28,132] [DEBUG] [paperless.management.consumer] Consumer exiting.

[2024-12-30 23:05:03,135] [INFO] [paperless.tasks] No automatic matching items, not training

[2024-12-31 00:05:03,127] [INFO] [paperless.tasks] No automatic matching items, not training

[2024-12-31 01:05:03,144] [INFO] [paperless.tasks] No automatic matching items, not training

[2024-12-31 02:05:00,166] [INFO] [paperless.tasks] No automatic matching items, not training```
unreal storm
cerulean tapir
#

It should be independent of the outside user. I run my instance without docker 🤷
though I think you are doing something slightly weird and trying to set the consumption dir for a path outside the container? 🤔

#

The compose yaml mounts /opt/docker/paperless-ngx/consume into the container.

  - /opt/docker/paperless-ngx/consume:/usr/src/paperless/consume

I guess that's where your statement about the user comes from? 🤔
Depending on how you plan to do this in the long term, you could replace /opt/docker/paperless-ngx/consume with /mnt/usb/nextcloud/Documenten/archief/
Though that seems questionable, since the path suggests that's a protable drive.

#

FWIW, sudo chmod 0777 /opt/docker/paperless-ngx/consume would give you the ability to write into that dir with your normal user
But I don't know if that's a persistent dir, or created when you start the contianer

unreal storm
#

oh nice

#

great. [2024-12-31 11:22:56,394] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Ingebed bericht (2).eml: Unknown file extension.

[2024-12-31 11:22:56,411] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Ingebed bericht .eml: Unknown file extension.

#

I tought i make a reguair expression .*.eml

#

oh.. it reads a pdf

cerulean tapir
#

yea, eml is email export
paperless has some email functionality, but I think that's IMAP based and mostly supposed to read PDF attachements out of mail