#Checking space usage and shrinking/migrating database

1 messages ยท Page 1 of 1 (latest)

lilac dune
#

If you show me pictures I can explain some of them.

tender scroll
lilac dune
#

overlay2 are the file systems of the containers. Mostly addons. Leave them alone. ha supervisor repair does housekeeping for this.

tender scroll
lilac dune
#

It's mounted in different places. Doesn't necessarily use double the space though.

tender scroll
#

ah, but it's duplicated right?

#

ahh, fuck

#

is that normal?

lilac dune
#

Yeah. Seems to be.

#

What's in /mnt/data/supervisor? All backups I assume?

tender scroll
lilac dune
tender scroll
#

I can see my mariaDB is f'ing large (I have 2 years set to retain) but I didn't expect it to get that large

lilac dune
#

So basically /mnt/data/data and /var/lib/docker is the same.
I recommend using a proper Time Series database for long term data. SQL is very inefficient for that.

tender scroll
#

I'm confused now though. HA reports 156GB used.

#

gdu shows 164+34GiB

lilac dune
#

I'd have to check where HA gets its data from. You can also check df -h.

#

Also note that gdu shows GiB while HA shows GB and so on and gdu counts the docker directory double.

tender scroll
#

wiztree doesn't allow me to scan network devices

lilac dune
tender scroll
#

I am forced to choose a subfolder

#

as the device is not considered a folder

lilac dune
#

You might have to run it as admin. Or if you are try without. I remember there was some strangeness with this.

tender scroll
#

no, tried both ways

lilac dune
#

It's only really faster for local drives so it doesn't matter much what you use.

tender scroll
#

so basically the 2 main lumps that smb doesn't detect are the /addons folder, and that is mostly mariaDB (55GB). And then the docker x2 (37GB x2)

#

so, for the docker thing it's definitive that it's standard and normal that it's doubled. Nothing to be done there?

#

I'd like to understand what makes it grow, and how to keep it down, at least if possible

lilac dune
#

You can't change it being there two times. There's quite some inconsistencies how the space is calculated/shown.
Here's some data from my test HAOS VM

# df -h /mnt/data/docker/
Filesystem                Size      Used Available Use% Mounted on
/dev/sda8                30.8G      6.0G     23.5G  20% /mnt/data
# du -shc /mnt/data/docker/* | sort -rh
10.4G   total
10.4G   /mnt/data/docker/overlay2
27.8M   /mnt/data/docker/image
...

As you can see, df says 6G used, du says 10.4G. gdu says 10.4G too.

tender scroll
#

weird

#

why is different df from du and gdu?

lilac dune
#

You can try docker system df to see some of it but like I said, supervisor should clean this up periodically and through ha supervisor repair.
If you have lots of addons it's normal to have a big overlay2 folder.

#

Not sure. How overlay works I guess.

tender scroll
#

got it

#

will live with it then

lilac dune
#

You should think about the Time Series database though. HA has long term statistics so you don't need 2 years worth of recorder data to see all.

tender scroll
#

yeah, I get that. That's going to be my focus

#

I have heard about the long term statistics but tbh I don't get what it means

#

for me long term statistics = storing the data for years ๐Ÿ˜‚

lilac dune
#

Trying to find the blog post.

#

I use the default 10 days and that already makes my database bloat to 4-8G. I have my long term data in VictoriaMetrics: #installation-archived message

tender scroll
#

so if I turn down my keep_days back to 10 I will still be able to see the long-term data?

lilac dune
#

Assuming the data is eligible for it. Yes. It's not as ganular.

tender scroll
#

I actually did the change to mariadb just because I wanted to keep all the data, and had heard that sql wasn't as good for it.

lilac dune
#

The problem with 2 years of data in HA is you can't do anything with it. You can count, let's say, average daily energy consumption over a month or so.

#

If you have it in InfluxDB or similar you can use grafana do create dynamic dashboard to calculate pretty much everything.

tender scroll
#

yeah, I heard of them too, but never got around to research more and go that route

lilac dune
#

Well HeidiSQL is still SQL. In my expeience it performs rather badly when you have more data than memory anyways.

#

I used MariaDB in the beginning because the tools are just better and I'm familiar with it. Like checking space usage per table and so on is infinitely easier.

tender scroll
#

tbh it's not that I'm looking for dashboards etc

#

I just like being able to see temperatures over a long time

lilac dune
#

Nowadays I just use the defaults as my important data is in VictoriaDB. I use HA's graphs just for a quick peek.

tender scroll
#

got it

#

never heard of the ones you mentioned, only InfluxDB

#

seems to be quite popular, so I guess the others are more advanced, and I'm clearly not at that user stage

lilac dune
tender scroll
#

do you think InfluxDB will be good enough then?

lilac dune
#

I don't like its syntax but it's the most popular choice for this and there's lots of dashboard already.

tender scroll
#

I'm still confused tho

#

the point is to set up influxdb and then keep 720 days to retain?

lilac dune
#

Why 720? The recorder is not involved in it. They are separate.

tender scroll
#

or is 720 def not needed because these supposed long term statistics are still keeping the data somehow?

#

that's what I guess I'm missing

lilac dune
tender scroll
#

The long-term statistics table is never purged. ahhhhhhhhhhh

lilac dune
tender scroll
#

but it has nothing to do with the long term ones, right?

#

well, influxDB stores it forever if you want, so it acts as another copy of the full data

lilac dune
tender scroll
#

is that right?

#

ahhh

lilac dune
#

I guess you can try to select last 5 years in your history tab and see if it becomes light blue like mine too: <#1216777289270951957 message>
Might load for 1 an hour...

#

Not all your entities might support long term stats.

tender scroll
#

nah my mariaDB it's only been set up for less than a year now

#

so it's still accumulating

lilac dune
#

I can't imagine having to back up 50G+ of data for HA.

#

HA's backups are not that efficient.

tender scroll
#

it doesn't show light blue or anything

lilac dune
#

That's not the stock history panel. History explorer is dead, btw.

tender scroll
#

really?

tender scroll
lilac dune
#

Hmm. Have you run HA longer than you had MariaDB?

tender scroll
#

yep

#

5 or 6 more months

lilac dune
tender scroll
#

so sad though, it was a great tool

lilac dune
#

I think we can go back to the main HA channels now. The thread served its purpose.

tender scroll
#

thank you SO much for all the help

#

would you mind if I ping you to continue with this?

lilac dune
#

No that's fine.

tender scroll
#

really feel it's the right time for me to do this db change and set it up fine for future

lilac dune
#

I'd recommend you run both in parallel until your're ready to switch. Like I said earlier, they are separate.

tender scroll
#

you meant it should have the earlier 6 months before mariaDB?

#

also, just to clarify for me, even if I switch keep_days back to default 10, that should purge (whenever the next one is) the recorder values but still retain the long-term ones, correct?

lilac dune
#

Yes. You should have long term stats from the time of first installation. At least as far as I know. You are using a recent HA version, yeah?

tender scroll
#

yeah, last stable version

tender scroll
#

idk, maybe I somehow cleared everything when changing the db?

#

tbh didn't know much of what I was doing

lilac dune
#

The attributes for that sensor are

state_class: measurement
unit_of_measurement: ยฐC
device_class: temperature
friendly_name: ...
#

As far as I know you can't really delete long term stats.

tender scroll
#

weird then :S

#
recorder:
  db_url: !secret mariadb_url # Use new MariaDB
  purge_keep_days: 730 # Retain data for 2 years instead of default 10 days
#

this is what I did to my config

lilac dune
#

Well unless you destroy the whole DB, of course. Switching from SQLite to MariaDB might count as such a case as you basically create a new database from scratch.

tender scroll
#

nothing else

#

but isn't' that what I'm supposed to do with inlfuxDB?

lilac dune
#

Yes. You can run both in parallel for a year though and don't lose anything. The point of the switch is to have better efficiency and ways to inspect the data.

tender scroll
#

but whenever I switch that recorder url it will get replaced and deleted, right?

lilac dune
#

Well yes and no. InfluxDB does NOT replace the recorder.

tender scroll
#

ah

lilac dune
#

The idea is to store ganular recorder data for only 10 days, then have recorder long term stats and granular + long term via InfluxDB.

tender scroll
#

got it. So what should I do with my mariaDB then? Atm it seems I'm storing granular for 2 years, so long term cannot even be seen

#

so I guess if I purge over 10 days, it will switch to granular for 10, then long term

#

and then I set up influxDB as a "backup" DB for granular long term

#

am I right?

lilac dune
#

You could try reducing purge days by a month or so and see if long term stats show up. But if you only have 6 months of data ๐Ÿคท

tender scroll
#

what do you mean only 6 months? I have almost a year.

lilac dune
#

11 months then ๐Ÿ˜„

tender scroll
#

okok got it

lilac dune
#

I mean that 11 months is less than, let's say, 700 days.

tender scroll
#

yeah yea

#

reduced to 630 now

#

will go and purge, and check

lilac dune
#

If not maybe the folks in #integrations-archived can tell you more. I'm not an expert with the long term stats.

tender scroll
#

not managing to purge

#

just to get my roadmap clear. Would you to switch the recorder back to default db?

#

or stick to mariaDB?

lilac dune
#

Depends why you use MariaDB. I use SQLite at the moment. The only benefits of MariaDB I see is the better tools to inspect the database.

tender scroll
#

I never looked at the db myself really

#

just thought it was more efficient for long term

#

but now understanding the recorder won't be responsible of long term, I don't think it matters

#

right?

lilac dune
#

I think it's still a bit faster. I switched because it uses more memory and space.

#

Long term stats are still saved in the recorder database.

#

They don't take a lot of storage though as they are not that granular.

tender scroll
#

ah ok. Well so to avoid losing all the data again, I will just stick with mariaDB for now

#

tbh don't think I even need influxDB if the long term thing really works as it seems

#

don't really use the granular bit that much

lilac dune
#

InfluxDB is useful if you want to do stuff with the data. Like let's say check the average power consumption of the last hour or so.

#

Or store glances data for CPU usage per process or whatever.

tender scroll
tender scroll
#

but I get that even after purged my 55GB won't reduce much

#

so I guess my only option is to start from 0 again

lilac dune
#

Think of the database file like belly skin that is now flappy after you've lost weight.

tender scroll
#

yeah, makes sense

#

is that the same for any db though? Or just mariaDB?

#

meaning, is everyone forced to reset their DB because it just keeps inflating infinitely?

#

ah no, if you stick to the 10d rule, it won't increase

#

silly me

lilac dune
#

It only inflates to the biggest it has been.

tender scroll
#

yep

#

so it was dumb of me to do it ๐Ÿ˜‚

#

so once again. Knowing I'm starting form 0. What do you recommend for recorder DB? Keeping in mind I'm a user, not a dev. ๐Ÿ™‚

lilac dune
#

I'd stick to SQLite.

tender scroll
#

priority is ease of use, and popularity (for community support)

#

ok

#

funny thing it still had the original sqlite

#

so there's the old data

lilac dune
#

So at least it works.

tender scroll
#

do you know if there's a way to migrate the data?

lilac dune
#

Not officially.

tender scroll
#

well, I guess I don't want the granular data, and the long term one wasn't stored

#

so it'd be complex

#

nvm

#

I'll survive

#

happy to recover 55GB

#

also prob mariaDB and that db was responsible of some of that docker folder

#

so might see even a larger reduction

lilac dune
tender scroll
#

yeah found something, but again I'd need to trim it down to 10d granular, or else I'll have the same issue if I migrate everything

#

and I don't think making mariadb process the data and convert it to long term manually is going to be straightforward

#

nvm

tender scroll
#

found a migration guide and the video suggested it was possible too, so going to try

#

and first I am purging all the long term

#

looking good!

tender scroll
#

Hey @lilac dune just FYI. I learned a few things.

  1. You can check db size without SSH access by going to Settings > System > Repairs > 3 dots > System information, Scroll down to Recorder.
  2. I had a 55GB db as you saw. I started purging the db before migrating back to SQLite. https://file.coffee/u/QjQ68HVvSSv5c46W-zbH5.png Still working on it after 2 days. ๐Ÿ˜‚ Anyway, when purging there are these handy options https://file.coffee/u/VDAQBiggt4N22i45t3qZ1.png that seem to take much longer but help reducing the db size after the purge.
  3. It seems to be working! https://file.coffee/u/Pk1OxvrxbWD9rw7O2wX_Z.png Db currently sitting at 19GB and going down. ๐Ÿ˜„ Will see how small it ends with. I will still migrate it back to SQLite. The video info was very useful.
lilac dune
#

1
You can also use a SQL sensor for this: https://www.home-assistant.io/integrations/sql/
And file size for SQLite: https://www.home-assistant.io/integrations/filesize/
This is mine right now: https://i.imgur.com/IZTSjN5.png
See here for queries and tops to keep the database small. Not sure if the queries work with MySQL too: https://community.home-assistant.io/t/how-to-keep-your-recorder-database-size-under-control/295795
See here for MySQL queries you can run in phpMyAdmin: https://stackoverflow.com/questions/9620198/how-to-get-the-sizes-of-the-tables-of-a-mysql-database

2
Yeah that's handy but unfortunately it doesn't show progress or anything. This also happens periodically on its own btw: https://www.home-assistant.io/integrations/recorder/#auto_repack

tender scroll
#

Cool, Might create a sensor once I migrate then. It's handy to have the history. What's the diff between the 2 sensors you got?

#
  1. Then that solves the issue of "skin staying flacid" no? Sort of
#

I think my purge is completed now. Got it down to 900MiB omfg

#

such a reduction

lilac dune
#

Well one checks the actual file size, the other the data SQLite reports.

tender scroll
lilac dune
#

Maybe it does a recreate.

tender scroll
#

I'm trying to double check with gdu but can't find the command to launch it

lilac dune
tender scroll
#

How come HA now reports less (110GB) that gdu (120+37.7GiB)?

lilac dune
#

For what exactly? I thought I explained that alst time.

lilac dune
#

I don't know how often that updates.

tender scroll
#

maybe you did, sorry. I don't understand

#

you think gdu is more reliable? Not sure why it would increase suddenly though, as I've only been reducing it

lilac dune
#

The gist is that overlays skew the data a bit and they calculate it differently.

tender scroll
#

I just checked and can confirm db has shrunk 54GB ๐Ÿ™‚

#

overlays folder hasn't changed a bit though ๐Ÿ˜ฆ I thought it would since you said it was related to addons, and thought it would be MariaDB fault

lilac dune
#

You can actually trigger the compaction ayourself in phpMyAdmin with optimize table but it's kind of write heavy as it writes a copy of the database, then deletes the old one.

#

I think SQLite does the same, btw.

tender scroll
#

trigger what? The purge? I'm happy to do it via HA purge service. Not that familiar with phpmyadmin and SQL anyway

lilac dune
#

The repack.

tender scroll
#

yeah, prefer to do it via HA

#

will migrate back to sqlite now, and delete mariadb altogether. And will check again. Really hoping to see overlays reduce somewhat ๐Ÿ˜ฆ

tender scroll
#

appreciate it, but really, way too messy when HA is one click ๐Ÿ™‚

lilac dune
#

Still good to know about.

#

As the livestream video said, if you use MySQL you are the DB admin ๐Ÿ™‚

tender scroll
#

yeah, true. Notice how quick I'm moving back ๐Ÿ˜‚

#

got an error when trying to install the migration tool

lilac dune
#

Try pipx.

tender scroll
#

apk add py3-pipx ??

lilac dune
#

Check if it's installed first. I'd have to check the package name first. Am mobile right now.

tender scroll
#

no, command not found

lilac dune
#

Try apk add pipx.

tender scroll
#

yeah, it worked

lilac dune
#

Then run the same command but just with x appended.

tender scroll
#
 installed package mysql-to-sqlite3 2.1.10, installed using Python 3.11.8
  These apps are now globally available
    - mysql2sqlite
โš ๏ธ  Note: '/root/.local/bin' is not on your PATH environment variable. These apps will not be globally accessible until your PATH is
    updated. Run `pipx ensurepath` to automatically add it, or manually modify your PATH in your shell's config file (i.e. ~/.bashrc).
done! โœจ ๐ŸŒŸ โœจ

Got this

#

should I do the ensurepath?

lilac dune
#

Sure, why not.

#

Because we're doing it the proper way.

tender scroll
#

zsh: command not found: mysql2sqlite :S

lilac dune
#

It told you it's not in your path

#

Allso please post this in code blocks, no pictures.

tender scroll
#

but I did the ensurepath

#

sorry

lilac dune
#

If you do I can copy paste your paths from there and give you a proper command. I don't want to retype it.

#

It's also easier to read.

tender scroll
#
โžœ  ~ pipx install mysql-to-sqlite3
  installed package mysql-to-sqlite3 2.1.10, installed using Python 3.11.8
  These apps are now globally available
    - mysql2sqlite
โš ๏ธ  Note: '/root/.local/bin' is not on your PATH environment variable. These apps will not be globally accessible until your PATH is
    updated. Run `pipx ensurepath` to automatically add it, or manually modify your PATH in your shell's config file (i.e. ~/.bashrc).
done! โœจ ๐ŸŒŸ โœจ
โžœ  ~
โžœ  ~
โžœ  ~ pipx ensurepath
Success! Added /root/.local/bin to the PATH environment variable.

Consider adding shell completions for pipx. Run 'pipx completions' for instructions.

You will need to open a new terminal or re-login for the PATH changes to take effect.

Otherwise pipx is ready to go! โœจ ๐ŸŒŸ โœจ
โžœ  ~ ha core stop
Processing... Done.

Command completed successfully.
โžœ  ~ mysql2sqlite -f /config/home-assistant_v2.db -d homeassistant -u homeassistant -h homeassistant.local -P 3306 -p -V --use-buffered-cursors
zsh: command not found: mysql2sqlite
โžœ  ~ zsh: command not found: mysql2sqlite
lilac dune
#

Try /root/.local/bin/mysql2sqlite ....

tender scroll
#

cd there? Or just add that instead of just the name at the beginning of the command?

lilac dune
#

No.

#

You can also run export PATH="/root/.local/bin:$PATH" to make it runnable with just mysql2sqlite

#

RIght now it doesn't know where mysql2sqlite is. There's specific paths it looks at to check for commands

#

Try echo $PATH.

tender scroll
#

migrating now

#

tahnk you very much

lilac dune
#

Checking space usage and shrinking/migrating database

tender scroll
lilac dune
#

Which commands?

tender scroll
#

the ones shown at the bottom

lilac dune
#

I don't see commands?

#

Oh.

tender scroll
#

I mean key shortcuts

lilac dune
#

Yeah sorry it took me a second. Just CTRL+X and so on.

tender scroll
lilac dune
#

Just enter.

tender scroll
#

that is SO misleading lol

lilac dune
#

nano is one of the easiest editors. You should try vi ๐Ÿ˜„

tender scroll
#

soooo... after rebooting with the sqlite db now the db is empty xD

#

the migration seemed to work, and no errors

#

but now history shows nothing. Not even the old data from the initial months

#

will ask in the forum thread :S

#

config/home-assistant_v2.db.corrupt.2024-03-13T21:03:18.844016+00:00

#

maybe that's a flag

lilac dune
#

Yeah I'm not sure if that conversion works properly.

tender scroll
lilac dune
#

Guess there's a reason the devs don't have docs for it if it were that simple.

tender scroll
lilac dune
#

Did you stop HA while doing the migration?

tender scroll
#

yep

tender scroll
#

I managed to make it work trying the migration again. But this time, instead of migrating into the old original HA database and merging with it, I did it into a brand new db. And it worked fine. Obv I don't have the data from those first HA months now though.

But still have the file from the first migration which holds that data. Do you know if there's a way to properly migrate that date range only into the current working sqlite db?

tender scroll
#

nvm, I'll live without it

#

just happy the migration is over and back to sqlite

tender scroll
#

SO happy right now