maiden verge Aug 11, 2024, 1:17 PM

#

@real jungle

midnight karma Aug 11, 2024, 1:19 PM

#

im sorry i dont know these things

maiden verge Aug 11, 2024, 1:19 PM

#

Can u tag

#

Who know

#

@stray gulch

#

Hello brother

#

@livid solstice

livid solstice Aug 11, 2024, 1:20 PM

#

please don't ping random people.

maiden verge Aug 11, 2024, 1:22 PM

#

livid solstice please don't ping random people.

Can u help

livid solstice Aug 11, 2024, 1:26 PM

#

no

maiden verge Aug 11, 2024, 1:26 PM

#

livid solstice no

Why

brisk flame Aug 11, 2024, 3:40 PM

#

What do you think about an ORM dedicated only to creating tables, validating and displaying those tables via GUI? I want to create such a tool for fun, because I think ORM is mainly helpful for table management and validation, but imo creating queries in raw SQL is easy enough that I don't need this functionality.

keen minnow Aug 11, 2024, 10:54 PM

#

brisk flame What do you think about an ORM dedicated only to creating tables, validating and...

Two main feedback from me would be:

To manage my schemas I use tools like liquibase or flyway
I do like to use ORMs for everything else

mint wharf Aug 12, 2024, 12:14 AM

#

I'm trying to learn AWS features in prep for an interview, but I don't know how to create a database without costing me. It says it's in free tier, but both options for configuring Dynamo and RDS are saying they'll cost me. Any help?

wise goblet Aug 12, 2024, 12:49 AM

#

brisk flame What do you think about an ORM dedicated only to creating tables, validating and...

i could add to answer of @keen minnow that i saw quite elegantly made and feature rich schema migrating tool Atlas https://atlasgo.io/
That could be used as inpiration for making own tool. Luquibase/Flyway looked way more messy in comparison

Atlas | Manage your database schema as code

wise goblet Aug 12, 2024, 12:51 AM

#

brisk flame What do you think about an ORM dedicated only to creating tables, validating and...

but imo creating queries in raw SQL is easy enough that I don't need this functionality.
i think it is still important feature for static typing safety. At least lightweight abstraction helping with SQL query building as SQLAlchemy provides or Golang-bun provides is very nice to have

#

i don't use any ORMs only if i develop tool... which as basis i build around smth which has docs providing me with Raw SQL only

#

in this case i am lazy to translate to ORM, and can go raw, since usually this tool is small enough anyway and i would not benefit from using ORM

#

otherwise if it is any even small tool that has a lot of CRUD stuff... then ORM is obligatory for me to simplify things and having type safety

#

TLDR: The essential problem is with Raw SQL that it is unsafe.

#

but at the same time it could be nice having tool like Atlas that defines tables, helps with migrations

hallow pelican Aug 12, 2024, 12:54 AM

#

Hello,

I'm going to start a project with FastApi, and it would also be my first Python project, so I would like some advice about the stack.
I was going to use SQL Alchemy because it was what I had seen a lot, but then I read about Prisma Python Client, and I've already used Prisma but in Node and it's great, how is it with Python?

wise goblet Aug 12, 2024, 12:54 AM

#

Atlas kind of sucks in having almost everything behind paywall

#

would be nice having similar tool as FOSS

wise goblet Aug 12, 2024, 12:55 AM

#

hallow pelican Hello, I'm going to start a project with FastApi, and it would also be my first...

i would make an educated guess it will suck completely unless you are node.js dev

#

at least that's how my Prisma experience went i tried to check its usage for golang client

#

better use Python native solutions like SQLAlchemy or Django ORM

#

they will provide with far better experience, because both natively intergrate with Python

#

and provide even full static typing capabilities (in case of SQLAlchemy ORM)

#

Prima will double suck, because its models aren't even described in Python, but in its own custom language/files

hallow pelican Aug 12, 2024, 12:58 AM

#

wise goblet i would make an educated guess it will suck completely unless you are node.js de...

Thank you very much, this is the kind of information I was looking for 🫰🏻

wise goblet Aug 12, 2024, 12:59 AM

#

hallow pelican Thank you very much, this is the kind of information I was looking for 🫰🏻

basically if u use SQLAlchemy (with mypy integration optionally for best experience)
you will have Python native acccessed data structs, so it will for you like u use Typescript stuff in full capabilities
Mypy integration will especially make extra things top notch

#

and as far as i see Python Prisma Client has not really good level of... type safety or providing intellisence support in general

#

so it will be... full of runtime surprises thing

#

that is at least what i see from short Prima offered doc examples, and is suspect there is nothing more to it 🤔

#

Although Prisma does have Pyright integration

#

may be it is not that bad

#

    post = await db.post.create(
        {
            'title': 'Hello from prisma!',
            'desc': 'Prisma is a database toolkit and makes databases easy.',
            'published': True,
        }
    )

Still

#

Prisma example has DICTIONARY as input

#

that's nightmare

#

i can't seem them being capable to catch it with Pyright

hallow pelican Aug 12, 2024, 1:02 AM

#

I'd rather go the safe route then since this api will move money hehe

wise goblet Aug 12, 2024, 1:03 AM

#

hallow pelican I'd rather go the safe route then since this api will move money hehe

never heard python devs also using Prisma in general too

#

so your python devs will be way more surprised too if u go with it

#

SQLAlchemy has nice module called Alembic to manage migrations and even can auto generate them based on introspection of a database for you
Should be quite good

hallow pelican Aug 12, 2024, 1:04 AM

#

thank you for taking the time to reply dude

candid mica Aug 12, 2024, 1:26 AM

#

hello does anyone know to link sqlserver with django python

maiden verge Aug 12, 2024, 12:17 PM

#

livid solstice no

padhe likhe gawar 💀

maiden verge Aug 12, 2024, 12:17 PM

#

candid mica hello does anyone know to link sqlserver with django python

not python but ik how to link with xammp

cedar tiger Aug 12, 2024, 1:58 PM

#

candid mica hello does anyone know to link sqlserver with django python

!pip mssql-django was suggested to you. Did you try it?

delicate fieldBOT Aug 12, 2024, 1:58 PM

#

mssql-django v1.5

Django backend for Microsoft SQL Server

Released on <t:1714172280:D>.

last panther Aug 12, 2024, 3:35 PM

#

ok so i'm in a debate with this guy at work for large mysql tables. 😦

#

to either partition large table or create explicit new archive table

coral wasp Aug 12, 2024, 4:03 PM

#

last panther to either partition large table or create explicit new archive table

I try to keep history separate from 'current' data, rather than using a single table for both.

#

One big reason is recovery: I don't want to have to restore massive tables to get my app back online.

last panther Aug 12, 2024, 4:05 PM

#

coral wasp I try to keep history separate from 'current' data, rather than using a single t...

So you would rather write unions or seperate queries.

coral wasp Aug 12, 2024, 4:06 PM

#

No, why would I need to union? If I have some operational data that I also need a history for, I keep it separate. So I only need to query one side

#

I'm eating the cost of two writes for the benefit of query time

last panther Aug 12, 2024, 4:07 PM

#

We have an app that would have to query the current and archive .

coral wasp Aug 12, 2024, 4:09 PM

#

We're just talking abstractly right now, could you explain more?

last panther Aug 12, 2024, 4:11 PM

#

we have a large table. and we want to either put records < 2020 in a different table or create a partition on the table by the date field with <2020

#

We have a use case that would have to scan all partitions or union all tables.

#

a query by only id and not a date

coral wasp Aug 12, 2024, 4:20 PM

#

Nothing wrong with just throwing a partition on the year of a history table. Makes it easy to drop a year, or recover specific years.

#

I've done both tho, but problem with a year specific table is needing another one in a few years,
And so on

#

Lately, I've just been putting history in hive partitioned parquet files, but that's a different story

last panther Aug 12, 2024, 4:23 PM

#

coral wasp I've done both tho, but problem with a year specific table is needing another on...

Yes. It's alot easier to throw a partion than to create new tables every year.

#

Models don't have to be updated on all the applications.

coral wasp Aug 12, 2024, 4:29 PM

#

MySQL also isn't so great at parallelizing queries, so I don't love it for large historical analytical queries

last panther Aug 12, 2024, 4:30 PM

#

This company has been using mysql since the 90's

coral wasp Aug 12, 2024, 4:30 PM

#

A lot of my MySQL work was working around its scale limitations, a lot of denormalization and 'manual' parallelization

#

Oh, I ran a SaaS backend on MySQL, sharded the heck out of it. We got it done, just needed a lot of love

last panther Aug 12, 2024, 4:40 PM

#

Thanks for help.

brisk flame Aug 12, 2024, 8:02 PM

#

wise goblet but at the same time it could be nice having tool like Atlas that defines tables...

yess sir, I'm thinking about simple table-maker tool with Atlas support. At this stage of development it's dataclass with SQL-style syntax, example:

class Color(BaseTable):
    name: VARCHAR(max=50)
    rgb: VARCHAR(max=6)

torn sphinx Aug 12, 2024, 8:04 PM

#

is this stupid

#

i e slow

#

class DuckDBLogger:
    def __init__(self):
        self.con = duckdb.connect()
        self.con.sql("CREATE TABLE metrics (name VARCHAR PRIMARY KEY)")
        self.con.sql("CREATE TABLE steps (step INTEGER PRIMARY KEY)")
        self.con.sql("CREATE TABLE logs (metric VARCHAR REFERENCES metrics(name), step INTEGER REFERENCES steps(step), value FLOAT)")
        
    def log(self, step, metric, value):
        self.con.sql(f"INSERT OR IGNORE INTO metrics VALUES ('{metric}')")
        self.con.sql(f"INSERT OR IGNORE INTO steps VALUES ({step})")
        self.con.sql(f"INSERT INTO logs VALUES ('{metric}', {step}, {value})")

#

i mean yes it is slow but is it slow because it is badly written or because databases won't help with that

#

its actually 5688.87785007 times slower than a nested python dictionary

coral wasp Aug 12, 2024, 8:36 PM

#

torn sphinx is this stupid

Yes, don't use duckdb to write logs.

#

It's the wrong kind of database. Use almost any other database.

#

Preferably, just stream to a .csv file.

#

Also, never write SQL queries with f-strings like that:

#

!sql

delicate fieldBOT Aug 12, 2024, 8:37 PM

#

SQL & f-strings

Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.

Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.

For example, the sqlite3 package supports using ? as a placeholder:

query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)

Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.

See Also

Python sqlite3 docs - How to use placeholders to bind values in SQL queries
PEP-249 - A specification of how database libraries in Python should work

fallen vault Aug 12, 2024, 9:07 PM

#

Would it be bad practice to have my database open to all IP addresses when setting up a azure Postgres database?

thorny anchor Aug 12, 2024, 10:26 PM

#

yes

frank cloak Aug 13, 2024, 1:24 AM

#

table.sql is like this:

CREATE TABLE IF NOT EXISTS BLACKLIST(
  user_id BIGINT NOT NULL PRIMARY KEY,
  reason TEXT,
);

vs something like this:

https://paste.pythondiscord.com/I2QA

I am trying to convert my old schema.sql to the original table.sql

#

I could probaly find the old things

#

but it would be way easier to conver it from the schema.sql to table.sql

#

and I'll send the schema.sql if someone replies with ping only
(so I know you responded)

torn sphinx Aug 13, 2024, 6:43 AM

#

sqlite is still 16.4110429335 times slower than python nested dictionary for logging

torn sphinx Aug 13, 2024, 6:44 AM

#

coral wasp Preferably, just stream to a .csv file.

I would need to know all metrics beforehand which is a little bit annoying

#

next up I try redis

#

nooooo they dont have a redis tutorial on w3s how am I going to be able to understand it

#

"Redis is not officially supported on Windows"

#

we will see about that

torn sphinx Aug 13, 2024, 7:07 AM

#

torn sphinx "Redis is not officially supported on Windows"

it was actually true💀

paper flower Aug 13, 2024, 7:19 AM

#

torn sphinx sqlite is still 16.4110429335 times slower than python nested dictionary for log...

You can't compare sqlite with a dictionary and redis they all offer different functionality

torn sphinx Aug 13, 2024, 7:21 AM

#

paper flower You can't compare sqlite with a dictionary and redis they all offer different fu...

I am comparing their performance for my logging

paper flower Aug 13, 2024, 7:21 AM

#

How would dictionary be useful for logging though? 🤔

torn sphinx Aug 13, 2024, 7:22 AM

#

paper flower How would dictionary be useful for logging though? 🤔

this is what im using

{
"metric1":{
  1: 100,
  4: 200,
  7: 300,
  },
"metric2":{
  2: 15,
  3: 25,
  },
}

paper flower Aug 13, 2024, 7:22 AM

#

Well, would they be persistent anywhere?

#

How would you fetch them?

#

What operations would you perform on these logs? E.g. finding averages, means, etc

#

Writing logs/metrics isn't that useful unless you can do something with them later

torn sphinx Aug 13, 2024, 7:24 AM

#

paper flower What operations would you perform on these logs? E.g. finding averages, means, e...

I just do it with python, convert it to numpy and do stuff return np.array(list(self.logs[key].values()), copy = False)

#

fetching is not an issue, its instant, only writing

paper flower Aug 13, 2024, 7:25 AM

#

If your application crashes or just exists you lose all of your data though

torn sphinx Aug 13, 2024, 7:25 AM

#

oh yeah I save logs into numpy savez compressed

paper flower Aug 13, 2024, 7:25 AM

#

Also memory could be a concern depending on how long your application runs for and how many metrics you collect

torn sphinx Aug 13, 2024, 7:25 AM

#

true as well

#

but I am trying to find an alternative and everything I tried is much slower

paper flower Aug 13, 2024, 7:26 AM

#

torn sphinx oh yeah I save logs into numpy savez compressed

That's why using something like sql is just easier than to deal with numpy, etc

torn sphinx Aug 13, 2024, 7:31 AM

#

paper flower That's why using something like sql is just easier than to deal with numpy, etc

its actually very easy with numpy, because it savez_compressed basically saves a dictionary, ease of use is not an issue, its mainly writing performance

paper flower Aug 13, 2024, 7:31 AM

#

I mean, you'd have to know how that was packaged, etc, and it would only be accessible with python

#

sql is much easier to use

#

Also regarding write performance - you didn't share how you actually benchmarked it
also I'm not completely sure if you want metrics or logs

torn sphinx Aug 13, 2024, 7:32 AM

#

thats sort of true about sql, how its packaged depends on database and databases can have different column names

#

this is my benchmarking code

dlogger = DictLogger()
from glio.python_tools import perf_counter_context
with perf_counter_context():
    for i in range(1_000_000):
        dlogger.log(i, "loss", 1.)
        dlogger.log(i, "accuracy", 1.)
        dlogger.log(i, "dice", 1.)
        dlogger.log(i, "iou", 1.)
        if i % 2 == 0: dlogger.log(i, "loss2", 1.)
        if i % 2 == 0: dlogger.log(i, "acc2", 1.)
        if i % 4 == 0: dlogger.log(i, "dice4", 1.)
        if i % 4 == 0: dlogger.log(i, "iou4", 1.)
        if i % 10 == 0: dlogger.log(i, "loss10", 1.)
        if i % 10 == 0: dlogger.log(i, "dice10", 1.)

paper flower Aug 13, 2024, 7:33 AM

#

So, it's just logs

#

not some sort of metrics?

torn sphinx Aug 13, 2024, 7:33 AM

#

well it logs metrics

#

dictionary is 2.6316637999843806 perf_counter seconds
sqlite is 44.37015440012328 perf_counter seconds
and duckdb I had to reduce number of iterations to 1000 and it still took 11.377755700144917 perf_counter seconds

paper flower Aug 13, 2024, 7:35 AM

#

Imo there's a difference between metrics and logs, logs usually are just strings, when metrics could be something more structured 🤔 Honestly I don't have a good explanation for that

#

For logs you can just use logging

torn sphinx Aug 13, 2024, 7:36 AM

#

paper flower Imo there's a difference between metrics and logs, logs usually are just strings...

it logs metrics, I just put "1." as the value everywhere for performance testing

#

but 1 is replaced by the value of the metric

#

dlogger.log(i, "accuracy", 1.) means at step i metric accuracy was 1

paper flower Aug 13, 2024, 7:37 AM

#

Seeing your duckdb code - I don't think you need two of those tables at all? 🤔

#

Also you're not batching your operations in any way

#

Why insert records one after the other in different sql operations when you can insert like 10 thousands of them at once

torn sphinx Aug 13, 2024, 7:38 AM

#

paper flower Seeing your duckdb code - I don't think you need two of those tables at all? 🤔

is it possible to have 1 table if some metrics are not logged at some steps? Like some metrics are logged only once every 128 stepzs

paper flower Aug 13, 2024, 7:39 AM

#

In your code you just insert the same values into metrics and steps tables, so they're kind of redundant, no?

torn sphinx Aug 13, 2024, 7:40 AM

#

paper flower In your code you just insert the same values into `metrics` and `steps` tables, ...

probably

paper flower Aug 13, 2024, 7:41 AM

#

Can you share your dict code too?

torn sphinx Aug 13, 2024, 7:41 AM

#

class DictLogger:
    def __init__(self):
        self.logs = {}

    def log(self, step, metric, value):
        if metric not in self.logs: self.logs[metric] = {step: value}
        else: self.logs[metric][step] = value

near tapir Aug 13, 2024, 7:45 AM

#

Miscellaneous Data Types

Transact-SQL supports several data types that do not belong to any of the data type groups described previously:

Binary data types
BIT
Large object data types
CURSOR (discussed in Chapter 8)
UNIQUEIDENTIFIER
SQL_VARIANT
TABLE (discussed in Chapters 5 and 8)
XML (discussed in the previous edition of this book)
Spatial (e.g., GEOGRAPHY and GEOMETRY) data types (discussed in Chapter 30)
HIERARCHYID
TIMESTAMP
User-defined data types (discussed in Chapter 5)

Why is TIMESTAMP included in miscellaneous? I thought it would be a temporal data type

torn sphinx Aug 13, 2024, 7:54 AM

#

is this correct in sqlite "INSERT INTO logs (step, ?) VALUES (?, ?)"? It says OperationalError: near "?": syntax error, even though I passed it a length 3 list

#

before I had "?" not being replaced because of quotes

paper flower Aug 13, 2024, 7:54 AM

#

torn sphinx ```py class DictLogger: def __init__(self): self.logs = {} def ...

For 100k records with sqlite it takes me 15 seconds, and with dict logger 0.05, did youconsider how much time does it take for numpy to do the whole processing too? I don't think it's that slow though

#

Also you can always send logs in the background thread or make use of asyncio

torn sphinx Aug 13, 2024, 7:56 AM

#

paper flower For 100k records with sqlite it takes me 15 seconds, and with dict logger 0.05, ...

I didn't benchmark numpy processing because it seems instant

paper flower Aug 13, 2024, 7:56 AM

#

torn sphinx is this correct in sqlite `"INSERT INTO logs (step, ?) VALUES (?, ?)"`? It says ...

I didn't write raw sql in a while

#

(I was using an orm for this example, so this may be the reason for it being slow 😅)

torn sphinx Aug 13, 2024, 8:00 AM

#

torn sphinx is this correct in sqlite `"INSERT INTO logs (step, ?) VALUES (?, ?)"`? It says ...

it was because one of the substituted strings was empty

#

still weird, why would this not work self.cur.execute("ALTER TABLE logs ADD COLUMN ? FLOAT", [metric, ]), it says OperationalError: near "?": syntax error, even though metric = "a", why would it not subsititute

#

this works fine self.cur.execute(f"ALTER TABLE logs ADD COLUMN {metric} FLOAT")

#

man substituting is so weird in sqlite

paper flower Aug 13, 2024, 8:10 AM

#

Just tested, and simply inserting all records into a sqlite db shouldn't take that long

#

import time
from collections import defaultdict

from sqlalchemy import create_engine, insert
from sqlalchemy.orm import Mapped, mapped_column, \
    sessionmaker, DeclarativeBase


class Base(DeclarativeBase):
    pass


class Metric(Base):
    __tablename__ = "metric"

    id: Mapped[int] = mapped_column(primary_key=True)
    iteration: Mapped[int]
    name: Mapped[str]
    value: Mapped[float]


engine = create_engine("sqlite:///db.sqlite3")
session_factory = sessionmaker(bind=engine)


class DictLogger:
    def __init__(self):
        self.logs = defaultdict(dict)

    def log(self, step, metric, value):
        self.logs[metric][step] = value


def main() -> None:
    Base.metadata.create_all(bind=engine)

    t = time.perf_counter()
    with engine.begin() as connection:
        logger = DictLogger()
        for i in range(100_000):
            logger.log(i, "loss", 1.)
            logger.log(i, "accuracy", 1.)
            logger.log(i, "dice", 1.)
            logger.log(i, "iou", 1.)
            if i % 2 == 0: logger.log(i, "loss2", 1.)
            if i % 2 == 0: logger.log(i, "acc2", 1.)
            if i % 4 == 0: logger.log(i, "dice4", 1.)
            if i % 4 == 0: logger.log(i, "iou4", 1.)
            if i % 10 == 0: logger.log(i, "loss10", 1.)
            if i % 10 == 0: logger.log(i, "dice10", 1.)

        values = []
        for metric, obj in logger.logs.items():
            for step, value in obj.items():
                values.append({"name": metric, "iteration": step, "value": value})

        connection.execute(insert(Metric), values)
    print(time.perf_counter() - t)


if __name__ == "__main__":
    main()

#

You can swith to just sqlite here, but I'm just used to sqlalchemy

#

It still takes around 10 times longer, which is to be expected, but you can parallelize your operations and writes

torn sphinx Aug 13, 2024, 8:14 AM

#

I just tried 1 table batched yeah it takes 10 times longer

#

I need to try asyncio now

paper flower Aug 13, 2024, 8:19 AM

#

asyncio would be beneficial if you just do that in the background

coral wasp Aug 13, 2024, 11:32 AM

#

torn sphinx still weird, why would this not work `self.cur.execute("ALTER TABLE logs ADD COL...

You can't parameterize colunn or table names. Just values.

torn sphinx Aug 13, 2024, 11:33 AM

#

coral wasp You can't parameterize colunn or table names. Just values.

Oh, do I need to somehow avoid f strings there though?

coral wasp Aug 13, 2024, 11:34 AM

#

torn sphinx Oh, do I need to somehow avoid f strings there though?

You can't avoid them for dynamic columns, but you'd need to sanitize the inputs to avoid an injection attack.

pastel vale Aug 13, 2024, 3:23 PM

#

torn sphinx Oh, do I need to somehow avoid f strings there though?

And make sure to put your variables in brackets for your placeholders e.g. (var1, ) because for some reason it wont recognise your variable as one variable but as several different variables

#

Someone here, much smarter than me, will explain it better

whole mica Aug 13, 2024, 3:41 PM

#

I am currently using MongoDB for saving data and PyMongo in Python.
Here is what my data look like:

{
    "_id": 123,
    "weapons": [
        { "name": "Sword", "durability": 100 },
        { "name": "Shield", "durability": 50 }
    ]
}

{
    "_id": 456,
    "weapons": [
        { "name": "Sword", "durability": 100 },
        { "name": "Shield", "durability": 0 }
    ]
}

{
    "_id": 789,
    "weapons": [
        { "name": "Sword", "durability": 50 },
        { "name": "Hammer", "durability": 50 }
    ]
}

how can I filter out a list of "_id" that does not have a "name": "Hammer" in its "weapons" ( i.e. [ 123, 456 ] )?

coral wasp Aug 13, 2024, 3:58 PM

#

pastel vale And make sure to put your variables in brackets for your placeholders e.g. (var1...

(Var1,) is a tuple containing one element. (Var1) is just var1

#

The trailing comma makes a difference

cedar tiger Aug 13, 2024, 4:02 PM

#

whole mica I am currently using MongoDB for saving data and PyMongo in Python. Here is wha...

Are each of these _id in a list?

whole mica Aug 13, 2024, 4:11 PM

#

cedar tiger Are each of these _id in a list?

yes

#

or i think it can be a cursor using find() because i would like to add a hammer to each of those who dont have it

cedar tiger Aug 13, 2024, 4:12 PM

#

then loop over each dictionary ids and loop over each weapons name to find the Hammer. If its not Hammer, add it to a new list

#

oh you wanna add a hammer to ids that don't have it

whole mica Aug 13, 2024, 4:13 PM

#

cedar tiger then loop over each dictionary ids and loop over each weapons name to find the H...

had this idea but would it be slow if there are many entries? im not sure about it 👀

whole mica Aug 13, 2024, 4:13 PM

#

cedar tiger oh you wanna add a hammer to ids that don't have it

yep lol

cedar tiger Aug 13, 2024, 4:13 PM

#

yeah then I guess do that double loop I mentioned and then append hammer to it after

cedar tiger Aug 13, 2024, 4:13 PM

#

whole mica had this idea but would it be slow if there are many entries? im not sure about ...

Yeah it would. Easier if its in SQL.

whole mica Aug 13, 2024, 4:14 PM

#

cedar tiger Yeah it would. Easier if its in SQL.

oof alright thanks lemme have a try

grim vault Aug 13, 2024, 4:35 PM

#

whole mica oof alright thanks lemme have a try

!e

data_list = [
  {"_id": 123, "weapons": [{"name": "Sword", "durability": 100}, {"name": "Shield", "durability": 50}]},
  {"_id": 456, "weapons": [{"name": "Sword", "durability": 100}, {"name": "Shield", "durability": 0}]},
  {"_id": 789, "weapons": [{"name": "Sword", "durability": 50}, {"name": "Hammer", "durability": 50}]}
]


def check_for_weapon(weapons_list, weapon_name):
    for weapon in weapons_list:
        if weapon["name"].upper() == weapon_name.upper():
            return True
    return False


weapon_all_should_have = "Hammer"
for row in data_list:
    if not check_for_weapon(row["weapons"], weapon_all_should_have):
        print(f"{row['_id']} needs a {weapon_all_should_have}")

delicate fieldBOT Aug 13, 2024, 4:35 PM

#

grim vault !e ```py data_list = [ {"_id": 123, "weapons": [{"name": "Sword", "durability"...

:white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | 123 needs a Hammer
002 | 456 needs a Hammer

whole mica Aug 13, 2024, 4:36 PM

#

grim vault !e ```py data_list = [ {"_id": 123, "weapons": [{"name": "Sword", "durability"...

😮 thanks

#

so it still needs a double loop

paper flower Aug 14, 2024, 10:36 AM

#

whole mica I am currently using MongoDB for saving data and PyMongo in Python. Here is wha...

@whole mica

db.collection.find({
  "weapons.name": {
    "$nin": [
      "Hammer"
    ]
  }
})

#

https://www.mongodb.com/docs/manual/reference/operator/query/nin/#op._S_nin

$nin

#

https://mongoplayground.net/p/8SijSeQjJi4

Mongo playground

Mongo playground: a simple sandbox to test and share MongoDB queries online

whole mica Aug 14, 2024, 10:39 AM

#

paper flower <@673484505771802635> ```py db.collection.find({ "weapons.name": { "$nin"...

:O thanks a lot!!! by the way is there a way to search for the value instead of the key, like searching for those with a durability of 50 (i.e. [123, 789])? 🤔

paper flower Aug 14, 2024, 10:39 AM

#

I think you can add an and condition, too there

#

No idea how to do it 😅

#

I don't use mongodb at all, mostly just sql

#

To only get ids you can add a projection:

db.collection.find({
  "weapons.name": {
    "$nin": [
      "Hammer"
    ]
  }
},
{
  "_id": 1
},
)

paper flower Aug 14, 2024, 10:41 AM

#

whole mica :O thanks a lot!!! by the way is there a way to search for the value instead of ...

You want a specific weapon with a durability of 50 or just any weapon?

#

e.g. sword with durability 50

whole mica Aug 14, 2024, 10:42 AM

#

paper flower You want a specific weapon with a durability of 50 or just any weapon?

any weapon

paper flower Aug 14, 2024, 10:42 AM

#

Just

db.collection.find({
  "weapons.durability": 50
},
)

?

#

It's kind of the same

#

Seems like in firs query you can do the same instead of using nin operator

#

Ah, no, nvm 😅

whole mica Aug 14, 2024, 10:44 AM

#

paper flower Just ```py db.collection.find({ "weapons.durability": 50 }, ) ``` ?

wait :O it works!!!

#

i thought like weapons.name and weapons.durability wont work if weapons is a list of objects, but came out i was wrong

paper flower Aug 14, 2024, 10:45 AM

#

If you want a specific weapon youl could use $elemMatch:

db.collection.find({
  "weapons": {
    "$elemMatch": {
      "name": "Sword",
      "durability": 50
    }
  }
},
)

#

And you should be able to use comparison operators too instead of just 50 if you need that

whole mica Aug 14, 2024, 10:46 AM

#

thanks a lot!! 🙌

west blaze Aug 14, 2024, 12:42 PM

#

Hi! Can someone tell me if it makes sense in context of django/psql to create a generic model class with fields like created_at, deleted_at and perhaps some helper classmethods like create/delete for other actual models to inherit from?

paper flower Aug 14, 2024, 12:44 PM

#

west blaze Hi! Can someone tell me if it makes sense in context of django/psql to create a ...

A mixin class would probably do, but you don't need a create/delete method here 👀

#

Django's managers already have create/delete methods, and created_at is supported by django itself

#

For deleted objects you can just mark them as deleted instead of calling delete

west blaze Aug 14, 2024, 12:51 PM

#

paper flower Django's managers already have create/delete methods, and created_at is supporte...

Hm, thats wierd. I created a model, made a migration that added the model itself, than made a data migration to populate the table of the said model and I don't have a created_at column 👀

the only column that was actually autogenerated was id PKey

west blaze Aug 14, 2024, 12:52 PM

#

paper flower A mixin class would probably do, but you don't need a `create/delete` method her...

Ah, I see

paper flower Aug 14, 2024, 12:52 PM

#

west blaze Hm, thats wierd. I created a model, made a migration that added the model itself...

The column you have to add yourself, but there's a parameter for DateTimeField 🤔

keen ravine Aug 14, 2024, 3:03 PM

#

I am starting to work on a timescaledb project(db built on top of postgres for time series). I am working on my wsl2. I just pulled the official image. I want to know if I am not using python, how do I work on package management and dependency issues? My PC is trash so I do not want to trash it further by keep installing on my local, and I do not want to use virtual machine.

storm mauve Aug 14, 2024, 3:06 PM

#

Which "package management and dependency issues" exactly?

It depends heavily on which manager you are using

fading patrol Aug 14, 2024, 4:20 PM

#

keen ravine I am starting to work on a timescaledb project(db built on top of postgres for t...

If a venv isn't good enough, then sounds like Docker is what you need

keen ravine Aug 14, 2024, 4:36 PM

#

fading patrol If a venv isn't good enough, then sounds like Docker is what you need

Someone told me to use vscode devcontainers, I want to develop, scale and deploy the project, I dont understand why I should run as entire os on docker just to code.

fading patrol Aug 14, 2024, 4:54 PM

#

keen ravine Someone told me to use vscode devcontainers, I want to develop, scale and deploy...

Why should you run a container? Precisely because you said you're having trouble managing dependencies and that you don't want to install things locally. What other solution could there be?

I'm assuming you've already looked at venv. If that's good enough, then that's what you should use. Maybe layer Poetry on top of that if you really want. But to me Devcontainers is just simple and easy.

#

BTW, this has absolutely nothing to do with this channel, please move it over to #tools-and-devops if you have more questions

keen ravine Aug 14, 2024, 5:33 PM

#

fading patrol Why should you run a container? Precisely because you said you're having trouble...

Thanks

dry cipher Aug 15, 2024, 12:12 AM

#

how can I use python to create an edit a txt file, for example I can put save data. however the open() wont let me do it is there a way to do it without root access

fierce kite Aug 15, 2024, 12:33 AM

#

dry cipher how can I use python to create an edit a txt file, for example I can put save da...

open can write to a text file. What are you unable to do?

dry cipher Aug 15, 2024, 12:35 AM

#

welll

fierce kite Aug 15, 2024, 12:39 AM

#

Looks like permissions for the location you are saving the file. I haven't run into that issue.

dry cipher Aug 15, 2024, 12:40 AM

#

hmmm its in a foulder that i created

fierce kite Aug 15, 2024, 12:43 AM

#

Maybe it is the permission level of the python interpreter

#

Is the file being saved in the same location as the .py file?

dry cipher Aug 15, 2024, 1:09 AM

#

yea

obsidian basin Aug 15, 2024, 1:26 AM

#

Hello I am using flask and flask sqlachemy. To create my db I am using flask migrate I am wondering if there are better and more advance ways like postgresSQL. I am going to research postgresSQL more but will it work with flask and create the db with flask sqlachemy ? Also is it easy to setup in flask.

wise goblet Aug 15, 2024, 2:01 AM

#

obsidian basin Hello I am using flask and flask sqlachemy. To create my db I am using flask mig...

postgresql is supported by sqlalchemy

#

but your migrations created for Sqlite3 highly likely will not work for it 😉

#

so you would have to "regenerate" migrations

obsidian basin Aug 15, 2024, 2:02 AM

#

If I am understanding correctly that okay I haven't launched a site so it is nothing important

#

Thanks

#

Do you think postresql is the best tool to use

wise goblet Aug 15, 2024, 2:15 AM

#

obsidian basin Do you think postresql is the best tool to use

that depends on a project and somewhat even programming language.
for web related pet projects which have user database, Sqlite3 is kind of more lazy option to go for as default
if you utilize a good programming language and project is simple, depending on its data... less conventional dbs like redis or mongodb can fit too, or even just storing in memory of a program (+using some of its native storing in file) and that's enough too. some projects run on stuff like etcd and etc, it is very app specific.
if you develop backend api for production for work (once again we have user database we wish to persist), then postgres is good default.

TLDR: it depends on project requirements.

#

Postgresql is nice in allowing to bruteforce its usability for everything

#

but... for simple enough pet projects it is too much, if u are lazy to maintain it, sqlite3 is better then

obsidian basin Aug 15, 2024, 2:58 AM

#

Thanks it is just a pet project which I hope grows but might not , I am just an amateur

paper flower Aug 15, 2024, 6:18 AM

#

wise goblet but your migrations created for Sqlite3 highly likely will not work for it 😉

Alembic migrations generated for sqlite generally should work with postgres

#

But postgres ones if you postgres specific data types won't

tawny orchid Aug 15, 2024, 7:55 PM

#

What should I do to make this query go fast?

SELECT * FROM words
WHERE
    (book_number, chapter_number, verse_number) IN (SELECT book_number, chapter_number, verse_number FROM words WHERE word LIKE 'jeesus%')
AND
    (book_number, chapter_number, verse_number) IN (SELECT book_number, chapter_number, verse_number FROM words WHERE word LIKE 'elää%');

Db is sqlite, and it takes over 200ms to respond from backend. I mean to run it on every key-stroke, though, but at least once second (or two 😄 ). ChatGPT suggested some bizarre join-gadget which was meant to eliminate subqueries, but it just worsened it. :O.

storm mauve Aug 15, 2024, 8:06 PM

#

tawny orchid What should I do to make this query go fast? ```sql SELECT * FROM words WHERE ...

use full text search instead of just matching ~~regex~~ LIKE on a text column

https://www.sqlite.org/fts5.html

tawny orchid Aug 15, 2024, 8:06 PM

#

i don't use regex. But I would need to transform the database

storm mauve Aug 15, 2024, 8:07 PM

#

How many rows do you have in that table?

tawny orchid Aug 15, 2024, 8:08 PM

#

bible words

storm mauve Aug 15, 2024, 8:08 PM

#

How many rows does that amounts to in your db?

tawny orchid Aug 15, 2024, 8:09 PM

#

some 54k

#

oops

#

*10

storm mauve Aug 15, 2024, 8:10 PM

#

so you expect to perform around a million string comparisons for each keystroke?

idk if there is some simpler way to create a text index, but I am pretty sure you will have to transform the database

grim vault Aug 15, 2024, 8:12 PM

#

This could be faster: ~~deleted~~ doesn't work like I thought.

tawny orchid Aug 15, 2024, 8:13 PM

#

Mmm... i really like the idea of of having flexible search i get with this. One can write words, and the wording need not be exact

storm mauve Aug 15, 2024, 8:13 PM

#

also, side note: storing each word as a separate row sounds like a pretty bad idea in first place but idk the best/right way to do it besides "use FTS"

tawny orchid Aug 15, 2024, 8:13 PM

#

full text search doesn't give that?

#

fts?

storm mauve Aug 15, 2024, 8:14 PM

#

the thing I linked earlier

tawny orchid Aug 15, 2024, 8:14 PM

#

ah

#

yes

#

i mean word counts if you start it correctly. It has to be 3 or more chars (or it won't generate sql for that). Words can be out of order too

#

😮 So i prefer not to have full text search, because this is more flexible

#

maybe i just don't send every keystroke, but instead set a timer and after a time make query

#

maybe im just imagining things and i should change the db ;D

storm mauve Aug 15, 2024, 8:17 PM

#

at least consider de-duplicating if you haven't yet I guess

tawny orchid Aug 15, 2024, 8:17 PM

#

(just making sure)

#

i haven't :< I just edited something and plöräytin it from pandas to db ;D

waxen finch Aug 15, 2024, 8:38 PM

#

tawny orchid i mean word counts if you start it correctly. It has to be 3 or more chars (or i...

i believe you can do that with FTS too using the trigram tokenizer and OR boolean operators, although i'd need a benchmark to say if it's a faster approach

waxen finch Aug 15, 2024, 9:05 PM

#

tawny orchid What should I do to make this query go fast? ```sql SELECT * FROM words WHERE ...

also, could this query be simplified to SELECT * FROM words WHERE word LIKE 'jeesus%' OR word LIKE 'elää%', or am i misunderstanding it?
edit: oh oops, i misread the AND operator in your query

tawny orchid Aug 16, 2024, 12:30 AM

#

i forgot --release 😮 It's still slow, but 3x faster still than before. Thanks for fts links 😄 I just kind of assumed it wasn't suitable and thus made the db one word per line. Why is it always I only learn after the mistakes. 😄

paper flower Aug 16, 2024, 3:12 AM

#

tawny orchid What should I do to make this query go fast? ```sql SELECT * FROM words WHERE ...

Honestly I don't undrestand this query 🤔 You're trying to find a word that contains search terms?

#

Hm, you could check how long it takes to just do a select with specific book_number, chapter_number and verse_number?

#

Maybe filtering is just slow

tawny orchid Aug 16, 2024, 6:19 AM

#

It searches bible verses that contain both search terms

cursive phoenix Aug 16, 2024, 7:52 AM

#

Hello All, I have a challenge in implementing a scenario. I have 2 databases. first is updating with data for every 10 min interval and the second database is updating 3 times a day. I have to compare data and similarities between these 2 databases. which tools fit for this scenario and how to do it

delicate fieldBOT Aug 16, 2024, 11:36 AM

#

:warning: Your 3.12 eval job has completed with return code 0.

[No output]

fading patrol Aug 16, 2024, 11:56 AM

#

cursive phoenix Hello All, I have a challenge in implementing a scenario. I have 2 databases. fi...

If these don't absolutely need to be two separate databases, they probably shouldn't be. Why are they separate?

If you really need to sync two unrelated databases with two different schemas there are discussions online about the possibilities like: https://softwareengineering.stackexchange.com/questions/288370/best-way-to-synchronize-data-between-two-different-databases

Software Engineering Stack Exchange

Best way to synchronize data between two different databases

I have to implement data synchronization between two big databases which have completely different structures. Basically, I need to grap some data about products in different tables in the first da...

west blaze Aug 16, 2024, 5:39 PM

#

Hey everyone! I got a quick question about the json module.


def populateLocations(apps, schema_editor):
    file = open('./data/geo_list_1.json')
    locations_data = json.load(file)

    for location_data in locations_data:      
        
        location = location_data
        Location.populate(**location_data)
    file.close()

Do I need to .close() the file after the loop, as shown above, or can I .close() the file right after it loads?
Also are there any constraints keeping the file "open" for a while?

grim vault Aug 16, 2024, 6:32 PM

#

You can close it after the load but maybe use the with statement:

def populateLocations(apps, schema_editor):
    with open('./data/geo_list_1.json') as file:
        locations_data = json.load(file)

    for location_data in locations_data:      
        Location.populate(**location_data)

tranquil island Aug 16, 2024, 10:12 PM

#

writes full stack app in lua wishing I could use python

near tapir Aug 17, 2024, 12:28 PM

#

Hey guys I understood the basic concept of CLUSTERED and NONCLUSTERED in MS SQL but I am unsure when's the right time to use either of the two. And I'm wondering if there are any disadvantages between them?

coral wasp Aug 17, 2024, 3:01 PM

#

Clustered indices?

near tapir Aug 18, 2024, 1:42 AM

#

coral wasp Clustered indices?

yep

#

In Transact-SQL, the names of all data types and system functions, such as CHARACTER and
INTEGER, are not reserved keywords. Therefore, they can be used to denote objects. (Do not
use data types and system functions as object names! Such use makes Transact-SQL statements
difficult to read and understand.)

They say it isn't a reserved keyword but why is it highlighted in my case? (I'm using SQL Server)

torn sphinx Aug 18, 2024, 1:07 PM

#

mongoDb is the best ngl

fading patrol Aug 18, 2024, 1:37 PM

#

torn sphinx mongoDb is the best ngl

https://www.youtube.com/watch?v=b2F-DItXtZs

YouTube

gar1t

Episode 1 - Mongo DB Is Web Scale

Q&A discussion discussing the merits of No SQL and relational databases.

▶ Play video

brazen charm Aug 18, 2024, 1:46 PM

#

old but gold

cinder spear Aug 18, 2024, 2:50 PM

#

thorny anchor Aug 18, 2024, 3:22 PM

#

near tapir > In Transact-SQL, the names of all data types and system functions, such as CHA...

it's like how you can use list in python as a variable name, but you really shouldn't

near tapir Aug 19, 2024, 10:54 AM

#

Hey guys, I'm trying to use MS SQL but there are so many variations? I don't know which one to choose.

astral harbor Aug 19, 2024, 11:03 AM

#

Hello there

#

I hope you are all doing well.

#

I am new and I need help, can someone please help me? I have a problem with relationships in my sqlalchemy database.

#

fading patrol Aug 19, 2024, 1:09 PM

#

near tapir Hey guys, I'm trying to use MS SQL but there are so many variations? I don't kno...

Who is managing your database? Ask them how you're supposed to connect, not us. But my guess would be TCP/IP

fading patrol Aug 19, 2024, 1:10 PM

#

astral harbor

It's warning you that whatever is at line 31 of your file is depricated. It should still work for now but you should put that error into a web search to find the relevant documentation, and then look at your code

near tapir Aug 19, 2024, 1:12 PM

#

fading patrol Who is managing your database? Ask them how you're supposed to connect, not us. ...

i'm managing the database

fading patrol Aug 19, 2024, 1:17 PM

#

near tapir i'm managing the database

So unless you set up one of the other options, TCP/IP

astral harbor Aug 19, 2024, 2:40 PM

#

fading patrol It's warning you that whatever is at line 31 of your file is depricated. It shou...

Thank you very much, the problem is already solved.

deep heron Aug 19, 2024, 5:15 PM

#

I could use a pointer, if someone knows, on how to properly handle excluding zero dates (0000-00-00) and NULL dates in MariaDB using SQLAlchemy.

def _build_search_query(self, table, fields, term, like):
        term_string = str(term)
        search_conditions = []
        query = smc_database.sql.select(table)
        
        for field_name in fields:
            term = term_string
            field_attribute = getattr(table, field_name, None)
            if field_attribute:
                field_type = self._get_field_type(table, field_name).lower()
                if "int" in field_type or "float" in field_type or "decimal" in field_type:
                    try:
                        term = float(term) if "." in term else int(term)
                        search_conditions.append(field_attribute == term)
                    except ValueError:
                        continue
                elif "date" in field_type or "time" in field_type:
                    term = self._parse_date(term)
                    if term:
                        search_conditions.append(
                            smc_database.sql.and_(
                                field_attribute == term,
                                field_attribute != None,
                                field_attribute != "0000-00-00"
                            )
                        )
                else:
                    if like:
                        search_conditions.append(field_attribute.like(f"%{term}%"))
                    else:
                        search_conditions.append(field_attribute == term)
        
        if search_conditions:
            query = query.where(smc_database.sql.or_(*search_conditions))#.limit(500)
        print(f"\n{query}\n")
        return query

#

The part in question is elif date. If I understand sql correctly, I think the generated sql query is correct. (Excerpt)

WHERE `SMC_Billing`.`Customer_No` = ? OR `SMC_Billing`.`Invoice_Date` = ? AND `SMC_Billing`.`Invoice_Date` IS NOT NULL AND `SMC_Billing`.`Invoice_Date` != ? OR `SMC_Billing`.`Item_No` = ?   
2024-08-19 12:11:41,935 INFO sqlalchemy.engine.Engine [generated in 0.00169s] ('2016-09-23', datetime.date(2016, 9, 23), '0000-00-00', '2016-09-23')

paper flower Aug 19, 2024, 10:41 PM

#

deep heron I could use a pointer, if someone knows, on how to properly handle excluding zer...

Why excatly this function is so complex? 👀 Do you want to exclude these dates only on specific query or on all of them?

deep heron Aug 20, 2024, 1:28 AM

#

I might break it up a bit once it's all working. This is a general search program where the user can type a string, number, date, or date object, and it has the ability to search every column for something like it. It's complex because it's attempting to match data type as best as possible.

deep heron Aug 20, 2024, 1:28 PM

#

Funny enough, I think it was working the whole time, but I didn't realize it. Pointers in general would still be great though.

deep heron Aug 20, 2024, 1:31 PM

#

paper flower Why excatly this function is so complex? 👀 Do you want to exclude these dates o...

I may have some work to do on the complexity 😂

paper flower Aug 20, 2024, 1:32 PM

#

deep heron I might break it up a bit once it's all working. This is a general search progra...

Are you using this function for multiple models or just one? 🤔

deep heron Aug 20, 2024, 1:47 PM

#

It's for any model.

#

The idea is that I can set it up so if I need to give a department access to edit fields of our core database, I can just make a little program that imports the module that includes this. Then I call with the necessary fields, which fields have which permissions, and then display a search followed by an output display.

#

For very simple views and edits, I should add. Anything more complicated deserves more program, but can still use this module for whatever would be common. It's part of a set of core modules I'm writing for the company.

coral wasp Aug 20, 2024, 1:57 PM

#

The or and and there makes me wonder if you got the order of operations right

paper flower Aug 20, 2024, 2:42 PM

#

deep heron The idea is that I can set it up so if I need to give a department access to edi...

So, do you only need to exclude fields that are of type date/dateime and have value of 0000?

#

Maybe like this:

from datetime import date

from sqlalchemy import inspect, select

from app.db.models.manga import Manga

mapper = inspect(Manga)
stmt = select(Manga)
for column in mapper.columns:
    try:
        python_type = column.type.python_type
    except NotImplementedError:
        continue
    if issubclass(python_type, date):
        stmt = stmt.where(column != "0000-00-00")

print(stmt)

change the model for your own here

#

But honestly I'd work on fixing your db to exclude these zero dates 👀

#

Honestly I'd make a separate set of searchable columns for each model which you'd apply your search term to

stray cipher Aug 20, 2024, 4:06 PM

#

class User(UserMixin, db.Model):
# ...
following: so.WriteOnlyMapped['User'] = so.relationship(
secondary=followers, primaryjoin=(followers.c.follower_id == id),
secondaryjoin=(followers.c.followed_id == id),
back_populates='followers')

#

I don't get the logic of the secondary join, it makes sense that we're joining the user table with the assoiciation table where the follower_id == id. But I'm not sure what the secondaryjoin does

deep heron Aug 20, 2024, 4:12 PM

#

paper flower Honestly I'd make a separate set of searchable columns for each model which you'...

It's dynamic. I can't account for all of the models and columns this program will encounter.

#

If it works, that's the beauty of it.

#

And the database is full of garbage data from poor imports when the tables were created, so I need to account for that, too. Including the zero dates.

coral wasp Aug 20, 2024, 5:19 PM

#

First, how about using is distinct from operator? https://docs.sqlalchemy.org/en/20/core/sqlelement.html#sqlalchemy.sql.expression.ColumnOperators.is_distinct_from (I'm not a sqla user, I'm just thinking about how I'd handle in sql), second, why are the 0000's an issue? Do you want to treat them as unknown or what?

paper flower Aug 20, 2024, 5:37 PM

#

In case you're using a secondary table secondaryjoin establishes how you join from your secondary table to a child model, and primaryjoin how you join from this model to a secondary table

#

In most cases you don't need to use primary or secondary join attributes as sqlalchemy can figure out how a typical relationship should work

outer yacht Aug 20, 2024, 6:30 PM

#

I'm testing some python automation for some reports I am tasked with. I have some excel sheets that connect to our MS-SQL databases, but I am not too sure about how to connect to them with my python environment. It's health data, so the security can be weird. Does anybody know how I can generate a sqlalchemy connection strings? I have the connection string that works in excel but I don't exactly understand how to transfer that to something python can understand

deep heron Aug 20, 2024, 6:45 PM

#

coral wasp First, how about using is distinct from operator? <https://docs.sqlalchemy.org/e...

What would I use IS DISTINCT FROM to accomplish? Would I need to know the fields ahead of time?
I was getting some weird results from "0000-00-00" when I was searching for things, so I decided to exclude them entirely. I don't think anyone would be searching for it specifically.

#

I wish I could just clear them out, but there's so many garbaged up tables with tens of thousands of rows that I can't possibly go through and fix them.

coral wasp Aug 20, 2024, 6:47 PM

#

deep heron What would I use IS DISTINCT FROM to accomplish? Would I need to know the field...

Is distinct from would simplify your null checks.

deep heron Aug 20, 2024, 6:48 PM

#

🤔 I might have to take a look at it. I thought it just checked to make sure you didn't return results with repeated information. I might have misunderstood the reading.

coral wasp Aug 20, 2024, 6:48 PM

#

The way I'd do this in sql is: id write a cte to filter out irrelevant records, then query against that, like: with q1 as (select * from mytable where date is not null and date != 0000....) select * from q1 where conditions...

pastel vale Aug 20, 2024, 7:00 PM

#

Could someone link me a database project theyve done that can still be understood by a beginner (as in the code is understandable to someone like me, even if parts of the code is above my level)?

#

Also itd help if the code itself has a lot of comments so I can understand what each part functions as? Thank you

deep heron Aug 20, 2024, 7:38 PM

#

I finally found why I needed the protections against the zero date and null entries. If I am searching for something that isn't a date, when it gets to the date field, I get a "match" with those, even though they don't match.

#

So I added a few checks to block the search and block against those returns as a sort of multi level protection against false positives.

#

So for example, If I search for "WSB-BX-LGCOVER#2" with the protections, I get one result. Without the protections, I get a segmentation fault (I think). There's so many results that the program just kind of... gives up. It exits with no error to catch. I had to create and pull a minidump to find that error.

paper flower Aug 20, 2024, 7:52 PM

#

deep heron So for example, If I search for "WSB-BX-LGCOVER#2" with the protections, I get o...

Maybe this is an issue with nulls, but I don't think zero dates quite matter here pithink

#

You can just add a limit to see the small subset of what you get back from db

deep heron Aug 20, 2024, 7:55 PM

#

paper flower Maybe this is an issue with nulls, but I don't think zero dates quite matter her...

Oh yes. I did that to confirm the issue.

#

To make sure the results I would get back were actually bad if it weren't for the error. They were.

#

On a related note, is it better to query a date field with a date object? Or the string of said date object? I'm leaning towards object since it would be a closer match in my mind.

paper flower Aug 20, 2024, 7:56 PM

#

deep heron On a related note, is it better to query a date field with a date object? Or the...

If you're talking about using it in a condition in where it shouldn't quite matter

#

But date would be preferred

deep heron Aug 20, 2024, 7:56 PM

#

I thought so. Thanks 😄

paper flower Aug 20, 2024, 7:57 PM

#

paper flower Maybe like this: ```py from datetime import date from sqlalchemy import inspect...

^ Btw you could check if python type is a date and add a condition based on that

deep heron Aug 20, 2024, 7:57 PM

#

I use this to parse a given date from one of two formats.

def _parse_date(self, date):
        try:
            if "/" in date:
                return datetime.datetime.strptime(date, "%m/%d/%Y").date()
            elif "-" in date:
                return datetime.datetime.strptime(date, "%Y-%m-%d").date()
            else:
                return None
        except ValueError:
            return date

paper flower Aug 20, 2024, 7:57 PM

#

And you can use stmt = stmt.where(...) instead of collection all the where clauses in a list

paper flower Aug 20, 2024, 7:57 PM

#

deep heron I use this to parse a given date from one of two formats. ```py def _parse_date(...

Horrific 😬

#

Can't you handle that with orm?

deep heron Aug 20, 2024, 7:58 PM

#

lol why?

paper flower Aug 20, 2024, 7:58 PM

#

are you getting these values from db?

deep heron Aug 20, 2024, 7:58 PM

#

The date to search? Or the date returned?

paper flower Aug 20, 2024, 7:58 PM

#

deep heron The date to search? Or the date returned?

The date you're parsing

#

Or is that what you're getting from your client?

deep heron Aug 20, 2024, 7:59 PM

#

The date being parsed is from the user.

#

So yeah, client.

#

And the source of part of my headache with item "numbers" like given before 😄

paper flower Aug 20, 2024, 8:00 PM

#

It's not that bad then 😅 I would be scared If I had dates in my db as strings and in different formats

deep heron Aug 20, 2024, 8:00 PM

#

Nono. That at least is one of the correct things about this database. Dates are in date type.

paper flower Aug 20, 2024, 8:00 PM

#

Honestly if you can - use ISO dates/datetimes

#

And force your client to use that ducky_devil

deep heron Aug 20, 2024, 8:01 PM

#

I'd love to. I don't have that control. The joke is that some of the data in this database is older than I am.

#

Imported from other sources of course.

#

I'm curious though, why ISO?

obsidian basin Aug 20, 2024, 9:46 PM

#

Please ping on reply.

I have a few questions about sqlachemy + flask sqlalchemy could someone please answer?

In my database/models I have the RouteToken table and the a column called token.
Can someone think of a better name for RouteToken and the column token.
For example RouteToken and the token column is used when when I have a route like

@email_password_reset.route('/verify_email_token/<username_db>/<token_db>', methods = ['GET', 'POST'])

The reason I want a different name is because RouteToken and token are so similar. Or do you think my reasoning is incorrect.

Another question I have is in RouteToken I want to add a column called attempts_token_tried: so.Mapped[int] = so.mapped_column(sa.Integer, default=0). It would reset to zero if someone tried to create more then 5 tokens. Should I create a 1 to many relationship or just have a 1 to 1 and delete the token each time and keep track with attempts_token_tried?

https://pastebin.com/hYRJgXti

Also how is the 2nd example in the 1 to many link different then the 1st example in 1 to 1 link?

https://docs.sqlalchemy.org/en/20/orm/basic_relationships.html#one-to-many

https://docs.sqlalchemy.org/en/20/orm/basic_relationships.html#one-to-one

Pastebin

class User(UserMixin, db.Model): ''' one to many relationship...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

fallen vault Aug 21, 2024, 2:31 PM

#

@coral wasp figured we should move to a dedicated channel. I haven’t played with the idea much yet since it was just meantioned yesterday, but the software would probably have to save locally and preform a scheduled online backup rather than saving in both locations every time a record is created.

coral wasp Aug 21, 2024, 2:34 PM

#

This can get complicated quickly, just warning; having a central database with local onsite replicas raises a lot of issues. The other approach is many onsite databases that 'push' to a central database. A third approach is a dual db, where transactions are applied across two db's.

pastel vale Aug 21, 2024, 4:25 PM

#

https://github.com/YourLocalPurveyorOfExplsoveGoods/MyDatabaseProjects if anyone doesnt mind, could someone here give as detailed as feedback as you can on my database projects please? Just as a preface, they're all quite basic projects but I want to start advancing my skills. Thank you

GitHub

GitHub - YourLocalPurveyorOfExplsoveGoods/MyDatabaseProjects

Contribute to YourLocalPurveyorOfExplsoveGoods/MyDatabaseProjects development by creating an account on GitHub.

simple pendant Aug 21, 2024, 5:18 PM

#

Hii ,
I am dishant. I work with startups & businesses to find suitable freelancers they require. If you are a web developer or editor you can contact me with your portfolio and pricing

sudden delta Aug 21, 2024, 5:52 PM

#

is there any way to fix the thrift error when trying to make a connection to a hive database ?

fading patrol Aug 21, 2024, 6:25 PM

#

simple pendant Hii , I am dishant. I work with startups & businesses to find suitable freelance...

Not here, please delete this and read the #rules .

simple barn Aug 21, 2024, 7:19 PM

#

I need to store JSON in an SQLITE table column. JSON1 extension or Blob? Pros and Cons?

waxen finch Aug 21, 2024, 7:29 PM

#

simple barn I need to store JSON in an SQLITE table column. JSON1 extension or Blob? Pros an...

what do you mean by blob, like storing JSON strings as blobs? or using the jsonb* functions provided by the JSON1 extension?

simple barn Aug 21, 2024, 8:07 PM

#

waxen finch what do you mean by blob, like storing JSON strings as blobs? or using the jsonb...

A Blob datatype column. Supposedly you can store JSON in a BLOB (Don't know if it has to be a string or not). And yes for the latter question, using the JSON1 extension

waxen finch Aug 21, 2024, 8:08 PM

#

https://sqlite.org/json1.html#the_json_blob_input_bug
apparently it was originally considered a bug for JSON1 functions to operate on BLOB values and was initially fixed in 3.45.0 before they restored backwards compatibility in 3.45.1

#

i think if you want to use BLOBs for the slight performance benefit of sqlite's jsonb format, that's fine

#

but otherwise it's probably more convenient to use TEXT for json strings

(to clarify, converting {"Hello": "world!"} into a BLOB has no benefits over storing it as TEXT, but in 3.45.0+ you can convert it into sqlite's JSONB format using the jsonb() function if you want slightly more compact storage and/or other JSON1 functions like jsonb_insert() to run more efficiently)

deep heron Aug 21, 2024, 8:16 PM

#

Is there a trick to matching a python float to a MariaDB float? I know there's a slight difference (python 64 bit, MariaDB 32 bit?) and I can't change it to decimal.

#

I confirmed an issue exists when I even ran the query in MariaDB through Heidi, same result. I have a row with a float value, and when I put that float value in a WHERE clause, it returns no results.

grim vault Aug 21, 2024, 8:42 PM

#

deep heron Is there a trick to matching a python float to a MariaDB float? I know there's ...

!e You could try to convert the double to a float, like:

import struct
from decimal import Decimal

double = 0.3
float32 = struct.unpack("f", struct.pack("f", double))[0]

print(f"{Decimal(double) = }")
print(f"{Decimal(float32) = }")

delicate fieldBOT Aug 21, 2024, 8:42 PM

#

grim vault !e You could try to convert the double to a float, like: ```py import struct fro...

:white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | Decimal(double) = Decimal('0.299999999999999988897769753748434595763683319091796875')
002 | Decimal(float32) = Decimal('0.300000011920928955078125')

deep heron Aug 21, 2024, 8:57 PM

#

It's already a float by default, right?

#

If you don't specify?

brazen charm Aug 21, 2024, 8:58 PM

#

python floats are double precision so 64 bit

deep heron Aug 21, 2024, 8:58 PM

#

Ah. I'm thinking backwards.

brazen charm Aug 21, 2024, 8:58 PM

#

the driver should cast it correctly tho

deep heron Aug 21, 2024, 8:59 PM

#

What's weird is that using a mariadb program also doesn't seem to find it.

#

deep heron Aug 21, 2024, 9:18 PM

#

grim vault !e You could try to convert the double to a float, like: ```py import struct fro...

I was having trouble with that, but I tried this out and it's hitting the matches, at least. It's not the most precise...

def _build_numeric_query(self, attribute, value, conditions):
        try:
            if isinstance(value, int):
                numeric_value = int(value)
            elif isinstance(value, float) or "." in value:
                numeric_value = float(value)
                conditions.append(attribute.between(numeric_value - 0.005, numeric_value + 0.005))
                return None
            else:
                numeric_value = int(value)
        except (ValueError, TypeError) as error:
            numeric_value = None

        if numeric_value or numeric_value == 0:
            conditions.append(attribute == numeric_value)

#

(Right before the return None)

#

This is using SqlAlchemy, btw.

green yew Aug 22, 2024, 4:14 AM

#

I'm using SQLAlchemy and i think i may have written my app pretty wrong, are you suppost to do:

engine = create_engine(get_db_url())

def nameFromID(ID:int):
    with Session(engine) as session:
        stmt = select(Customer).where(Customer.id == ID)
        result = session.scalar(stmt)
    return result.name

def userFromID(ID:int):
    with Session(engine) as session:
        stmt = select(Customer).where(Customer.id == id)
        result = session.scalar(stmt)
    if not result:
        raise Exception('invalid id!')
    return result

or

engine = create_engine(get_db_url())
session = Session(engine)

def nameFromID(ID:int):
    stmt = select(Customer).where(Customer.id == ID)
    result = session.scalar(stmt)
    return result.name

def userFromID(ID:int):
    stmt = select(Customer).where(Customer.id == ID)
    result = session.scalar(stmt)
    if not result:
        raise Exception('invalid id!')
    return result

outer tulip Aug 22, 2024, 7:45 AM

#

hi, db masters. In sqlalchemy I got a model SKU that has decimal/number col price and category (str) . However different clients have their discount config in ClientCo.disc_config (json) field.

    def client_price(self, clientco_disc_config: dict[str, dict]) -> Decimal:
        '''returns client price with computed discount for client catalog'''
        cat_config_dict: dict = clientco_disc_config.get(self.category, {})
        cat_disc_str = cat_config_dict.get(CatConfigKey.DISCOUNT.value, '0')
        disc_fraction = Decimal(cat_disc_str) / 100
        return round(self.price * (1 - disc_fraction), 2)

This method works well. but I would like to sort paginated catalog of SKUs based on this computed-disc_config based client_price. As per my learnings in recent day, hybrid_property does not take any extra arguments. What are my options?

outer tulip Aug 22, 2024, 8:35 AM

#

appears, solved it. Not quite nice, but works.

from sqlalchemy.sql import case
from sqlalchemy import Select, asc, desc

def apply_client_price_sort(stmt: Select, clientco_disc_config, sort_asc: bool = True) -> Select:
    cases = []
    for category, config in clientco_disc_config.items():
        cat_disc_str = config.get(CatConfigKey.DISCOUNT.value, '0')
        disc_fraction = Decimal(cat_disc_str) / 100
        cases.append(
            (SKU.category == category, SKU.price * (1 - disc_fraction))
        )
    client_price_expression = case(*cases, else_=SKU.price)
    sort_order = asc(client_price_expression) if sort_asc else desc(client_price_expression)
    return stmt.order_by(sort_order)
...
skus_stmt = apply_client_price_sort(skus_stmt, clientco_disc_config, sort_asc=False)

uneven lynx Aug 22, 2024, 9:30 AM

#

Heyo people! Earlier I have only worked with xampp and mysql for a earlier project. But now my current project is getting close to needing a database to store some data. What do you say about using xampp and host it with AWS server? All ideas are appreciated

#

The database needs to hold a number (Id) and under every Id there will be three columns of data (temperature, time and sensors)

#

So I believe mysql would work fine for it but I have no experience having a database cloud hosted. Earlier it was just local

near tapir Aug 22, 2024, 11:29 AM

#

how can i fix this

pastel vale Aug 22, 2024, 11:44 AM

#

any book recommendations for learning SQL (particularly books that have projects to immediately apply your knowledge)? Prefer reading books as it gives more structure to my learning. Been using sqlite and python but feel ive hit a wall as, I'm very much a beginner

coral wasp Aug 22, 2024, 11:55 AM

#

pastel vale any book recommendations for learning SQL (particularly books that have projects...

For beginner resources, these are pretty good: https://owencampbell.me.uk/sql_python_tutorial/pages/intro.html https://www.hackerrank.com/domains/sql https://sqlbolt.com https://selectstarsql.com/ https://www.datawars.io/

pastel vale Aug 22, 2024, 12:00 PM

#

coral wasp For beginner resources, these are pretty good: <https://owencampbell.me.uk/sql_p...

Ill have a look. Thank you

#

The owencampbell one looks like itll be the perfect resource for me as Ive essentially been learning sql through python

jagged garden Aug 22, 2024, 12:43 PM

#

hi

#

I have about select data using pandas

#

this line not filter

#

aprovados = escola_selecionada[escola_selecionada['situacao_nome'] ==
'APROVADO']

#

I am using this csv

📎 situacaofinal2023_2.csv

#

my code https://pastebin.com/CVb9t07q

Pastebin

import pandas as pddef calcular_porcentagem_situacao_por_falta(arqu...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

sudden delta Aug 22, 2024, 1:13 PM

#

How do I connect to a hive database ?

fallen vault Aug 22, 2024, 3:42 PM

#

coral wasp This can get complicated quickly, just warning; having a central database with l...

Thanks for the help. Sorry, ended up getting busy at work and forgetting about discord.

wicked flame Aug 23, 2024, 9:16 AM

#

Can I ask a question about vector databases here?

storm mauve Aug 23, 2024, 12:00 PM

#

wicked flame Can I ask a question about vector databases here?

yes, also preferably ask your question instead of asking to ask

if it's off-topic, you'll just get redirected elsewhere
if it's on topic, someone may answer it without the extra delay/compromise of responding to the original message and having you reply with the actual question

blissful yoke Aug 23, 2024, 3:54 PM

#

Is this a good chat to get help with creating a data logger?

#

I mean, it’s more just a functionality in Python rather than purely this but just wanna ask just in case

#

But basically

#

I have created a data logger for a module system for a raspberry pi to track various modules for a car. This is just context

#

My issue is, I’m creating a framework for it, so that it would act as a library, you create a logger object, and you have the options to add sources, information you want to log, and those will be consistently updated with a update_source() function

#

There are 2 types of updates, updating a specific source, and updating all sources of the logger, both with the same problem

#

I want it to be as easy as possible, so by simply specifying the name of the source, or just running the update_all_sources() function, it will update and add the values into the list storing each update

#

This is going to be difficult to explain, but how I was thinking was to instead of getting the value, to instead store where the data is being changed, so you would simply just have to read the stored variable and then obtain the value

#

Because if not you would have to plug these variables into the parameters of update_all_sources and that ruins convenience to an extent. Instead do it once when you create the source and then run said code. Just wanted to know if it was possible

#

Sorry for the long post, it’s hard to describe my thought process, hopefully it helps

keen minnow Aug 24, 2024, 12:42 AM

#

blissful yoke Sorry for the long post, it’s hard to describe my thought process, hopefully it ...

I would suggest to start by the problem you are trying to solve

#

A datalogger is a start, but then how much retention do you need? Who would need access to that? How often would they need access to that? Would they go straight to the RPI or would it go to a DB or a cloud or something else?

halcyon slate Aug 24, 2024, 3:44 AM

#

how do i make an er diagram for students submit issues

#

and normalize it

hidden creek Aug 24, 2024, 10:38 AM

#

hi

so i have a many to many relationship
using postgres

#

say a table called records
and a table called meds

#

when the users create a record
they can make any number of med for it
the med can be new or one if the meds already in the database

#

so i need a query the first checks if each med exists or not
if it exists, make the relationship in the middle table
otherwise create it then make the relationship

#

if anyone knows how this can be done in an efficient way

wispy cloud Aug 24, 2024, 3:41 PM

#

Guys i forgot abt the cursor what does it do

coral wasp Aug 24, 2024, 3:42 PM

#

The cursor is just the thing you do operations on, see https://docs.python.org/3/library/sqlite3.html#sqlite3-tutorial for how to use it. One connection can have multiple cursors.

wispy cloud Aug 24, 2024, 3:43 PM

#

Ooooo

#

Thankuuu

fair coral Aug 24, 2024, 5:35 PM

#

anyone know any good books on implementing databases? namely looking into learning how distributed keyvalue stores work.
could even just be theory, just a decent book on how databases work

coral wasp Aug 24, 2024, 6:14 PM

#

fair coral anyone know any good books on implementing databases? namely looking into learni...

Maybe check the syllabus of a graduate database class at a large CS Uni? Carnegie Mellon has a large database group, or Stanford

thorny anchor Aug 24, 2024, 6:24 PM

#

fair coral anyone know any good books on implementing databases? namely looking into learni...

the book "database internals"

mental cove Aug 24, 2024, 8:22 PM

#

Hi, I'm making a login system as a part of a app project I'm working on and I want the app to be accessed publicly. I plan to code it in python using the sqlite3 library and a sqlite3 database to store all of the users's information (passwords, usernames, etc...). Anyway, I was wondering if I have to code it differently as a result of wanting the app to be public ? Do I have to import libraries other than sqlite3 and consider things like protocols or am I ok to just code it normally ?

torn sphinx Aug 25, 2024, 6:32 AM

#

hi. in my python project i used pure python mysql connector. why do ppl use sqlalchemy or other stuff to work with database??

storm mauve Aug 25, 2024, 6:52 AM

#

torn sphinx hi. in my python project i used pure python mysql connector. why do ppl use sqla...

they abstract SQL away into Python classes and methods, are sometimes more portable than SQL with how each database has a different dialect and can help make database migrations and versioning easier

there are many people that prefer to use raw SQL queries instead of using an ORM though
(the reason varies per person but in general extra tool, extra complexity, extra dependency, more things that can go wrong and that you have to manage, sometimes their queries are less efficient than hand-written queries etc)

wise goblet Aug 25, 2024, 12:31 PM

#

torn sphinx hi. in my python project i used pure python mysql connector. why do ppl use sqla...

SQLAlchemy stuff + Adding things like Alembic + Factory boy can take care for you of

Creating SQL migrations
Applying not applied to db
Easier to unit test code logic (thanks to Factory boy)
Having DECLARATIVE table/columns definitions, where u define them once and they are used to autogenerate migrations
Having type safety, you could is using recognizable attributes of those ORM class instances
Increased uniformness/flexibility to implement more complex in terms of Python syntax itself code logic (raw SQL is pretty much a mess)
You can use raw SQL if necessary too, but only if u exhausted all ORM lib capabilities
Good to choose if u develop fully fledged backend api/service (or just big in code size app. Lets say over 20-40k code lines u are drastically needing that)
And also IMP very good for any CRUD api

#

On another hand using raw SQL has advantages of:

Having increased flexibility to SQL syntax (At the cost of almost no integration with a programming language itself, and no typing safety)
Less dependencies involved (u depend only on a single minimal lib!)
Good to choose if u develop very minimalistic tool that will be maintained only once in several months. Haven't encountered usage cases when else it would be good to use raw only, since u can use it if u need for ORM too anyway

torn sphinx Aug 25, 2024, 2:35 PM

#

storm mauve they abstract SQL away into Python classes and methods, are sometimes more porta...

i also prefer that but maybe because i havent tried python orm yet and also feel less interested in trying but will learn anyways. but i see what u mean

torn sphinx Aug 25, 2024, 2:36 PM

#

wise goblet SQLAlchemy stuff + Adding things like Alembic + Factory boy can take care for yo...

can u look at my code and tell if sqlalchemy would have made it better or now

#

https://github.com/maisha-maliha/MySQL-Python-Jinja2/blob/master/backend/database/getdata.py

GitHub

MySQL-Python-Jinja2/backend/database/getdata.py at master · maisha-...

This project is a simple blogsite made using purely Python http.server - maisha-maliha/MySQL-Python-Jinja2

wise goblet Aug 25, 2024, 2:40 PM

#

torn sphinx can u look at my code and tell if sqlalchemy would have made it better or now

Yeah, the regular CRUD (Create, Update, Delete simple operations) app, ORMs are best fit here.
You would benefit the most if u utilize Mypy integration of SQLAlchemy

torn sphinx Aug 25, 2024, 2:40 PM

#

wise goblet Yeah, the regular CRUD (Create, Update, Delete simple operations) app, ORMs are ...

whyyyyyy??? but i love seeing my sql i literally put sql

#

why put extra stuff in it

#

even tho i havent learned them yet

wise goblet Aug 25, 2024, 2:41 PM

#

torn sphinx why put extra stuff in it

described above before. 😏 #databases message

#

anyway your app is simple enough to live without ORM

#

student level project after all

#

i would be more excited to see unit tests first in it than ORM

torn sphinx Aug 25, 2024, 2:43 PM

#

wise goblet i would be more excited to see unit tests first in it than ORM

okay will look into it

#

tnx for the advice

ashen crown Aug 25, 2024, 4:16 PM

#

hack

#

hacking

delicate fieldBOT Aug 25, 2024, 10:43 PM

#

:incoming_envelope: :ok_hand: applied timeout to @gusty scroll until <t:1724626385:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

quaint mauve Aug 25, 2024, 11:18 PM

#

Anyone know if My SQL can be used in Jupiter noteboooks on Mac

coral wasp Aug 25, 2024, 11:33 PM

#

quaint mauve Anyone know if My SQL can be used in Jupiter noteboooks on Mac

Yes

quaint mauve Aug 25, 2024, 11:34 PM

#

How do I connect

#

This is what w3 schools says

#

But I don’t know my username

thorny anchor Aug 25, 2024, 11:34 PM

#

did you run MySQL locally?

quaint mauve Aug 25, 2024, 11:35 PM

#

Wdym

quaint mauve Aug 25, 2024, 11:35 PM

#

quaint mauve Wdym

Yes

visual bough Aug 26, 2024, 9:21 AM

#

bluds, in my face attendance system, I am using sheety API to store the data in google sheets, it is working well but it can't save images in google sheets, any alternative ?

robust grotto Aug 26, 2024, 9:31 AM

#

HI, does MariaDB Connector support async operations ? I try to create SQLALchemy async engine with valid URL started with mariadb+mariadbconnector:// using create_async_engine(...) , then SQLAlchemy raises following exception :

File "/path/to/my-project/.venv/lib/python3.12/site-packages/sqlalchemy/ext/asyncio/engine.py", line 1031, in __init__
    raise exc.InvalidRequestError(
sqlalchemy.exc.InvalidRequestError: The asyncio extension requires an async driver to be used. The loaded 'mariadbconnector' is not async

Should I switch to other connector packages such as asyncmy (looks like the only option) ?

[update]
I switched to asyncmy , async things work well

visual bough Aug 26, 2024, 9:32 AM

#

not blud creating anchors 💀

visual bough Aug 26, 2024, 2:41 PM

#

visual bough bluds, in my face attendance system, I am using sheety API to store the data in ...

bruh 😐

somber ember Aug 26, 2024, 4:30 PM

#

~~if i have a problem that I was able to resolve, but not able to comprehend the log messages, should I move on?~~

figured out...

fading patrol Aug 26, 2024, 7:36 PM

#

visual bough bluds, in my face attendance system, I am using sheety API to store the data in ...

Is there a reason you're using Google sheets and not a normal database like SQLite or Postgres?

ashen wraith Aug 26, 2024, 8:51 PM

#

visual bough bluds, in my face attendance system, I am using sheety API to store the data in ...

Using Google Sheets as a database is a wired idea, and you also want to store images? You can use the Google Drive API. But using Google sheets/drive is the most wired and unreasonable way to stor data.

#

Use a database like MongoDB, MySQL, or Postgres.

#

And store files in a server, store a file as a file.

fathom beacon Aug 26, 2024, 8:56 PM

#

:D

ashen wraith Aug 26, 2024, 8:57 PM

#

DDoS?

#

Did I understand you well?

#

You mean Distributed Denial-of-Service (DDoS) Attack?

#

You want malicious code..

#

And the server ToS doesn't agree with you.

#

Rule 5
Do not provide or request help on projects that may violate terms of service, or that may be deemed inappropriate, malicious, or illegal.

fathom beacon Aug 26, 2024, 8:59 PM

#

Do I need to delete messages?

ashen wraith Aug 26, 2024, 8:59 PM

#

idk

#

But for now read #rules

fathom beacon Aug 26, 2024, 9:00 PM

#

I think I need to delete

ashen wraith Aug 26, 2024, 9:00 PM

#

Okay

fathom beacon Aug 26, 2024, 9:01 PM

#

Why would a person write code if he isn't going to do such things?

ashen wraith Aug 26, 2024, 9:01 PM

#

To make a billion dollar company and be rich?

fathom beacon Aug 26, 2024, 9:01 PM

#

Coding a calculator is seriously boring

ashen wraith Aug 26, 2024, 9:02 PM

#

To take $100K a year?

ashen wraith Aug 26, 2024, 9:02 PM

#

fathom beacon Coding a calculator is seriously boring

bruh

#

Did someone tell you that's coding?

fathom beacon Aug 26, 2024, 9:02 PM

#

Be realistic my friend.

ashen wraith Aug 26, 2024, 9:02 PM

#

Realistic?

fathom beacon Aug 26, 2024, 9:02 PM

#

ashen wraith Did someone tell you that's coding?

what is this

thorny anchor Aug 26, 2024, 9:03 PM

#

fathom beacon Why would a person write code if he isn't going to do such things?

are you asking why someone would write code if it isn't going to be malicious?

ashen wraith Aug 26, 2024, 9:03 PM

#

thorny anchor are you asking why someone would write code if it isn't going to be malicious?

yeah, can you help me explain that to him?

fathom beacon Aug 26, 2024, 9:04 PM

#

I don't think it's right to continue this discussion from here.

ashen wraith Aug 26, 2024, 9:04 PM

#

Google Top Results:
The average salary for Software Developer is USD 104,440 per year in the India, MS

#

Ok, #cybersecurity

ashen wraith Aug 26, 2024, 9:05 PM

#

fathom beacon I don't think it's right to continue this discussion from here.

Are you here?

fathom beacon Aug 26, 2024, 9:05 PM

#

:D

#

joke

ashen wraith Aug 26, 2024, 9:06 PM

#

bad joke D:

#

😡

fathom beacon Aug 26, 2024, 9:06 PM

#

I know :D

#

where are you from

ashen wraith Aug 26, 2024, 9:06 PM

#

if you know my github page, Lives in the terminal

fathom beacon Aug 26, 2024, 9:06 PM

#

ashen wraith Google Top Results: The average salary for Software Developer is USD 104,440 per...

Do you think I care about money? :D

ashen wraith Aug 26, 2024, 9:07 PM

#

fathom beacon Do you think I care about money? :D

No, you care about stealing banks.

ashen wraith Aug 26, 2024, 9:07 PM

#

thorny anchor are you asking why someone would write code if it isn't going to be malicious?

plz don't delete this comment

fathom beacon Aug 26, 2024, 9:08 PM

#

ashen wraith No, you care about stealing banks.

I can't make it unless I come across a stupid manager in a bank's system.

#

:D

#

My goal is not to enter the bank

ashen wraith Aug 26, 2024, 9:08 PM

#

fathom beacon I can't make it unless I come across a stupid manager in a bank's system.

that uses JSON & plain text to store data.

fathom beacon Aug 26, 2024, 9:08 PM

#

That's why my friend went to court today.

ashen wraith Aug 26, 2024, 9:08 PM

#

fathom beacon That's why my friend went to court today.

bruh

fathom beacon Aug 26, 2024, 9:09 PM

#

ashen wraith bruh

Really

ashen wraith Aug 26, 2024, 9:09 PM

#

?

fathom beacon Aug 26, 2024, 9:09 PM

#

He was released on the condition that he pay back the money he stole to its owner.

#

2.058$

#

I'm not looking for money

#

where are you from @ashen wraith

ashen wraith Aug 26, 2024, 9:12 PM

#

ok, if you want to be a programmer, I recommend Flask & HTML/CSS, then Django, then beautifulsoup to autosearch for jobs

#

From earth

fathom beacon Aug 26, 2024, 9:12 PM

#

ashen wraith From earth

me too :D

ashen wraith Aug 26, 2024, 9:13 PM

#

grumpchib

fathom beacon Aug 26, 2024, 9:13 PM

#

I think yes

#

Do you want to help me with this?

ashen wraith Aug 26, 2024, 9:13 PM

#

no

fathom beacon Aug 26, 2024, 9:13 PM

#

:(

ashen wraith Aug 26, 2024, 9:13 PM

#

it's illegal

fathom beacon Aug 26, 2024, 9:14 PM

#

you will just teach

ashen wraith Aug 26, 2024, 9:14 PM

#

i'm wondering deleted messages are viewable by maintainers

fathom beacon Aug 26, 2024, 9:15 PM

#

anyway thank you

#

o7

ashen wraith Aug 26, 2024, 9:16 PM

#

thank me for what?

next shadow Aug 26, 2024, 9:16 PM

#

ashen wraith i'm wondering deleted messages are viewable by maintainers

They are, but a @rugged locust report is always appreciated if we need to look out for something

fathom beacon Aug 26, 2024, 9:16 PM

#

ashen wraith thank me for what?

for talking to me

ashen wraith Aug 26, 2024, 9:17 PM

#

next shadow They are, but a <@683001325440860340> report is always appreciated if we need to...

@fathom beacon bruh

fathom beacon Aug 26, 2024, 9:17 PM

#

see you buddy

fathom beacon Aug 26, 2024, 9:17 PM

#

next shadow They are, but a <@683001325440860340> report is always appreciated if we need to...

🤨

ashen wraith Aug 26, 2024, 9:19 PM

#

next shadow They are, but a <@683001325440860340> report is always appreciated if we need to...

Hi brad, this buddy wants code for DDoS, which is illegal..

fathom beacon Aug 26, 2024, 9:20 PM

#

I think I'm about to get banned

next shadow Aug 26, 2024, 9:20 PM

#

Yup, and they have now been informed of the rules (thanks for that by the way) and have discontinued their search here.

ashen wraith Aug 26, 2024, 9:20 PM

#

What do you mean by " discontinued their search"?

fathom beacon Aug 26, 2024, 9:20 PM

#

next shadow Yup, and they have now been informed of the rules (thanks for that by the way) a...

I promise I won't open a topic like this again.

next shadow Aug 26, 2024, 9:20 PM

#

this That is what I mean

ashen wraith Aug 26, 2024, 9:22 PM

#

i don't understand, do you mean you told them to not open a topic like this again?

fathom beacon Aug 26, 2024, 9:22 PM

#

yeah

ashen wraith Aug 26, 2024, 9:22 PM

#

ok

next shadow Aug 26, 2024, 9:23 PM

#

They have understood that we will not allow it again on the server. Leniency is usually a good policy if someone is not aware of the server rules (even though everyone clicks the button saying they have read and understood the server rules when they join)

ashen wraith Aug 26, 2024, 9:24 PM

#

next shadow They have understood that we will not allow it again on the server. Leniency is ...

just like every other app, they need to read the rules/ToS/privacy policy!

pastel vale Aug 27, 2024, 2:30 PM

#

How do I try to level up my sql skills (in terms of learning and projects). Like Ive got a good grasp in the basics but I want to go beyond that. Using sqlite atm before I go onto other sql variants

cedar tiger Aug 27, 2024, 2:34 PM

#

pastel vale How do I try to level up my sql skills (in terms of learning and projects). Like...

I think I was able to push my SQL abilities the most is when I was making company reports. See if you can do the same.

pastel vale Aug 27, 2024, 2:54 PM

#

Ive been working on a ufc database but its getting kinda boring now. Going to move onto a new project thatll properly test me

#

I swear this shit is so difficult because Im self learning everything. Im not going to spend money on courses unless i really want to

coral wasp Aug 27, 2024, 3:12 PM

#

pastel vale How do I try to level up my sql skills (in terms of learning and projects). Like...

What have you done?

#

Here's a few resources:
https://www.hackerrank.com/domains/sql
https://sqlbolt.com/
https://selectstarsql.com/
https://www.datawars.io/

pastel vale Aug 27, 2024, 3:26 PM

#

coral wasp What have you done?

https://github.com/YourLocalPurveyorOfExplsoveGoods/MyDatabaseProjects this is the link to the database projects Ive done. Hit bit of a wall with it. I feel like I need to challenge myself a lot more and pick projects that will properly test my skills and force me to learn more

GitHub

GitHub - YourLocalPurveyorOfExplsoveGoods/MyDatabaseProjects

Contribute to YourLocalPurveyorOfExplsoveGoods/MyDatabaseProjects development by creating an account on GitHub.

coral wasp Aug 27, 2024, 3:28 PM

#

pastel vale https://github.com/YourLocalPurveyorOfExplsoveGoods/MyDatabaseProjects this is t...

There's several sides to the database world. One side is the transactional world of using databases for application data. The other is using it for analytical purposes, such as stock analysis. (there's also the no-sql world too).

pastel vale Aug 27, 2024, 3:29 PM

#

Ive been focusing more on the analytical side as thats what I'm more interested in. just dont know what to do to further my skills

coral wasp Aug 27, 2024, 3:29 PM

#

You'd want to balance it with some transactional stuff, if you want to be "expert" you need to know both sides.

#

But, I'd suggest working through some of those sites I linked above. It looks like you're just scratching the surface, and haven't yet worked with CTEs, aggregates, subqueries, and window functions.

pastel vale Aug 27, 2024, 3:31 PM

#

Ill do that then for the next month or so

somber ember Aug 27, 2024, 4:03 PM

#

when i was in game industry i was constantly told that when i write code it should be future ready, that if something unforeseen happens, it should be able to handle with minimal intervention. But looks like same practice doesn't apply in data engineering. I was told, that instead of writing dynamic code, i should only apply changes when user request since table is a fixed structure.

in general is this a standard in data engineering or should i see this as case by case?

fathom beacon Aug 27, 2024, 4:22 PM

#

@next shadow Hey man, do you mind if I find teammates here?

next shadow Aug 27, 2024, 4:23 PM

#

We don't allow recruitment or advertising

fathom beacon Aug 27, 2024, 4:24 PM

#

next shadow We don't allow recruitment or advertising

thank you buddy

cunning plover Aug 27, 2024, 4:25 PM

#

Guys, im a beginner

#

what database should i install in my pc

#

like postgresql

tulip cove Aug 27, 2024, 6:36 PM

#

not a real database but yesterday i created a small package for storing your data models. (In cas you use pydantic)
https://github.com/42dotmk/slowstore

GitHub

GitHub - 42dotmk/slowstore

Contribute to 42dotmk/slowstore development by creating an account on GitHub.

#

maybe it will work for you.

fading patrol Aug 27, 2024, 7:43 PM

#

cunning plover like postgresql

If you've never touched SQL at all before, I recommend SQLite first. Postgres is also a fine choice but SQLite is just a bit easier to dive into quickly and start building stuff with.

fallow otter Aug 28, 2024, 2:37 AM

#

Is it normal to have SQL query like this? Or I should have do more logic in python instead

WITH search AS (SELECT id.id         as item_id,
             td.scalar_min as scalar_min,
             td.scalar_max as scalar_max,
             id.type       as type
      FROM item_held
               INNER JOIN main.item_log il on il.item_id = item_held.item_id
               INNER JOIN main.item_def id on il.source_item_id = id.id
               INNER JOIN main.tool_def td on id.id = td.item_id
      WHERE id.type in ('tool_dirt', 'tool_fish')
        and holder_user_id = 0),
search2 AS (SELECT scalar_min, scalar_max, type
    FROM search
    WHERE (
        (scalar_max = (SELECT max(scalar_max) FROM search WHERE type = 'tool_dirt') and type = 'tool_dirt')
        or
        (scalar_max = (SELECT max(scalar_max) FROM search WHERE type = 'tool_fish') and type = 'tool_fish')
)),
search3 AS (SELECT scalar_min, scalar_max, type
FROM(
    SELECT scalar_min, scalar_max, type
    FROM search2
WHERE (scalar_min = (SELECT max(scalar_min) FROM search2 WHERE type = 'tool_dirt') and type = 'tool_dirt')
    or (scalar_min = (SELECT max(scalar_min) FROM search2 WHERE type = 'tool_fish') and type = 'tool_fish')
)),
dirt_out AS (
    SELECT scalar_min, scalar_max, type
    FROM search3
    WHERE type = 'tool_dirt'
    ORDER BY scalar_max DESC, scalar_min DESC
    LIMIT 1),
fish_out AS (
    SELECT scalar_min, scalar_max, type
    FROM search3
    WHERE type = 'tool_fish'
    ORDER BY scalar_max DESC, scalar_min DESC
    LIMIT 1
)
SELECT * FROM dirt_out
UNION ALL
SELECT * FROM fish_out;

cedar tiger Aug 28, 2024, 2:41 AM

#

cunning plover like postgresql

I agree with dowcet. Try with sqlite first and you can easily reset and redo a single db file whenever. Once you're comfortable, you can then start looking at setting up Postgresql server. I like messing around with one in a docker container https://github.com/devrimgunduz/pagila

GitHub

GitHub - devrimgunduz/pagila: PostgreSQL Sample Database

PostgreSQL Sample Database. Contribute to devrimgunduz/pagila development by creating an account on GitHub.

cedar tiger Aug 28, 2024, 2:41 AM

#

fallow otter Is it normal to have SQL query like this? Or I should have do more logic in pyth...

Thats normal

fallow otter Aug 28, 2024, 2:43 AM

#

Ok

coral wasp Aug 28, 2024, 3:17 AM

#

fallow otter Is it normal to have SQL query like this? Or I should have do more logic in pyth...

Yah, normal, but some opportunities to simplify. Such as consolidating the two max's, and combining dirt out and fish out by using a window function to take the scalar max of both types

fallow otter Aug 28, 2024, 10:13 AM

#

coral wasp Yah, normal, but some opportunities to simplify. Such as consolidating the two m...

Hmm, I have no clue what they are, probably would learn later

visual bough Aug 28, 2024, 10:20 AM

#

fading patrol Is there a reason you're using Google sheets and not a normal database like SQLi...

ye, it's bcz I haven't learnt them yet

fallow otter Aug 28, 2024, 10:27 AM

#

visual bough ye, it's bcz I haven't learnt them yet

It would be recommend to start learning the basic since they are quite helpful

fading patrol Aug 28, 2024, 10:59 AM

#

visual bough ye, it's bcz I haven't learnt them yet

If you really want to keep using Google Sheets and need help troubleshooting then share the details in a help channel #❓｜how-to-get-help . But switching to sqlite3 shouldn't be too difficult and should be easier / more reliable

cunning plover Aug 28, 2024, 11:11 AM

#

fading patrol If you really want to keep using Google Sheets and need help troubleshooting the...

hi

#

#

can anyone help me choose which one to download

coral wasp Aug 28, 2024, 11:33 AM

#

Sqlite is in the Python stdlib , you don't need to install

#

https://docs.python.org/3/library/sqlite3.html#sqlite3-tutorial

Python documentation

sqlite3 — DB-API 2.0 interface for SQLite databases

Source code: Lib/sqlite3/ SQLite is a C library that provides a lightweight disk-based database that doesn’t require a separate server process and allows accessing the database using a nonstandard ...

fading patrol Aug 28, 2024, 11:34 AM

#

cunning plover can anyone help me choose which one to download

What BillyBobby said

cunning plover Aug 28, 2024, 11:44 AM

#

oh.....

#

i already installed it

#

i thought itwas like postgresql

fading patrol Aug 28, 2024, 11:53 AM

#

cunning plover i thought itwas like postgresql

a SQLite database is just one simple file, that's the whole beauty of it 🙂

cunning plover Aug 28, 2024, 12:04 PM

#

fading patrol a SQLite database is just one simple file, that's the whole beauty of it 🙂

How does that work though?

#

Like it was jus 5mb to download.

#

I was surprised

#

cause postgresql gave me a lot to download

fading patrol Aug 28, 2024, 12:08 PM

#

cunning plover How does that work though?

What was 5mb to download? The Python library should be all you need. It will create a sqlite file when you use it, and that file is your whole database

#

Just like writing to JSON or a text file basically

cunning plover Aug 28, 2024, 12:09 PM

#

does that mean i can uninstall what i installed

fading patrol Aug 28, 2024, 1:02 PM

#

cunning plover does that mean i can uninstall what i installed

Probably, but you didn't say what you installed

cunning plover Aug 28, 2024, 1:09 PM

#

fading patrol Probably, but you didn't say what you installed

lemme show

#

fading patrol Aug 28, 2024, 2:03 PM

#

cunning plover lemme show

Might be useful but probably not required. (Not so sure to be honest, because Windows)

waxen finch Aug 28, 2024, 3:12 PM

#

yeah, python's windows installer bundles its own sqlite library so it works out of the box, but if desired, you can replace the sqlite3.dll inside your installation with a newer version from that SQLite downloads page

#

those sqlite executables are just convenient CLI tools you can use on your database files (they have some neat formatting options too)
https://sqlite.org/cli.html

spark bluff Aug 28, 2024, 9:50 PM

#

Yes

teal hare Aug 28, 2024, 10:42 PM

#

Hey guys, I've found this cool open-source project for visualizing databases. I want to test it, but I've never used PostgreSQL before, and I'm using it as a Docker container. Can anyone tell me how to make it work?

Here's the link: https://app.chartdb.io/diagrams/diagramexample01

I tried adding the script mentioned in my PostgreSQL terminal, but it didn't return any JSON at all.

fading patrol Aug 29, 2024, 12:03 AM

#

teal hare Hey guys, I've found this cool open-source project for visualizing databases. I ...

It looks like whatever the results are from that query, you need to paste them into the box.

teal hare Aug 29, 2024, 12:04 AM

#

This is the result

 fk_info 
---------

(1 row)

teal hare Aug 29, 2024, 12:05 AM

#

fading patrol It looks like whatever the results are from that query, you need to paste them i...

it should return in json

teal hare Aug 29, 2024, 12:05 AM

#

teal hare This is the result ``` fk_info --------- (1 row) ```

instead of this

#

Here's how I run

postgres=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# WITH fk_info AS (
postgres(#   SELECT json_agg(row_to_json(t)) AS fk_info
postgres(#   FROM (
postgres(#     SELECT
postgres(#       schema_name,
postgres(#       table_name,
postgres(#       fk_column,
postgres(#       foreign_key_name,
postgres(#       reference_table,
postgres(#       reference_column
postgres(#     FROM (
postgres(#       SELECT
postgres(#         tc.constraint_schema AS schema_name,
postgres(#         tc.table_name,
postgres(#         kcu.column_name AS fk_column,
postgres(#         ccu.table_name AS reference_table,
postgres(#         ccu.column_name AS reference_column,
postgres(#         tc.constraint_name AS foreign_key_name
postgres(#       FROM
postgres(#         information_schema.table_constraints AS tc
postgres(#         JOIN information_schema.key_column_usage AS kcu
postgres(#           ON tc.constraint_name = kcu.constraint_name
postgres(#           AND tc.table_schema = kcu.table_schema
postgres(#         JOIN information_schema.constraint_column_usage AS ccu
postgres(#           ON ccu.constraint_name = tc.constraint_name
postgres(#           AND ccu.table_schema = tc.table_schema
postgres(#       WHERE
postgres(#         tc.constraint_type = 'FOREIGN KEY'
postgres(#     ) AS sub
postgres(#   ) AS t
postgres(# )
postgres-# SELECT * FROM fk_info;
 fk_info 
---------

(1 row)

fading patrol Aug 29, 2024, 10:41 AM

#

teal hare Here's how I run ``` postgres=# \c postgres You are now connected to database "p...

Again I think the JSON will come after you paste results in the box...

But the problem I see here is with line breaks. Your query is getting ready line by line instead of as one line. I think you need to fix it in an editor before pasting in your terminal

gusty flint Aug 29, 2024, 12:08 PM

#

is there a way to connect and execute HSQLDB commands through python script ?

coral wasp Aug 29, 2024, 12:10 PM

#

gusty flint is there a way to connect and execute HSQLDB commands through python script ?

I've never used it but: https://hsqldb.org/doc/2.0/verbatim/sample/sample.py might help

#

Looks like they connect through an odbc connection, so it's really about python -> pyodbc -> hsqldb

teal hare Aug 29, 2024, 12:12 PM

#

fading patrol Again I think the JSON will come after you paste results in the box... But the...

Tried that as well same output

fading patrol Aug 29, 2024, 12:15 PM

#

teal hare Tried that as well same output

I would try an IDE like DBeaver instead of terminal to see if the query executes properly.

teal hare Aug 29, 2024, 3:15 PM

#

fading patrol I would try an IDE like DBeaver instead of terminal to see if the query executes...

give it a shot and let me know

#

if it works

lone raft Aug 29, 2024, 4:08 PM

#

Will work

unreal hemlock Aug 30, 2024, 8:19 PM

#

look like just a filter

#

it will only show rows where that column value is "UDF111"

#

np

obsidian basin Aug 31, 2024, 6:00 PM

#

Hey I am using back_populate because the modern docs doesn't include backref? Also if I have a 1 to many relationship in the many if I have a column in the many table how do I get a column in the one table in flask sqlalchemy?
https://docs.sqlalchemy.org/en/20/orm/basic_relationships.html#one-to-many
The link below is using backref + back_populate. But the example below are from the old docs. Is the link below accurate with the new docs? I can't find anything with the modern docs involving my previous question. Any help would be appreciated.
https://stackoverflow.com/questions/39869793/when-do-i-need-to-use-sqlalchemy-back-populates

Please ping on reply.

Stack Overflow

When do I need to use sqlalchemy back_populates?

When I try SQLAlchemy Relation Example following this guide: Basic Relationship Patterns

I have this code

#!/usr/bin/env python

encoding: utf-8

from sqlalchemy import create_engine
from sqlalch...

mental cove Aug 31, 2024, 6:46 PM

#

#

Can somebody help me with this please ?

#

I'm making a create account screen using sqlite3 and hashlib in pycharm. Whenever I create an account it says "Database locked"

waxen finch Aug 31, 2024, 8:39 PM

#

mental cove I'm making a create account screen using sqlite3 and hashlib in pycharm. Wheneve...

"database is locked" means you have some uncommitted transaction from another connection to your database file

#

that can happen from leaving an unclosed connection in your script, but also from external programs like database viewers, or just having your script running twice - can't say for sure what's causing it in your case without seeing the rest of the code

slate lichen Aug 31, 2024, 9:28 PM

#

any naming usggestions for these methods??

#

storm mauve Aug 31, 2024, 9:32 PM

#

slate lichen

get_guild_modules, (get|list|find|search)_guilds_with_module

slate lichen Aug 31, 2024, 10:06 PM

#

storm mauve `get_guild_modules`, `(get|list|find|search)_guilds_with_module`

thanks

#

Finally, i can continue coding 😂

mental cove Aug 31, 2024, 11:08 PM

#

waxen finch that can happen from leaving an unclosed connection in your script, but also fro...

Yeah you’re right turns out it wasn’t working because I forgot to add conn.commit() and conn.close(). Thanks for the help

paper flower Sep 1, 2024, 11:24 AM

#

Maybe create more generic methods with related filter objects? So if you have to fetch a module/guild by different property you won't have to make a new method?

@dataclasses.dataclass
class SomeFilter:
    module_name: str | None = None

async def guilds(self, filter: SomeFilter) -> list[Guild]: ...

delicate fieldBOT Sep 1, 2024, 9:35 PM

#

:incoming_envelope: :ok_hand: applied timeout to @delicate lodge until <t:1725227150:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

unreal hemlock Sep 2, 2024, 5:35 PM

#

does anyone have a database class which I could get some inspiration from

fading patrol Sep 2, 2024, 5:40 PM

#

unreal hemlock does anyone have a database class which I could get some inspiration from

Database class? Inspiration for what purpose?

unreal hemlock Sep 2, 2024, 5:40 PM

#

fading patrol Database class? Inspiration for what purpose?

I was thinking of some ideas to create a database class to manage all queries

#

and wanted to get some inspiration from others

west hill Sep 3, 2024, 6:40 AM

#

asyncpg?

unreal hemlock Sep 3, 2024, 8:52 AM

#

west hill asyncpg?

Anything works

fluid mortar Sep 3, 2024, 9:17 PM

#

I'm starting a new back-end project with fairly loose persistence requirements. I have experience with SQLAlchemy but it feels like a sledgehammer to crack a nut, plus it seems like there are so many different query syntaxes these days it's hard to remember how everything works.
So, does anyone have any alternative suggestions? I don't necessarily need SQL, but I do need to be able to have trivial local hosting on Windows and Linux, even better if it's self-contained like sqlite. Looking for something easy to use from Python but powerful if needed.

wise goblet Sep 3, 2024, 10:42 PM

#

fluid mortar I'm starting a new back-end project with fairly loose persistence requirements. ...

Sqlite3 is a good choice for local engine of data (that is just a file)

wise goblet Sep 3, 2024, 10:43 PM

#

fluid mortar I'm starting a new back-end project with fairly loose persistence requirements. ...

otherwise... i recommend actually to suck it up and use it through SQLAlchemy (or Django ORM in Django Ninja)

#

Python ORMs are too powerful, they make it all a class, with easy to use abstractions and ecosystem around for easy unit testing
And they are able to autogenerate migrations and have inbuilt migrating versioning accordinly

#

Like...Other options would be very hard to refactor later if u will pick anything else

#

ORMs are abstracting away which SQL engine u use and later if u will need to make the project more powerful than at its start u will be able easily switching Sqlite3 to postgres or other fully fledged engine

wise goblet Sep 3, 2024, 10:47 PM

#

fluid mortar I'm starting a new back-end project with fairly loose persistence requirements. ...

So u can use SQLAlchemy with SQlite3 as mentioned. and to make it even more powerful u could ensure u use its mypy integration 😉
https://docs.sqlalchemy.org/en/20/orm/extensions/mypy.html the the code will be awesome.
Because despite u using leaky abstraction like orm, it will have all the validations for typing and correct usage

from sqlalchemy import Column, Integer, String, select
from sqlalchemy.orm import Mapped
from sqlalchemy.orm.decl_api import DeclarativeMeta


class Base(metaclass=DeclarativeMeta):
    __abstract__ = True


class User(Base):
    __tablename__ = "user"

    id: Mapped[Optional[int]] = Mapped._special_method(
        Column(Integer, primary_key=True)
    )
    name: Mapped[Optional[str]] = Mapped._special_method(Column(String))

    def __init__(self, id: Optional[int] = ..., name: Optional[str] = ...) -> None: ...


some_user = User(id=5, name="user")

print(f"Username: {some_user.name}")

select_stmt = select(User).where(User.id.in_([3, 4, 5])).where(User.name.contains("s"))

Ensuring u use only existing attributes to a model, including during a syntax to SQL query commands

#

https://factoryboy.readthedocs.io/en/stable/orms.html#factory.alchemy.SQLAlchemyModelFactory well synergizes with factory boy for easier unit testng 😉

#

There are some other options... but they are meh in average
like if you are going to maintain your project only once in year, then good choice can be just using std lib sqlite3 and nothing else

paper flower Sep 4, 2024, 6:57 AM

#

fluid mortar I'm starting a new back-end project with fairly loose persistence requirements. ...

Currently there's only one query syntax (select(...)), Query has been deprecated

paper flower Sep 4, 2024, 6:58 AM

#

wise goblet So u can use SQLAlchemy with SQlite3 as mentioned. and to make it even more powe...

Why are you using _special_method instead of mapped_column?

fluid mortar Sep 4, 2024, 8:28 AM

#

paper flower Currently there's only one query syntax (`select(...)`), `Query` has been deprec...

But if I try to find any examples anywhere, there is always a mix of syntaxes. And the docs are far, far too long to easily work out what is current and what is not

paper flower Sep 4, 2024, 8:28 AM

#

"Just read the docs"

#

They cover all the relevant techniques

#

Perhaps https://docs.sqlalchemy.org/en/20/orm/quickstart.html

#

https://docs.sqlalchemy.org/en/20/orm/quickstart.html#simple-select

fluid mortar Sep 4, 2024, 8:29 AM

#

Like I said, the docs are far, far too long.

paper flower Sep 4, 2024, 8:30 AM

#

You don't need to read everything

fluid mortar Sep 4, 2024, 8:30 AM

#

I've been using SQLAlchemy for something like 15 years, on and off. I'm not a newbie at this. I'm just looking for something a bit more convenient.

paper flower Sep 4, 2024, 8:30 AM

#

quickstart section is pretty short

wise goblet Sep 4, 2024, 10:27 AM

#

paper flower Why are you using `_special_method` instead of `mapped_column`?

i copy pasted doc example, i blame on them any inconsistencies

paper flower Sep 4, 2024, 10:44 AM

#

wise goblet i copy pasted doc example, i blame on them any inconsistencies

Which example? 🤔

wise goblet Sep 4, 2024, 10:45 AM

#

paper flower Which example? 🤔

the one u asked question about and i have code posted about

paper flower Sep 4, 2024, 10:45 AM

#

Can you link it please?

wise goblet Sep 4, 2024, 10:53 AM

#

paper flower Can you link it please?

#databases message

wise goblet Sep 4, 2024, 10:54 AM

#

paper flower Can you link it please?

#databases message

#

copy pasted from https://docs.sqlalchemy.org/en/20/orm/extensions/mypy.html

paper flower Sep 4, 2024, 10:54 AM

#

Deprecated since version 2.0: The SQLAlchemy Mypy Plugin is DEPRECATED

wise goblet Sep 4, 2024, 10:56 AM

#

paper flower `Deprecated since version 2.0: The SQLAlchemy Mypy Plugin is DEPRECATED`

ORM Declarative Models
SQLAlchemy 1.4 introduced the first SQLAlchemy-native ORM typing support using a combination of sqlalchemy2-stubs and the Mypy Plugin. In SQLAlchemy 2.0, the Mypy plugin remains available, and has been updated to work with SQLAlchemy 2.0’s typing system. However, it should now be considered deprecated, as applications now have a straightforward path to adopting the new typing support that does not use plugins or stubs.
🤔 https://docs.sqlalchemy.org/en/21/changelog/whatsnew_20.html#whatsnew-20-orm-declarative-typing

#

https://docs.sqlalchemy.org/en/21/changelog/whatsnew_20.html#step-three-apply-exact-python-types-as-needed-using-orm-mapped

#

okay mapped_column and Mapped[int] is latest

fluid mortar Sep 4, 2024, 2:39 PM

#

This is exactly why I’m considering other options. Almost every example of SqlAlchemy out there uses old syntax, whether it’s for queries, declaring ORM fields, or something else. It gets tiring trying to mentally translate it all the time. So I’m just interested in alternatives.

wise goblet Sep 4, 2024, 2:42 PM

#

fluid mortar This is exactly why I’m considering other options. Almost every example of SqlAl...

check Django ORM then 😇

paper flower Sep 4, 2024, 2:42 PM

#

Depends on the queries you want to run really and how far you're ready to stray away from sql

fluid mortar Sep 4, 2024, 2:52 PM

#

Not sure I want to try and integrate Django ORM into a non-Django project 🙂 Tempted to consider MongoDB again but that doesn’t seem popular any more

wise goblet Sep 4, 2024, 5:01 PM

#

fluid mortar Not sure I want to try and integrate Django ORM into a non-Django project 🙂 Tem...

MongoDB I think can work nice if u will use Pydantic to set data structure
And if u don't care about data to be lost between app version upgrades

#

Since MongoDB is not sql, data is migratable to new format only manually by self written scripts

#

Sql (and especially popular python orms) take care about it for you

#

Also if u use mongo, u usually need to be not caring about complex syntax quering with data joins.
Supposedly you are okay your db having simple operations only

#

Tldr: mongodb is even more poor alternative than Sqlite3, but can play nice enough if u have very simple needs and it is one time development application without future growth in terms of code and data complexity

wise goblet Sep 4, 2024, 5:06 PM

#

fluid mortar Not sure I want to try and integrate Django ORM into a non-Django project 🙂 Tem...

Tbh at this point u can just use shelve
https://docs.python.org/3/library/shelve.html

Python documentation

shelve — Python object persistence

Source code: Lib/shelve.py A “shelf” is a persistent, dictionary-like object. The difference with “dbm” databases is that the values (not the keys!) in a shelf can be essentially arbitrary Python o...

#

Like mongo, except inbuilt std lib and bd as simple file 🙂

fluid mortar Sep 4, 2024, 5:09 PM

#

I hadn't thought about using Pydantic with Mongo before, which is an interesting idea. At some point I will probably need the data to be readable from multiple processes in a fairly reliable way so I don't think shelve would be good enough, but thanks for the reminder that it exists

wise goblet Sep 4, 2024, 5:17 PM

#

fluid mortar I hadn't thought about using Pydantic with Mongo before, which is an interesting...

Sprinkle with mypy/Pyright(pick one) to validate Pydantic correct usage across code.
U will very potentially sweet nice app

#

As long as u wrote unit tests involving their running against local (preferably in docker compose) instance of mongo

thorny anchor Sep 4, 2024, 5:59 PM

#

wise goblet Sql (and especially popular python orms) take care about it for you

not really. you still need to validate them manually, and more complex migrations can't be done automatically

wise goblet Sep 4, 2024, 6:03 PM

#

thorny anchor not really. you still need to validate them manually, and more complex migration...

you are probably speaking about SQLAlchemy having manual part for validation

#

it is not present in Django ORM 😉

#

trully automatically

obsidian heart Sep 4, 2024, 8:02 PM

#

kind of an odd question but i want to make a webapp with python that would show some charts on an analysis of data that i have in a csv. this data has tags but it is currently just comma separated in a single cell for each item which feels so wrong. I am not sure if I am gong to move the data to a db or use pandas, but i am struggling to find the best way to store tags that will make it easy to read and sort by (e.x. click on a tag in the webapp and it will show all items that have that tag, and each item can have multiple tags). anyone have experience with this?

#

ik in something like bigquery i could use something like an array in a column for each line item but i also dont want to get that involved if it is a personal passion project that is to just track books i am reading and the tags i am adding to them

#

i have also considered the option of having a denormalized set where there is an entry for each book and each tag it has but that also feels kind of icky to me and could cause issues with latency (i sometimes have more than 20 tags on a book and there will be at least 1500 books with the list growing each day)

wise goblet Sep 5, 2024, 12:24 AM

#

obsidian heart kind of an odd question but i want to make a webapp with python that would show ...

kind of an odd question but i want to make a webapp with python that would show some charts on an analysis of data that i have in a csv. this data has tags but it is currently just comma separated in a single cell for each item which feels so wrong. I am not sure if I am gong to move the data to a db or use pandas, but i am struggling to find the best way to store tags that will make it easy to read and sort by (e.x. click on a tag in the webapp and it will show all items that have that tag, and each item can have multiple tags). anyone have experience with this?
Relational database like Postgres has several solutions for that.

Just using Array and store as it is tags https://www.postgresql.org/docs/current/arrays.html
U already again ability to query
Postgres has all the syntax to treat them correctly and return u query if only specific tag is present

https://docs.sqlalchemy.org/en/14/dialects/postgresql.html#array-types

from sqlalchemy.dialects import postgresql

mytable = Table("mytable", metadata,
        Column("data", postgresql.ARRAY(Integer, dimensions=2))
)
mytable.c.data.contains([1, 2])

SQLAlchemy with its Postgresql dialect can use this functionality for example

PostgreSQL Documentation

8.15. Arrays

8.15. Arrays # 8.15.1. Declaration of Array Types 8.15.2. Array Value Input 8.15.3. Accessing Arrays 8.15.4. Modifying Arrays 8.15.5. Searching in …

#

More traditional SQL solution...
using Normalization. Creating extra table with unique tags for example

And then creating third table for many to many relationship
where each record links your row from main db with specific tag.
Repeat records for linking multiple tags with specific row from main table
Django ORM probably has it implemented as Many To Many relationship shortcut or smth like that

#

Also can be just used some pervert solution like using "LIKE" syntax for searching values in any pseudo array for any SQL engine (including Sqlite3)

wise goblet Sep 5, 2024, 12:29 AM

#

obsidian heart kind of an odd question but i want to make a webapp with python that would show ...

ultimately solution depends on size of your database.

if i used golang, and it fitted in memory, i would just created data structs and kept in memory.. in python same would have worked less good but can do too. Depends on size
Then next good pick Sqlite3
Then postgres 😎

#

if it is personal project as it is, i would have crossed postgres away. too heavy for pet projects

wise goblet Sep 5, 2024, 12:30 AM

#

obsidian heart i have also considered the option of having a denormalized set where there is an...

🤔 if u expect the number growing fast, may be good to pick postgres.
Otherwise sqlite3 is a good start then

paper flower Sep 5, 2024, 7:34 AM

#

wise goblet if it is personal project as it is, i would have crossed postgres away. too heav...

To be honest it's fine for personal projects if you already have a db running on your pc/local network

cedar tiger Sep 5, 2024, 9:17 AM

#

-1 would mean the last character. 6 would mean the 6th position. -1:6 means last character and up to the 6th character, but it doesn't work that way

mossy swift Sep 5, 2024, 9:18 AM

#

I think it's because name[-1] (the last element) is to the right of name[6] and the default step for slicing is 1, so you move to the right
You can specify your own "step" for slicing, for example:

#

!e
name='Jennifer'
print(name[-1:6:-1])

delicate fieldBOT Sep 5, 2024, 9:18 AM

#

mossy swift !e name='Jennifer' print(name[-1:6:-1])

:white_check_mark: Your 3.12 eval job has completed with return code 0.

cedar tiger Sep 5, 2024, 9:18 AM

#

If you just wanted the last 2 characters, just do -2:

#

!e

name='Jennifer'
print(name[-2:])

delicate fieldBOT Sep 5, 2024, 9:18 AM

#

cedar tiger !e ```py name='Jennifer' print(name[-2:]) ```

:white_check_mark: Your 3.12 eval job has completed with return code 0.

er

torn sphinx Sep 5, 2024, 9:43 AM

#

Hey guys I have 20tb of images to store

#

I need to store it to train a model later (yolov8)

#

Where should I store it? I need it long term too

#

If I use amazon s3/glacier, will I be able to train models with the stored data directly or will I have to download it again

weak yoke Sep 5, 2024, 11:14 AM

#

torn sphinx If I use amazon s3/glacier, will I be able to train models with the stored data ...

Download it again

weak yoke Sep 5, 2024, 11:15 AM

#

torn sphinx If I use amazon s3/glacier, will I be able to train models with the stored data ...

glacier is very expensive to retrieve data from

torn sphinx Sep 5, 2024, 11:17 AM

#

weak yoke glacier is very expensive to retrieve data from

What’s the best solution? Should I get a physical hard drive

coral wasp Sep 5, 2024, 12:17 PM

#

torn sphinx What’s the best solution? Should I get a physical hard drive

It's basically a math equation... how much data, how long you're going to keep it, where's the model running, etc.

#

Are you training locally or in AWS?

torn sphinx Sep 5, 2024, 12:36 PM

#

coral wasp Are you training locally or in AWS?

sagemaker

#

I heard that sagemaker can read data directly from s3 but I’m not sure

coral wasp Sep 5, 2024, 12:41 PM

#

torn sphinx sagemaker

play around with this calculator to estimate costs - https://calculator.aws/#/createCalculator/S3

AWS Pricing Calculator

AWS Pricing Calculator lets you explore AWS services, and create an estimate for the cost of your use cases on AWS.

formal latch Sep 5, 2024, 5:32 PM

#

More of a conceptual and generic question.

Best or good practice: how should I handle my database connection code? Where should I place it, how should I structure it, etc.

fading patrol Sep 5, 2024, 5:58 PM

#

formal latch More of a conceptual and generic question. Best or good practice: how should I...

What "database connection code"? You're writing your own instead of using a library?

storm mauve Sep 5, 2024, 5:59 PM

#

depends on the framework, not that generic

for FastAPI, you should use Dependencies to create connections that draw from a global pool
for Flask, iirc you typically just save the pool as part of the app configs

#

using https://github.com/fastapi/full-stack-fastapi-template/blob/master/backend/app as an example

formal latch Sep 5, 2024, 6:14 PM

#

fading patrol What "database connection code"? You're writing your own instead of using a libr...

Im using a prebuild wrapper, asyncpg. I just want to know how to structure where the code that interacts within PG is.

fluid mortar Sep 5, 2024, 7:32 PM

#

To be honest, structuring good DB code within a non trivial FastAPI app is a pain in the ass. Like Flask, it’s too keen on using globals.

#

The example above is okay but it relies on a module level ‘engine’ variable. But at least it gets the settings from the environment

storm mauve Sep 5, 2024, 7:39 PM

#

fluid mortar The example above is okay but it relies on a module level ‘engine’ variable. But...

what is the problem with having the module level ‘engine’ variable?

#

at least for FastAPI, you can and should use their dependency injection overwrites if you need to mock it for testing

if you need to use a different engine depending on the request, that feels pretty cursed but could be done by having another dependency instead of leaving it global I guess

fluid mortar Sep 5, 2024, 8:04 PM

#

I don’t want logic being executed at import time. The engine global has side-effects, so you have to be careful when importing that file.

Their dependency override system for testing is nasty because it uses the original object as the key. Which means you get the side effects from importing it even though you won’t use it

rustic remnant Sep 5, 2024, 9:00 PM

#

Hey all, I have an eight month long class for designing a project with 2-3 people that fills some need in CS that hasn't been filled. I really am trying to orient my career path towards databases (with a little bit of DS), but for the life of me can't think of a project that would take that long to make with that many people. Of course it wouldn't be full time work on the project, but I would need to get something presentable and just really don't know what I could even present related to databases. Any ideas?

storm mauve Sep 5, 2024, 9:07 PM

#

collect usage analytics in whichever project you do, and look for ways to store it efficiently and analyse them

databases are part of nearly any CS project ever, but not a project on their own unless you're creating your own (which I wouldn't really recommend)

severe basin Sep 7, 2024, 1:17 AM

#

Do you have any idea how I can add exceptions for the database calls?

#

I'm using AsyncIOMotorClient

potent quartz Sep 7, 2024, 1:29 PM

#

do i need a conn pool with async conn?

#

wont like the async-ness of the conn act as a paralelism mechanism by itself?

#

just like with conn pool

waxen finch Sep 7, 2024, 1:42 PM

#

potent quartz wont like the async-ness of the conn act as a paralelism mechanism by itself?

from my understanding, a pool still gives you a convenient way of running queries without needing to open a new connection each time (connection reuse), along with queuing up workers to prevent excessively high concurrent connections

potent quartz Sep 7, 2024, 1:43 PM

#

waxen finch from my understanding, a pool still gives you a convenient way of running querie...

well on the point of conn reuse

#

i would just pass single conn via app state

potent quartz Sep 7, 2024, 1:44 PM

#

waxen finch from my understanding, a pool still gives you a convenient way of running querie...

along with queuing up workers to prevent excessively high concurrent connections
and with that as i understood async should do even better than with sync (beacuse thats where async strikes at, as i understood)

#

however

#

i think i should just like test

#

with pool vs no pool

#

beacuse about high concurrency

#

i am going to store binary data in db

#

so perhaps separating would help

waxen finch Sep 7, 2024, 1:46 PM

#

potent quartz > along with queuing up workers to prevent excessively high concurrent connectio...

async tasks are definitely more efficient to create compared to threading, but servers cant tolerate infinite connections regardless of which concurrency model you use

potent quartz Sep 7, 2024, 1:47 PM

#

waxen finch async tasks are definitely more efficient to create compared to threading, but s...

but wont the db be the bottleneck at that point?

#

or more like process conn limit

waxen finch Sep 7, 2024, 1:47 PM

#

not to mention there's the overhead of handshaking every new connection

#

someone else or a blog post can explain it better than me, but in short, the benefits of using them don't become redundant simply because your program is asynchronous, and having fewer connections can outperform a database with too many connections
https://stackoverflow.blog/2020/10/14/improve-database-performance-with-connection-pooling/

potent quartz Sep 7, 2024, 1:58 PM

#

i read up a bit of docs actually

#

and

#

it seems like i misunderstood how it works internally a bit pithink

potent quartz Sep 7, 2024, 1:59 PM

#

waxen finch someone else or a blog post can explain it better than me, but in short, the ben...

yeah i will need a conn pool

#

because i cant use same conn from different threads at the same time without a lock..?

#

i use rust btw, just to clear up some possible misunderstandings

waxen finch Sep 7, 2024, 2:04 PM

#

potent quartz because i cant use same conn from different threads at the same time without a l...

you could definitely try, but mixing queries into one connection without locking is always bound to cause some obscure race conditions: sql BEGIN; INSERT INTO foo VALUES (7, 8, 9); BEGIN; SELECT a, b, c FROM foo WHERE a < 5; UPDATE foo SET b = b * 10 WHERE a = 2; ROLLBACK; DELETE FROM foo WHERE c = 10; COMMIT;

potent quartz Sep 7, 2024, 2:05 PM

#

waxen finch you could definitely try, but mixing queries into one connection without locking...

ic

#

and pg's pipeline mode

#

seems more like just batch request

#

rather than some automatic magic

waxen finch Sep 7, 2024, 2:12 PM

#

seems about right, its described as just client-side so i'd assume the database was already able to queue up queries (from one connection) before the feature was added

thorny anchor Sep 7, 2024, 7:18 PM

#

the model in relational databases is multiple concurrent connections, not multiple concurrent transactions on one connection

tender tree Sep 8, 2024, 7:32 PM

#

hey guys why are dbt docs such a mess

torn sphinx Sep 9, 2024, 1:38 AM

#

MongoDB is best

oak pumice Sep 9, 2024, 11:14 AM

#

This may be a silly question but it's Monday morning and i'm only two sips into my coffee (That's my story and I'm sticking to it):

I have two tables [PartsInventory] & [PartHistory] that pull just under 60k items every morning from our software's DB via API calls. I also have three more tables [CurrentDayInventory] CDI, [PreviousDayInventory] PDI, and [ArchivedInventory] AI. THe current process is to dump PDI's data into AI. Then dump CDI into PDI. Then dump the up to date data from a View I created into CDI.

I recently began working on this piece again and decided to compare the View to CDI and which ever Part required updating to update it instead of deleting all data in CDI and then inserting 60k items.

My question, is whether or not it is more efficient to simply delete the data in PDI, then insert all of CDI into it. Then delete CDI data and insert all of my View into it. OR iterate over each item and update where necessary? The latter method appears to have increased the run time on the script.

Thanks!

#

I should also add that I am using SQLAlchemy within my Python script to do all of this.

clever topaz Sep 9, 2024, 11:54 AM

#

oak pumice This may be a silly question but it's Monday morning and i'm only two sips into ...

How are you updating the existing records?

#

I wonder also if you could get away with renaming the tables (at least for CDI -> PDI). Should be pretty fast : https://dba.stackexchange.com/a/53850

Database Administrators Stack Exchange

How long will it take to rename table

How long will it take to rename a table with a size of 219 GB?

This is the specification of our server:
Memory: 65 GB
Available disk space: 266 GB
MySQL Ver 14.12 Distrib 5.0.77

Sorry for the

oak pumice Sep 9, 2024, 12:00 PM

#

I have CDI and PDI set up so that I can compare yesterday's items with today's items which I use for a couple of reports for the managers.
Changing their names wouldn't help i nthis scenario. As for the update, I will grab a code sample and post it

clever topaz Sep 9, 2024, 12:06 PM

#

oak pumice I should also add that I am using SQLAlchemy within my Python script to do all o...

Also beware indexes for slowing down the inserts into the AI table (https://dba.stackexchange.com/a/55575). What are your current benchmarks and what are what speed are you aiming for?

spark aurora Sep 9, 2024, 12:06 PM

#

wsg ppl

oak pumice Sep 9, 2024, 12:11 PM

#

Here is the function that updates iterates through my View and updates CDI where necessary. Previously, I simply deleted the data in CDI and inserted everything from the View into CDI. I figured updating would be quicker but it seems to have lengthed the process.

#

https://paste.pythondiscord.com/R5QQ

#

That might be easier to read

clever topaz Sep 9, 2024, 12:28 PM

#

oak pumice Here is the function that updates iterates through my View and updates CDI where...

Hmm curious why you decided to go with SQLAlchemy vs pure SQL against the DB.

In terms of the update, you are making multiple queries per row, which is going to be quite slow. Also I'm not sure these operations are being batched, but that will have a big impact if not either.

#

I would expect a query like this to take on the order of 10s of seconds, a couple mins max. How long is it taking at the moment?

paper flower Sep 9, 2024, 12:40 PM

#

oak pumice I should also add that I am using SQLAlchemy within my Python script to do all o...

It shouldn't be hard to just merge exiting entities 🤔

clever topaz Sep 9, 2024, 12:42 PM

#

paper flower It shouldn't be hard to just merge exiting entities 🤔

In terms of efficiency it will likely (definitely?) be slower though

paper flower Sep 9, 2024, 12:43 PM

#

clever topaz In terms of efficiency it will likely (definitely?) be slower though

A bit, yes, but it's just 60k entities, how slower do you think it'll be?

#

And also why?

#

Let's say you have a code like this to add new entities:

session: Session

for chunk in chunked(scraped_data, 1000): # Probably would be a good idea to chunk your data there
    for schema in chunk:
         model = SomeModel(id=shcema.id, name=schema.name, ...)
         session.add(model)
    session.flush()

You can just modify it a bit to update existing records:

session: Session

for chunk in chunked(scraped_data, 1000):
    stmt = select(SomeModel).where(SomeModel.id.in_([schema.id for schema in chunk]))
    existing_models = {model.id: model for model in session.scalars(stmt)}
    for schema in chunk:
         model = existing_models.get(schema.id) or SomeModel(id=schema.id)
         model.name = schema.name
         session.add(model)
    session.flush()

#

That's if you're using orm of course

#

With core you could just use on conflict do update?

clever topaz Sep 9, 2024, 12:45 PM

#

paper flower A bit, yes, but it's just 60k entities, how slower do you think it'll be?

Not much, a couple seconds. Not 100% sure what OP is attempting to optimise for.

paper flower Sep 9, 2024, 12:46 PM

#

Honestly if you're after raw speed you probably ~~shouldn't use python~~ should just benchmark all the options 🤔

#

Maybe yeah, just call delete/truncate on a table and insert all the records again, see if that would be better

#

But generally sqlalchemy should try to optimize updates, if model didn't change nothing should be emitted into db

slate lichen Sep 9, 2024, 2:27 PM

#

Hmm, I will either get praise or a tonne of backlash due to this code

class DictWrapper:
    def __init__(self, dictionary):
        self._dict = dictionary

    def __getattr__(self, key):
        if key in self._dict:
            value = self._dict[key]
            if isinstance(value, dict):
                return DictWrapper(value)
            return value
        else:
            return None

    def __setattr__(self, key, value):
        if key == "_dict":
            super().__setattr__(key, value)
        else:
            if isinstance(value, dict):
                self._dict[key] = DictWrapper(value)
            else:
                self._dict[key] = value

    def __repr__(self):
        return repr(self._dict)

The only reason I made it is so that I can easily read and manupulate the dictionary without getting random index errors. The code looks prettier as well.

||This code was motivated by the javascript objects||

clever topaz Sep 9, 2024, 2:32 PM

#

slate lichen Hmm, I will either get praise or a tonne of backlash due to this code ```py clas...

I don't think trying stuff out is bad, and this is a nice approach. Re. the problem that you had though (random index errors), I'm not sure how you using dicts, but dict.get(key) can be used instead as it just returns None by default (default val is configurable).

slate lichen Sep 9, 2024, 2:42 PM

#

clever topaz I don't think trying stuff out is bad, and this is a nice approach. Re. the prob...

Yh thanks for that info, but typing response.StatusCode is way cleaner than response.["StatusCode"] or response.get("StatusCode") where response is a dictionary

next shadow Sep 9, 2024, 6:50 PM

#

Hello, your messages have been removed for violating rule 6

slate lichen Sep 9, 2024, 7:08 PM

#

oak pumice Here is the function that updates iterates through my View and updates CDI where...

How is the picture such high resolution 💀?

pearl lodge Sep 9, 2024, 7:55 PM

#

guys why my table content is not deleting?

#

Template.py is ready
result = [(None, 1130268405333753857, 'Hikaru', 'Hikaru:', None)]
deleted

#

the process is occuring but the deletation si not happening

#

async def delete_default_character(
        self, *, user_id: int, name: str | None = None, prompt_prefix: str | None = None
    ) -> None | str | list:
        cursor = await self.db.execute_fetchall(
            "SELECT * FROM default_characters WHERE user_id = ? and (prompt = ? or char_name = ?)",
            (user_id, prompt_prefix, name),
        )

        result = list(i for i in cursor)

        print(f"{result = }")

        if len(result) > 0:
            if len(result) == 1:
                if name:
                    await self.db.execute(
                        "DELETE FROM default_characters WHERE user_id = ? and char_name = ?",
                        (user_id, name),
                    )
                    await self.db.commit()

                    print("deleted")

                    return "SUCESS"
                elif prompt_prefix:
                    await self.db.execute(
                        "DELETE FROM default_characters WHERE user_id = ? and char_name = ?",
                        (user_id, name),
                    )

                    print("deleted")

                    await self.db.commit()
                return "SUCESS"
            else:
                result_list = list()

                for i in result:
                    data = {
                        "name": i[2],
                        "prompt_prefix": i[3],
                        "image_url": i[4],
                    }
                    result_list.append(data)

                return result_list
        else:
            return "ERROR"

#

why it's not deleting from my database?

oak pumice Sep 9, 2024, 9:59 PM

#

slate lichen How is the picture such high resolution 💀?

I used CodeSnap extension in VSC 💪😎

mossy whale Sep 10, 2024, 1:36 AM

#

pearl lodge guys why my table content is not deleting?

in the elif prompt_prefix: block you attempt to delete an entry using name, even though the condition implies that name might not be provided.

harsh pulsar Sep 10, 2024, 4:36 AM

#

what's the purpose of doing this?
https://github.com/dbt-labs/dbt-adapters/blob/c707b534184a1a631e9b27437b4737626e19a5bd/dbt/include/global_project/macros/materializations/snapshots/helpers.sql#L62
won't that always be null 100% of the time? why not just write null? is it some kind of workaround to avoid an explicit cast?

delicate fieldBOT Sep 10, 2024, 4:36 AM

#

dbt/include/global_project/macros/materializations/snapshots/helpers.sql line 62

nullif({{ strategy.updated_at }}, {{ strategy.updated_at }}) as dbt_valid_to,```

finite edge Sep 10, 2024, 11:20 AM

#

What is the fastest way to check "in" on integer on dbs?

Get the whole list from JSON and placing it in a set?

For example, a list of channels ids

coral wasp Sep 10, 2024, 12:33 PM

#

finite edge What is the fastest way to check "in" on integer on dbs? Get the whole list fro...

Can you give an example of what you mean?

finite edge Sep 10, 2024, 12:39 PM

#

coral wasp Can you give an example of what you mean?

channels ids in a json file

[
    16382728238381982384,
    16382728238381982386,
    16382728238381982385,
    16382728238381982383,
    16382728238381982387,
    16382728238381982388,
    16382728238381983389,
]

on message event for example

white_channels = getallwhitechannelsjson() #returns the whole json in set

if message.channel.id in white_channels:
     ...

coral wasp Sep 10, 2024, 12:40 PM

#

And what do you want to query from a db? (And what type of db?)

finite edge Sep 10, 2024, 12:41 PM

#

coral wasp And what do you want to query from a db? (And what type of db?)

Json is considered a database...

finite edge Sep 10, 2024, 12:44 PM

#

finite edge What is the fastest way to check "in" on integer on dbs? Get the whole list fro...

Well, thats not the point, i have this issue

coral wasp Sep 10, 2024, 12:47 PM

#

finite edge Well, thats not the point, i have this issue

Why does 'fastest' matter? Is there a huge number of objects or something? Is there a complex json file? If it's just your example, that's just find an element in a list.

finite edge Sep 10, 2024, 12:49 PM

#

coral wasp Why does 'fastest' matter? Is there a huge number of objects or something? Is th...

Exceeding my example

thorny anchor Sep 10, 2024, 1:04 PM

#

finite edge Exceeding my example

if you're just using a json file, there's no faster way

finite edge Sep 10, 2024, 1:08 PM

#

thorny anchor if you're just using a json file, there's no faster way

So is json the best way to do something like that?

thorny anchor Sep 10, 2024, 1:08 PM

#

maybe

#

json could be a good choice for other reasons. it's not well suited for this task

coral wasp Sep 10, 2024, 1:15 PM

#

finite edge So is json the best way to do something like that?

Since this is the db channel, I'll give you my ducky_drawing solution: ```py
with open('myfile.json', 'w') as f:
f.write("""
[
"16382728238381982384",
"16382728238381982386",
"16382728238381982385",
"16382728238381982383",
"16382728238381982387",
"16382728238381982388",
"16382728238381983389"
]
""")

import duckdb
df = duckdb.sql("select * from 'myfile.json' where json = '16382728238381982384' ").df()
print(df)

thorny anchor Sep 10, 2024, 1:16 PM

#

well that's impressive

slate lichen Sep 10, 2024, 1:17 PM

#

oak pumice I used CodeSnap extension in VSC 💪😎

Imma try it out. Seems like something pretty cool 😎

coral wasp Sep 10, 2024, 1:20 PM

#

thorny anchor well that's impressive

It's actually cooler when you deal with more complex json...https://duckdb.org/2023/03/03/json.html

thorny anchor Sep 10, 2024, 1:24 PM

#

that is indeed cool

finite edge Sep 10, 2024, 2:09 PM

#

coral wasp It's actually cooler when you deal with more complex json...<https://duckdb.org/...

Ty

#

👍

harsh pulsar Sep 10, 2024, 4:41 PM

#

finite edge Json is considered a database...

json is not suitable as a database when partial reading and writing is required. the json file format does not support partial rewriting.

#

and duckdb is cool but you might want to consider just using regular python for this... json.load interprets json arrays as python lists. so once your data is loaded, you just have a normal list, and you can use normal python code on it

last perch Sep 11, 2024, 3:11 AM

#

I set up a db but have no idea how to work it. Still using json because we are less than 5 users so far

fading patrol Sep 11, 2024, 11:57 AM

#

last perch I set up a db but have no idea how to work it. Still using json because we are l...

It's worth learning and won't take long. SQLite is the easiest step up from a JSON file and good enough if you don't need simultaneous transactions. The main difference from JSON is just that you need a schema and tables

spare fox Sep 12, 2024, 7:40 AM

#

Can I set automatic attachment downloading of specific email sender that I receive every day and set it to be also automatically saved on my pc when I receive it?

cedar tiger Sep 12, 2024, 10:22 AM

#

spare fox Can I set automatic attachment downloading of specific email sender that I recei...

yes you can

spare fox Sep 12, 2024, 10:42 AM

#

Hi ! Tnx...Is there a software for that or I have to write code ?

spare fox Sep 12, 2024, 10:43 AM

#

cedar tiger yes you can

Can I do it through python?

cedar tiger Sep 12, 2024, 11:11 AM

#

spare fox Can I do it through python?

Yes

torn sphinx Sep 12, 2024, 6:48 PM

#

which database offers the most storage for free with the most read/write cycles?

fading patrol Sep 12, 2024, 8:06 PM

#

torn sphinx which database offers the most storage **__for free__** with the most read/write...

Question is vague... are you asking about fully managed cloud providers or what?

You can self-host Postgres and move a whole lot of data through it, depending on your hardware

torn sphinx Sep 12, 2024, 8:16 PM

#

fading patrol Question is vague... are you asking about fully managed cloud providers or what?...

database on the cloud, i cant self host a database because it needs to be available 24/7

fading patrol Sep 12, 2024, 8:31 PM

#

torn sphinx database on the cloud, i cant self host a database because it needs to be availa...

You're not going to get a reliable cloud-managed database for free. If your app is on a VPS you can manage the DB yourself on the same server

fading patrol Sep 12, 2024, 8:33 PM

#

torn sphinx database on the cloud, i cant self host a database because it needs to be availa...

This is a bit old, not sure if any of these options are still available https://www.reddit.com/r/Heroku/comments/x2qwst/free_alternatives_to_heroku_with_free_sql_database/

From the Heroku community on Reddit

Explore this post and more from the Heroku community

lime dagger Sep 12, 2024, 8:39 PM

#

torn sphinx which database offers the most storage **__for free__** with the most read/write...

if you can afford the transition to KV store instead of sticking to relational databases, serverless kv store like cloudflare workers kv and vercel hobby tier is also decent

torn sphinx Sep 12, 2024, 8:50 PM

#

lime dagger if you can afford the transition to KV store instead of sticking to relational d...

idk what a kv store is but i currently use mongodb

#

i was thinking of upgrading to serverless mongodb if i have to

torn sphinx Sep 12, 2024, 8:51 PM

#

fading patrol This is a bit old, not sure if any of these options are still available https://...

i use mongodb right now but ill look into this

ornate wave Sep 13, 2024, 10:32 AM

#

Hi, I have a question about SQLAlchemy's cascade deletion, specifically when using a secondary join.

If I declare cascade="all, delete", does it delete rows from the association (middle) table only, or does it also delete rows from the related table on the other side of the relationship? Or does it delete from both?

If I declare onDelete="CASCADE" on the middle table, do I also need to declare cascade="all, delete" on the left and right tables?

gilded ferry Sep 13, 2024, 1:10 PM

#

i feel dumb looking at all of this

warm igloo Sep 13, 2024, 1:14 PM

#

I have a question. So I appear to be having an issue with my connection string pyodbc. I mean the code runs perfectly on my computer, but when this other person tries running it, she gets the following error:

Login failed for user 'janedoe'

Here's the code:

`
server = 'servername'
database = 'dbname'
username = 'readonly'
password = 'xxxxxxxxxx'

conn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password, trusted_connection='yes', autocommit=True)
`

fading patrol Sep 13, 2024, 1:33 PM

#

gilded ferry i feel dumb looking at all of this

Normal... I've been programming professionally for a few years and I still don't understand half the stuff people discuss on this server 😉

fading patrol Sep 13, 2024, 1:34 PM

#

warm igloo I have a question. So I appear to be having an issue with my connection string p...

Do they have network connectivity to servername?

warm igloo Sep 13, 2024, 1:36 PM

#

fading patrol Do they have network connectivity to `servername`?

I'll just check with her.

#

Now to wait for a response.

warm igloo Sep 13, 2024, 3:02 PM

#

fading patrol Do they have network connectivity to `servername`?

So no she does not have a connectivity to the server. In SSMS, it says cannot connect to ServerName. Under additional information, it says Login failed for user '...\JDoe'. My guess is that she has no access to the server.

#

I guess I'll contact someone to try and get her access.

sly acorn Sep 13, 2024, 7:51 PM

#

hello all, is it bad practice to name certain keys in the database as "name.bio" instead of name_bioi? The original source where I'm getting the data from has keys names with both underscore and dots as well.

Should I make the schema according to that or rename the incoming data to match the database with all keys in underscore?

paper flower Sep 13, 2024, 9:36 PM

#

sly acorn hello all, is it bad practice to name certain keys in the database as "name.bio"...

It's probably better to be more consistent, I personally prefer _, also depending on db I think you'd have to wrap a name with . in quotes, but I may be wrong here 🤔

#

On python side if you use orm it would certainly be easier to use an underscore

#

Also regarding the field name.bio itself, can't it be just bio?

coral wasp Sep 13, 2024, 10:16 PM

#

sly acorn hello all, is it bad practice to name certain keys in the database as "name.bio"...

Uh don't use periods in names. Ever.

#

Periods are for scoping, and you'll be forced (as Doctor said) to always double quote the names

paper flower Sep 13, 2024, 10:26 PM

#

Yeah, e.g. schema.table, table.field

#

Would be much easier to write table.name_fio than table."name.fio"

static badger Sep 14, 2024, 3:15 AM

#

Not sure if this is the right channel but I’m curious for dealing with local data files as part of Jupyter hosted data analysis I’ve found my three most useful external modules are pandas, pyarrow and duckdb. Am I missing anything that’s better than these?

clever topaz Sep 14, 2024, 12:06 PM

#

static badger Not sure if this is the right channel but I’m curious for dealing with local dat...

Depends entirely on what you are trying to do. Throw polars into the mix if you are looking to work with large Data frames.

harsh pulsar Sep 14, 2024, 12:49 PM

#

static badger Not sure if this is the right channel but I’m curious for dealing with local dat...

Ibis might be interesting to you: it's a common interface over several data frame libraries, including Pandas, Polars, and Duckdb

#

Other options include Xarray for when you want >2-dimensional Pandas data frames, or Dask for when you want to do Spark-style partitioned parallel batch processing without the fuss of Spark (arguably this is less useful for local computing than it used to be, now that we have higher-performance multithreaded stuff like Duckdb that can handle out-of-core processing)

#

Pandera is good if you are building applications around data frames

sly ember Sep 14, 2024, 12:57 PM

#

Hi, not sure if this is the right channel to ask this in but I was having some trouble so I figured I'd try, https://github.com/vn-ki/film-raffle-bot I'm attempting to recreate a discord bot that uses python, I think I've got the main elements in the config file down but I'm having trouble with setting up a database for it since I have no experience with that. Any suggestions or recommendations for what to use?

GitHub

GitHub - vn-ki/film-raffle-bot

Contribute to vn-ki/film-raffle-bot development by creating an account on GitHub.

harsh pulsar Sep 14, 2024, 3:43 PM

#

sly ember Hi, not sure if this is the right channel to ask this in but I was having some t...

How are you planning to host the bot? Sqlite is probably the easiest option, but you will need write access to the filesystem where you host the bot

#

(The sqlite database is just a file)

sly ember Sep 14, 2024, 3:46 PM

#

harsh pulsar How are you planning to host the bot? Sqlite is probably the easiest option, but...

i was planning on just running it from my computer since it doesnt need to be online 24/7 if thats possible

#

i’m not really sure how hosting solutions work

fading patrol Sep 14, 2024, 5:09 PM

#

sly ember i was planning on just running it from my computer since it doesnt need to be on...

That's fine. What exactly you'll need to do to get it exposed to the Internet may depend on your router or ISP but basically the options are to set up port forwarding or open some kind of tunnel

As for the database, I agree SQLite is the easiest place to start, perfect for something that's not going to support a huge amount of real-time traffic

sly ember Sep 14, 2024, 5:40 PM

#

i created the file in the same folder as the rest of the files, edited the config with the necessary info but the main.py isnt working no matter what and im not sure what the issue is

burnt maple Sep 14, 2024, 10:01 PM

#

I have a question how can I make subfolders for my gui? My apologies

fading patrol Sep 14, 2024, 10:04 PM

#

sly ember i created the file in the same folder as the rest of the files, edited the confi...

You should show the relevant code and other details in a help channel #❓｜how-to-get-help

sly ember Sep 14, 2024, 10:06 PM

#

oh i got that fixed but now it’s just the commands not working, i’ll post in a bit

spring palm Sep 15, 2024, 3:10 AM

#

why am I get floating point errors with my float columns using asyncpg? i'm not doing anything exotic with my data except from loading it into my class

paper flower Sep 15, 2024, 3:49 AM

#

spring palm why am I get floating point errors with my float columns using asyncpg? i'm not ...

What was in the initial data? How are you loading it?

spring palm Sep 15, 2024, 4:33 AM

#

paper flower What was in the initial data? How are you loading it?

wdym by that? i'm just using a select query

paper flower Sep 15, 2024, 4:34 AM

#

I don't understand what kind of errors are you getting since these seem like normal values to me

#

Also is that war thunder?

spring palm Sep 15, 2024, 4:34 AM

#

paper flower Also is that war thunder?

yep

#

also i should've clarified, i'm getting this

paper flower Sep 15, 2024, 4:35 AM

#

Are you getting this from your api or from warthunder's api?

spring palm Sep 15, 2024, 4:37 AM

#

paper flower Are you getting this from your api or from warthunder's api?

at it's core a python script get's it from a datamine and puts it into the database

paper flower Sep 15, 2024, 4:37 AM

#

spring palm also i should've clarified, i'm getting this

I mean where are these values coming from?

spring palm Sep 15, 2024, 4:46 AM

#

paper flower I mean where are these values coming from?

can you be more specific?

paper flower Sep 15, 2024, 4:48 AM

#

That's what I'm asking you to simply understand what's the cause of that problem 😅

#

If that's a screenshot from warthunder's api for example then you can't really fix that

#

other thatn just round it to nearest N digits after coma yourself

spring palm Sep 15, 2024, 4:51 AM

#

paper flower If that's a screenshot from warthunder's api for example then you can't really f...

it's from my own api, which i'm trying to fix

#

and i guess i could round it, but i would prefer a more robust solution

paper flower Sep 15, 2024, 4:55 AM

#

Depending on how your client works with these values you may want to use decimal.Decimal here and a numeric type in your db 🤔

#

Float values can't be accurately represented with IEEE754 floats if that's what you're going for

#

But generally that should be fine if your client can handle formatting correctly

spring palm Sep 15, 2024, 5:07 AM

#

paper flower Depending on how your client works with these values you may want to use `decima...

i tried using DecimalField and it crashed from numeric overflow 🫠

#

i'll just round it for now until i work out something better

hasty helm Sep 15, 2024, 5:23 AM

#

spring palm wdym by that? i'm just using a select query

what font are you using
looks cool

spring palm Sep 15, 2024, 5:23 AM

#

hasty helm what font are you using looks cool

https://github.com/IdreesInc/Monocraft

GitHub

GitHub - IdreesInc/Monocraft: A monospaced programming font inspire...

A monospaced programming font inspired by the Minecraft typeface - IdreesInc/Monocraft

coral wasp Sep 15, 2024, 2:13 PM

#

Open help thread and share code plz: #❓｜how-to-get-help

faint marlin Sep 15, 2024, 2:23 PM

#

coral wasp Open help thread and share code plz: <#704250143020417084>

done, thanks ^^

grim quest Sep 15, 2024, 7:47 PM

#

hey guys i got a job interview for a data analyst position in two days

#

what topics should i go over

fading patrol Sep 15, 2024, 9:41 PM

#

grim quest what topics should i go over

I googled it for you: https://www.datacamp.com/blog/how-to-prepare-for-a-data-analyst-interview

In addition to lists like these I would especially be researching the specific company and role

14 Data Analyst Interview Questions: How to Prepare for a Data Anal...

Prepare for your data analyst interview and practice some data analyst interview questions. Get help landing your dream job!

static badger Sep 15, 2024, 11:25 PM

#

harsh pulsar Ibis might be interesting to you: it's a common interface over several data fram...

I’ve also been using AWSwrangler + boto for non remote stuff. (Mostly for integrating Athena queries into my notebooks or to dynamically generate glue records for S3 hosted data..)

Ibis has been on my radar to checkout, as was Polars but i actually ran into an issue with Polars and would have had to compile it myself to handle datasets with more than 4 billion (or so) records, which ironically pandas has no issue with if you have the ram. (64 bit index).

My largest dataset had 300+ billion rows. I melted my laptop trying to analyze it, so I ended up spinning up my notebook on a 1TB ec2 instance with Vscode remote. Believe it or not pandas (with Pyarrow) was able to handle it fine. (But took 12 hours.) in hindsight they t would have made sense to use a distributed processing engine but I was trying to contain my costs. (Was about $70 for the ec2 instance)

Polars is still on my list for its Iceberg support.. but I’m more and more attracted to duckdb’s native format for those cases I need an extendable file based data store that supports schema evolution.

Ibia remains on my list.. I’ll learn more about out it at the very least. I kind of like learning to work directly with the different libraries so may wait to actually use ibis.

coral wasp Sep 15, 2024, 11:28 PM

#

static badger I’ve also been using AWSwrangler + boto for non remote stuff. (Mostly for integr...

DuckDB discord is active and helpful, so ask for help there if you get stuck. It can handle larger than memory datasets, but as with everything, the details matter. My preferred model involves a lot of hive partitioned parquet files, which duckdb is really good with

static badger Sep 15, 2024, 11:30 PM

#

Yeah I’m in there. Duckdb was a recent discovery. I love it.. refactored a bunch of parquet code to use it.

#

It’s not as fast with parquet as its native format but I can’t give up the space efficiency of parquet. I ended up writing a simple query caching function which more than made up for the performance difference.

#

(It’s gross but I instantiate a global variable dictionary which uses a tuple of the query parameters as the cache key.)

It speeds up the first run and since I live in Jupyter I get the benefits on subsequent runs as long as I do t restart the kernel

#

Next time I have to analyze a large dataset I’m going to try duckdb

coral wasp Sep 15, 2024, 11:34 PM

#

static badger (It’s gross but I instantiate a global variable dictionary which uses a tuple of...

Could also just create a table in an in memory duckdb table

#

I use a dbt pipeline that mostly starts from parquet, too

#

(Which also creates various tables for the intermediate steps)

#

And, if you want to survive restarts, then make it an on disk db rather than :memory:

static badger Sep 15, 2024, 11:36 PM

#

All this data is data to help me. (I do Finops and I’m largely left to fend for myself.. I had to teach myself python and enough data analysis/engineering to get it done.

And I love it!

coral wasp Sep 15, 2024, 11:37 PM

#

Nice, I'm on data engineering side of fintech

static badger Sep 15, 2024, 11:38 PM

#

Ah neat! My background is devops/cloudops/linux.

coral wasp Sep 15, 2024, 11:38 PM

#

One important skill: understanding SQL window functions. Windows are the answer to many complex queries.

static badger Sep 15, 2024, 11:38 PM

#

(I’ve had to do ops support for de teams in the past so I wasn’t scared to get my hands dirty)

#

Ok good to know. Is this particularly to duckdb queries or in general?

#

As of now I largely only use sql with Athena and duckdb. (In both cases I dynamically generate the queries in code)

#

I also have to use psql sql for generating cloud watch metrics in lambda. (Much more lightweight as it’s for tracking database metrics like length of longest running query)

coral wasp Sep 15, 2024, 11:42 PM

#

static badger Ok good to know. Is this particularly to duckdb queries or in general?

In general if you're doing analytics work with databases.

static badger Sep 15, 2024, 11:45 PM

#

coral wasp Could also just create a table in an in memory duckdb table

I’ll try it and see if there is a performance difference.

harsh pulsar Sep 15, 2024, 11:49 PM

#

static badger I’ve also been using AWSwrangler + boto for non remote stuff. (Mostly for integr...

Interesting experience. I definitely found Polars' support for "out of core" / "streaming" to be incomplete, eg you can't do a streaming partitioned write to a pyarrow dataset even though pyarrow supports it

#

Duckdb does better

#

I think Dask can do some out of core stuff but meh, Duckdb just works most of the time

coral wasp Sep 15, 2024, 11:57 PM

#

harsh pulsar Duckdb does better

I think they're all somewhat 'incomplete': it's an area of active work and optimization for all of them... I haven't compared them side by side, I generally can just throw RAM at a problem and partition the dataset.

harsh pulsar Sep 15, 2024, 11:58 PM

#

coral wasp I think they're all somewhat 'incomplete': it's an area of active work and optim...

Absolutely, I don't mean that to be negative -- it's just been my experience

#

They all have areas where they are good and less good

coral wasp Sep 15, 2024, 11:59 PM

#

Yah, i guess what I really wanted to say is that this stuff is the state of the art, and as such, will have bumps along the path of using

static badger Sep 16, 2024, 12:21 AM

#

Looking at window functions. Can they kinda replace the groupby functions in pandas? (Among other things.. really just looking g to see why I would need them)

#

Oh nvm I see.. seems like it could be useful.. but maybe not for any problems I’ve had to solve YET.

coral wasp Sep 16, 2024, 12:40 AM

#

static badger Looking at window functions. Can they kinda replace the groupby functions in pan...

Groupby is just group by in sql.

#

Windows are good for things like: give me the top three products by country, or: show me rolling 7 day sales average

#

Or, let's say you want to compare sales this quarter to sales 4 quarters ago

#

(These are all things you can also do in pandas, just explaining why windows are useful)

torn sphinx Sep 16, 2024, 12:48 AM

#

I'm in a bit of a predicament and i need help please because i need to deploy
so first of all every time i need to execute sql on my webapp i open a new connection but this is a very expensive operation to do like opening new tcp connection to postgres db in the cloud so what people do is that they create a connection pool and pull connections from it so that there is no opening and closing overhead but the problem is that the appache webserver makes my code in process and each process has 5 threads so my server code gets ran num_ processes * 5 times which results in initializing a number of connections higher than the available (5)

coral wasp Sep 16, 2024, 1:24 AM

#

You can use a connection pool with only one connection.

#

This just means that each process will reuse the same connection, rather than making new connections each time

paper flower Sep 16, 2024, 1:27 AM

#

Honestly I was thinking that maybe don't use apache or use pgbouncer so it does the pooling instead pithink

austere badge Sep 16, 2024, 4:29 AM

#

Is it bad to run a database server and a website on the same machine?

#

I only have 1 machine and I am trying to run both so any recommendations which database would be fine running on the same machine as my website

thorny anchor Sep 16, 2024, 4:33 AM

#

it's fine

barren abyss Sep 16, 2024, 4:53 AM

#

what happen if i index on column with alphabetical data/ char type in SQL ?

#

like name for example

paper flower Sep 16, 2024, 5:25 AM

#

barren abyss what happen if i index on column with alphabetical data/ `char` type in SQL ?

What do you mean by "what happen"?

#

Index would get created pithink

barren abyss Sep 16, 2024, 5:29 AM

#

paper flower What do you mean by "what happen"?

I mean i can see if its on int

#

But idk if possible for char type

paper flower Sep 16, 2024, 5:30 AM

#

btree indexes (which are default in most databases, I believe), so anything that can be compared can be indexed

#

chars can be indexed (if we're talking about single characters), strings can be too

obsidian basin Sep 16, 2024, 6:59 AM

#

@weak yoke , hopefully it is okay if I ping you

I created a simpler example

Like stated in the post

I am having trouble inserting and then querying data in flask sqlalchemy 2.0. Does anyone know how?

Also can someone please show an an example?
Can someone use sqlalchemy 2.0 querying when answering?

Also if I need to add other columns at another time how would I add the columns in a many to many relationship? I assume this would be equivalent to updating columns even though the columns are optional.

Here is the error.
https://paste.pythondiscord.com/TPWQ
app.py
https://paste.pythondiscord.com/TPTQ

home.html
https://paste.pythondiscord.com/B6FA

create_book.html
https://paste.pythondiscord.com/WBUQ

Thanks

craggy hornet Sep 16, 2024, 2:25 PM

#

Hi, I'm looking for someone to give me a hand with some facebook scraper, I have to collect phone numbers of a particular location.

fading patrol Sep 16, 2024, 3:50 PM

#

craggy hornet Hi, I'm looking for someone to give me a hand with some facebook scraper, I have...

Nobody will help you get around Facebook ToS here against server rules. Also that's not a database question

acoustic lava Sep 16, 2024, 5:21 PM

#

Hey everyone!
Can anyone tell me how can I store 3d models in sql database and how I can retrieve to render it on website.

fading patrol Sep 16, 2024, 5:50 PM

#

acoustic lava Hey everyone! Can anyone tell me how can I store 3d models in sql database and h...

Why do you think you want to store 3D models in your database? It's probably better to store the URLs of the files in your DB instead, but if they are small blobs that should be trivial

The normal way to retrieve any data from a DB for a website is an API which you can build with whatever framework you like

For the rendering part, pick a library https://medium.com/lai4d/how-to-embed-an-interactive-3d-model-on-a-web-page-9bd631046927

Medium

How to embed an interactive 3D model on a web page

Thanks to the HTML5 standard nowadays it is possible to visualize 2D and 3D vector graphics in any web browser and in any device without…

jade gorge Sep 17, 2024, 12:24 AM

#

fading patrol Why do you think you want to store 3D models in your database? It's probably bet...

this, ive tried storing the blobs and its way worse, just store the URLs

whole mica Sep 17, 2024, 7:00 AM

#

How can I update all values in an object in MongoDB?
For example, the data look like ```
{
"_id": ...,
"myObject": {
"a": 1, "b": 2, "c": 3
}
}

and I would like to change all 1, 2 and 3 to 0s

jade gorge Sep 17, 2024, 11:38 AM

#

You can use the $set operator

#

query = {"_id": your_document_id}

update = {
    "$set": {
        "myObject.a": 0,
        "myObject.b": 0,
        "myObject.c": 0
    }
}

collection.update_one(query, update)

sacred shell Sep 17, 2024, 12:10 PM

#

Hello, Ive been trying to make this work for the past 2 days, I honestly dont know if this is the correct place to come, But my bot keeps "closing" the databases connection

Ive learned that this is because of my init_database_schema function that loads the schema.sql and gets it contents and executes the commands idk, But, Nothing is wrong there, I load my schema.sql and execute, Weirdly it shows that error, But fun fact, When i do return <asyncpg.Pool>._closed, It for some reason returns, False. Even if the error occured, Can someone help me?

paper flower Sep 18, 2024, 5:00 AM

#

sacred shell Hello, Ive been trying to make this work for the past 2 days, I honestly dont kn...

Perphaps connection is getting stale 🤔

#

Honestly I didn't work with asyncpg directly but sqlalchemy's pool includes a pre_ping functionality which checks if the connection is alive before sending any requests

#

https://github.com/MagicStack/asyncpg/issues/309 Seems to be related?

delicate lodge Sep 18, 2024, 6:06 AM

#

B

#

Mysql or posgrrs

tender kernel Sep 18, 2024, 3:57 PM

#

Alright... here's the thingy...

I'm building a database in PostgreSQL of raw trade data from the Kraken cryptocurrency exchange. There are 724 .csv files. Each one represents a unique trading pair. Each files contains anywhere from 10,000 rows up to around 50 million rows of data. The data in each row always consists of unix time stamp, price, volume like so:

1628609725,2.50300,50.36358000
1628609725,3.07400,2.00200000
1628609725,3.07400,2.00200000
1628609725,3.07400,2.00200000
1628609733,2.50300,7.87232000
1628609739,2.50300,355.43768000
1628609750,2.50300,3.93616000
1628609757,2.50300,2.00400000```

As you can see, entries 3-5  are identical, which is fairly common in raw trade data due to trading algorithms executing trades in rapid succession. Herein lies the problem...

This following link is to the kraken exchange support page
https://support.kraken.com/hc/en-us/articles/360047543791-Downloadable-historical-market-data-time-and-sales-
There you will find two links within, on for the bulk of their entire historical trade data in its most granular form and the second like is to quarterly data dumps. So the problem is, I am currently aggregating the bulk of the data into my database upto and including Q2 2024. Sometime in early Oct they will release the Q3 datadump.

My plan is to be consistently running a datacenter on my raspberry pi to capture live trade data and add it to my database via websocket, however, I'm considering the fact that this data stream can and will be interrupted at some point in the future. For example, if I started the data stream now, there would be  gap of missing data from the end of the bulk data up to the end of Q2 all the way until I start the websocket for live data. 

So If I added the Q3 data to the database, there is no way to differentiate between identical entries, and duplicate entries. I can't seem to figure out how to 'fill in the gaps' without duplicating trade data.

#

https://tenor.com/view/leeloo-fifth-element-please-help-gif-1926920342784571712

Tenor

#

So, the best I can come up with is giving a unique trade id's to the bulk data that identifies as a data dump, and giving unique trade id's to the live data that identifies as live trade data so if there are identical entries that already exist from one side or the other, then processing will skip over adding it as a duplicate?

Does that make sense?

clever topaz Sep 18, 2024, 9:48 PM

#

tender kernel Alright... here's the thingy... I'm building a database in PostgreSQL of raw tr...

Make rows unique on all three columns and then run an insert that just skips on conflict.

#

Also lots of other ways to approach this.

thorny anchor Sep 18, 2024, 11:12 PM

#

clever topaz Make rows unique on all three columns and then run an insert that just skips on ...

but they said that there are genuine duplicates

coral wasp Sep 18, 2024, 11:28 PM

#

Maybe compute a rolling checksum (ie rabin) over last N entries to find the overlap points?

#

https://en.m.wikipedia.org/wiki/Rabin–Karp_algorithm

coral wasp Sep 18, 2024, 11:30 PM

#

tender kernel Alright... here's the thingy... I'm building a database in PostgreSQL of raw tr...

See above

tender kernel Sep 19, 2024, 1:12 AM

#

coral wasp See above

reading. I'll get back to you, thank you.

paper flower Sep 19, 2024, 2:18 AM

#

tender kernel Alright... here's the thingy... I'm building a database in PostgreSQL of raw tr...

Since you already have a timestamp (1628609757 i guess for example?) can't you just skip all the data before that?

cedar tiger Sep 19, 2024, 6:18 AM

#

!pip dataset is interesting. Avoids the whole sql injection and the methods looks easy to use.

delicate fieldBOT Sep 19, 2024, 6:18 AM

#

dataset v1.6.2

Toolkit for Python-based database access.

Released on <t:1689152052:D>.

storm mauve Sep 19, 2024, 7:17 AM

#

cedar tiger !pip dataset is interesting. Avoids the whole sql injection and the methods look...

"Avoids the whole sql injection"?
SQL Injection is not a problem if you are using any library correctly, and I bet that if you use it wrong enough you'll still end up with code vulnerable to sql injections

For extremely simple projects it might be fine, but for anything with a non-trivial schema I would recommend using SQLAlchemy (which that library is built on top off) instead of using it

placid plume Sep 19, 2024, 3:55 PM

#

Hey guys. I'm new to this channel and our team recently built a SQLAlchemy extension for VS Code. It's basically an AI chat (RAG) system built out around the SQLAlchemy docs that allows you to ask questions about the project and get relevant answers without leaving VS Code (it's also completely free). Wanted to share here in case anyone found it useful! https://marketplace.visualstudio.com/items?itemName=buildwithlayer.sqlalchemy-integration-expert-jYSzG&

SQLAlchemy - Visual Studio Marketplace

Extension for Visual Studio Code - An AI Chat System Trained on SQLAlchemy Documentation

fervent stump Sep 22, 2024, 1:13 AM

#

hello everyone, apologies if the answer is obvious, but i need some assistance identifying my server name.
i am trying to connect to my local sql server inside my .py script. however, i get the following error:
Error connecting to database: ('HYT00', '[HYT00] [Microsoft][ODBC SQL Server Driver]Login timeout expired (0) (SQLDriverConnect)')
when running select @@SERVERNAME , it shows me the default DESKTOP-###### server name.

how do i setup my connection string?
prior to this i was using a docker instance, however, the image for that was lost so i am no longer using it.

tender kernel Sep 22, 2024, 5:39 AM

#

paper flower Since you already have a timestamp (`1628609757` i guess for example?) can't you...

the problem is that multiple identical entries can occur within the same timestamp for example:

1628609725,3.07400,2.00200000
1628609725,3.07400,2.00200000```

paper flower Sep 22, 2024, 5:39 AM

#

tender kernel the problem is that multiple identical entries can occur within the same timesta...

Well, you can just delete the duplicates? 🤔

tender kernel Sep 22, 2024, 5:40 AM

#

paper flower Well, you can just delete the duplicates? 🤔

they arent duplicates. they are legitimate trades

#

recurring trades from algo bots

paper flower Sep 22, 2024, 5:40 AM

#

I'm just saying that you can for example:

Get latest timestamp from your db
Delete everything with that timestamp
Start receiving data from the timestamp you're deleted

#

so you won't get any duplicates

#

I don't think there's really a need to overcomplicate, how many trades per second are you expecting to get on average?

#

Correct me if I'm wrong, but I think something like

# Pseudocode
last_record = select(Something).order_by(Something.timestamp.desc()).limit(1)
delete(Something).where(Something.timestamp == last_record.timestamp)

for record in data_stream(from_=last_record.timestamp):
   add_to_db(record)

should work?

#

If there's not a lot of records that could be in the same second/timestamp then impact of that delete statement is pretty minimal, so there shouldn't be any obvious problems?

tender kernel Sep 22, 2024, 5:58 AM

#

paper flower I'm just saying that you can for example: 1. Get latest timestamp from your db 2...

Timestamp is placed into the raw .csv files by the exchange not my db.
Can't delete duplicates because they aren't duplicates, they are identical trades that occur in rapid succession on the excahnge within the same timestamp. It's down to the second so multiple trades can occur within the same second, in fact they commonly do.
The data dumps occur about 1-2 weeks after each quarter ends. So I would be using a data dump to fill in the gaps between the last data dump and whatever gets missed by the live websocket. but if the entire entry from the live data is identical to the data dump, there is no way to differentiate between a duplicate and an identical trade.

#

there are hundreds of millions of trades

paper flower Sep 22, 2024, 5:58 AM

#

There would be duplicates if you accidentally go over the same timestamp twice

tender kernel Sep 22, 2024, 5:59 AM

#

tender kernel there are hundreds of millions of trades

32gb of .csv files i'm processing

paper flower Sep 22, 2024, 5:59 AM

#

that's what I was trying to avoid

tender kernel Sep 22, 2024, 5:59 AM

#

paper flower 2) There would be duplicates if you accidentally go over the same timestamp twic...

do you mind a voice conversation?

paper flower Sep 22, 2024, 5:59 AM

#

sorry, I'm not really in the mood

tender kernel Sep 22, 2024, 6:01 AM

#

paper flower 2) There would be duplicates if you accidentally go over the same timestamp twic...

no worries.

The examples I provided are real entries from the raw .csv file. multiple identical trades are legitimately occurring within the same second which subsequently produces what look like duplicate entries but are actual legitimate trades.

#

so if live data is recorded....

obtuse halo Sep 22, 2024, 6:02 AM

#

That's not typically how you do this

tender kernel Sep 22, 2024, 6:02 AM

#

hang on

obtuse halo Sep 22, 2024, 6:02 AM

#

You add a 4th column for batch identity

#

Which records which input task inserted that row

tender kernel Sep 22, 2024, 6:03 AM

#

so if live data is recorded... and then I add the data dump from the whole quarter to fill in any gaps, the data dump and the live trade data that exists incrementally, there is no way to tell the dumplicates from the identical trades

obtuse halo Sep 22, 2024, 6:03 AM

#

Right

#

You keep track of the job that inserted each row