#databases

1 messages · Page 24 of 1

paper flower
#

lol

fallen vault
#

That’s something I’m thinking about but I don’t think they would want extra things to do (manage a flask website over paying for a database manager)

wise goblet
#

i'll vote that this is one of the top horrors i saw for the last year. Not really maintainable at all.
full of magical values solution

fallen vault
#

Easier isn’t always better.

frosty mirage
frosty mirage
wise goblet
#

i don't like in it only that it returns still SQLqlalchemy statement a bit

#

would be nice wrapping this logic entirely into Pydantic returns for input and output 😅

paper flower
wise goblet
paper flower
#

I would rather use sqlalchemy models instead of returning pydantic objects

wise goblet
#

i hate ORM code propagating through my code base further than from pydantic wrapper 🙈 it is the most type unsafe code usually. Or at least difficult to type without Any stuff

paper flower
#

Usually you'd have some kind of adapter to your specific api (rest, cli, etc) and map a model there

#
class Book(MappedAsDataclass, Base, kw_only=True):
    __tablename__ = "books"

    id: Mapped[int] = mapped_column(primary_key=True, init=False)
    title: Mapped[str]
#

It's already a mapped table, no need to map it again

#

Unless you're returning some kind of aggregate

wise goblet
#

haven't seen this thing

paper flower
wise goblet
paper flower
#
Book(id=42)
src\app\db\models\_user.py:28:1: error: Unexpected keyword argument "id" for "Book"  [call-arg]
    Book(id=42)
Book()
src\app\db\models\_user.py:28:1: error: Missing named argument "title" for "Book"  [call-arg]
    Book()
wise goblet
#

still not liking though fragility of input params into SQL queries though (since they can be Any kwargs.) (or at least haven't seen input param checking)

#

there is no type checking anyway

#

so default behavior to wrap into repository thing is still valid solution

paper flower
#

It should be possible to add that sqlalchemy though, they should have their reasons to not do that

blissful seal
paper flower
delicate fieldBOT
wise goblet
#

in off topic channels

blissful seal
#

Yes, because I'm unaware of the offtopic channels.

paper flower
#

@wise goblet
In SQLCoreOperations they're typed as Any

def __eq__(self, other: Any) -> ColumnElement[bool]:  # type: ignore[override]  # noqa: E501
    ...
def __ne__(self, other: Any) -> ColumnElement[bool]:  # type: ignore[override]  # noqa: E501
    ...
#

But it should be possible to specify stricter types, since the generic is there:

class SQLCoreOperations(Generic[_T_co], ColumnOperators, TypingOnly):
blissful seal
#

I've heard that some databases attempt to be "helpful" in implicit conversion of types. I thought it might amuse.

paper flower
#

@frosty mirage When was the last time you tried an orm in python or any other language?

#

You could give SQLAlchemy a go, and in C# there's Entity Framework Core, which works on top of LINQ and expressions

#

Also if you don't want to use an orm you can always at least build queries with sqlalchemy core

frosty mirage
paper flower
#

That's a more safe option

frosty mirage
#

This is what I mean about adding all this extra mechanism on top of mechanism, just to avoid something that is so simple to do.

paper flower
#

For example like this:

options = {
    "1": Book.id,
    "2": Book.title,
}

select(options[...])
paper flower
#

And simplifies query building

frosty mirage
paper flower
#

Why? 😅

frosty mirage
paper flower
frosty mirage
paper flower
#

I sent one

frosty mirage
paper flower
#

That's what I call a hand-waving argument 😅

frosty mirage
paper flower
#

If you want me to replicate your code could you explain what it does first?

#

What fields are there, how does it search them?

frosty mirage
glacial current
# frosty mirage I kept it simple: all it does is assemble the part after the “`where`” in the qu...

There are other things that are nice about an ORM though that makes the code simpler. I used to do this sort of thing you are doing but when i would look back on it later it took a while to comprehend what it was actually doing. but the ORM method is pretty straight forward pattern. plus you can use other tools on top of it like alembic that will take the model and play an alter condition into it with no effort.

paper flower
#

@frosty mirage Maybe like this?

def escape_ilike(text: str, escape_char: str) -> str:
    return (
        text
        .replace(escape_char, escape_char * 2)
        .replace('%', f"{escape_char}%")
        .replace("_", f"{escape_char}_")
    )

class Model(Base):
    __tablename__ = "model"

    id: Mapped[int] = mapped_column(primary_key=True)

    make: Mapped[str]
    model: Mapped[str]
    details: Mapped[str]
    class_name: Mapped[str]
    allocation: Mapped[str]
    location_name: Mapped[str]


@dataclasses.dataclass
class ModelFilter:
    make: str | None = None
    model: str | None = None
    details: str | None = None
    class_name: str | None = None
    allocation: str | None = None
    location_name: str | None = None


def build_query(filter_: ModelFilter) -> Select[tuple[Model]]:
    stmt = select(Model)

    for field, value in (
        (Model.make, filter_.make),
        (Model.model, filter_.model),
        (Model.details, filter_.details),
    ):
        if value is None:
            continue
        stmt = stmt.where(
            field.ilike(escape_ilike(value, escape_char="/"), escape="/"))

    for field, value in (
        (Model.class_name, filter_.class_name),
        (Model.allocation, filter_.allocation),
        (Model.location_name, filter_.location_name),
    ):
        if value is None:
            continue
        stmt = stmt.where(field == value)

    return stmt
#

Builds a query like

SELECT 
    model.id, 
    model.make, 
    model.model, 
    model.details,
    model.class_name,
    model.allocation, 
    model.location_name 
FROM model 
WHERE model.make ILIKE 'Maker ' ESCAPE '/' 
AND model.model ILIKE 'a////:42' ESCAPE '/' 
AND model.details ILIKE 'b|////\\\\' ESCAPE '/' 
AND model.class_name = 'cls'
fallen vault
#

Does anyone know of a good Postgres client app for iOS?

paper flower
fallen vault
#

Alright. Didn’t want to pay for something I wasn’t sure about.

slender otter
#

hiiiiiiiiiii y'all

fading patrol
fallen vault
#

I have Pythonista. I could write a script, just wanted something prepackaged.

fading patrol
fallen vault
#

Does it have Postgres specific courses?

floral surge
#

does anyone have reading suggestions on Data manipulation languages?

raw reef
#

any who can help tell me why this is only making 1 table then stopping


    lines = []

    for line in file:

        line = line.strip()
        lines.append(line)

        name = str(lines[0])
        name = name.replace('b', '')
        name = name.replace("'", '')        

        cur.execute("CREATE TABLE IF NOT EXISTS " + name +" (Date DATE,Open float4, High float4, Low float4, Close float4)")

        conn.commit()

cur.close()
conn.close()```
fading patrol
glacial current
raw reef
#

the same coed works perfectly fine with mysql database but it's not working with postgres

#

but looks like with postgres it just loops the first line of the .txt file

#

wait lol nvm

#

i realized what it was lol

#

i was looping through them all but i was setting name to lines[0]

floral surge
#

suppose a table undergoes a set number of changes during a period of a year. is it common place to make a new table each time a change is made, and begin making new entries into this new table, instead of modifying the previously existing table?

coral wasp
naive trellis
#

no

#

There's tools like sql compre but you may manually alter the table. you don't recreate prod tables

raw reef
proud sparrow
#

Hello. I hope you are well. Could you tell me about your industry experience? How did you get into the professional world?

coral wasp
glacial current
#

With alembic what is best practice in the alembic.ini for the database connection url? sqlalchemy.url = driver://user:pass thing? Obviously don't want to commit that file. But what is normally done with that?

wise goblet
glacial current
#

Yeh I already set that up in the last project. And checked that file into the project with passwords and all. My question is more like should that file contain the naked password or do you usually set pass a different way? Are you saying I should set %(pass)s for the variable in the url?

glacial current
#

Well I guess that you just set the raw pass in there when you need to configure it. What I usually do is have some configuration set the credit in $HOME/.config/appname/config.ini

stone kestrel
#

what's the preferred way to select an entry from a table using a hash of some columns?
for example, say i have a table with two columns, id bigint, name varchar(20), and i want to be able to find a row using the hash of the id and name combined in some way (say, md5(str(id) + name))

#

i could hash it before i insert and store it as another field, but is that the best solution?

hexed estuary
stone kestrel
#

that sounds perfect, ill look into it, thanks

paper flower
#

config.set_main_option("sqlalchemy.url", get_settings(DatabaseSettings).url)

glacial current
paper flower
#

And get it from your environment

glacial current
#

OK thanks.

#

I really like the supervisor.d conf env interpolation. That could be useful in alembic

hexed estuary
# stone kestrel that sounds perfect, ill look into it, thanks

!e looks like it works for sqlite. I used a user-defined function for md5 here, but for performance one'd want to use a sqlite extension instead.

import sqlite3
import hashlib


# in reality you can load an extension for that I think, but I don't know offhand how to do that from python
def md5sum(t: str):
    return hashlib.md5(t.encode("utf-8")).hexdigest()


con = sqlite3.connect(":memory:")
# we need to promise to sqlite that this function is deterministic for it to let us use it in an index
# again, this won't be a problem for a built-in md5 from some sqlite extension
con.create_function("md5", 1, md5sum, deterministic=True)

with con:
    con.execute("CREATE TABLE mytable (id bigint, name varchar(20))")
    con.execute("INSERT INTO mytable values (?, ?)", (512, "yay"))
    lst = con.execute("SELECT *, md5(id || name) FROM mytable").fetchall()
    print(lst)
    val = lst[0][2]
with con:
    con.execute("CREATE INDEX mytable_hash ON mytable(md5(id || name))")
with con:
    print(con.execute("SELECT * FROM mytable WHERE md5(id || name) == ?", (val,)).fetchall())
    print(con.execute("EXPLAIN QUERY PLAN SELECT * FROM mytable WHERE md5(id || name) == 5").fetchall())
con.close()
delicate fieldBOT
#

@hexed estuary :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | [(512, 'yay', 'ff2b984f07eac019cc3694eb1e3fc6d0')]
002 | [(512, 'yay')]
003 | [(3, 0, 0, 'SEARCH mytable USING INDEX mytable_hash (<expr>=?)')]
stone kestrel
merry cobalt
#

hi could anyone help me on how to get specific data

stone kestrel
merry cobalt
#

I am trying to find a dataset that is within the past 3 years and has a lot of house features. It also needs the house price and location. Something like this https://www.openintro.org/data/index.php?data=ames. I just need newer data. That link is data from Ames, Iowa from 2006-2010, but I need features just like that

delicate fieldBOT
#

:incoming_envelope: :ok_hand: applied timeout to @torn sphinx until <t:1698999406:f> (10 minutes) (reason: role mentions spam - sent 4 role mentions).

The <@&831776746206265384> have been alerted for review.

terse stump
#

!pban 1168331268551888896

delicate fieldBOT
#

:incoming_envelope: :ok_hand: applied ban to @tired flax permanently.

mossy socket
#

<@&831776746206265384>

subtle basin
#

Hello. I want to implemet Postgres query

select * from delivery.client_contacts_notification ccn 
where array_to_string(ccn.phones, ';') like '%22298749%';

into SQLAlchemy ORM query

filter(any_(ClientContactsNotification.phones.like(f"%{client_phone}%"))

Because any_ doesn't work as I expected. Each item should be compared to expression with like operator

potent spire
#

in json file like this

{'id': '1165181902852984913', 'name': 'SwesRa Members', 'icon': '2a5a93228b81b999e24122db3b7ccc27'}

how to get name?

slender atlas
#

You can use json.load to load a JSON file as a dictionary

#

I'm alarmed you've posted this message in the #databases channel

slender atlas
#

It's a text format

mossy socket
#

database can refer to either a collection of data (which json is), or database management system (which json isn't)

slender atlas
#

It's not a memory efficient data collection format either

#

I suppose it was made to be convenient for web communication

paper flower
#
class Book(MappedAsDataclass, Base):
    __tablename__ = "book"

    id: Mapped[int] = mapped_column(primary_key=True, init=False)
    title: Mapped[str] = mapped_column(String(255), unique=True)
    elements: Mapped[list[str]] = mapped_column(ARRAY(String))


stmt = select(Book).where(Book.elements.contains("target"))
print(stmt.compile(dialect=postgresql.dialect(), compile_kwargs={"literal_binds": True}))
SELECT book.id, book.title, book.elements 
FROM book 
WHERE book.elements @> 'target'
#

Maybe this works for your case?

#

I guess ccn.phones is an array

subtle basin
# paper flower I guess `ccn.phones` is an array

Right. But if I use strictly contains in 2.0 { "error_msg": "(sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.UndefinedFunctionError'>: operator does not exist: character varying[] @> character varying\nHINT: No operator matches the given name and argument types. You might need to add explicit type casts." }

paper flower
#

What postgres version?

subtle basin
paper flower
#

And why any doesn't work?

subtle basin
# paper flower And why `any` doesn't work?

Plain SQL -> at least 1 result in set

select * from delivery.client_contacts_notification ccn 
where '+79*****749' like any(ccn.phones) ;

SQLAlchemy empty result set

FROM delivery.post_operation JOIN delivery.client_contacts_notification ON delivery.post_operation.id = delivery.client_contacts_notification.id_post_operation 
WHERE $1::VARCHAR LIKE ANY (delivery.client_contacts_notification.phones)
paper flower
#

Shouldn't it be any(field) ilike 'target'?

#

I honestly don't know if order matters here

#

Also you're doing a join here pithink

#

Does it return anything if you remove it and select directly from delivery.client_contacts_notification table?

subtle basin
paper flower
#

I mean, again, shouldn't any work?

#
SELECT book.id, book.title, book.elements 
FROM book 
WHERE ('target' LIKE '%%' || (ANY (book.elements)) || '%%')

I think the only reason it's possibly not working is your join in the second query you shared

#

select(Book).where(any_(Book.elements).contains("target"))

subtle basin
# paper flower `select(Book).where(any_(Book.elements).contains("target"))`
ef any_(expr: _ColumnExpressionArgument[_T]) -> CollectionAggregate[bool]:
    """Produce an ANY expression.

    For dialects such as that of PostgreSQL, this operator applies
    to usage of the :class:`_types.ARRAY` datatype, for that of
    MySQL, it may apply to a subquery.  e.g.::

        # renders on PostgreSQL:
        # '5 = ANY (somearray)'
        expr = 5 == any_(mytable.c.somearray)

        # renders on MySQL:
        # '5 = ANY (SELECT value FROM table)'
        expr = 5 == any_(select(table.c.value))

    Comparison to NULL may work using ``None`` or :func:`_sql.null`::

        None == any_(mytable.c.somearray)

    The any_() / all_() operators also feature a special "operand flipping"
    behavior such that if any_() / all_() are used on the left side of a
    comparison using a standalone operator such as ``==``, ``!=``, etc.
    (not including operator methods such as
    :meth:`_sql.ColumnOperators.is_`) the rendered expression is flipped::

        # would render '5 = ANY (column)`
        any_(mytable.c.column) == 5

Right, I have no unittest for that model, may be event selectinload changes behaviour

paper flower
subtle basin
odd estuary
#
async def get_player_item_value_by_name(user_id,items:list):
    """returns int value of items from database"""
    async with sql.connect(DATABASE_PATH) as db:
        cursor = await db.cursor()
        list_of_item_values = []
        for item in items:
            query = f"SELECT {item} FROM users WHERE user_id = ?"
            await cursor.execute(query,(user_id,))
            value = await cursor.fetchone()
            list_of_item_values.append(value[0])
    return list_of_item_values

async def get_player_item_value_by_name(user_id,items:list):
    """returns int value of items from database"""
    async with sql.connect(DATABASE_PATH) as db:
        cursor = await db.cursor()
        final_items = ""
        for item in items:
            final_items = final_items + f"{item} , "
        query = f"SELECT {final_items} FROM users WHERE user_id = ?"
        await cursor.execute(query,(user_id,))
        value_list = await cursor.fetchone()
    return value_list

which one is the best method that will actually work? becoz i havent tested them yet

plush quartz
#

“Cannot increment with non-numeric argument”

#

how suppose i fix this error?

paper flower
plush quartz
rich trout
#

then why are you asking your db to increment it?

floral surge
#

Is there any interface that generates the actual Python sqlalchemy Base code for the declaration of a table from a table already existing in a database?

plush quartz
#

i got my first bot(first bot use mongodb) worked, basically

if the data found ->
replace_one
elif not found ->
insert_one

And im working on another bot rn
I want make a game leaderboard (also with mongodb)
and I received an error while i add the if the data found…… stuff into my @client.event(this event hold the minigame fuction, and i add the db fuction at the place hold winner and loser) but I received this error

#
@client.event
TypeError: ‘Database’ object is not callable```
#

is that mean the db can not use able in @client.event?

spare dock
#

Hello community!

paper flower
#

o

floral surge
#

how heavy is the table-binding operation in sqlalchemy? resource-wise

wise goblet
#

python -m cProfile [-o output_file] [-s sort_order] (-m module | myscript.py) for easy invocation for any code.
python -m profile [-o output_file] [-s sort_order] (-m module | myscript.py)

floral surge
crisp panther
#
  File "/home/runner/NOA-sqlite/.pythonlibs/lib/python3.10/site-packages/discord/ext/commands/bot.py", line 1350, in invoke
    await ctx.command.invoke(ctx)
  File "/home/runner/NOA-sqlite/.pythonlibs/lib/python3.10/site-packages/discord/ext/commands/core.py", line 1029, in invoke
    await injected(*ctx.args, **ctx.kwargs)  # type: ignore
  File "/home/runner/NOA-sqlite/.pythonlibs/lib/python3.10/site-packages/discord/ext/commands/core.py", line 244, in wrapped
    raise CommandInvokeError(exc) from exc
discord.ext.commands.errors.CommandInvokeError: Command raised an exception: InterfaceError: Error binding parameter 0 - probably unsupported type.```

What's going on here?
#

In case it isn't clear, I'm trying to use a command on discord that adds an id unto the database

#

And when I try to add an id number, say "12345" that comes out

wise goblet
#

i refactored my code out of discord.py, and it was best decision i made

crisp panther
#

huh?

#

im not gonna not use discord.py for that, it's literally the bot's function lol

plush quartz
paper flower
paper flower
crisp panther
somber niche
#

INTEGER ?

crisp panther
#

i tried both but it doesn't work

paper flower
crisp panther
# paper flower Can you share your query though?

This is the table

This is the command that adds an id unto the database

                name='add',
                brief='Adds user id into the user list')
async def _add_(ctx, user_ids: commands.Greedy[int]):
  if user_ids:
    conn.execute("INSERT INTO users (id) VALUES (?)", (user_ids,))```
paper flower
#

And what was the full error?

crisp panther
# paper flower And what was the full error?
  File "/home/runner/NOA-sqlite/.pythonlibs/lib/python3.10/site-packages/discord/ext/commands/bot.py", line 1350, in invoke
    await ctx.command.invoke(ctx)
  File "/home/runner/NOA-sqlite/.pythonlibs/lib/python3.10/site-packages/discord/ext/commands/core.py", line 1029, in invoke
    await injected(*ctx.args, **ctx.kwargs)  # type: ignore
  File "/home/runner/NOA-sqlite/.pythonlibs/lib/python3.10/site-packages/discord/ext/commands/core.py", line 244, in wrapped
    raise CommandInvokeError(exc) from exc
discord.ext.commands.errors.CommandInvokeError: Command raised an exception: InterfaceError: Error binding parameter 0 - probably unsupported type.```
paper flower
#

That's not the full error

crisp panther
#

wait hold on

crisp panther
digital coral
#

user_ids is a list

#

are you sure you didnt want this?

@client.command(aliases=['a'],
                name='add',
                brief='Adds user id into the user list')
async def _add_(ctx, user_ids: commands.Greedy[int]):
  for u_id in user_ids:
    conn.execute("INSERT INTO users (id) VALUES (?)", (u_id,))
digital coral
#

is that command supposed to take multiple user ids?

#

or just one?

crisp panther
#

yep, that's why it's greedy

digital coral
#

oh

#

nice

paper flower
#

Didn't know Greedy[int] would be a list

slender atlas
# crisp panther yep, that's why it's greedy

You should change it from a loop of conn.execute to one conn.executemany:

conn.executemany("INSERT INTO users (id) VALUES (?)", tuple(zip(user_ids)))
conn.commit()

The tuple(zip(...)) is a trick to turn an iterable with items to a tuple of 1-item-long tuples.
On another note, consider using an asynchronous database API because your function is asynchronous and you wouldn't want your database to block your client. There's aiosqlite if you come from sqlite3

crisp panther
#

I'll consider using an asynchronous database

#

Tuples are the stuff that aren't meant to be modified right? Kinda goes against my intentions of it being modifiable at will so I think not using tuples is the best for it

real forum
#

Hey, I have a problem. Does anyone know how to fix this error?

The update method itself works with other values, only with the solution it doesn't. The insert function works as well.

ERROR

C:\Users\vhipn\Desktop\Eject[RECODE] V1.1\venv\Lib\site-packages\aiomysql\cursors.py:458: Warning: Truncated incorrect DECIMAL value: 'Hochzeit'
  await self._do_get_result()```
# **QUERY** 
```py
    async def update_solution(self, guild_id: int, solution: str) -> None:
        return await MariaDB.execute(("UPDATE emojiquiz SET solution = %s WHERE gid = %s", (solution, guild_id)))```
slender atlas
#

user_ids is a list of user IDs, but executemany requires an iterable of arguments. If it's only one argument, it needs an iterable of 1-item-long argument iterables

#

You can do [[user_id] for user_id in user_ids] if you really want both iterable types to be list, but I'm just saying my conversion trick is shorter

slender atlas
real forum
slender atlas
#

The letter after the % says its data type

#

d for (decimal) integers, s for variable-length character strings

real forum
#
    return await MariaDB.execute(("UPDATE emojiquiz SET solution = %s WHERE gid = %d", (solution, guild_id)))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\vhipn\Desktop\Eject[RECODE] V1\database\database.py", line 22, in execute
    await cursor.execute(*operation)
  File "C:\Users\vhipn\Desktop\Molly V1\venv\Lib\site-packages\aiomysql\cursors.py", line 237, in execute
    query = query % self._escape_args(args, conn)
            ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TypeError: %d format: a real number is required, not str```
slender atlas
#

What's the type of guild_id?

#

str?

real forum
#

BIGINT

slender atlas
#

Not gid

#

guild_id

real forum
#

gid = guild_id

slender atlas
#

Before the updating

#

In any case

#

If guild_id is an ID string, just put it in int(...)

real forum
#

guild_id: int, solution: str

slender atlas
#

Apparently guild_id: str

real forum
#

I managed to fix it, but thanks anyway.

slender atlas
valid shoal
#

Suggest best database language to learn for python

fading patrol
hidden creek
#

Hi

#

So lets say i have a many to many relationship

A table that stores sports and a table that stores people
Many people can play a sport and each person can Play multiple sports

#

And i have an app that lets people register their name and the sport they play and inserts that in the database

#

So if a person comes and makes a sport named football

And then another person comes and does the same

Does it create a new data?
Or use the same footbal?

#

This is happening in postgres
If that makes a difference

vague fable
#

If your talking about predefined sports in the database (not letting the user create a sport), you would just take the user imported sport, query your table for said sport, then put the new row in your peoples table where sport = sport.id (sport id being your foreign key)

hidden creek
#

What I'm asking is

If two users input the same name

Does it store it twice?

vague fable
hidden creek
#

Yes

vague fable
#

Then yes it would, unless you add a unique column for the sport name, then what would happen is you would get an error when the second user tries

hidden creek
#

So i have to validate manually then

#

Thanks

vague fable
hidden creek
#

It's not actually sports

It's a web project
And the subject is too big for me to predefine all the possibilities

#

I'd probably have to make billions if cases, which wouldn't be good for me 🙂

vague fable
hidden creek
#

There are relations
Multiple actually

ruby elm
#

How do I connect a mySQL database to a django rest_framework API

#

?

wise goblet
hardy cedar
#

I have a specific question, I would like to transfer my .db database to .sql to use it with postgres instead of sqlite3 as I did before but without losing any data

real ridge
#

hi can someone help me connect my database with pycharm using psycopg2, it says “no data sources are configured to run this SQL and provide advanced code assistance”. I’ve been trying to fix it all day and it’s making me go crazy atp, feel free to dm me

ruby elm
#
    'default': {
        'NAME': 'myDB',
        'ENGINE': 'mysql.connector.django',
        'HOST': '127.0.0.1',
        'PORT': 3306,
        'USER': 'root',
        'PASSWORD': 'my_password',
        'OPTIONS': {
          'autocommit': False,
          'use_oure': True,
          'init_command': "SET foo='bar';"
        },
    }
}``` What should I change my options too when connecting mySQL server to django
#

when I tried to runserver it threw an error

plucky holly
real timber
plush quartz
#

Why it store nothing?
Code:

#

When using command:

#

It do have data in database

#

Anyone please help me solve this issue 🙏

ruby elm
#

why can I not download mysqlclient using pip

#

does anyone know how to do it

coral wasp
floral surge
#

typically, is there a one-to-one mapping between a Pydantic model and a database table?

floral surge
paper flower
#

That is if you want to map your tables that way

#

But pydantic isn't very suitable for that by itself

floral surge
#

when would we have a Pydantic model that maps to more than one database table?

paper flower
#

Because you don't always select all columns or relationships

#

e.g. you may have a model

class User(BaseModel):
    id: int
    username: str

But what if you only select id?

floral surge
paper flower
#

I just think pydantic is a bad choice here, since you can just use an ORM that would do the conversion into python object for you

paper flower
#

Yep

floral surge
# paper flower Yep

in which case, there actually is a one-to-one relationship between an ORM model and the SQL table

paper flower
#

Usually, yes

floral surge
#

noted, thanks!

paper flower
#

Why are you asking about this by the way? pithink

#

There could be some edge-cases with inheritance, etc

floral surge
#

my questionnaire takes time in the form "hours minutes", and then I need to transform it into a valid datetime object by inferring year, month, day, and only then it becomes a valid database entry

#

so I was wondering, what is the usual way this sort of cohersion is done on incoming data

paper flower
#

Well, certainly not in your db models, where are you getting that data from?

floral surge
#

user input

paper flower
#

What kind of?

floral surge
#

prompt-answer CLI

paper flower
#

You can just pass user input into a method/function and do preprocessing/mapping into an intermediate data object or directly into a model

#

First option may be overengineered if you don't need to support other types of inputs

#

E.g. you may have a CLI and Web API

floral surge
#

yeah, but the data normalization/cohersion procedure is not done usually done through the ORM model itself? or in this case would a Pydantic model make sense?

paper flower
#

Depends 🤷

#

I personally prefer to not do any complex validation in pydantic

#
@dataclasses.dataclass
class QuestionaireCreateDTO:
    a: int
    b: str
    c: datetime

def map_cli_args_to_dto(data: dict[str, Any]) -> QuestionaireCreateDTO:
    # Map them how you want
    ...

def main():
    user_input: dict[str, Any] = get_cli_args()
    dto = map_cli_args_to_dto()
#

Then you could pass that DTO into a method/function that creates your db record

floral surge
#

DTO: data transfer object?

paper flower
#

Yep

#

You can map everything straight into your model, but that would be less "clean" imo

floral surge
#

the QuestionnaireCreateDTO has all validation/cohersion methods attached to itself? or do you do them in map_cli_args_to_dto?

paper flower
#

DTO is just that - a DTO, it doesn't have any methods, just data

floral surge
#

and you would create a different function in the case of incoming data from, e.g., an API, correct?

paper flower
floral surge
paper flower
#

If you have some specific data format coming from that API you do that in said API

#

So that logic doesn't pollute anything else

#

You mentioned that you have to infer time

floral surge
#

yeah, the time inference is based on finding which time-tracking database entry the currently defined object is related to, at which point it will use that primary entry's datetime as it's primary key which is also a foreign key

paper flower
#
async def books_create(
    schema: BookCreateSchema,
    command: Annotated[BookCreateCommand, Inject],
) -> BookSchema:
    book = await command.execute(dto=BookCreateDTO(title=schema.title))
    if isinstance(book, Err):
        match book.err_value:
            case BookAlreadyExistsError():  # pragma: no branch
                raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST)

    return BookSchema.model_validate(book.ok_value)

class BookService:
    def __init__(
        self,
        repository: BookRepository,
        db_context: DBContext,
    ) -> None:
        self._repository = repository
        self._db_context = db_context

    async def create(
        self,
        dto: BookCreateDTO,
    ) -> Result[Book, BookAlreadyExistsError]:
        if await self._repository.get(title=dto.title) is not None:
            return Err(BookAlreadyExistsError())

        book = Book(
            title=dto.title,
        )
        self._db_context.add(book)
        await self._db_context.flush()
        return Ok(book)

(Exampe from fastapi app)

#

I do mapping here but I usually make a to_dto method on Schema classes which maps them to dtos

book = await command.execute(dto=BookCreateDTO(title=schema.title))
#

I prefer to not use **kwargs since they won't be type-safe

#

e.g.

class BookCreateSchema(BaseSchema):
    model_config = ConfigDict(title="BookCreate")

    title: str = pydantic.Field(max_length=255)
    
    def to_dto(self) -> BookCreateDTO:
        return BookCreateDTO(title=self.title)
#

It's a simple example though

floral surge
#

hm, i'll study this for a while. thanks a lot for sharing, very helpful stuff

floral surge
paper flower
#

There's also BookCreateCommand class but in this case it just passes dto into BookService.create method

floral surge
# paper flower Yes, it's a fastapi endpoint

so validation is split multiple ways here: BookService makes sure the book doesn't already exist in the database, books_create calls for the validation requirements stored in BookSchema

paper flower
#

You need to leave some validation in fastapi's schema here just so it's built-in validation works

#

But some frameworks I integrated with don't have any built-in validation

floral surge
#

I'm trying to streamline it into a clearly visible high-level structure

pastel canyon
#

Hello everyone, i'm currently needing help with my safe chat i want to have a database and i need to have one for users and messages but sadly nothing is working as efficient and instead i have json errors, i woud like some help from anyone that have the time to do so : https://paste.pythondiscord.com/SWDQ

paper flower
#

Here's another example ( strawberry-graphql doesn't provide any built-in validation):

@strawberry.input(name="DashboardCreateInput")
class DashboardCreateInputGQL:
    name: str

    def to_dto(self) -> DashboardCreateDTO:
        return DashboardCreateDTO(
            name=self.name,
        )

@strawberry.type
class DashboardMutations:
    @strawberry.mutation
    @inject
    async def create_dashboard(
        self,
        info: Info,
        command: Annotated[DashboardCreateCommand, Inject],
        input_: Annotated[
            DashboardCreateInputGQL,
            strawberry.argument(name="input"),
        ],
    ) -> DashboardCreatePayloadGQL:
        dto = validate_callable(input_.to_dto)
        if isinstance(dto, ValidationErrors):
            return DashboardCreatePayloadGQL(error=dto)
        result = await command(
            dto=dto,
            user=await info.context.user_model,
        )
        return DashboardCreatePayloadGQL(
            dashboard=DashboardGQL.from_orm_(result.ok_value),
            error=None,
        )
#

validate_callable is pretty simple:

def validate_callable(callable_: Callable[[], T]) -> T | ValidationErrors:
    try:
        return callable_()
    except pydantic.ValidationError as exc:
        return _to_graphql_error(exc)
floral surge
#

Would this sort of structure make sense?

Questionnary method
    Obtains a set of user imputs in a prompt-answer format

Cohersion method
    Coherses questionnary strings into determinate data types

Validation method
    Complex validation (Does validation which requires server access)
    Simple validation (Simple Pydantic)

Model-creation method
    Creates one or more ORM models from the incoming cohersed and validated data

To database method
    Sends models to database
paper flower
#

For example you could have a cli adapter and rest api adapters, both could use the same methods from core

#

Adapters could use core, but not the other way around, keeping the core more or less isolated

floral surge
paper flower
#

cohersion? Depends

#

Adapter should map it's data into the format that core understands

floral surge
#

I see. so validation, model-creation and transfer-to-database would be the elements of the core

paper flower
#

Yep

#

I think you can use pydantic models in core too, so the data received is guaranteed to be validated

floral surge
#

noted! thank you very much for the help seriously, this has done a great deal in orienting me for finishing my application

paper flower
#

There are some libraries that can validate your imports

#

e.g. pytest-archon

delicate fieldBOT
#

tests/test_imports.py lines 10 to 16

def test_core_cant_import_adapters() -> None:
    (
        archrule("core_imports")
        .match("app.core.**")
        .should_not_import("app.adapters.*")
        .check("app.core")
    )```
pastel canyon
#

@paper flower

paper flower
#

Why ping me?

pastel canyon
#

If you are not busy

paper flower
#

I would have helped you if I wanted to or was interested

#

We're all volunteers here

pastel canyon
#

I posted a thread in python help if you have time

floral surge
floral surge
#

nice! this all makes nice structural sense. thanks a lot again

paper flower
#

@floral surge Generally you want your adapters to interact with core of your applications via pre-defined ports, I use Command/Query classes for that

#

You could use services directly but I think that exposes too much for the adapters

#

e.g. BookService.create returns Book or BookAlreadyExistsError, if we would want to add permission validation it wouldn't really be the right place to do that

floral surge
#

hm, a service is a distinct conceptual unit in your program structure, right?

paper flower
#

Yes

#

So I would put that into the command class:

class BookCreateCommand:
    def __init__(
        self,
        book_service: BookService,
        permission_service: PermissionService, # <-- New class
    ) -> None:
        self._book_service = book_service
        self._permission_service = permission_service

    async def execute(self, dto: BookCreateDTO) -> Result[Book, BookAlreadyExistsError | PermissionDeniedError]:
        if not await self._permission_service.can(Action.create, obj=Book):
            return Err(PermissionDeniedError())

        return await self._book_service.create(dto=dto)
#

Existing code that uses BookCreateCommand and properly handles error values would fail type-checking, since PermissionDeniedError is a new error

#

Like this:

src\app\adapters\api\books\_router.py:39:38: error: Item "Err[BookAlreadyExistsError | PermissionError]" of "Ok[Book] | Err[BookAlreadyExistsError | PermissionError]" has no attribute "ok_value" 
[union-attr]                                                                                
        return BookSchema.model_validate(book.ok_value)
                                         ^~~~~~~~~~~~~ 
src\app\adapters\graphql\apps\books\_mutations.py:37:37: error: Item "Err[BookAlreadyExistsError | PermissionError]" of "Ok[Book] | Err[BookAlreadyExistsError | PermissionError]" has no attribute
"ok_value"  [union-attr]                                                                    
                result=BookGQL.from_orm(result.ok_value),
floral surge
tough kiln
#

Say if we are making a database where attendance of students comes into place

Over there we have to get date wise attendance

Would it be better to have a separate table for each date, where the student serves as the primary key and it just has two columns (student and present/absent column)

Or have a single table with the date column as primary key and each student has their own column (date, student 1, student 2..... student n)

paper flower
#

@tough kiln Never go for the last option

tough kiln
#

Alright,so first option it is

#

Thx

paper flower
#

There may be a better design

tough kiln
paper flower
#

Probably a table like this would do:

student_id, date, is_present
#

I would use a surrogate pk here and put an unique constraint on (student_id, date) pithink

#

Composite pkeys are a pain to deal with in most cases

tough kiln
fringe sigil
#

alright i got a fairly hard-ish question for who ever can help.

torn sphinx
#

how would one store data into a db that resembles the github activity calendar basically. Like just the logic behind it, how would be the best way to store that activity data to an account. Im fairly new to data modelling etc, sorry!

fading patrol
torn sphinx
#

What would be the best way to store around 10+ billion records, around 5 - 10 TB of data total

All of the cloud solutions like aws / google cloud / etc seem to cost hundreds if not thousands per month which I can't afford for a hobby

I'm assuming the best solution would be to buy a physical server and self host it and then just worry about electricity costs, but wondering if anyone has any experience on this front

coral wasp
#

Maybe explain what you're trying to do, what kind of data it is, etc.

torn sphinx
coral wasp
#

oh, so graph data?

torn sphinx
#

like as long as it could return a query within 10 seconds then that's fine tbh and it doesn't need to update the data quickly

#

no it's just user profiles. Like imagine parsing every instagram profile in existence or something like that (but for another website) as well as relationships like friends, profile summaries, avatar history, etc.

coral wasp
#

My usual thought process for this type of problem is first "Can I do it with parquet files?" (hive partitioned ofc)

#

Then second: Can I fit it into memory? If so, duckdb (is my personal choice)

torn sphinx
coral wasp
#

Do you know how well this data will compress?

#

If you don't, I would create a test case (like 1 million records into a parquet file)

torn sphinx
#

no i just sorta calculated the end would be like 4 - 5 tb once everything's indexed etc in a typical sql database

coral wasp
#

Yah but you wouldn't use a typical sql database for this. You'd use an OLAP / columnar database, for sure.

#

This is static data, right?

#

Or are you appending? Or are you modifying?

torn sphinx
#

there's multiple tables but there's say like 10+ billion relationships that tells you the date when x and y user were last seen as friends. New records would be added and old records would be updated with more recent timestamps like last_seen

And maybe 1 bill profile records with info such as summary, avatar hash, and whatnot, that might also be updated

#

i'm guessing it'd best to like store the relationships in a sql database but then individual profiles in parquet files or some other way

coral wasp
#

You'd just put everything in a columnar store.

#

Oh, sorry, read closer: new records added / old records update. I dunno, you'd have to test... if it's constantly updating, you might be better off with a key-value store, rather than a rdbms

torn sphinx
#

do you think it'd be ok to buy like one of those 32cpu 256gb ram servers and then i'd mostly just worry about electricity costs and network stuff

#

idk that much about databases

coral wasp
#

Candidly, I think you're trying to run before you crawl. Get more data before spending $$.

#

Do some experiments on a smaller dataset.

#

Parquet, for instance, is especially nice when you can accumulate the data. ie: having a daily file for changes, etc.

#

Or, when you can do something like run a periodic job to update the "last seen" value... rather than computing it in real time.

torn sphinx
#

yea there's also definitely ways i can optimize the db to make it smaller but i'll test it more later, ty

coral wasp
torn sphinx
coral wasp
#

But, a 32core / 128GB system would cost about $3k usd.

torn sphinx
#

granted they're used tho and stuff

coral wasp
#

(plus GPU)

torn sphinx
#

oh mhm i haven't rly bothered to check the gen of the specs or anything, i think i even saw like 512gb ram 44 cpu for $2400 but that was without hard drives or a gpu and stuff

#

i'm assuming u need a bunch of ram to query rly large databases

storm mauve
#

if you index it properly, it shouldn't require that much RAM I believe?
I'd be more worried about the disk usage personally

coral wasp
torn sphinx
kindred merlin
#

Hi all this is more of a mysql python scripting issue.

But currently with this function, I get an addition of extra quotes for the parameters I pass in:

    cur.execute("CREATE DATABASE IF NOT EXISTS `%s`;", (db, ))```
Example: ```createDB('supplies')```ends up being `''supplies''`, I've done this with f-strings but I know that it's best practice to avoid using string formatting... does anyone have any idea how I overcome the addition of extra quotes?
coral wasp
#

It’s that the db is adding quotes already when you use a parameter. I believe you’d just drop the quotes around the %s

#

Im not 💯 sure tho about mysqls support for parameterized database names: I understand the quote issue but was a little surprised this even worked.

kindred merlin
#

If I take off those `` i get "mysql.connector.errors.ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''chatgpt'' at line 1"

#

with them CMySQLCursor: SHOW DATABASES; ("'chatgpt'",) ('information_schema',) ('mysql',) ('performance_schema',) ('sys',)

coral wasp
#

Yah, see, that’s what I -expected- would happen. Usually you’d need to prepare a dynamic statement or something

#

So, i am surprised your first thing worked at all

kindred merlin
#

How would a dynamic statement be formated?

kindred merlin
#

Oh I see, and I would would have to set the conn.cursor(prepared=True) and then setup the dynamic sql phase.

#

In your experience would there be a better way of doing this? Other than dynamic sql? Maybe use cursor.execute in a different way? like more directly?

coral wasp
#

Most dbs I work with don’t play well with dynamic table or column names. Query parameters are meant for values.

#

This is a common request/problem

kindred merlin
#

Right, so the process of architecting the DB should be done with specifying values in the sql instead of passing them in. Sounds annoying xD

coral wasp
#

Very annoying indeed. It’s such a common problem: wanting dynamic table names.

kindred merlin
#

I guess I could always just run the sql from a file, specific to each use case.

#

and use the function to call that specific action. Hand change it if I need to.

paper flower
kindred merlin
# paper flower Why do you need to create db dynamically?

Im new to db creation, so I thought it would be more convenient and possibly efficient if I could run most of what I need using functions that I can change on the go If need be, especially If I'm linking it to another python script that might pass in different values.

#

Im not really used to setting up scripts for Database creation, and I want to have an understanding of how it works at scale and eventually fit for production.

paper flower
kindred merlin
# paper flower If we're talking about a typical app usually a developer or a devops person, who...

Seeing as This is more of my own project, I will be doing that myself, I think you gave a good point out to the approach I should take for planning my db and structuring it for implementation. How would you go about setting up logical/physical db? and also how do you setup the script for it? How do you automate the sql for it to be easily integrated into the whole program? Hopefully I am not asking too many questions 😛

obtuse magnet
#

could use an ORM or write raw sql for ur app to interact with the database

kindred merlin
# obtuse magnet https://dbdiagram.io/ to plan ur database

Thanks, I did consider using such tools, but didn't follow through yet. I believe MySQL comes with Workbench tool which does that. I was trying to get the functions coded before I setup the database, but that might be the wrong order... : )

paper flower
#

You just create a db and use some kind of migration tool to create all the tables and other db object

#

e.g. alembic

#

When time goes on you don't need to think about designing a db/tables that much becaues you already have experience with that, unless is's some kind of complex relationship, so you just write/create them

#

For example:
I need to store authors and their books: well, that's simple one-many relationship, create author and book tables, add author_id to books, done
Turns out there could be co-authors, well, add a third table to facilitate many-to-many relationship, migrate existing data with simple sql, you're done 😉

kindred merlin
kindred merlin
paper flower
#

Yeah, you should be able to create a copy of your production database schema locally, and you shouldn't make any change manually, so it's reproducable

kindred merlin
paper flower
#

What do you need exactly?

kindred merlin
#

I like how you go to the point with it

#

Was just thanking you for the example lol 😛

paper flower
#

Ah, I see 😅

#

I thought you wanted me to provide a concrete example (e.g. code) so I was a bit confused

#

I usually use sqlalchemy as an orm and alembic to handle migrations, it can generate migrations for most of the changes in sqlalchemy models automatically

#

But some things it doesn't handle well, e.g. column renames, it would think that you deleted and added a new column, you have to modify such migrations manually

#

You can take a look at django orm too, but I don't really like it

kindred merlin
#

Oh interesting, and here i thought the ORM was for visualization purposes (literally seeing it).

paper flower
#

Haha, no

kindred merlin
#

Thanks for all the info, I'm gonna read up and document what I need for a working framework for understanding this process!! Have a good one pithink

paper flower
#

sqlalchemy itself is quite easy to bootstrap

#

But you may need to fiddle a bit with alembic

kindred merlin
#

I'm still not at that stage, but def will keep it in mind

median marsh
#

Hi All,
I've a query regarding an issue we've been facing lately in our MongoDB v4.4.18 cluster running over AWS EC2 instances and managed by Cloud Mongo service.

We are running with a Primary, 2 RR and 1 delayed hidden. The delayed hidden in 6 hours behind current time.

Lately, the replication lag between our delayed hidden instance and primary has gone over 300+ secs 2-3 times.

Once, the issue was due to our EC2 volume configuration, it was not able to handle the throughput of incoming data.
We checked it with AWS and fixed it and it went to normal.

2 weeks later, we again start getting the alerts of lag and the metric which we checked before for volume was normal.

So we don't have any clue what could be the issue here. Any open ideas?

queen rose
#

hey guys

#

I know sqlite3 has some JSON support, do I need to install the extension or how does it work when using it trough Python?

slender atlas
#

SQLite has no JSON data type

#

It's a bad design choice anyway for SQLite

queen rose
#

oh interesting

slender atlas
#

Cool, good to know

#

It seems it uses TEXT

#

Just like datetime functions

queen rose
#

ok, so I'm rewriting an API that deals with a graph, and the most common operation I need to do is find the descandants of a given node, so instead of storing two columns as parent: str, children: list[str] I will simply store the parent once for each child, some thing like this:

create table parent_child (
  parent integer,
  child integer,
  constraint parent_child_pk primary key (parent, child)
);

insert all
  into parent_child values (1, 2)
  into parent_child values (1, 3)
  into parent_child values (3, 4)
  into parent_child values (3, 5)
  into parent_child values (2, 6)
  into parent_child values (7, 8)
  into parent_child values (7, 9)
  select 1 from dual;
commit;

then I can use smth like this query to navigate it

with descendants (node) as (
  select 1 from dual -- root
  union all
  select child from descendants inner join parent_child on parent = node
)
select node from descendants order by node;
#

what do you guys think of the design? I plan on using an in process SQLite3 DB

gusty jolt
queen rose
#

can I index a column with duplicate values in SQLite3?

grim vault
#

Yes, just don't use the UNIQUE keyword on index creation.

pine saddle
#

How long is altering a table to add a foreign key in one of your columns after you created an index for that column suppose to take? It’s been processing for 10 minutes so far.

slender atlas
#

I think altering a table never takes that long

storm mauve
#

depends on how large the tables are and which database you're using I think?

queen rose
queen rose
#

I tried this query and it work great, until it finds a closed loop.
is there a way I can modify this query to keep track of the nodes that were already seen:

with descendants (node) as (
  select 1 from dual -- root
  union all
  select child from descendants inner join parent_child on parent = node
)
select node from descendants order by node;
slender atlas
queen rose
#

I had a MySQL table that had more than 50GB of data, and when I try to drop it and it took a few minutes and eventually the DB crashed...

queen rose
#

I dont want to use Python 😭

wild prairie
#

Hi guys, i'm using python firebase admin sdk and uploading csv file to firebase storage, now i'm trying to store the public downloadble and also setting that url in user collection document... but when i tap on the generated link, it throws permission error: Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object. Permission 'storage.objects.get' denied on resource (or it may not exist).

firebase storage permissions:

service firebase.storage {
  match /b/{bucket}/o {
    match /{allPaths=**} {
      allow read: if true;
      allow write: if request.auth != null;
    }
  }
}

python code to generate public url:

from firebase_admin import storage, firestore

bucket = storage.bucket()
blob = bucket.blob(file_path)
blob.upload_from_string(df.to_csv())
ref = db.collection(f"users/{user.uid}/history")
ref.add({
  "query": str(search_query),
  "result_file_url": blob._get_download_url(storage._StorageClient),
  "created_at": firestore.firestore.SERVER_TIMESTAMP,
})
fringe sigil
#

howdy all, i was given the task of creating a very barebones ERM in python, im having a problem with getting everything onto one window, any suggestions?

crisp panther
#

Hello, what value type do I use so that sqlite doesn't use decimals? Trying to print the database that has numbers in it results in the numbers ending with .0 (those are whole numbers)

hexed estuary
coral wasp
#

Altering a table can indeed take a long time, lots of ‘it depends’ here

#

Adding a column in MySQL would result in rewriting the entire table, for instance

#

In most cases, tables are considered unordered… results can be returned in any order, unless you specify an order by. PerhPs it’s more precise to say: query results are unordered, unless order is specified in the query

queen rose
#

guys, any ideas how I can keep track of visited nodes in a recursive CTE query?

#

I wonder if its possible to create a temporary table inside the CTE query that has one column, and I simply INSERT each node I visited

fleet pebble
#

~~Why does SELECT DATE_ADD("2020-9-22", INTERVAL 2 MONTH); fail to run? I've seen this as the syntax on multiple websites? ~~

The error I get is: "Wrong arguments '2': Date_ADD() only takes TINYINT/SMALLINT/INT types as second argument, got INTERVAL_YEAR_MONTH. Looks like we use HiveQL 3sKMS

#

This SELECT DATE_ADD("2020-9-22", 2); works just fine, by adding 2 days.

queen rose
#

yoo how fu**ing clean is this:

conn.execute("""
WITH descendants (node) AS (
  SELECT 149505
  UNION ALL
  SELECT child FROM descendants INNER JOIN categories ON parent = node AND NOT node_seen(node)
)
SELECT node FROM descendants ORDER BY node;
""").fetchall()
#

and its super fast

slender atlas
#

The magic number is bugging me

queen rose
#

the crazy part is actually the Python function node_seen()

#

but I think I will still use a different method

queen rose
#

Is there a way I can optimize storage and performance if a copumn has many duplicate values?
I thought about using an ENUM but Mysql has a limit of just 64k unique values, is there an alternative?

#

Would be nice if there was something similar to pd.Categorical so I can hot encode the column

paper flower
#

If there are 64k+ unique values it certainly shouldn't be any kind of enum

brazen charm
#

also, this is why your database has compression... The compression algorithms will almost certainly be removing all the duplicate data if it is sane to do so

queen rose
coral wasp
queen rose
#

I suppose I will store the INTEGER to STRING mapping in a separate table, and in the other I will just keep the IDs/integers

coral wasp
#

Yah, and parquet files too… column orientation unlocks columnar compression

queen rose
#

I need indexed columns tho, with O(1) lookup

#

I dont think there is a way to avoid a full-scan with parquets

coral wasp
#

And row groups have pages/etc.

#

But, columnar comes with other tradeoffs, and I was just pointing out the diff.

queen rose
#

I'm aware of that, also file metadata can help to know if to scan that file ...

#

but what I'm doing is too performance intensive, graph traversal

#

so I will manually hot-encode the values

paper flower
paper flower
#

The most optimal type would be something like int32/int64, check how much storage you'll save and if it's worth it

#

Also how many records do you have?

queen rose
paper flower
#

And how many records do you have?

#

Mybe just use an int32 + foreign key if you can 🤷

paper flower
queen rose
paper flower
#

How many rows?

queen rose
#

I'll use unsigned int32 then

#

idk, why?

paper flower
#

Because if it's, say 200k rows it's not work optimizing at all

queen rose
#

ofc

#

its at least 10M

paper flower
#

You don't know how much data you have and yet you want to optimize something?

#

If we assume average size of your string is 10 chars, 8 bytes each, then for 10m rows it would be roughly 700mb of storage, but I'm not aware of how db would actually store it

#
>>> 8*9 * 10_000_000 / 1024 / 1024
686.6455078125

Just think if it's worth optimizing for or not 🤷

#

You could definitely use a foreign key here but if you'd need to filter by name you'd have to join to a different table

#

Which would be slower

paper flower
#

Probably my mistake, should be just 9?

#

Assuming utf8

queen rose
#

yeah utf8

paper flower
#

so just 9, if your awerage str length is around 10

#

85mb for 10m rows

queen rose
#

I'll use integers, I just gotta check how to configure that with SQLalchemy

paper flower
#

Don't 😅

#

Do you actually need those string values?

#

If yes - where would you be storing them?

queen rose
paper flower
#

what other columns do you have?

#

Also db could have been not truncated too

queen rose
#

anyway dont worry, I like optimizing

worldly hull
#

hello

queen rose
#

does anybody know how to define an int32 with sqlalchemy?

paper flower
#

Just Mapped[int] should work

#

If not - override it with Integer

queen rose
#

god, I hate how SQLalchemy requires a PK in each table

wise goblet
#

With your desired default pk and type

queen rose
queen rose
#

I want some tables without any PK

storm mauve
#

make all columns into a composite PK

queen rose
#

but I dont want to spend that extra storage?

#

why should I have to have a PK if I dont use the ORM interface

#

its very annoying

nova forge
#

Are there any library which allow us to load a mysql database in redis?

paper flower
#

pk could be "virtual", but yeah, it requires it

#

does not have to be an actual pk though

#

Oh, etrotta already linked the relevant docs 😅

fading patrol
elder elk
#

Hello, I'm trying to combine two crosstab in postgres but stuck it will be great if someone guide me

#
-- rainfall CROSS TAB
SELECT * FROM
CROSSTAB(
'SELECT block_id, forecast_date, rainfall_mm
 FROM weather.imd_forecast 
    ORDER BY 1'
)
AS rainfall_crosstab(
    block_id int, rainfall_day_1 numeric, rainfall_day_2 numeric, 
    rainfall_day_3 numeric, rainfall_day_4 numeric, rainfall_day_5 numeric
);
    
-- temp_max CROSS TAB
SELECT * FROM
CROSSTAB(
'SELECT block_id, forecast_date, temp_max_deg_c 
 FROM weather.imd_forecast 
    ORDER BY 1'
)
AS temp_max_crosstab(
    block_id int, temp_max_day_1 numeric, temp_max_day_2 numeric, 
    temp_max_day_3 numeric, temp_max_day_4 numeric, temp_max_day_5 numeric);

#

my aim is to make one crosstab of rainfall and temp

queen rose
#

so I benchmarked my graph traversal with MySQL and SQLite, and with SQLite it only took 1.2 seconds to scan the whole graph from the root-node, while with MySQL, with the same exact data and query it took 10 seconds

#

10X slower

#

SQLite really is the GOAT db

#

after adding indexes it only takes 6 seconds, still 5x slower

brazen charm
#

That is pretty suspicious

#

but also idk why you are using SQL for graph traversal unless you are creating connected components

#

Graphs in SQL is a hugeeeeeeeeeeee inefficient mess

wise goblet
# queen rose so I benchmarked my graph traversal with MySQL and SQLite, and with SQLite it on...

There is modern shiny this lib to have it working nicely.
https://github.com/feincms/django-tree-queries
U could try this instead of whatever you are doing

GitHub

Adjacency-list trees for Django using recursive common table expressions. Supports PostgreSQL, sqlite, MySQL and MariaDB. - GitHub - feincms/django-tree-queries: Adjacency-list trees for Django usi...

#

this solution is optimal from the point of writing at least
while having nicely clean.... from the point of running sql code for reading

#

There are also alternate solutions which are very optimized to reading operations... but they are...mmmm in a very messy state.

neon quarry
#

As someone who's been developing restful apps for quite awhile, how/where do I start learning about graphql?

queen rose
#

how trivial is it to close a cursor (not connection) when working with a database?

brazen charm
#

very

#

generally a cursor is per-operation

#

a db connection is not

zealous spire
#

!rule paid

delicate fieldBOT
#

9. Do not offer or ask for paid work of any kind.

crisp panther
#

Is there a way to remove commas on integer when I fetch the database? When I use a command to get a database the output is like this

[(123,)]

#

that might be the reason why a certain function isn't working but I'm not too sure

storm mauve
#

that is

  • a list containing a
  • tuple containing a
  • integer
    you could use thing[0] to select the first element of the list, or thing[0][0] to select the integer in that case
crisp panther
#

how xD

#

probably something with the add command on my program

blissful linden
#

Hey all friends! I want to grab the coordinate data of a large number of data points from a graph, but my graph is not complete, I need to do mathematical fitting to get the missing data points and complete the curve, and I want to import the coordinate data into Excel or a database for subsequent point fetching tasks. Can this be done using Matlab OR Python and what tools or libraries do I need to use?

torn sphinx
#

Can I use sqlite on production?

torpid chasm
#

If anyone here is a MariaDB user, I could use your help! I'm running a bot that uses a MariaDB MySQL database connection, but it's not very active, so it times out rather frequently. Is there a good way to prevent this happening? I've thought of checking to see if the server is active, and if not, reinstating the connection, but I don't know how to check whether or not the connection is still active. Any help is appreciated!

slender atlas
paper flower
fading patrol
# torn sphinx Can I use sqlite on production?

I work with a legacy customer who does and it's been running that way for years but a) it's a fairly low-traffic app and b) maintenance is pretty annoying and time consuming.

So depending on the use case I'd say you might get away with it, but would avoid it if possible

radiant lichen
coral wasp
shrewd turret
#

guys, im having a column in a table for storing email ids
for that, i've decided to add a constraint to check whether they match the email regex before inserting the data
is this ok to do? like, im just validating the input one step prior
is it really worth it?
im using postgres and this is the regex ^(?!\\.)[\\w.-]*[^.]@\\w+(?:\\.\\w+){1,2}[^.\\W]$
but another problem is, its not working
pls ping me on reply

torn sphinx
#

Hi guys

#

How can I use sympy ?

glacial current
#

Does anyone know if catch an sqlalchemy error like this one
<bound method SQLAlchemyError._message of DataError('(pymysql.err.DataError) (3263, "NULL supplied to NOT NULL column \'future_releases\' at row 1")')>
How do i actually use that obj to get at the message itself "NULL supplied to NOT NULL column \'future_releases\' at row 1" ?

glacial current
glacial current
pine saddle
#

Hello. I would like to get rid of the escape characters format in my database. They were updated like this in Python but I changed the code to stop that from happening. Now I would like to change the formatting of the JSON string in the table. How can I do that?

pine saddle
#

I tried ```sql

SELECT *
FROM jb_projects
WHERE JSON_UNQUOTE(JSON_EXTRACT(obj, '$')) LIKE '[______,%';

#

It says


[22001][3140] Data truncation: Invalid JSON text: "The document root must not be followed by other values." at position 4256 in value for column 'jb_projects.obj'.
wise goblet
#

We can calculate if object has children as

SELECT 1 AS "a" FROM "zone_zone" WHERE "zone_zone"."parent_id" = '728046be-0df2-432c-0001-ac781965844f'::uuid LIMIT 1

How to rewrite it in a good way to be optimal for postgres for multiple zone objects in bulk to check?
Some kind of window function i would guess to apply? 🤔

#

or alternatively we could go with just subquery perhaps

#
SELECT *, (SELECT 1 AS "a" FROM "zone_zone" WHERE "zone_zone"."parent_id" = ZONE.PK::uuid LIMIT 1)
FROM ZONE
WHERE ZONE IN desired list of PKs

As a dumb way i can think of 🤔

#

(i am trying to affirm that zone is not having any children (parent_id is a thing from adjucency list to represent graph)

#
SELECT *,
       count/exists() OVER (PARTITION BY zone_zone.parent_id)
FROM zone_zone
WHERE ZONE IN desired list of PKs

something like this may be possible to write

#

or it is less efficient than handling as a subquery above 🤔

Essentially i am trying to write query: SQL check if foreign key exists for every row

coral wasp
#

for example: sql with mydata as (select 'somejsontext' as obj) select * from mydata where JSON_UNQUOTE(JSON_EXTRACT(obj, '$')) LIKE '[______,%';

wise goblet
#
SELECT z.pk
FROM zone_zone z
WHERE NOT EXISTS (SELECT 1 AS "a" FROM "zone_zone" WHERE "zone_zone"."parent_id" = z.pk::uuid LIMIT 1))
#

in theory perhaps should be efficient way to check SQL for every row, check that no row has parent id belonging to it

#

subquery way with abusing EXISTS to maximum

coral wasp
pine saddle
coral wasp
wise goblet
pine saddle
coral wasp
pine saddle
coral wasp
#

If you just want to remove 8 characters, why json_extract/unquote? You could just take the substring.

wise goblet
coral wasp
pine saddle
coral wasp
#

Replace somejsontext with an example of your JSON text.

wise goblet
pine saddle
coral wasp
#

Well, that's easy. '[' is invalid JSON. Give it valid JSON.

pine saddle
#

If ‘[‘ isn’t JSON, then I can’t remove the first 8 characters that are in the obj column

coral wasp
#
with jb_projects as (SELECT '{"a": 2}' as obj) SELECT * FROM jb_projects WHERE JSON_EXTRACT(obj, '$');
#

If you just want to remove the first 8 characters of a string, then: ```sql
select substring('123456789abcdef', 9)

#

JSON_EXTRACT requires the input to be valid JSON.

pine saddle
#

I’m trying to remove “[______,” that are in my obj column. I have some rows like this in my obj column.

coral wasp
#

Use substring to remove the text you want to remove.

pine saddle
#

Ok

#

My goodness.

[22001][3140] Data truncation: Invalid JSON text: "The document root must not be followed by other values." at position 4256 in value for column 'jb_projects.obj'.

My query

UPDATE jb_projects SET jb_projects.obj = SUBSTRING(obj, 9) WHERE obj LIKE '[______,%';
thorny anchor
#

let me know if i understand the problem correctly. you inserted a list [a_number, json_string], and you want to just change it to be the json object that json_string represents

glacial current
coral wasp
#

Just open the data in VSCode, in a .json file. and see what errors pop up.

#

I assume this particular error is because you didn't remove the trailing ']' after removing the leading '[', but who knows what other issues there are.

thorny anchor
#

the original question didn't make sense to me. the error wouldn't have been caused by escape characters, rather trying to insert a string with 2 json objects concatenated together i think

coral wasp
thorny anchor
#

it should be valid json though, right? as long as the string was escaped properly

coral wasp
#

Nah, it should look like this https://dbfiddle.uk/vE_RIkoa: ```sql
WITH jb_projects AS (
SELECT '{"id": 367603, "name": "name_goes_here", "short_description": null, "custom_id": "12345", "version": "1.4"}' as obj
)
SELECT JSON_EXTRACT(obj, '$') AS extracted_id
FROM jb_projects;

#

So, need to extract the second element from the list, and also unquote it

thorny anchor
#

i'm a bit out of my depth (breadth?) with mysql though

coral wasp
thorny anchor
#

i think there are some good use cases, but it's very icky from a theoretical perspective

stone blaze
#

hi guys what is this error:

sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 7932089088 and this is thread id 6135279616.
pine saddle
#

That’s why I’m trying to remove it.

#

columns got updated that way with Python

coral wasp
pine saddle
#

I will also want to remove the escape characters

coral wasp
pine saddle
#

But I have been told that with escape characters, it’s valid JSON syntax

thorny anchor
#

the error was not pointing to an invalid string, it mentioned that you have two json objects in the column

pine saddle
thorny anchor
#

i'm not sure. i don't know how you would do it in mysql, and i'm honestly not sure how you would do it with python

coral wasp
#

You can substring and remove the leading and trailing character(s). That why I suggested copying to vscode and seeing what’s required to make the string valid json

thorny anchor
#

yeah if it's just one entry, then you can just do it manually

coral wasp
thorny anchor
#

yeah

shrewd turret
glacial current
#

chat GPT suggests this type

CREATE TABLE employees (
    id serial PRIMARY KEY,
    first_name varchar(50),
    last_name varchar(50),
    email varchar(100) CHECK (email ~* '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$')
);
#

otherwise both your regex and chatGPT regex seem to work on my tests with re.match

shrewd turret
#

i'll try this and report it later

glacial current
#

thre are a few forms. but yours worked for a few i tried but as the stackoverflow showed a+b@whatever.com is also a valid mail address. so you possibly could just test the entries with re.match(patt, email) on all entries in the DB to see if it holds up

shrewd turret
#

guys, how can i encrypt the entire postgres db at rest?

shrewd turret
glacial current
#

but its not very good

#

re.match(r1, 'a.bc@domain.com')
<re.Match object; span=(0, 15), match='a.bc@domain.com'>

shrewd turret
#

did u try that in postgres?

#

it didnt work for me

glacial current
#

no i dont use postgres. did the stack overflow help ?

shrewd turret
shrewd turret
#

guys, im thinking of encrypting the whole db at rest using pgcrypto to encrypt each column
but i read somewhere that if there's any error occurred while running the sql query, and if the error statements gets logged with the query, it'll expose my secret key that i use for encryption/decryption
so, is there any better way to encrypt the db?

shrewd turret
#

guys, ive decided to make a personal password manager
and i've also decided to store all the info that i get from a site, like username, api keys, even the stuff like recovery codes, etc. in it, in a separate table
im creating a table for each one like this

CREATE Table IF NOT EXISTS emails (
  email_id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY NOT NULL,
  email TEXT UNIQUE NOT NULL
  CONSTRAINT valid_email
    -- CHECK (email ~* '^[a-z0-9_-][a-z0-9._-]*@[a-z0-9_]+(\.[a-z0-9_]+){1,2}[a-z0-9]$')
    CHECK (email ~* '^([^<>()[\]\.,;: \r\n\t@"]+(\.[^<>()[\]\\.,;: \r\n\t@"]+)*|".+")@(([a-z0-9-]+\.)+[a-z]{2,})$')

);

CREATE TABLE IF NOT EXISTS sites (
  site_id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY NOT NULL, 
  email TEXT,
  site TEXT UNIQUE NOT NULL,
  username TEXT,
  CONSTRAINT fk_email
  Foreign Key(email)
    REFERENCES emails(email)
    ON DELETE No ACTION
);

CREATE TABLE IF NOT EXISTS api_keys (
  api_key_id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY NOT NULL,
  site TEXT NOT NULL,
  apikeys JSON NOT NULL,
  CONSTRAINT fk_site
  Foreign Key (site)
    REFERENCES sites(site)
    ON DELETE NO ACTION
);

CREATE TABLE IF NOT EXISTS recovery_vault (
  recovey_key_id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY NOT NULL,
  site TEXT NOT NULL,
  recovery_data JSON NOT NULL,
  CONSTRAINT fk_site
  Foreign Key (site)
    REFERENCES sites(site)
    ON DELETE NO ACTION
);

now the question is, is the schema for creating a structured relational database is good here?
im keeping emails, sites, pretty much a separate table for each attribute
but now the problem i see is, if i wanted to insert a new entry in sites table, or in apikeys table, and the email id doesnt exist, then i have to run several queries to update the whole db
plus im doing full db encryption
what should i have to do here?

#
  1. just store everything in a single table, and fill null to empty values
  2. make separate constraints to add values to email table if not exists
  3. just make those extra queries (again, i've done a full db encryption)
wise goblet
queen rose
#

guys, should I create my own graph database in Rust even tho I've never used a proper graph db like Neo4J ?

mortal crescent
#

very simply and probably dumb question
what is the point of a database, or where is it useful

queen rose
mortal crescent
#

okay, can you give an example of such?
i see how in huge scale things such as a search engine or hosting sites need that, but what else

queen rose
#

for example if you have an e-commerce website you want to store all the products that you have avaialble in a database (with their stock, price, description, etc.), you might also want to save some data on your customers; whenre they live, whats their email, how many orders have they made, etc.

mortal crescent
#

okay

#

so mostly fairly large scale projects, right?

queen rose
mortal crescent
#

okay

queen rose
#

I mean it depends by what your doing ..

thorny anchor
mortal crescent
#

Ah okay, ive done that a bit by litteraly changing the code lol

glacial current
# mortal crescent Ah okay, ive done that a bit by litteraly changing the code lol

I will point out too, that there are many ways to get at data in a database. All programming languages support it in some way, and can intercommunicate though the database on that data. Some tasks can be updating and others displaying. Most databases have a common language SQL that in itself is very powerful data hunter that a few words can almost always, get you the answer. SQL is the single most important aspect of the top databases. On top of all that people have developed ORM which is a way to access all that data, with class objects that most languages support, making it a compelling choice of data storage even for a small project.

mortal crescent
#

Okay

glacial current
# mortal crescent Okay

And today Python ( which i assume you know python ) which is rather slow, is becoming the center of data analysis because its easy to learn. But because its a slow language, people use databases to get results and update data faster. So, if you get a job doing anything accounting or data related, chances are someone has a database doing a critical job in the middle. So its important enough to learn.

coral wasp
glacial current
#

The python is slow. yes it is. getting several million rows from a database could introduce 3 seconds if you grab those rows with a pure python api. you can chop that down to a thenth of a second if you grab the rows from a call to the db with a the c client mysql -u user "select * from whatever" so what i have found is that you do as little work in python itself. Its the same code but you jsut made an improvement to you code by taking the workload off python

#

that is the patterrn over and over that i see on a day to day basis with python devs that i work with

slender atlas
#

Depends on the data

glacial current
#

when you are joining 10 million rows to a billion .. definitlly doing that in sql not in python

slender atlas
#

I wouldn't want an SQL database for a fluid simulation

#

as an example

#

I wouldn't use an SQL database for training a neural network

#

as another example

#

Every program manages its data in the best suitable way, which isn't always in an SQL database

#

Tables are suitable when you want to use table records to begin with

#

Also, SQL tables can have constraints and relations to other tables

#

It's just a data management system which is suitable for one scenario

coral wasp
glacial current
# coral wasp Then you are agreeing with me: database choice and language choice have very lit...

I believe it was you who raised the disagreement with my observations over the years that a database becomes the core of many systems written in many different languages. At the heart of all those systems is a common language SQL. You can say there is a side case that dosnt match. Fine. 99% of all systems i have seen meet the SQL standard. Perl used to be the Python 20 years ago. they come and go. SQL is still here at the heart of those systems. that was my point.

coral wasp
glacial current
#

The python is slow business is not so much important for me. There are times when im told "this is too slow, make it faster" which inevitably comes down to doing some trick of getting the work done externally to python and taking delivery of those results. a very simple example would be if i have a 400Meg binary and i need to copy that. Its definitely too slow to do a open("file").read() then write that file out.. instead in python you make a file system call to cp file file2. im excited to see python 3.11 and 3.12 trying to make it faster though.

fallen vault
#

How strict are data types in Postgres when passing data from Python? For example, if I have a date data type column in Postgres, can I pass the date as a string from Python?

wise goblet
#

STRING::DATE or something is available to cast value from string to date in sql

fallen vault
#

What if I put everything down as a string and convert it in python lol.

#

Mostly as a quick solution for a proof of concept. I signed up for CS50 last night and plan on learning Sql from start to end.

wise goblet
#

why would u wish hack strict static typing into Serialized Strings?

#

you could have not used Postgres then, and just used Sqlite3 then

#

you are creating yourself pits to fail on your own with this decision

fallen vault
#

I’m using SQLite3 currently but have been asked if I could make a mock software where the information can be accessed from anywhere. I want to get it working to prove I can do it and use the extra time given to rewriting for the new version to learn Postgres correctly.

paper flower
#

Because queries should look basically the same

fallen vault
#

Kinda. I currently don’t have extensive data validation as I’m racing someone else to get this done lol. SQLite doesn’t care it seems.

paper flower
#

I think postgres would try to coerce string values but I'm not sure

#

You can always explicitly cast them

#

Also, you could use an orm

fallen vault
#

Not sure how to do that but will definitely be looking into it, either when this goes well or whenever I finish cs50.

fading patrol
fallen vault
fading patrol
fallen vault
storm mauve
#

uh, if you only care about reading you could literally copy/paste the file, but if you want for multiple people to be able to edit it and keep it synced, definitely use a "proper" database like postgres or mysql instead of sqlite

waxen finch
#

see also https://sqlite.org/whentouse.html

... SQLite will work over a network filesystem, but because of the latency associated with most network filesystems, performance will not be great. Also, file locking logic is buggy in many network filesystem implementations (on both Unix and Windows). If file locking does not work correctly, two or more clients might try to modify the same part of the same database at the same time, resulting in corruption. Because this problem results from bugs in the underlying filesystem implementation, there is nothing SQLite can do to prevent it.

A good rule of thumb is to avoid using SQLite in situations where the same database will be accessed directly (without an intervening application server) and simultaneously from many computers over a network.

fallen vault
#

Alright. Get Postgres “functional” and worry about doing it correctly when I get full approval.

queen rose
#

also SQLite has a STRICT statement for creating strongly typed tables

tiny hearth
#

What is the best method of handling database migrations for a containerized python application? Should the migrations be ran when the container is initialized or can/should they be automated externally? Examples appreciated. Thank you.

fading patrol
# tiny hearth What is the best method of handling database migrations for a containerized pyth...
raven raft
#

If anyone has experience with PySpark/Spark, I have a question in #data-science-and-ml that people here might find interesting

somber ember
#

is it normal to be angry at a person who inserted
"Map(key->value)" as a single string value in database? my query is flooded with regexp_replace, because i need to use presto sql.

wise goblet
somber ember
#

Me: yes i finally fixed "Map(key->value)" to the format I need it to be. Feels great!
5 mins later...
Also me: nooooooo who tf put these data. What the hell is "Map(val1->val2->val3->val4)" or "Map(val1->val2->val3, val4->val5->val6)" or "Map(val1->val2, val3, val4->val5->val6)"?!

marsh ledge
#

If you do not wish to click the link, here's a screenshot:

#

For some reason, the data set returned is a simple dictionary with only the first database column requested.

waxen finch
marsh ledge
#

Hm, if I change the query to look like the one below, it works fine but returns columns I do not want:

query = select(User).order_by(User.firstname)
waxen finch
#

https://docs.sqlalchemy.org/en/20/orm/queryguide/select.html#selecting-orm-entities

When selecting a list of single-element rows containing ORM entities, it is typical to skip the generation of Row objects and instead receive ORM entities directly. This is most easily achieved by using the Session.scalars() method to execute, rather than the Session.execute() method ...
so if you wanted to your query to be select(User) it would make sense to use the scalars shorthand, but otherwise it would throw away the rest of your columns

marsh ledge
#

Man, I badly want this to work so I can use Python for APIs but NodeJS is so much easier (maybe perhaps because I have been writing it for so many years). We're using Python in many other ways with Lambda functions but not in any real APIs. I just cannot seem to grasp this. I tried changing it to session.execute(query) and it just throws up everywhere.

waxen finch
#

did you still want to retrieve the results as a list? cause .all() exists for Result too

marsh ledge
#

Oh wait, maybe it does work. It's throwing up trying to serialize the result from what I can tell

#

Ah, it isn't returning key value pairs

paper flower
#

Also rows should be named tuples iirc

marsh ledge
marsh ledge
# paper flower Why don't you select an orm object?

Do you have an example? The documentation is pretty large and to be honest, it seems like there's 10 ways to do the same thing but for some reason, it doesn't work for me. For example, there seems to be multiple ways to define a table model. For what reason?

paper flower
#

And there's only one way to define a model right now

marsh ledge
#

Okay, now I am really confused. I already mentioned if I define the select like that, it returns all the columns for each row in the database, which I do not want. I do not want to return the password column which means I have to select which columns I want returned. Right now, that works however it is returning a tuple instead of dict.

paper flower
#

Forgive me my formatting, I'm on a phone

marsh ledge
#

No worries, I appreciate you trying to help from your phone. That's definitely an interesting concept but can get real ugly, in my opinion. I am trying to find a more elegant way of doing it based on the documentation but nothing seems to be working and most search results are for SQLAlchemy versions prior to 2.0

paper flower
#

I remember there was "only" function too

paper flower
marsh ledge
#

With SQLAlchemy, there's actually two ways to define tables: directly by using the Table constructor, or indirectly by using ORM Mapped classes.

#

The result I am looking for is this:

[
  {
    "created_at": "2023-01-01 00:00:00",
    "email": "johnny@appleseed.com",
    "firstname": "Johnny",
    "id": 1,
    "is_active": 1,
    "is_admin": 0,
    "lastname": "Appleseed",
    "updated_at": "2023-01-01 00:00:00",
    "uuid": "FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF"
  }
]

If I use select(User).order_by(User.firstname) I get that exact result but with the password column, which I do not want.

#

If it's important, I am using FastAPI and have Pydantic models defined which I was expecting the model to be applied and the columns not defined in the model to be stripped but that doesn't happen. Haha

#

I know, I have been told in the past that FastAPI + SQLAlchemy + Pydantic is overkill and difficult to learn and I should use something like Django instead because it has tons of features built-in but I like doing things the hard way and Django has massive overhead.

paper flower
#

Orm model defines, well, a model and a table underneath

marsh ledge
#

So, from my understanding, I am using ORM Mapped Classes. I could be wrong though.

paper flower
#

You can attach your own table instance to it if you want too

marsh ledge
#

This appears to be the Table constructor, which I am not using:

from sqlalchemy import Table, Column, Integer, String
user_table = Table(
    "user_account",
    metadata_obj,
    Column("id", Integer, primary_key=True),
    Column("name", String(30)),
    Column("fullname", String),
)

I am wondering if I should be, though.

#

I tend to see a lot of references to a column like this user_table.c.name in the documentation which is not available when using mapped classes

paper flower
#

You don't have to use it really, sqlalchemy can and does create a table automatically for you

marsh ledge
#

I definitely like the mapped class approach over the table constructor.

paper flower
#

@marsh ledge I think using Column in orm class definitions is deprecated, you should use mapped_column

marsh ledge
#

Yeah, I changed it. My class now looks like this:

from datetime import datetime
from lib.database import Base
from sqlalchemy import String, DateTime, Boolean
from sqlalchemy.orm import Mapped, mapped_column


class User(Base):
    __tablename__ = "users"

    id: Mapped[int] = mapped_column(primary_key=True)
    uuid: Mapped[str] = mapped_column(String(36))
    firstname: Mapped[str] = mapped_column(String(32))
    lastname: Mapped[str] = mapped_column(String(32))
    email: Mapped[str] = mapped_column(String(128))
    password: Mapped[str] = mapped_column(String(128))
    is_admin: Mapped[bool] = mapped_column(Boolean(False))
    is_active: Mapped[bool] = mapped_column(Boolean(True))
    created_at: Mapped[datetime] = mapped_column(DateTime)
    updated_at: Mapped[datetime] = mapped_column(DateTime)
#

Which to me, is clean and understandable.

marsh ledge
#

Thank you, I appreciate it. I am going through the unified tutorial in the documentation and will have a look at those as well.

paper flower
#

So you would use something like

stmt = select(User).order_by(User.firstname).options(defer(User.password))
#

And probably pass a raiseload to it too, if you don't use asyncio with sqlalchemy

#

i.e. defer(column, raiseload=True) if I'm not mistaken

paper flower
marsh ledge
# paper flower Cool, I think sqlalchemy is a great library but it definitely has a high barrier...

Yes sir, I am quickly finding that out. If I can get close to mastering it, this API will pretty much write itself. I am pretty good with logic having a programming background for 20 years but in other languages so I am just struggling with new concepts and what not. I have only been writing Python for a year now and have nearly 20 AWS Lambda functions and some other code that works great but definitely still learning every day. We do a lot with GIS and drone mapping so it has definitely been a fun learning experience.

paper flower
marsh ledge
paper flower
#

Honeslty if you know how to use sql - you'll manage it with sqlalchemy

#

The API is pretty similar to sql, and it adds some helpful things on top, like relationships

marsh ledge
#

Yeah, I can definitely write raw SQL queries, I am not a master but I do well with joins and grouping and what not.

paper flower
#

And it would build a correct join condition for you

marsh ledge
#

Yeah, I have seen that in the docs, which makes complete sense. Just for some reason without using scalars(), the data set is always a tuple, even the documentation shows this 😦

#

Unfortunately, this does not work either though

            results = session.execute(select(
                User.uuid,
                User.firstname,
                User.lastname,
                User.email,
                User.is_active,
                User.is_admin,
                User.created_at,
                User.updated_at
            ).order_by(User.firstname)).scalars().all()
paper flower
#

But you mostly use scalars

marsh ledge
#

So, how do you return a dict from the database? LoL

paper flower
marsh ledge
paper flower
#

scalar and scalars just takes the first column of your select

#

If you select an orm model it would be that model 👌

marsh ledge
#

Yeah, I read that in the docs.

#

I just do not understand why returning a dict seems so difficult. Haha

paper flower
#

scalars may be impossible to use in a case like this:

stmt = select(User.id, Address.street).join(User.address)
for id_, street in session.execute(stmt):
    print(id_, street)
marsh ledge
paper flower
#

It's easier to work with

#

And you're selecting almost all fields

marsh ledge
#

But again, I am getting tuples and not dicts so I cannot reference row.firstname

#

Oh, yes I can! It does work. LoL

#

Super weird. So now, how to convert that to a JSON object / dict so I can return it from my API?

paper flower
marsh ledge
#

FastAPI?

paper flower
#

I have to go for like 1 hour though 😉

#

In case of fastapi create according pydantic model

marsh ledge
paper flower
#

And then return that from your API endpoint

marsh ledge
#

And then this is the route I am working with right now:

@routes.get(
    "/",
    summary="Get All Users",
    response_model=List[User],
    status_code=status.HTTP_200_OK,
    responses={
        status.HTTP_200_OK: {
            "model": List[User],
            "description": "OK",
        }
    }
)
async def get(session: Session = Depends(get_session)):
    result = service_user.get(session)

    return JSONResponse(
        content=jsonable_encoder(result)
    )

Just trying to return all the users in the database.

#

I wish that would just strip the columns from the result set that are not in the User model, but that is not the case.

#

Well, this works:

            results = session.execute(select(
                User.uuid,
                User.firstname,
                User.lastname,
                User.email,
                User.is_active,
                User.is_admin,
                User.created_at,
                User.updated_at
            ).order_by(User.firstname))

            return [dict(r._mapping) for r in results]

It isn't pretty, if you ask me. But it works.

#

And, I am accessing a protected member of a class which PyCharm hates

marsh ledge
paper flower
marsh ledge
#

Okay, so something is not working properly then with those.

paper flower
#

And you can always write tests for that

#

Share your code 🤔

marsh ledge
#

I am a bad developer, I have never written test cases for code. That's something I need to learn.

#

I've always been in startups in the last 20 years where I am the only developer and server administrator doing everything so I did not get time to learn to be a real developer, if you will. Haha

#

Well, based on the models defined:

class UserBase(BaseModel):
    firstname: constr(strip_whitespace=True) = Field(
        max_length=32,
        min_length=2
    )
    lastname: constr(strip_whitespace=True) = Field(
        max_length=32,
        min_length=2
    )
    email: EmailStr = Field(
        max_length=128
    )
    is_admin: bool = Field(
        default=0
    )
    is_active: bool = Field(
        default=1
    )

    model_config = ConfigDict(extra="ignore", from_attributes=True, json_schema_extra={
        "example": {
            "firstname": "Johnny",
            "lastname": "Appleseed",
            "email": "johnny@appleseed.com",
            "is_admin": 0,
            "is_active": 1
        }
    })


class User(UserBase, CreatedUpdatedAtModel, IDModel, UUIDModel):
    model_config = ConfigDict(extra="ignore", from_attributes=True, json_schema_extra={
        "example": {
            "id": 1,
            "uuid": "FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF",
            "firstname": "Johnny",
            "lastname": "Appleseed",
            "email": "johnny@appleseed.com",
            "is_admin": 0,
            "is_active": 1,
            "created_at": "2023-01-01 00:00:00",
            "updated_at": "2023-01-01 00:00:00"
        }
    })

The result set should include all columns in the database except the password column. I tell FastAPI that the response model for this particular request is response_model=List[User] but without explicitly not selecting the password column from the database, it is still returned so not sure what the deal is there.

#

The example is correct

#

But the actual response has the password column

#

Ah ha, I GOT IT!

paper flower
marsh ledge
#

Don't laugh! LoL - I had the following code:

async def get(session: Session = Depends(get_session)):

Instead of

async def get(session: Session = Depends(get_session)) -> list[User]:

Not sure what the Python term is for the end there but it wasn't mapping to the return model?

marsh ledge
#

Okay, now to do some really fun stuff. Time to write a depend to ensure a user exists before trying to update or delete it!

marsh ledge
# paper flower What was the problem?

From a newbie Python developer to a seasoned veteran, it would be awesome to get your feedback on my structure and code. Is this something you might be up for doing? It wouldn't have to be for free, I can compensate you in someway.

paper flower
paper flower
fathom axle
marsh ledge
#

Right on! I chose a different route. My SQLAlchemy models use the ObjectSchema convention. For example, UserSchema because to me a schema is a table in a database where my Pydantic models follow the ObjectModel convention for example UserBaseModel or UserCreateModel.

marsh ledge
#

I am pretty organized, at least from my perspective. The only thing I am not doing (yet) is API versioning. This is a pretty small test project that I am hoping to be able to apply to larger projects.

paper flower
#

A lot of changes could be incremental too, also versioning is not as important if it's an internal API or it serves your frontend specifically

marsh ledge
#

Just an example of my structure

paper flower
#

Public API's are a different story

paper flower
#

And db models too

marsh ledge
#

Yeah, this is going to be just to serve my front end. It's a website for a softball team that I built. I want it to be dynamic because it's all static right now

paper flower
#

I try to do something like this:

.
└── src/
    ├── adapters/
    │   └── api/
    │       ├── users/
    │       │   ├── _router.py
    │       │   └── _schema.py
    │       ├── books/
    │       │   ├── _router.py
    │       │   └── _schema.py
    │       ├── app.py
    │       └── schema.py (common pydantic models)
    ├── db/
    │   ├── engine.py
    │   └── models.py (Could be a package too)
    └── domain/
        ├── users/
        │   ├── repositories.py
        │   └── services.py
        └── books/
            └── ...
marsh ledge
#

I dislike the src directory name so I used app. LoL

paper flower
#

src/app, change my mind

marsh ledge
#

Haha, the only reason I have is that src is overused and honestly, there could be code outside the src directory. For example. my main.py is not inside my app directory. Probably bad structure, but it works for me.

paper flower
marsh ledge
viral hillBOT
paper flower
#

@haughty tree <@&831776746206265384>

#

(Deleted message)

marsh ledge
marsh ledge
paper flower
#

So, they're not really separated, imo

#

With my structure I can just nuke adapters/api for example 😅

marsh ledge
#

I thought they had to be a package so I coud do things like from app.auth.services import AuthService? Should I be doing something else?

paper flower
#

Also such structure just gets bloated when you need other "adapters" - maybe you want to add some other protocol like graphql or grpc, cronjobs, rabbitmq, kafka, etc...

#

All of this code gets piled up in the same directory

marsh ledge
#

Haha, funny story. I wanted to use GraphQL instead of REST but I couldn't find any easy tutorials. Everything I tried didn't work so I gave up after about 8 hours. Haha

marsh ledge
#

For the company I work for, I would prefer to go the GraphQL route so I will have a look at that when I have time (it's 2AM where I am)

#

As for things like cronjobs. queueing services, etc, I have come accustomed to using AWS services. For example, if I need to do something periodically, I write an AWS Lambda function in Python + Docker that is deployed via AWS ECR and then uses AWS EventBridge Scheduler to execute it. For queueing, I rely on AWS SQS and use an AWS Lambda function that is kicked off when a message is entered into the queue.

#

I prefer as much serverless as possible. That's not saying I do not have AWS EC2 instances and even physical servers because I do. When processing drone imagery, that's not cost effective on AWS and cannot be done with a Lambda function. Those applications are Python deployed via Docker containers on physical servers that simply listen to the AWS SQS queue.

paper flower
#

I just use k8s 🥴

marsh ledge
#

Yeah, I do not have that luxury. Working in a startup has its drawbacks.

#

Hey quick question, with FastAPI, can I secure the routes at the root level instead of having to secure them using Depends individually? For example, I import the routes into a routes.py file which is then loaded in my main.py. The routes.py looks like this:

from fastapi import APIRouter
from app.auth.routes import routes as auth
from app.users.routes import routes as users

routes = APIRouter()

routes.include_router(auth, prefix="/auth", tags=["Authentication"])
routes.include_router(users, prefix="/users", tags=["Users"])

Can I use a Depends here to secure the users routes?

#

Hm, looks like a middleware would be the best bet except I do not want to block the auth routes

#

Well, my authentication system works! 🙂

paper flower
marsh ledge
#

Oh yeah, I can! Thank you! I just saw that.

#

Nice, that works great! Thank you

marsh ledge
#

Well, thank you for the help @paper flower , I appreciate it.

delicate fieldBOT
#

:incoming_envelope: :ok_hand: applied timeout to @analog shell until <t:1700075684:f> (10 minutes) (reason: burst spam - sent 8 messages).

The <@&831776746206265384> have been alerted for review.

gray surge
#

@analog shell I’m going to let that stick. If you’re trying to spam for voice verification, don’t. It’ll get you voice muted.

queen rose
#

so before I was using SQLAlchemy + SQLite3 and I was doing many queries like this: SELECT .. WHERE col IN (very_big_list_of_values), and everything worked fine.
now I removed SQLalchemy (because I was having issues with the dependencies) and I'm passing the parameters outside the query using qmarks: SELECT .. WHERE col IN ({','.join('?' for _ in range(len(values)))}) and I keep getting the error:

sqlite3.OperationalError: too many SQL variables
#

I understand that SQLite has a max of 999 values which you can pass as parameters, but how was SQLalchemy able to bypass that restriction?

#

or perhaps there is a way to make the number higher?

waxen finch
delicate fieldBOT
#

lib/sqlalchemy/dialects/sqlite/base.py lines 2037 to 2039

if self.dbapi.sqlite_version_info < (3, 32, 0):
    # https://www.sqlite.org/limits.html
    self.insertmanyvalues_max_parameters = 999```
queen rose
#

but I did a quick test and I was actually able to pass just below 250k parameters using sqlite3 directly

harsh pulsar
#

maybe not in all cases, but i remember reading about it and being surprised

somber pasture
#

Hello,

I am pretty new to sql alchemy and im trying to use it in my fast api app. I am getting a type error when trying to call this get all referrals function.

    def get_all_referrals(self):
        try:
            referrals = (
                self.db.query(DbReferral.id, DbReferral.code, DbReferrer.name, DbProduct.name)
                .join(DbReferrer, DbReferral.referrer_id == DbReferrer.id)
                .join(DbProduct, DbReferral.product_id == DbProduct.id)
                .all()
            )

            print(referrals)
            return referrals
        except Exception as e:
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
                detail = f"error retrieving referrals"
            )
 

ValueError: [TypeError('cannot convert dictionary update sequence element #0 to a sequence'), TypeError('vars() argument must have __dict__ attribute')]       
#

any idea into why this might happen?

#

wel I do self.db.query(DbReferral).all() this has no problem :

    def get_all_referrals(self):
        try:
            referrals = (
                self.db.query(DbReferral).all()
            )

            print(referrals)
            return referrals
        except Exception as e:
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
                detail = f"error retrieving referrals"
            )
marsh ledge
#

https://stackoverflow.com/questions/76322342/fastapi-sqlalchemy-cannot-convert-dictionary-update-sequence-element-0-to-a-seq - You have a TypeError meaning that you passed a value that is not a dict since it is expecting a dict

#

Are you using SQLAlchemy 2.0? If so, a lot has changed since the 1.x versions. Look back in this chat from last night between Doctor and I.

somber pasture
#

yeah i am using 2.0

marsh ledge
#

I have yet do any joins but what I can tell you is to ensure your models are defined properly and that you pay close attention to the documentation

somber pasture
#

gotcha. I am pretty sure they are. When i run normal sql queries against my db, the right results come up, but i dont know how to translate the sql queries to a sql alchemy query

marsh ledge
#

Have you tried using scalars?

self.db.query(DbReferral.id, DbReferral.code, DbReferrer.name, DbProduct.name)
                .join(DbReferrer, DbReferral.referrer_id == DbReferrer.id)
                .join(DbProduct, DbReferral.product_id == DbProduct.id)
                .scalars().all()
#

Also, the query function is legacy at this point. I would switch to using execute

somber pasture
#

Ill try that.

somber pasture
#

@marsh ledge got it working. Thank you for the assistance.

marsh ledge
#

No worries, glad you figured it out!

somber pasture
#

I didnt realize I was using out dated commands and the select statement made things much easier to understand

marsh ledge
#

Haha, yeah, I was falling into the same issues last night.

marsh ledge
#

Hey @paper flower you available right now for a question?

marsh ledge
#

May I private message you? It doesn't really belong in this channel.

paper flower
#

I would prefer if you don't

marsh ledge
#

No problem. It relates to FastAPI not necessarily SQLAlchemy or databases, but it definitely could. Is there a better place?

paper flower
queen rose
#

should I attempt to do that as well? SQL injection is not a problem in this case, I'm more afraid of breaking the string if one of the values has a quote ...

queen rose
#

if I do:

SELECT 1, 2, 4

it will create a single record: [(1, 2, 3)]
but how can I make it create one column that has 3 values: [(1,), (2,), (3,)] ?

#

(dynamically)

waxen finch
#

oh wait, did you mean you wanted 3 rows?

#

hmm for sqlite/psql at least, you could write VALUES (1), (2), (3);

coral wasp
queen rose
queen rose
slow shore
#

I've got a novice question--I (at least in part) come from data compression, and there's a rich world which approaches in data compression through a lens of information theory. Is there a resource for SQL or databases in general that teach how to organize and design a database (and pick an architecture) from that perspective?

I went through the pinned comments and they largely focus on *how* to use sql (or other types of dbs), but I'm curious about the decision making when it comes to organizing a database:

  • if I have time series data of multiple sensors, which I'll typically select first by a date range, then by a sensor subset, should the rows or columns be the timestamps?
  • If I then take that time series data for a couple of sensors out of the database, transform it into some combined value, and want to load it back into a database, should I put it into the original table or create a new table, especially if the transformation operates on the full length of time series and the transformation needs to be re-performed as often as once each hour (when the data is added to the db at 1hz)

What resources are there which talk about database design from these perspectives? Should I just live in the docs for postgress, mongodb, redis, graphql for a few weeks or is there something more general that I should start with?

storm mauve
#

you're better off asking in a community related to whichever database you are using

#

there are a bunch of generic tools for relational databases, but niche NoSQL databases are a bit harder to find (/ build) generic tools for

storm mauve
# slow shore I've got a novice question--I (at least in part) come from data compression, and...

should the rows or columns be the timestamps
typically you'll want to have a fixed amount of tables and columns, not dynamic table names / column names
... combined value, and want to load it back into a database, should I put it into the original table ...
how would an aggregate/combined value even fit in the original table? duplicate it for every row used to calculate it? do these rows need to have it?

The first thing I would do is probably try to see if other companies that operate with data similar to yours have blog posts explaining what their infrastructure looks like.
Other than that... idk, maybe ask in the data engineering server

#

Uh, any tools you find for manipulating a database are quite much by definition database administration tools.
You could use Roles or other permission systems to limit what users are allowed to do, but abstracting away how users interact with the database is literally your job

#

at best you can try to use an open source project as a base/reference I guess

cedar nebula
#

hello, i am trying to create a discord bot which can run few servers without using few bots a bot for every server so i decided to use a json file to store data on each server. well i got to a point which i struggel and someone told me that json file is not that good and he suggested that i got to here and talk to you. I would like some guidness on how I can create a database for my bot. https://discord.com/channels/267624335836053506/1174793064099889284

coral wasp
#

There’s a PostGres timeseriesdb engine, along with influx and other dbs, but hard to talk about applicability without knowing your particulars