#databases

1 messages ยท Page 22 of 1

potent bolt
#

To identify users

#

Say the client wants to interact with the server

#

Authentication tokens

paper flower
potent bolt
#

Oh, right! Thank you

#

I'll figure it out. Don't worry

#

I'm done here. Thanks to all those who helped out. Y'all the real VIPs :)

lost cypress
#

Is learning how to interact with databases in python an important knowledge in general? I have been wanting to look into it but i have no idea on where to start or if i should start. Thanks in advance to anyone who answer me!

coral wasp
lost cypress
wise goblet
#

learning more obscure databases on another hand is mostly field of Backend engineers, DevOps engineers and Data engineers. So at postgresql most devs could finish up their education regarding databases ๐Ÿ˜‰

lost cypress
potent bolt
#

Guys, one LAST question, am I supposed to modify the revision files generated by alembic myself inorder to create database tables???

wise goblet
potent bolt
#

But I had run the revision command and it just had the pass keyword in the upgrade/downgrade functions

#

What am I doing wrong?

wise goblet
#

Like.. where is the code or errors man?

#

where is link to github what are you doing ๐Ÿ˜

potent bolt
#

Error looks like this:

   raise translated_error from error
sqlalchemy.exc.ProgrammingError: (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.UndefinedTableError'>: relation "users" does not exist
[SQL: SELECT users.id, users.email, users.password, users.created, users.modified 
FROM users 
WHERE users.email = $1::VARCHAR]
[parameters: ('test@email.com',)]
(Background on this error at: https://sqlalche.me/e/20/f405)
#

It's not on GH yet unfortunately

#
"""fixed models

Revision ID: 1fe65c3d80f5
Revises: bc4522ef3b3d
Create Date: 2023-09-24 22:20:43.507352

"""
from typing import Sequence, Union

from alembic import op
import sqlalchemy as sa


# revision identifiers, used by Alembic.
revision: str = '1fe65c3d80f5'
down_revision: Union[str, None] = 'bc4522ef3b3d'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None


def upgrade() -> None:
    pass


def downgrade() -> None:
    pass
#

That's the most recent revision file

wise goblet
#

๐Ÿ˜‰ check your ORM declaring files now

#

something is missing to integrate ORM with alembic

#

also u can check alembic config

#

may be it is just not pointing to defined ORM model

potent bolt
wise goblet
#

i have no idea how u named your files

#

i am not a farseer

potent bolt
#

The env.py file is autogenerated by alembic though

#

Or you mean my regular model files?

wise goblet
#

everything matters here.
alembic file is supposed to be somehow connected to your model files ๐Ÿ˜‰

#

otherwise it will not have its autogenerating migrations command working

potent bolt
#

I'm quite confused.

#

I can see the .ini file from alembic and its env.py file. Other than that, it's my regular model files which I defined by subclassing DeclarativeBase class

#

Do you have a public code I could look at?

#

My model is similar to this

#

Ah, wait, am I supposed to use the --autogenerate flag??

#

Works now :)

full prairie
#

For searching purposes in mongodb which is faster , splitting one big db in multiple smaller db or making smaller collections within the big db and query to collection under specific conditions.

charred gate
#

I know that this is a python and not SQL related discord but usually python for analysis uses SQL and vice versa - how easy is SQL at a basic to intermediate level to learn?

spare pendant
#

basics aren't that hard to learn
intermediate is also probably not that hard either
the harder part is probably coming up with a proper database design for your needs

stiff radish
tidal mica
#

Any ideas why this may not work?

#

Sqlite3

#

It doesn't raise any errors

#

And yet the values don't appear in the table

#

The table exists

#

When I try to execute the same thing through the DB browser, it works

hexed estuary
#

Make sure you con.commit() afterwards.

tidal mica
#

I'm dumb

fallen vault
#

is there a way to check if you are currently connect to a sqlite3 databse?

fallen vault
#

That returns the value. Maybe Iโ€™m looking to see if a SQLite3.connect() has been made? Does the close() function destroy the connect object created with connect()?

coral wasp
#

You're wondering if the db connection is still alive?

fallen vault
#

Yes.

coral wasp
# fallen vault Yes.

If you have a connection object, you could run a test query. Without a connection object, I dunno. I guess you could check the db file for open file handles (like lsof)

obtuse magnet
#

Any idea why the materialised view of a table consumes much less disk space than the actual physical table?

Also, reltuples is 0 for a materialized view?

fringe sundial
#

guys how can i fix this sqlalchemy errror?

sqlalchemy.exc.ArgumentError: 'SchemaItem' object, such as a 'Column' or a 'Constraint' expected, got Uuid() 
id: Mapped[Uuid] = mapped_column(Uuid(as_uuid=True), primary_key=True, default=uuid4, nullable=False, unique=True)
fringe sundial
paper flower
#

No

#

Also you can shorten it to

id: Mapped[UUID] = mapped_column(primary_key=True, default=uuid4)
fringe sundial
#

is my import wrong?

rom sqlalchemy import ForeignKey
from sqlalchemy import String
from sqlalchemy import DateTime
from sqlalchemy.dialects.postgresql import UUID
from sqlalchemy.sql import func
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import relationship
from uuid import uuid4
paper flower
#

uuid.UUID

#

primary keys are already not nullable, indexed and type is inferred from your Mapped argument

fringe sundial
#

oh

paper flower
fringe sundial
#

oh, thanks

fringe sundial
paper flower
fringe sundial
#

thanks

hollow oasis
#

If anyone here
Can someone tell
How is mongodb scheme is different than other dbs scheme ( according to some google search results And website )

And how can I build a schema in mongodb? ( have no idea how to build a schema )

#

if you reply turn on ping please

brazen charm
#

if you are trying to build a schema in mongodb you are using the wrong database

fading patrol
hollow oasis
#

fun fact: I don't even need schema for my project. I'm just curious can we do it or not

brazen charm
#

no, because that would defeat the purpose of mongo

#

but also, If you data can be correctly represented in a schema

#

why use mongo at all ๐Ÿ˜…

fading patrol
maiden light
#

Ping when replying

harsh pulsar
#

you did SELECT guild FROM antinuke WHERE v1 = 1 AND guild = ... and it returned a row? outside of a database transaction?

harsh pulsar
#

it might be the case that the table does not contain the data you think it contains, or you're using the wrong values to query it

maiden light
#

But I'm using database in python application

maiden light
harsh pulsar
maiden light
#

I only update value between 0 and 1

#

And here I'm asking for 1

harsh pulsar
#

or even in a python script, try writing out the query without any query parameters, just put the values directly into the query

maiden light
#

Ohh

#

I'll try something

#

And see if it works

hollow oasis
#

Real database?? Isn't mongodb a real database??

tight junco
#

no

obtuse magnet
#

It's not a relational database

wise goblet
hollow oasis
#

As long as I know Postgres or MySQL is considered SQL based database. While mongodb is nosql database ๐Ÿ˜

wise goblet
# hollow oasis As long as I know Postgres or MySQL is considered SQL based database. While mong...

difference in Relational database providing typing validation of structure
and giving access to migrate tables from one state to another one

it gives
a) protection from having code that uses db with wrong data
b) gives data integrity, you are sure all your data reflects set structure and linked with data from other tables
c) allows smooth transition to new data structure, when deploying new application versions

#

with mongodb u a basically making... application that will not scale in terms of your code growth

#

one time app, which will be able to live only in short amount of tables and only its first version prod version
when u will start adding new features and making new releases -> you are going to be growing in complexity exponentially very quickly

#

especially bad it happens when people not having unit tests. Relational databased code can survive dozens of times longer without them, mongo db code can not

hollow oasis
#

Fun fact is mongodb can handle those things effectively

#

as long as I know I guess

obtuse magnet
#

Mongo has it uses but it shouldn't be the default

wise goblet
hollow oasis
#

Btw I need speed and performance at database so I think nosql db is faster generally

obtuse magnet
#

What kind of scale r u running lel

#

Discord runs on scylladb iirc

#

Which is built on top of postgres

#

Nvm Scylla is nosql

hollow oasis
#

XD

#

๐Ÿ˜†๐Ÿ˜‚

#

Theory proven: if you have no idea how your database will be look like then pick nosql database

#

๐Ÿ˜†

obtuse magnet
#

Point is I doubt the speed and performance u need justifies nosql compared to what SQL can offer u

#

U never mentioned anything about not knowing ur schema beforehand?

hollow oasis
#

In most cases nosql is faster

obtuse magnet
#

Which I doubt again

hollow oasis
#

*ik mongodb is schemaless db

wise goblet
hollow oasis
#

๐Ÿ˜†

#

Btw I got got some objections in 2nd one

#

Code will be still maintained in future

#

By November 2015, as they reached 100 million messages, various issues were observed like data and indexes exceeded available RAM. This prompted the transition to a more suitable database, ensuring a high-quality user experience. Considering the above points Cassandra was chosen as the successor of MongoDB at Discord.Sep 12, 2023
https://www.linkedin.com โ€บ pulse
Discord's journey from MongoDB to Cassandra - LinkedIn

#

Looks like mongodb will be good choice for begining to long run

#

๐Ÿ˜…

wise goblet
#

yeah, and u are 100 million messanging application, capable to actually unit test your code so well that u will not suffer issues araised from using mongodb
everyone forges their own fate. u was warned of problems u will suffer
I audit code projects, regular people suffer from mongo problems after creating just 4 tables

hollow oasis
#

Personal opinion:
They changed mongodb to another db cause mongodb documents size must be in 16mb
How much data discord stores in one document ๐Ÿคจ

wise goblet
hollow oasis
#

๐Ÿ˜

#

Ok gtg ๐Ÿ˜†

#

Cya

obtuse magnet
#

U keep talking about scale as justification for nosql

brazen charm
#

Just because a DB isn't relational doesnt mean its not a DB

#

it just means it is not a relational db

brazen charm
wise goblet
#

from the point of view of choosing it as main database for beginners, it is not really db for them

brazen charm
#

Well, its much easier to shoot yourself in the foot and make less performant queries

obtuse magnet
#

idek how to write a query in mongo HAHAHA

brazen charm
#

what people often don't realise is relational DBs are actually really fast when querying data they don't have indexes. A lot of NoSQL/NewSQL databases suffer quite hard if you are missing a index which means it can't do a hashtable lookup.

wise goblet
#

anyway, the main problem of mongodb usage for beginners
is that... if they had poor code quality (they usually all have poor code quality), mongodb will multiply their problems further with having unstructured untyped database
SQL database gives a more room for application growth

brazen charm
#

and if you not have indexes, your performance falls off a cliff as mongo starts doing linear scans across the data

torn sphinx
#

I need help my customer is getting this error with my compiled app he cant install pysqlite3 on windows

obtuse magnet
#

might wanna try installing a specific version

#

maybe give a requirements.txt tile to them or smth

torn sphinx
#

I did

#

its just not installiong

brazen charm
#

why on earth are you using pysqlite3

torn sphinx
#

as its not a massive thing, and its only for a small sqlite3 database

brazen charm
#

you know that a) it hasnt been maintained for over a year and b) is already included within windows?

torn sphinx
#

so you dont actually need to install it

obtuse magnet
#

y sqlite in the first place?

torn sphinx
#

as what else would I use its just for a small program for storing confi

obtuse magnet
#

and customer? hmmm

torn sphinx
#

its a tool

#

called scrapemate. it stores what amazon stores they want to monitor for new products

#

and stores around 90 products per store

brazen charm
#

if you have python installed on windows

#

you have SQLIte by default

sudden dagger
carmine tiger
lavish nebula
#

For some reason i get this error from my flask app

Traceback (most recent call last):
  File "/home/reikimann/coding/awesome_recipes/main.py", line 3, in <module>
    app = create_app()
          ^^^^^^^^^^^^
  File "/home/reikimann/coding/awesome_recipes/app/__init__.py", line 19, in create_app
    db_parser.add_user(new_user)
  File "/home/reikimann/coding/awesome_recipes/app/models/db_parser.py", line 28, in add_user
    self.cursor.execute(query, values)
sqlite3.IntegrityError: UNIQUE constraint failed: user.username

Even if I delete the database I get this error. It runs for one second then crashes.

Main.py:

from app import create_app

app = create_app()

if __name__ == "__main__":
    app.run()

app.init() file:

def create_app():
    ...
    db_parser = SqliteParser("test.db", app.instance_path)
    new_user = ("asdfjasoifjayeeet", "yet")
    db_parser.add_user(new_user)
    db_parser.close_connection()
    ...

SqliteParser file:

class SqliteParser:
    """ Class containing all the DB interaction methods. """

    def __init__(self, database, instance_path):
        self.dbName = database
        self.db = sqlite3.connect(database)
        self.cursor = self.db.cursor()
        # Temp
        self.db_url = Config(instance_path).get_db_url()
        self.config = Config(instance_path)

        with open(self.config.schema_path, "r") as f:
            schema = f.read()

        self.cursor.executescript(schema)

    def add_user(self, values):
        """ Creates a new user """
        query = "INSERT INTO user (username, password) VALUES(?, ?)"
        self.cursor.execute(query, values)
        self.db.commit()

Schema.sql file:

CREATE TABLE IF NOT EXISTS user (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  username TEXT UNIQUE NOT NULL,
  password TEXT NOT NULL
);
harsh pulsar
cedar tiger
#

Hence the error that its not unique

lyric bane
#

what is this erver

lavish nebula
lavish nebula
torn sphinx
#

Ya'll

#

i must be stupid or sum shit cuz

#

how am i struggling on the first level

obtuse magnet
#

sql injections?

brave bridge
#

I have this schema in PostgreSQL:

message(id, content)
message_tags(message_id, tag)

I know that I can find all messages that feature both foo and bar tags with where tag = all('{foo,bar}'), or where tag = all($1).
But what if I want to select all messages which match (foo AND bar) or (foo AND fizz AND buzz) or (fizz AND other)? I know I can combine three all()s with an OR, but is it possible without dynamically constructing a query?

brave bridge
#

wait, I think all() is exactly wrong here

brave bridge
#

So this is a query I came up with:

select id, tags from
    (
        select m.id, array_agg(t.tag) as tags
            from messages m
            join message_tags t on t.message_id = m.id
            group by m.id
    ) subq
    where (%s::TEXT[]) <@ tags;

Here's the code for generating sample data: ```py
def genstring(size: int) -> str:
return "".join([random.choice("0123456789abcdef") for _ in range(size)])

messages = [(i, genstring(100)) for i in range(1, 100_001)]
tags = [genstring(10) for _ in range(100)]

message_tags: set[tuple[int, str]] = set()
for tag in tags:
for _ in range(500):
message_id, _ = random.choice(messages)
message_tags.add((message_id, tag))

#

The query does work, but it's absolutely pathetic

GroupAggregate  (cost=5533.12..6090.08 rows=1 width=40) (actual time=72.131..72.133 rows=0 loops=1)
  Group Key: m.id
  Filter: ('{5032b7c164,f605eff885,ab362709ba,0f142381da,d08c93cff3,87ec1815f9,647004d3a2,870b369760}'::text[] <@ array_agg(t.tag))
  Rows Removed by Filter: 39376
  ->  Sort  (cost=5533.12..5595.00 rows=24754 width=19) (actual time=48.828..51.552 rows=49854 loops=1)
        Sort Key: m.id
        Sort Method: quicksort  Memory: 3703kB
        ->  Hash Join  (cost=2941.12..3726.65 rows=24754 width=19) (actual time=26.542..40.114 rows=49854 loops=1)
              Hash Cond: (t.message_id = m.id)
              ->  Seq Scan on message_tags t  (cost=0.00..720.54 rows=24754 width=19) (actual time=0.010..4.589 rows=49854 loops=1)
              ->  Hash  (cost=2482.72..2482.72 rows=36672 width=8) (actual time=26.368..26.369 rows=100000 loops=1)
                    Buckets: 131072 (originally 65536)  Batches: 1 (originally 1)  Memory Usage: 4931kB
                    ->  Seq Scan on messages m  (cost=0.00..2482.72 rows=36672 width=8) (actual time=0.006..13.915 rows=100000 loops=1)
Planning Time: 0.478 ms
Execution Time: 72.600 ms
coral wasp
harsh pulsar
#

this smells LATERAL potentially

#

good question though, i can work up a sql fiddle thing

coral wasp
#

Yah, in duckdb Iโ€™d write a lateral against the unnested array

harsh pulsar
#

that or ive been brain poisoned by snowflake's LATERAL FLATTEN

harsh pulsar
#

actually i think snowflake is like... columnar, but each column is an entire row? or something like that

brave bridge
#

I changed the table to have a tags text[] not null column, and the query is extremely simple: select id from messages where (%s::TEXT[]) <@ tags

Bitmap Heap Scan on messages  (cost=837.43..841.44 rows=1 width=8) (actual time=17.145..17.146 rows=0 loops=1)
  Recheck Cond: ('{83dde7b967,367177e827,0ebeecc7ae,df70048773,e50b676be8,d3d74f1285,444cc44eb2,b2966b3689}'::text[] <@ tags)
  ->  Bitmap Index Scan on messsage_tags_idx  (cost=0.00..837.42 rows=1 width=0) (actual time=17.144..17.144 rows=0 loops=1)
        Index Cond: (tags @> '{83dde7b967,367177e827,0ebeecc7ae,df70048773,e50b676be8,d3d74f1285,444cc44eb2,b2966b3689}'::text[])
Planning Time: 0.217 ms
Execution Time: 17.165 ms

test data: ```py
tags = [genstring(10) for _ in range(100)]
messages = [(i, genstring(100), random.choices(tags, k=random.randint(0, 10))) for i in range(1, 1_000_001)]

harsh pulsar
#

curious what your duckdb solution would look like @coral wasp if you have the chance to write it up

brave bridge
#

I'm actually not married to postgresql, it's just for prototyping mostly

harsh pulsar
#

that's basically what you were creating with your subquery, right?

coral wasp
#

I could just join on any(select unnest(?))

#

(I think, or something like that)

brave bridge
#

that is complete black magic to me

#

just like most of SQL

harsh pulsar
#

is this a good interview question or no

#

(assuming it's live and they are allowed access to the postgres docs)

brave bridge
#

I do not understand SQL and I hate postgres documentation, so please don't interview me

coral wasp
#

Unnest unpivots an array

#

So, instead of a row of values, you have a column

brave bridge
#

you can already use any() on an array in postgres tho

harsh pulsar
#

yeah maybe postgres is just too good

coral wasp
#

Oh, yah: so you want unnest so you can join the unnested values on the tags: without having to do a string comparison)

harsh pulsar
brave bridge
coral wasp
brave bridge
#

I think sticking with an array could be okay for my purposes ๐Ÿ‘€

#

I know that it's a bit of a poo poo, but I will need to research this further as I don't understand SQL at all

coral wasp
#

Arrays are fine. Itโ€™s just a relatively new thing in sql land

#

One option in PostGres is to normalize the tags (have a separate tag table). Rather than storing as strings.

#

Thatโ€™ll shrink the message_tags association table to just two ids.

brave bridge
# brave bridge I changed the table to have a `tags text[] not null column`, and the query is ex...

Tried it on 1 million messages: ```
Bitmap Heap Scan on messages (cost=1387.16..1391.17 rows=1 width=8) (actual time=30.443..30.444 rows=0 loops=1)
Recheck Cond: ('{37d2f,2b84e,34608,8712f,b9ff6,fd099,cce90,fb20d}'::text[] <@ tags)
-> Bitmap Index Scan on messsage_tags_idx (cost=0.00..1387.16 rows=1 width=0) (actual time=30.441..30.441 rows=0 loops=1)
Index Cond: (tags @> '{37d2f,2b84e,34608,8712f,b9ff6,fd099,cce90,fb20d}'::text[])
Planning Time: 0.709 ms
Execution Time: 30.461 ms

coral wasp
#

Or: store the ids as an array in the message table and make it completely flat (at least, Iโ€™d consider that)

#

What was that query?

brave bridge
#

Table ```sql
create table messages (
id bigserial primary key,
author_id bigint not null,
created_at timestamp with time zone not null default NOW(),
content text not null,
tags text[] not null,

unique(id)

);
create index messsage_tags_idx on messages using GIN (tags);

#

i guess this is what GIN is for lmao

coral wasp
#

Do you have a data generator? Might try it on duckdb later

brazen charm
#

this sort of thing is where text-search engines take over

coral wasp
#

Oh I was leaning towards array operations

#

And maybe normalizing a bit

brazen charm
#

I think for Postgres' case they could make it more efficient by creating a DFA over the btree

#

but I guess realistically if you're trying to do those queries, a DFA is not going to help you that much, compared to an actual index

brave bridge
#

The tags are completely arbitrary if that matters

brazen charm
#

The issue you tend to have is that postgres is having to do the job of building an inverted index, on every query.
Since effectively the inverted index takes that whole scan and instead goes "Create this sets and either perform a union or intersection with other terms"

brave bridge
#

I wonder if I can keep a read model in RAM with just this index, if I need some kind of extreme performance

brazen charm
#

For PG just adding a GIST index would be more than enough

brave bridge
#

what's the difference between GIST and GIN?

brazen charm
#

or at least it will be better

brave bridge
#

from what I've read in the docs gist is for like, geometry?

brazen charm
#

Basically size and performance is most of the difference

#

GIST indexes are technically lossy and can produce false positives (which PG then filters out when it validates the row) but they are much smaller than GIN indexes

#

and much faster to update

#

so if you are in a situation where you have a large number of inserts coming in

#

a GIN index might be too expensive for what you really want out of it

#

GIN indexes are actual inverted indexes, GISTs are a bit different

coral wasp
#

And, fyi, you can cut your load time significantly with: psycopg2.extras.execute_values(cursor, "INSERT INTO messages(id, content, tags) VALUES %s", messages, page_size=1000)

torn sphinx
#

so im facing a small issue, i need a certain button that when pressed does 2 things, calculates the sum of all items in a table and then deletes all the records from the table. im doing the delete part first but it doesnt seem to be working so am i missing something here

brave bridge
#

&& finds posts that have at least one tag in my query

torn sphinx
slender drum
#

sqlalchemy.exc.ProgrammingError: (psycopg2.errors.InvalidSchemaName) no schema has been selected to create in
LINE 2: CREATE TABLE alembic_version (
^

[SQL:
CREATE TABLE alembic_version (
version_num VARCHAR(32) NOT NULL,
CONSTRAINT alembic_version_pkc PRIMARY KEY (version_num)
)

this happens when i try to run an alembic.
did somebody encouter this and how did you fix it?

potent bolt
#

Hi guys! Please I'm working with sqlalchemy, how can I implement bulk deletion using the core API?

#

I guess I should just use a for loop??

slender atlas
# torn sphinx so im facing a small issue, i need a certain button that when pressed does 2 thi...

Tip: Create one global database connection and pass it in the functions that require it or put a global conn line in them. You can create cursors as you need, and you can probably use a with statement so you don't have to close them yourself. Connecting to the database and disconnecting is a waste of time and may even result in being locked from using the database because there is an established connection in use

shut tiger
brave bridge
#

The official docs were not very helpful

brazen charm
#

Mmm maybe the Postgres source code? ๐Ÿ˜… I'm not really aware of anyone using the indexes outside of postgres

brave bridge
#

Though it is even slower to create

#

Anyway, I think I'm really overthinking stuff at this stage. <1s to search a million messages will serve me very well

#

and I can always implement some kind of eventually consistent read model

brave bridge
#

clearly I know about data access more than the people behind Postgres ||/s||

#

RUM seems to be able to store additional stuff like timestamps, that could be useful

#

to avoid a heap scan

brazen charm
#

RUM looks like a more traditional FTS index

#

so i would probably say it's a solid option

brave bridge
torn sphinx
# slender atlas Tip: Create one global database connection and pass it in the functions that req...

ah ok will do i was planning on refining the code once i got done with the initial workings, i had a slighter more complex issue and that is the app im making is for a restaraunt so i have add to cart buttons next to the items. When the buttons are pressed i need them to be added to an sql table with default price values which is the issue so is there any way to do that since i only know how to set price values for everything at once

paper flower
#

You could do something like

delete(Record)

or

delete(Record).where(Record.id.in_(ids))
lunar cargo
#

hello guys
Im a little bit confused. My context manager doesn't work properly it doesn't close the session each time Im calling it

Base = declarative_base()
engine = create_async_engine('sqlite+aiosqlite:///my_url?check_same_thread=False', echo=False)
__session = async_sessionmaker(engine, expire_on_commit=False,)

class Player(Base, SerializerMixin):
    data...

async def global_init():
    async with engine.begin() as connection:
        await connection.run_sync(Base.metadata.create_all)


async def create_session() -> AsyncSession:
    async with __session() as session:
        async with session.begin():
            return session
async def main():

    data = [370996968811397122, 423231892943536139, 428188503390814212, 453197176714166273]
    session = await create_session()

    for player in data:
        result = await session.execute(select(Player).where(Player.id == player))
        p = result.scalars().fetchall()
        for user in p:
            print(user)


if __name__ == '__main__':
    asyncio.run(main())

# The garbage collector is trying to clean up non-checked-in connection <AdaptedConnection <Connection(Thread-1, stopped daemon 12448)>>, which will be dropped, as it cannot be safely terminated.  Please ensure that SQLAlchemy pooled connections are returned to the pool explicitly, either by calling ``close()`` or by using appropriate context managers to manage their lifecycle.
#

I think I did something wrong

harsh pulsar
harsh pulsar
#

you need @asynccontextmanager and yield instead of return

#

also i strongly suggest not tying together creating a session with opening a transaction

#

that's asking for trouble

#

@lunar cargo

Base = declarative_base()
engine = create_async_engine('sqlite+aiosqlite:///my_url?check_same_thread=False', echo=False)
session_factory = async_sessionmaker(engine, expire_on_commit=False)

class Player(Base, SerializerMixin):
    data...

async def global_init():
    async with engine.begin() as connection:
        await connection.run_sync(Base.metadata.create_all)

@contextlib.asynccontextmanager
async def create_session() -> AsyncSession:
    async with session_factory() as session:
        yield session

async def main():
    data = [370996968811397122, 423231892943536139, 428188503390814212, 453197176714166273]
    async with create_session() as session:
        async with session.begin():
            for player in data:
                result = await session.execute(select(Player).where(Player.id == player))
                p = result.scalars().fetchall()
                for user in p:
                    print(user)

if __name__ == '__main__':
    asyncio.run(main())
coral wasp
# brave bridge Anyway, I think I'm really overthinking stuff at this stage. <1s to search a mil...

I'd consider a few things here... date-based partitioning, as I'd assume your message searches would probably have some sort of time component... and with partitions, may address long-term gradation problems as the tables grow massive. I'd also think about factoring out the tags, rather than storing them as strings in the messages table. Storing an array of int[] (tag ids) may be more efficient than an text[]

torn sphinx
#

I have sort of a weird idea i need to implement, im creating an app right now using python and tkinter. So when a button is pressed i need a record to be added to a mysql table and ofc if multiple are pressed i need them to be added in there but the records have to come from a seperate table with pre existing data i was thinking with a key. is there someway to implement this?

potent bolt
#

Ps: what I have (for loop) currently works but I was wondering if it had any impact on perf

brave bridge
#

Though reading the row also takes up time, so the space might actually increase the performance

#

also, I think I completely botched my syntetic test tbh

#

because I tested on uniform data, not a real-world distribution

#

i.e. there will be a few extremely popular tags and a lot of unpopular tags

#

If I understand correctly, to find a post that matches every one of foo, bar, baz a GIN index literally does a linear scan, like ```py
relevant_indices: list[set[MessageId]]: ...
relevant_indices.sort(key=len) # traverse the least popular tags first
final_message_ids = reduce(lambda a, b: a & b, relevant_indices)

#

(if not, I think that's what I am missing in how GIN works)

#

So even if the intersection itself is small, if all three tags are "popular", it will have to do some hefty scanning

simple barn
#

So, has anyone ever considered a markdown to SQlite script? I document my schema before data modeling, and figured I could just parse it to create a DB. Working on inserts now.

brave bridge
#

Also you usually have some SQL file that you use to initialize. Or a migrations folder

#

Though to understand the result of a series of migrations you'll have to do a fold manually...

simple barn
coral wasp
coral sparrow
#

Hi. Can someone tell me which database would be suitable. I want to store daily changes in about a million products. I get the data as

{id:some, field1:value ...........field33: value}

It is not guaranteed that all fields will change everyday. Think of it like Description of the product which might change in a year, number of sold which will change daily.

What is the best setup for this

Thank you!

coral sparrow
brave bridge
coral sparrow
#

Values will change daily which i need to capture

brave bridge
#

Then you can just make a table with columns like (id, name, description, foo, bar, baz, ...)

coral sparrow
#

Got it. So rdms is preferred way for this instead of nosql

brave bridge
coral sparrow
#

Thank you! @brave bridge

brave bridge
#

NoSQL (which is a very questionable name tbh) does have some applications, but for most applications a SQL database is a good default

simple barn
# coral sparrow I did think using mongo but how would I go about recording the changes? Edit in ...

I think this quote is good from a website:

The five critical differences between SQL and NoSQL are:

SQL databases are relational, and NoSQL databases are non-relational.
SQL databases use structured query language (SQL) and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data.
SQL databases are vertically scalable, while NoSQL databases are horizontally scalable.
SQL databases are table-based, while NoSQL databases are document, key-value, graph, or wide-column stores.
SQL databases are better for multi-row transactions, while NoSQL is better for unstructured data like documents or JSON.
#

Like fix error said, you can also use SQL database with a JSONB field

coral sparrow
#

Thank you @simple barn @brave bridge

Looks like postgres with jsonb it is

brave bridge
#

well, this list is oversimplifying stuff a bit

#

"NoSQL" is a very broad term that can mean different things

#

For example, Cassandra has schemas

coral sparrow
#

Document db to column stores

storm mauve
#

that point in particular

SQL databases are vertically scalable, while NoSQL databases are horizontally scalable.
is oversimplying wayyy too much

brave bridge
#

yeah, replicas are a thing

#

and partitioning

simple barn
#

I think ChatGPT wrote that post, haha. Disregard.

brave bridge
#

I guess SQLite is not horizontally scalable

#

But if you have a SQLite database and you need horizontal scaling, you made some extremely backwards design decisions lol

coral sparrow
brave bridge
#

I think defining terms as a negative of something usually turns out in confusion and vagueness

#

Like, is EdgeDB a "NoSQL" database?

#

as it's just a fancy frontend on top of postgresql, but doesn't use SQL for querying

#

also "non-functional requirements"

coral wasp
brave bridge
#

Well, I guess you can partition it "manually"

potent bolt
#

Not different attributes

shut tiger
potent bolt
#

Oh yeah, I forgot the Q class

#

It's been long I did that kind of query in django

#

but in django, calling delete() on the queryset can delete all the objects

#

I'm not sure that's the case with sqlalchemy

#

I'm using sqlalchemy orm. Seems django orm already abstracts that part

shut tiger
potent bolt
#

Oh..

shut tiger
potent bolt
#

Well, I haven't seen a solution yet. Maybe I need to keep searching

potent bolt
shut tiger
potent bolt
#

The Q class was in reference to the logical operators

#

AND, OR

brazen charm
#

It would actually be a pretty cool concept to see, you could do a fairly abstract version and a more efficient version which modifies some more of the sqlite internals

#

But personally, I would like even to see a POC of a DB built on top of a set of sqlite dbs

#

I'm sure you could make a more efficient system

#

But damn sqlite setup right could probably do pretty solid

coral wasp
#

The key part of all of that is writing a sql rewriter/interpreter to subdivide the work/etc.

kindred hawk
#

Is this function correct because even after executing this, it is still retuning None

def enable_v1(self, value_1: int):
    self.cursor.execute("UPDATE table_1 SET v1 = ? WHERE guild = ?", (1, value_1,))
    self.conn.commit()
brazen charm
#

what is still returning none?

#

also, cursors should be short lived

#

you should make a new one effectively for each query

#

rather than as a class var

kindred hawk
# brazen charm *what* is still returning none?
  def check_v1(self, value_1) -> bool:
    cursor.execute('''SELECT guild FROM table_1 WHERE v1 = ? AND guild = ?''', (1, value_1))  
    value = cursor.fetchone()
    print(value)
    if value is None:  
      return False  
    return True

This ^

kindred hawk
kindred hawk
finite lily
#

All, trying to deploy something and noticed my tests were failing. Apparently its a DROP DATABASE command. Using Postgresql. Noticed I was on 12 so updated to 16

#
psql (16.0 (Ubuntu 16.0-1.pgdg20.04+1), server 12.16 (Ubuntu 12.16-0ubuntu0.20.04.1))
Type "help" for help.

amelia=> DROP DATABASE IF EXISTS foo WITH (FORCE);
ERROR:  syntax error at or near "WITH"
LINE 1: DROP DATABASE IF EXISTS foo WITH (FORCE);
                                    ^
amelia=>
fading patrol
finite lily
#

Any ideas why I am getting syntax errors?

kindred hawk
#

But it was working in my previous code

#

Seems like it isn't fetching correctly or not inserting values into table

fading patrol
finite lily
#

yeah it just errors on WITH FORCE each time

#

if I remove it its fine

#

but wondering why

hollow oar
finite lily
#

yeah just realized that. I saw the 16 in the front and thought I was good to go

hollow oar
#

ah 16 is your client version

#

in fact it should have warned you saying you are connecting to PG 12 with a newer client version

normal frigate
#

Hi,
im working on a flask app which uses sqlalchemy.
Locally when testing, im using sqlite without problems. I even set up a local mariadb 11 server to test the production config, without problems.
When i deploy it to my server, which also runs mariadb 11, i get errors that the table already exists.

sqlalchemy.exc.OperationalError: (MySQLdb.OperationalError) (1050, "Table 'events' already exists")

database.py

from .config import Config
from flask_sqlalchemy import SQLAlchemy  # noqa
from sqlalchemy import create_engine, MetaData
from sqlalchemy.ext.declarative import declared_attr
from sqlalchemy.orm import declarative_base, scoped_session, sessionmaker


engine = create_engine(
    Config.SQLALCHEMY_DATABASE_URI, pool_pre_ping=True, pool_recycle=3600
)

db_session = scoped_session(
    sessionmaker(autocommit=False, autoflush=False, bind=engine)
)


class CustomBase:
    @declared_attr
    def __tablename__(self):
        return self.__name__.lower()


Base = declarative_base(
    metadata=MetaData(
        naming_convention={
            "ix": "ix_%(column_0_label)s",
            "uq": "uq_%(table_name)s_%(column_0_name)s",
            "ck": "ck_%(table_name)s_%(constraint_name)s",
            "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
            "pk": "pk_%(table_name)s",
        }
    ),
    cls=CustomBase,
)


def init_db():
    import apps.models  # noqa

    Base.metadata.create_all(bind=engine, checkfirst=True)


def update(self, data):
    # Update an object with the data provided
    for key, value in data.items():
        if hasattr(self, key) and value is not None:
            setattr(self, key, value)


Base.query = db_session.query_property()
Base.update = update

I dont understand why this can happen.

  1. why does my local mariadb server work, but the remote mariadb server (fresh install, dockerized) does not?
  2. Base.metadata.create_all(bind=engine, checkfirst=True) should add IF NOT EXISTS afaik - how can it even try to re-create the tables
odd zenith
#

If there is a caching mechanism for admin panel dashboards, do people use Redis? And if they do, what happens if a cached data gets deleted by LRU tactic and admin starts seeing old events?

#

like how all does that shit work

woven bobcat
odd zenith
#

just trust on the main database?

woven bobcat
# odd zenith so they dont do cache at all?

It depends,

Case 1: Data is not frequently updating in real-time

eg: records of user data

  • admin panel do cache, but, everytime before UI updates, or in a fixed interval it would check for updates.

  • or.., there would be a webhook, which notifies, if there is an update in database

Case 2: Data is updating real-time frequently

eg: stock market

  • it's based on, like I said, timestamps, and it always updates
odd zenith
#

And they get updated frequently

#

there isnt one admin, there will be more than one (expected, might be one)

woven bobcat
#

You can check for yourself which case it comes under

odd zenith
#

Yes, i host the admin panel from the backend directly, I just mount the panel to FastAPI app.

#

i can connect to redis or postgresql

odd zenith
#

like there is 2-ish month development time in ahead of us,

woven bobcat
#

It's better to be safe than sorry ๐Ÿ˜…

odd zenith
#

i can populate with bunch of random data

coral wasp
# odd zenith Probably case 2? I expect a lot of records, users, events, transactions, ban app...

Even in case 2, sometimes data needs to be pre-computed and pushed to a cache rather than live-queried. ie: If you're drawing stock market candles with a 1 minute granularity, you could pre-compute (denormalize) and cache the OHLCV candles, rather than querying/constructing the results directly. Or, adding a materialized view that has a refresh interval or refreshes on certain events, etc.

odd zenith
#

That makes sense, but i am worried about how will I cache things.

coral wasp
#

Are you perhaps thinking about premature optimization? Dropping caching in is relatively easy, once you ahve an understanding of your bottlenecks.

#

You'd be, perhaps, surprised at how well databases perform under reasonable load: often caching is not needed.

odd zenith
#

Maybe yeah, main PostgreSQL database might be more than enough

#

With the admin panel and the backend itself, we will do like, what, 500k row reads monthly? (again, expected)

coral wasp
#

There's also lots of things you can do within the database to optimize (namely: indexing, partitioning, materialization, etc), before you even get to caching.

odd zenith
#

Thanks a lot

wise goblet
#

Plus there is caching available at the level of reverse proxy Nginx

#

Plus there is caching available at the level of CDN

#

Too many caching layers, at this point worry becomes how to invalidate them ๐Ÿ˜…

glass rose
#

hey can anyone explain this diagram a little pleaseItsukiBorgarPlease

pure mortar
#

@coral wasp thought you would be interested. i found this insightful:

coral wasp
pure mortar
coral wasp
pure mortar
#

oh yeah that would be dope

queen rose
fallen vault
#

Im trying to count records based on a partial information from a column. I have a date column with the format mm/dd/yyyy and i want to pull data based on a specific year. I know i need count and in but im having trouble with the parsing the year out.

#

Or should i just do this with a python loop?

unkempt prism
fallen vault
#

Sqlite3 .

unkempt prism
fallen vault
#

Iโ€™m using strings

unkempt prism
#

so you can parse it into the date type and then format it to how you need it.

pure mortar
unkempt prism
#
SELECT strftime('%Y', you_col) FROM your_table;
fallen vault
#

I donโ€™t think sqlite3 has a date data type.

#

At least I donโ€™t remember seeing it in their website.

wise goblet
fallen vault
#

Iโ€™ll have to look into like, thatโ€™s a new statement for me. And I just realize Iโ€™m using test data, I donโ€™t remember if the software generated it or I dumped it. If I dumped it, the software may create a different format.

#

Thank you.

balmy ice
#

is it difficult to link py with sql for databeses and stuff

storm mauve
#

not really, there's a sqlite3 library built-in you can use from the moment you install python and most databases have python drivers

wise goblet
#

It is entirely another question how good, maintainable or performant result will be though

fading patrol
unkempt prism
fallen vault
pastel wren
#

is this the general form of doing bulk updates for sqlalchemy orm?

session.execute(update(magicInfo),bulk,)# bulk is [{id:1,"set_names":"abc,def"},{id:2,"set_names":"hif,klm"}
wise goblet
#

oh... also... how the hell to make SQL update in bulk if raw SQL is not supporting it ๐Ÿ˜…

#

hehe, under the hood it is not very bulky as expected

UPDATE employee SET name=? WHERE employee.id = ?
[...] [('scheeks', 1), ('eugene', 2)]
UPDATE manager SET manager_name=? WHERE manager.id = ?
[...] [('Sandy Cheeks, President', 1), ('Eugene H. Krabs, VP Marketing', 2)]
#

Probably at least has advantage of submitting query over single network query

torn sphinx
#

Hi

#

I need a help to make database of song

#

How to Start?

#

So I can play Multiples of songs

#

This is my website, and I'm working on it

#

Help me please

simple barn
hearty siren
#

I am using sqlalchemy using uselist = False should have made it one-to-one?
but on erd diagram(pgadmin 4) it still shows one-to-many
i want it to be one-to-one with out using the unique constraint in UserSettigs.user_id

class Users(Base):
   __tablename__ = "user"

   id: Mapped[int] = mapped_column(primary_key=True)
   tg_id: Mapped[int] = mapped_column(nullable=False, index=True, unique=True)

   settings = relationship(
       "UserSettings", backref="user", uselist=False, cascade="all, delete-orphan"
   )

class UserSettings(Base):
   __tablename__ = "user_settings"

   id: Mapped[int] = mapped_column(primary_key=True)
   is_premium: Mapped[bool] = mapped_column(Boolean, default=False)
   is_banned: Mapped[bool] = mapped_column(Boolean, default=False)
   enable_notifications: Mapped[bool] = mapped_column(Boolean, default=True)

   user_id: Mapped[int] = Column(Integer, ForeignKey("user.id"),unique=True)

   )```
graceful widget
#

does anyone know how do i get to here

#

i found this image on stackoverflow

torn sphinx
graceful widget
#

does this mean i did the PATH thing correctly?

#

look like i forgot the password ๐Ÿ’€ how do i change it

graceful widget
#

i reset my password and its still wrong =.=

#

i use the change scram-sha-256 to trust method

#

i try finding but cant find it where do i change it

sterile pelican
#

Currently I don't see anything wrong, just expand "Servers" tab

graceful widget
#

when i try to create a table it ask me for a password

#

and idk how to reset the password ;-;

sterile pelican
hearty siren
#

How to create 1 to 1 and 1 to many relationship using sqlalchemy

blissful yoke
#

Good Afternoon. Im trying to ceate an application which stores internships and gradschemes im applying to. Im using Pandas and openpyxl to store the database in an excel spreadsheet.

#

My issue is whenever i create a file, it is always corrupted, and i have no idea why

#

ill send the relevant code below:

graceful widget
sterile pelican
sterile pelican
#

What do you want to do

graceful widget
#

i wanna createdb match

#

then after i enter the password ERROR

sterile pelican
#

You want to create a database?

#

Create it from pgadmin

graceful widget
#

oo ok

blissful yoke
# blissful yoke ill send the relevant code below:

#this checks to see if the file exists or not:
try:
    oldData = pd.read_excel("Tracker.xlsx", sheet_name="Data")
    existingReminders = pd.read_excel("Tracker.xlsx", sheet_name="Reminders")
    mode = 'a'
except FileNotFoundError:
    oldData = pd.DataFrame()
    existingReminders = pd.DataFrame()
    mode = 'w'
#creates the writer: 
with pd.ExcelWriter("Tracker.xlsx", engine="openpyxl", mode=mode) as writer:
    toExit = False
    while not toExit:
        choice = int(input("1, 2 or 3: "))
        if choice == 1:
            newEntry(writer) 

#uploads data:
    if correct in ["YES", "Y"]:
        toTransfer = pd.DataFrame(data, index=[0])
        toSubmit = pd.concat([toTransfer, oldData])
        toSubmit.to_excel(writer, sheet_name = "Data")

#I am certain a sheet is being created, its just the file itself does not open
#Ive reached the cap, so will be putting the error after this message


#
#Here is the error:
Traceback (most recent call last):
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\Grad Tracker.py", line 166, in <module>
    newEntry(writer)
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\Grad Tracker.py", line 154, in newEntry
    updateEntry(data, data["Name"], writer)
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\Grad Tracker.py", line 97, in updateEntry
    if (not data["Second Stage"]) or data["Second Stage"] == "No Response":
            ~~~~^^^^^^^^^^^^^^^^
TypeError: string indices must be integers, not 'str'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\Users\Necus\AppData\Local\JetBrains\Toolbox\apps\PyCharm-P\ch-0\232.9559.58\plugins\python\helpers\pydev\pydevconsole.py", line 364, in runcode
    coro = func()
           ^^^^^^
  File "<input>", line 1, in <module>
  File "C:\Users\Necus\AppData\Local\JetBrains\Toolbox\apps\PyCharm-P\ch-0\232.9559.58\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Necus\AppData\Local\JetBrains\Toolbox\apps\PyCharm-P\ch-0\232.9559.58\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", 
#
line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\Grad Tracker.py", line 161, in <module>
    with pd.ExcelWriter("Tracker.xlsx", engine="openpyxl", mode=mode) as writer:
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\venv\Lib\site-packages\pandas\io\excel\_base.py", line 1370, in __exit__
    self.close()
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\venv\Lib\site-packages\pandas\io\excel\_base.py", line 1374, in close
    self._save()
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\venv\Lib\site-packages\pandas\io\excel\_openpyxl.py", line 110, in _save
    self.book.save(self._handles.handle)
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\venv\Lib\site-packages\openpyxl\workbook\workbook.py", line 386, in save
    save_workbook(self, filename)
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\venv\Lib\site-packages\openpyxl\writer\excel.py", line 294, in save_workbook
    writer.save()
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\venv\Lib\site-packages\openpyxl\writer\excel.py", line 275, in save
    self.write_data()
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\venv\Lib\site-packages\openpyxl\writer\excel.py", line 89, in write_data
    archive.writestr(ARC_WORKBOOK, writer.write())
                                   ^^^^^^^^^^^^^^
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\venv\Lib\site-packages\openpyxl\workbook\_writer.py", line 150, in write
    self.write_views()
  File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\venv\Lib\site-packages\openpyxl\workbook\_writer.py", line 137, in write_views
    active = get_active_sheet(self.wb)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
#
 File "C:\Users\Necus\OneDrive\Documents\Grad Tracker\venv\Lib\site-packages\openpyxl\workbook\_writer.py", line 35, in get_active_sheet
    raise IndexError("At least one sheet must be visible")
IndexError: At least one sheet must be visible
#

any solution would be beneficial

fallen vault
#

how detailed can you get with sql? Ive seen someone use it to do advent of code.

torn sphinx
#

but does anyone have experience with making Servers, Databases and User Authentiction / Accounts

wise jewel
#

https://paste.pythondiscord.com/USUA

I made this query.

Yet i have a contract that had their last visit from attendancelist last week or in the last month that get state_id = 4.

I don't see what triggers this behavior.

Does anybody has any pointers?

wise goblet
wise goblet
#

It is common as dirt under foot thing for backend devs

torn sphinx
#

yeah, just asking if anyone can help really im making a game (unity with C#) and im trying to make server for it to send user creds too so if they need to reset password its possible

wise goblet
#

Because then u can reuse server code at desktop app game code, less complexity, automated typing validation

torn sphinx
#

dm me

wise goblet
#

Not doing dms, ask in public

torn sphinx
#

alr i was gonna ask if you would like to help with the game but alr

normal gale
#

anyone good at databases

wise goblet
# torn sphinx explain

lets supposed we make CRUD backend API that can accept JSON requests and answers with them at server.
If we use C#(same language) at server and backend

We could define Struct of data that is possible input for Request body, and struct for answer
https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/how-to?pivots=dotnet-8-0

using System.Text.Json;

namespace SerializeBasic
{
    public class WeatherForecast
    {
        public DateTimeOffset Date { get; set; }
        public int TemperatureCelsius { get; set; }
        public string? Summary { get; set; }
    }

    public class Program
    {
        public static void Main()
        {
            var weatherForecast = new WeatherForecast
            {
                Date = DateTime.Parse("2019-08-01"),
                TemperatureCelsius = 25,
                Summary = "Hot"
            };

            string jsonString = JsonSerializer.Serialize(weatherForecast);

            Console.WriteLine(jsonString);
        }
    }
}

With using same code for backend and desktop app
Desktop app could import same Structs that server app uses
and to send correct data from desktop app with validation at the level of compiler that u used correct input fields, and u can reuse backend code for Struct answer to deserialize recieved in request answer from json back to C# struct

#

this alone simplifies situation and making u using all the time correct input/output fields to communicate between desktop and server
Making possible refactoring easily (renaming fields), adding/removing freely fields for communications, and all without super need for unit tests

torn sphinx
#

yeah i would but my friend knows c# and i only know python

#

so thats why i want the server in Python so i can make it

wise goblet
#

it will autogenerate documentation for your API

#

if u will use framework at a good capacity, friend will always know automatically possible input and output and possible endpoints

torn sphinx
#

thanks, can i add you i can ask for advice whenever rather then having to check this server all the time

wise goblet
wise goblet
torn sphinx
#

alr

fallen vault
#

I have this query
SELECT SUM(pulls) from toning group by job that returns a list of numbers (pulls). How do i average the data thats returned?

grim vault
fallen vault
#

Thank you.

#

Didnโ€™t realize Sql could get so complex. And this is the shallow end lmao

knotty kindle
#

which database best for web developmnenr

wise goblet
#

Sqlite3 good for pet projects

knotty kindle
#

oh thanks man

snow marsh
#

how to check table is exists or not in sqlite3

snow marsh
grim vault
snow marsh
#

i usually depend on the error lol

try:
    for svinfo in range(0,3):
        cur.execute(f"CREATE TABLE sv{svinfo}({', '.join(str(v) for v in svbase)})")
    

# This means that the db exists
except sql.OperationalError:
    pass
#

but that looks reliable, thank you

grim vault
#

The CREATE TABLE statement does have IF NOT EXISTS if you want that, like:

CREATE TABLE IF NOT EXISTS mytablename
wise goblet
magic galleon
#

any SQL God here?

fading patrol
vivid trench
#

Thou walkest among sql atheists

deep heron
#

As a sanity check, I'll ask here. I have a program that uses SQL to access a MariaDB database. I have a few users that use the program, sometimes simultaneously. After one user commits, the other cannot see the new data until they either commit, or when they restart the program.
My understanding on this is that the default isolation level, repeatable read, runs on a snapshot grabbed at the first connection and updated with each commit. My options appear to be to have the program constantly commit wihout any changes, or change the isolation level. This is where my question lies.
If I am understanding correctly, I can use Read Commited to update the snapshot with each read. Is there a downside to doing this?
Thanks!

coral wasp
deep heron
coral wasp
deep heron
#

Oh I see. They use the program for (usually) minutes at a time, unless they are mass updating inventory. I do open a single connection and then use the same cursor for all transactions. Out of curisoity, and the fact that I'm new to SQL, what is the issue when keeping a transaction open?

coral wasp
#

The problem with a long-lived transaction is: the database is constantly changing.

#

Eh, there's lots of issues, actually. The way databases handle locks and the like, etc.

#

Generally, best practice is to do things in small chunks.

#

Use a cursor/transaction for when you need to update. Usually get in and get out.

#

Otherwise you end up with the isolation issues you're describing, as one side effect.

#

Isolation is a good thing, makes your code a lot more complex without it.

deep heron
#

So should I open and close the overall connection as well as the cursor, each time I do a read or write to the database?

coral wasp
deep heron
#

I see. I think the transactions are short. I mainly grab the information from the database I need to initialize the program, and then only hit the database for when doing searches and saving to the database.

novel dagger
#

Hey guys I got a Q. I have a column of lists in my dataframe, and I'm trying to pull the first value from these lists to make another column. The problem is that when it runs df['col_2'] = df['col_1'] [0] it uses the first value in the column, instead of the first value in every list in the column. How do I make it look at each row individually?

modern zealot
#

Learning sqlite im a beginner do u guys have any tips or other light weight data bases

storm mauve
delicate fieldBOT
#

@storm mauve :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 |            A  B
002 | 0  [1, 2, 3]  a
003 | 1  [4, 5, 6]  b
004 | 2  [7, 8, 9]  c
005 | 0    1
006 | 1    4
007 | 2    7
008 | Name: A, dtype: int64
river mantle
#

I use sqlite

#

A lot

digital coral
#
class Comment(Base):
    __tablename__ = "comment"

    id: Mapped[int] = mapped_column(primary_key=True)
    content: Mapped[str] = mapped_column(String(4000))
    parent_id: Mapped[int] = mapped_column(ForeignKey("comment.id", ondelete="CASCADE"), nullable=True)
    parent: Mapped["Comment"] = relationship("Comment", remote_side=[id], back_populates="children")
    children: Mapped[list["Comment"]] = relationship("Comment", remote_side=[parent_id], back_populates="parent")
    idea_id: Mapped[int] = mapped_column(ForeignKey("idea.id", ondelete="CASCADE"))
    idea: Mapped[Idea] = relationship(back_populates="comments")
    author_id: Mapped[int] = mapped_column(ForeignKey("user.id", ondelete="CASCADE"))
    author: Mapped[User] = relationship(back_populates="comments")
    reply_count: Mapped[int] = column_property(
        select(func.count(text("comment.id")))
        .select_from(text("comment"))
        .where(text("comment.parent_id = comment.id"))
        .correlate_except(text("comment"))
        .scalar_subquery()
    )

am i doing this right? (the reply_count) part

#

it's always returning 0

#

unlike this

class Idea(Base):
    __tablename__ = "idea"

    id: Mapped[int] = mapped_column(primary_key=True)
    title: Mapped[str] = mapped_column(String(70))
    description: Mapped[str] = mapped_column(String(4000))
    author_id: Mapped[int] = mapped_column(ForeignKey("user.id", ondelete="CASCADE"))
    author: Mapped[User] = relationship(back_populates="ideas")
    likes: Mapped[list["IdeaLike"]] = relationship(back_populates="idea")
    likes_count: Mapped[int] = column_property(
        select(func.count(text("idea_like.id")))
        .select_from(text("idea_like"))
        .where(text("idea_like.idea_id = idea.id"))
        .correlate_except(text("idea_like"))
        .scalar_subquery()
    )
    comments: Mapped[list["Comment"]] = relationship(back_populates="idea")
    comment_count: Mapped[int] = column_property(
        select(func.count(text("comment.id")))
        .select_from(text("comment"))
        .where(text("comment.idea_id = idea.id"))
        .correlate_except(text("comment"))
        .scalar_subquery()
    )

which return the correct count of comments and likes

slender atlas
patent schooner
#

hey i am trying a login interface for my site and it says in the tutorial i need a server side script how do i create on in python?

wise goblet
# patent schooner hey i am trying a login interface for my site and it says in the tutorial i ne...

if you are just wishing to learn more stuff in a raw way, go through flask mega tutorial for that
https://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-i-hello-world

Welcome! You are about to start on a journey to learn how to create web applications with Python and the Flask framework. In this first chapter, you are going to learn how to set up a Flask projectโ€ฆ

#

If you are doing smth for work... https://www.djangoproject.com/start/
I would recommend actually using django for that.
More boilerplated and error safe, and more... devproof in general
(if u need rest API, it has ready to use plugins too)

patent schooner
#

yes i need to create a site in 2 days for a competition

wise goblet
wise goblet
#

result is messy, but it works

patent schooner
#

thanks !

wise goblet
wise goblet
patent schooner
#

ok so flask more complicated time consuming djiango easier shitier result less time

wise goblet
# patent schooner ok so flask more complicated time consuming djiango easier shitier result less ...

Flask is more raw, time consuming to setup from zero correctly, full flexibility to make everything. easy to screw up completely beyond recoverability.
Django is more idiotproof, everything is boilerplated (configured out of the box). At low skill level Django is easier to make something good enough in a fast rocket way (assuming u make it in Django way). Django cost is more rigid solution in a super long run (if you are expert level dev, u can feel contrained by it. Not really issue for your 2 days timeline at all)

patent schooner
#

ok

wise goblet
# patent schooner ok

From personal experience, half of time will be consumed by figuring out how to use correctly SQLAlchemy if starting with Flask :/
Although u can bypass this issue by actually just using raw SQL.

Django ORM has out of the box ready to be used with SQLite3(or could be switched to another engine)

snow marsh
#

how do i check if a table exists?

#

(sql btw)

fading patrol
wise goblet
snow marsh
wise goblet
snow marsh
wise goblet
snow marsh
#

Thank you

snow marsh
orchid apex
#

Hey, using sqlite in an async application. Is there a popular async ORM that any of y'all would recommend (or is an async even a requirement for an ORM)?

storm mauve
#

there are a few others too but tbh sqlite is so 'lite' that it probably wouldn't make much of a difference if you just used the normal standard library sync api for it, assuming that you do not have like dozens/hundreds of things trying to access it at the same time, in which case you really should consider using a bigger database like postgres

storm mauve
orchid apex
deep heron
snow marsh
#

im making a game, would it make sense to use a .txt file as a logbook (boot ups, log offs, sessions) and a sqlite table for game saves?

obtuse magnet
#

why not just use sqlite for everything?

snow marsh
#

im actually coming off of sqlite and using csv for my project

orchid apex
snow marsh
halcyon summit
#

Would anybody know where to retrieve data about revenue per square m of a specific restaurant?
Im trying to find out how the position of the restaurant correlates with its revenue, and I was wondering where I can find such data.

obtuse magnet
#

proly annual reports of restaurant businesses

fading patrol
hollow oar
# halcyon summit Would anybody know where to retrieve data about revenue per square m of a specif...

for a single restaurant? or for a brand of restaurant?

if your meant the brand as a whole and that they are listed company then as mentioned above - quarterly/annual report is your best bet.

if you meant revenue broken down by individual restaurant of the brand, you are probably looking for spend data, you are unlikely to find spend data for free (if you found it, please ping me as i am also interested)

(also for location of the restaurants, open street map will likely have a almost complete list)

wise goblet
#

Saving Objects in Java ๐Ÿ˜…

#

Otherwise in python... closest achievable thing with Pydantic. All pydantic objects are serializable/deserializable too.

wise goblet
#

because i will be able to achieve saveability at a lesser code effort and same code is bidirectional to save and load itself

wise goblet
waxen finch
wise goblet
#

looks like the closest analog to java inbuilt serializer

#

going to help a lot in one tricky place at my library

#

i need to serialize deserialize very complex object of uknown structure. Pickle looks like will manage to do a better job there

#

it works for now with just json lib though, hopefully i will not need pickle may be

wise goblet
#

i would say though picke and pydantic serve different roles

#

pydantic is using json under the hood

#

it could work with pickle too

#

pydantic is better structs, json and pickes are different serializing engines

#

there is some interpolability i can admit, but still they serve different intentions

coral wasp
brittle lava
#

hey

#

what library you use for make a crud in python?

fading patrol
#

An ORM is not strictly necessary though

#

And then what kind of interface? Is this a web backend, CLI or GUI

brittle lava
#

GUI

#

GUI or web

#

to make a crud using cli I use mysql.connector

#

how long have you been working with this?

wise goblet
#

Very CRUD boilerplateful

shut tiger
native valve
#

just woke up to almost all my mongodb collections droppped and a single object in my database asking for 0.0125 BTC to decrypt my data ๐Ÿ˜ณ

I was just running this docker-compose.yml:

version: '3.8'
services:
  mongodb:
    command: mongod
    image: mongo:latest
    ports:
      - '27017:27017'
    volumes:
      - data:/data/db
volumes:
  data:

I was shocked to discover that even though I didn't set my firewall, ufw, to allow remote connections on 27017, some thing did anyway. I am still not sure how this happens, but I believe docker just bypassese/overwrites whatever ufw configurations exist and just edits iptables by itself? I don't know. But I want to disallow remote connections to my docker container. I tried adding --bind_ip 127.0.0.1 to the mongod command above (in the hopes of explicitly not binding to 0.0.0.0), the service starts up, it waits for connections, but when I try to connect with connection string mongodb://localhost:27017, it times out, not even refuses connection.

I am at a loss here. How do I setup a mongodb instance with a docker-compose file such that only local connections are allowed?

#

thankfully this is just a toy project of mine, but it still feels like almost being hit by a bus. I can totally see some poor unaware soul just blindly running this docker compose file in production expecting to be safe.

brazen charm
#

ah yes the good old docker ufw ๐Ÿ˜…

#

Basically

#

Docker bypasses your firewall, specifically it bypasses UFW by default

#

you need to configure your iptables to have it not route to the outside world beyond the firewall

#

but also i'd recommend doing


    ports:
      - '127.0.0.1:27017:27017'

as a standard practice though since it's rare you actually want your container exposed to the outside world

native valve
#

spooky!

native valve
#

thanks for the heads up i didn't know about this ufw/docker thing

brazen charm
#

side note, but you should definitely also add authentication to you database anyway

native valve
#

noted

native valve
wise goblet
native valve
#

i just need to live dangerously.

modest fjord
#

I just read this article and I want to know if this seems like a good idea to anyone else?

little gyro
#

help me

#

help meee

potent spire
#

how I can make a list in MySQL

#

like I want to store some ids

#

and then check if an Id in it

#

and also add Ids to list

#

I can do that with MySQL?

wise goblet
#

don't try to make list ๐Ÿ˜…

potent spire
#

I think Json for this is better

wise goblet
#

let SQL engine work for you properly

#

don't break it with usage of list/json stuff if u don't understand what u are doing

#

create separate table and link values via foreign keys

#

that is best recommended way, good default working

#

in SQL databases u need to think trice if u really want lists and jsons

#

they break proper functionality of database design and must be avoided unless there is some specific usage case that can justify it...
...the trick it is very rarely when it can justify it

tardy elbow
#

Hi im new, somebody knows where can i learn python for data analysis?, especifically, pandas, numpy. and matplotlib

fading patrol
# tardy elbow Hi im new, somebody knows where can i learn python for data analysis?, especific...

The options are endless so try whatever you like. This intro course is totally free. https://www.freecodecamp.org/news/how-to-analyze-data-with-python-pandas/

I also like interactive apps like DataCamp and DataQuest

Ultimately there's no substitute for real projects though. Once you know some basics try Kaggle competitions and other projects like that

freeCodeCamp.org

Data Analysis is an in-demand field but it can be hard to get into as a beginner. We've just released a 10-hour beginner-friendly video course to teach people how to analyze data with Python, Pandas, and Numpy. This course offers a coding-first introduction to data analysis. Besides the video content,

golden vector
#

Is it generally a better approach to store images as blobs in a database, or store them locally on the file system, for a bot? I am developing a discord bot that will hold items with images, that will be sent as embeds on a discord. I just don't know if the images are stored permanently, until the message is deleted. The database would be PostgreSQL.

wise goblet
#

it is great antipattern trying to store images in relational db directly

#

but in blob storage like S3 this is fine and meant to be

#

otherwise if your app is super is super simple, just store in filesystem (and keep link in relational)

golden vector
fading patrol
wise goblet
brazen charm
# fading patrol I swear I remember seeing some research by Microsoft that I can't find right now...

If images are under 1MB they are typically more performant in a wide column DB than blob store yes, I don't thing relational DBs are as good at is because they (at least in postgres) start using TOAST pages which require additional hops around the disk to fetch each row. But things like Cassandra can effectively be told "Here's this blob, dont both compressing it" which ends up being pretty damn efficient.

wise goblet
#

CQL ๐Ÿ™‚

coral wasp
queen rose
#

the space complexity of the query: SELECT DISTINCT col FROM table LIMIT :N is always the limit-value, regardless of the cardinality or the COUNT(*) of col, correct?

#

cause I'm doing this query with datafusion on an S3 dataset (parquet) and the query is taking a long time and 3-5GB of RAM, even tho I'm only selecting the first 10 values,
and I know for a fact that any parquet file has atleast 10 unique values, so it should just read the first file and return instead of doing a full scan ...

brazen charm
#

but I suspect datafusion is probably making a very intelligent query plan

#

Probably test with duckdb and see how that performs?

fringe sundial
#

guys whats best way of using db in fastapi,i am using asyncpg as cdriver

fading patrol
queen rose
#

with duckdb I can at least reduce memory usage trough the settings, with datafusion I havent been able to configure it

brazen charm
#

So that probably suggests your parquet files may be difficult to only do partial reads with

#

how big are the files?

queen rose
#

the thing that drives me nuts, is that all 10 values are present in the first file fo the Pyarrow dataset ((x in pd.read_parquet(dataset.files[0])['col'] for x in out['col']))

#

it did a full scan when all the necessary values could be found in the first partition/file

brazen charm
#

Does data fusion actually give you a query plan?

queen rose
queen rose
fringe sundial
queen rose
#

I found the the query plan generated by datafusion, for the statment: SELECT DISTINCT my_column FROM dataset LIMIT 10 (

# sql.logical_plan()
Limit: skip=0, fetch=10
  Distinct:
    Projection: dataset.my_column
      TableScan: dataset

# sql.optimized_logical_plan()
Limit: skip=0, fetch=10
  Aggregate: groupBy=[[dataset.my_column]], aggr=[[]]
    TableScan: dataset projection=[my_column]

# sql.execution_plan()
GlobalLimitExec: skip=0, fetch=10
  CoalescePartitionsExec
    LocalLimitExec: fetch=10
      AggregateExec: mode=FinalPartitioned, gby=[my_column@0 as my_column], aggr=[]
        CoalesceBatchesExec: target_batch_size=8192
          RepartitionExec: partitioning=Hash([my_column@0], 8), input_partitions=568
            AggregateExec: mode=Partial, gby=[my_column@0 as my_column], aggr=[]
              DatasetExec: number_of_fragments=568, projection=[my_column]
#

@brazen charm

coral wasp
simple barn
brazen charm
#

The fact they call it a database really annoys me somewhat

#

It's be better off being called a gloried set of text files

brazen charm
#

pyson is:

  • not a database
  • not atomic
  • not crash safe
  • not efficient
  • not concurrency safe
  • not storage efficient
queen rose
queen rose
coral wasp
queen rose
#

Bruh

queen rose
coral wasp
#

All I see from@that bug is โ€˜lots of timeโ€™ and โ€˜most of my ramโ€™. How many rows/how large is the parquet file? How does it perform differently if you run against a local parquet file?

#

(I doubt anyone will pick up that bug as it lacks anything tangible or comparative or reproducible)

fading patrol
coral wasp
wise goblet
coral wasp
#

What's tough for me is I'm often trying to do stuff that's just hard in SQL to begin with: stuff where I might be using some esoteric features, new functions, etc, where I'm running ahead of what's reasonable to expect from an ORM. This is maybe a side effect of OLAP stuff tending to be on the bleeding edge.

simple barn
queen rose
fading patrol
merry lake
#

ะป

toxic oasis
#

Is it bad to store data in 150k json files

wise goblet
# toxic oasis Is it bad to store data in 150k json files

it depends on what u store. if datasets, then csv is great
if application data, then better using stuff like postgres (or at least sqlite3, or smth like duckdb... u get the idea)
Otherwise... 150k json files i guess are acceptable for datasets (but not for app data)

fair cloud
#

BigQuery SQL: Can someone more knowledgable than me explain why I often see examples using ROW_NUMBER() OVER or RANK() OVER in conjuction with PARITION BY to get the most recent row instead of doing a GROUP BY and choosing MAX(ROW_CHANGED_DATE) (of course we have a row_changed_date column)

sour roost
#

when should I create sessions? persist it throughout the whole app or one for every request

#

(sqla)

coral wasp
fringe sundial
fading patrol
polar ether
#

Hi everybody, I'm trying to create unit tests for a function which uses a pymongo db. My idea was to use an embed db by mocking it with mongomock but nothing of what I tried works. I want to do something like this:


    def setUp(self):
        self.mongo_client = mongomock.MongoClient()
        self.db = self.mongo_client['test_db']

    def tearDown(self):
        self.mongo_client.close()

    @patch('app.config.database.db')
    def test_payment_body_creates_successfully(self, mock_db):
        mock_db.return_value = self.db```
This is the config for my db
```from pymongo import MongoClient

client = MongoClient(host=mongo_host, port=mongo_port)
db = client[str(db_name)]```
#

The idea is for the tests to be runnable without having mongo up locally

wise goblet
#

ORMs are a mess... and if u know how it works under the hood... u gain... a lot of productivity (10-1000X times)

#

not every ORM is a mess though (i am mostly talking about Django ORM as a super abstractional ORM mess)

#

Python SQLALchemy or Golang Bun are a different story

#

they are faaar less abstracting raw SQL from its usage

#

they just compliment it with language features to do it in a more static typed/nicer/structured way

#

despite its issues, Django ORM is still awesome (hey, a lot of boilerplating for CRUD) (But as i mentioned again... u really have to know how operates raw SQL first highly preferably)

somber ember
#

does this belong to normal level, good, or bad (or genius)?

concat('{"someKey":', case substr(regexp_replace(someKey, 0, 2) when '[[' then regexp_replace (regexp_replace(regexpt_replace(someKey, '\\\\|\"\\[|\\]\"', ''), '@@@', '\\\\') when'["' then concat('[', regexp_replace(regexp_replace(regexp_replace(someKey, '\\\\{3}', '@@@'), '\\\\|\"\\[|\\]\"', ''), '@@@', '\\\\'), ']') else '[""]' end

it's in hql but just wanted to share. We are migrating database and need to update all our queries, and this is what I need to work on.

EDIT:
I guess it's normal, since my new query isn't too far from this...

foggy rune
#

Has anyone worked with graphdb and graphql? Something like go gqlgen but not for rdb

Dgraph.io looks very interesting with native gql support.

Any opinions on how we can manage protobufs and their relationship with graph databases? wouldn't it be better to have a single source of truth instead of marshaling and unmarshaling via json?

cloud pond
#

Hey all does anyone here know DAX?

torn sphinx
cloud pond
#

ive got a powerbi question.. iv made this on excel... this creates a running total counting the amont of start dates and end dates on each calender date... eg there are 9 start dates from 01/07/22 to 05/07/22... anyways how do i do this in powerbi? pref using DAX

#

@torn sphinx

fair cloud
coral wasp
fair cloud
#

Well the problem I have with the window function is that it seems to return different number of rows every time I run it

#

whilst the group by does

#

which I don't get

#

I think it's because of hte order by clause in the window function

#

but I would have assumed that group by would have the same problem

coral wasp
#

Could you share an example? That doesnโ€™t make sense to me.

fair cloud
#

This is how I did it with the window function:

WITH FOO AS (
  SELECT
    RANK() OVER (
      PARTITION BY name
      ORDER BY date_time DESC
    ) AS rank,
    OTHER_FIELDS,
    ...
    FROM FOO_TABLE
)
SELECT ... 
FROM FOO 
WHERE rank = 1 and state = 'CREATED'
#

And this would return different results every time I run in. I believe it's because there is a tie in the ORDER BY clause and BigQuery just picks one randomly

coral wasp
#

Yah, well, that was what I was going to say is one problem with the groupby method

fair cloud
#

In some cases the state is CREATED other times it's DELETED and the WHERE clause filters out those

coral wasp
#

You can get the max value, but what if itโ€™s not unique

fair cloud
#

But when I use Group by I always the same number of rows

coral wasp
#

Secondly: youโ€™re filtering on created in the select but not window. So if most recent state is not created, youโ€™ll get nothing

fair cloud
#
WITH FOO AS (
  SELECT
    name,
    MAX(date_time) AS most_recent
  FROM FOO_TABLE
  GROUP BY name
)
SELECT ... FROM FOO 
INNER JOIN FOO_TABLE 
  ON FOO.name = FOO_TABLE.name 
    AND FOO.most_recent = FOO.date_time
WHERE state = 'CREATED'
#

That's fine this table is a log of the state of files in a google cloud storage bucket. A single file can have states CREATED, DELETED, CREATED, etc. but I always want the most recent entry so that I know whether it's in the bucket or not

coral wasp
#

I mean, if thatโ€™s what you want, then you could just get the last() value via window and skip the whole rank thing and drop the cte

#

Oh, nm, you just want the last entry, so yah, nm

red lagoon
#

Is there anybody who had problem at connecting mysql with python3 in Mac M1 and got NameError: name 'mysql' is not defined ?

wise goblet
#

if u would have used strict mypy/pyright, it could be not an issue to you

red lagoon
#

oh you are right lemon_zipped

digital walrus
#

Hello, I have three list of dicts. called vpc, po, and eth,
Now, these are somewhat related. An item in vpc, can be related to a po, a po can be related to an eth.
I want to visualize this somehow, maybe networkx or similar.
all the dicts in each list is unique, and the are related by name. like vpc_member: 'po2', or similar.
where do I even start? ๐Ÿ™‚ suggestions

coral wasp
#

Unclear what youโ€™re asking, are you asking how to visualize this? Or put it in a db?

digital walrus
#

Visualise it. Not sure why I put it in databases channel,

coral wasp
digital walrus
sand zephyr
#

Anybody know where I can get a psycopg2 wheel for python 3.12???? For windows 10.

waxen finch
sand zephyr
# waxen finch the official [psycopg2-binary](<https://pypi.org/project/psycopg2-binary/#files>...

I can't seem to get that to work. ร— Getting requirements to build wheel did not run successfully.
โ”‚ exit code: 1
โ•ฐโ”€> [5 lines of output]
running egg_info
writing psycopg2_binary.egg-info\PKG-INFO
writing dependency_links to psycopg2_binary.egg-info\dependency_links.txt
writing top-level names to psycopg2_binary.egg-info\top_level.txt
error: [WinError 2] The system cannot find the file specified
[end of output]

waxen finch
sand zephyr
brittle lava
#

hey guys

#

Does anyone here use a pre-made database for their applications?

delicate cedar
brittle lava
#

for a database

#

you create all of models or you get any pre-made model?

fading patrol
brittle lava
#

ok

#

thanks

sand zephyr
wise goblet
# brittle lava you create all of models or you get any pre-made model?

The important part is keeping as a code for serious custom applications

Files of migrations in your git repository.
How they will be written it does not matter. They should be present as defined library versions or anything.

Django ORM migrations, SqlAlchemy Alembic migrations.

In Django we keep our migrations as a files of code of our applications
And third party installed libraries can run their own migrations from installed django applications (those are premade tables).

primal sparrow
#

hey guys!

#

I've started to make this project a few days ago

#

my goal is to make a weight tracker

#

and the required qualities are as below:

#

1-ask the user for weight

#

2-save the date of the added weight

#

3-show a diagram for the user progress

#

4-calculate bmi and ....

#

since I need to store the data long term I was loking for a way to do so

#

but I'm getting overwhelmed by the variety of file formats and their uses

#

until now I have csv,json and pickling in mind .

#

any suggestion?

wise goblet
#

๐Ÿ’

#
  • It will have strong data integrity between all data (if foreign check integrity is on)
  • Enforced data structure
  • Ability to migrate already existing data to new state (works well if u use solutions to keep migrations between app versions as a git commit files)
  • enforcing rules/constraints on saved data
#
  • easiest out of all Relational dbs
primal sparrow
#

is there an specific module i need to leanr to work with it?

fading patrol
wise goblet
# primal sparrow is there an specific module i need to leanr to work with it?

inbuilt https://docs.python.org/3/library/sqlite3.html sqlite3 lib is sufficient to work with in minimal capacity.

ORM like SQLAlchemy+Alembic(migrating lib) or Django ORM (just hook it in standalone mode), or any other solution is desirable to work easier with it through python classes and having migrating autoversioned

  • (I like in general Django ORM for migrating system, it works)
  • (SQLalchemy has better syntax to make queries for custom projects may be. I like its migrating a bit less but it is good enough too and somewhat more reliable perhaps)
  • haven't tried Peewee, having no opinions. Important to have good migrating system
    • checked docs... no support for introspection in migrating system (and smth else named is missing, database "versioning"?). i would not have chosen Peewee
lethal zinc
#

Hello guys, do u use Python in Excel?

fading patrol
lethal zinc
#

in the Microsoft Beta version now u can use Python in Excel

#

the libary for sure is Pandas

wise goblet
# wise goblet inbuilt https://docs.python.org/3/library/sqlite3.html sqlite3 lib is sufficient...

@primal sparrow in the end i will recommend using just Django ORM. for simple projects it is all that need it and it has everything configured out of the box
Django ORM simplifies dealing with SQL significantly more than any other ORM.

  • U get good enough code code structure (very life saving point for novices)
  • migrating system (that u don't need to configure like in SQLAlchemy) ๐Ÿ˜„
  • unit testing framework (again not need to configure anything, it just works)
  • integrations with other unit testing solutions to simplify it even further.
  • highly abstracted python class dealing syntax
  • ability to integrate with other boilerplating stuff

Dealing with SQLAlchemy will be kind of pain in the ass to make it all working correctly i think
(Or just using sqlite3 inbuilt library alone can be an option as well ๐Ÿ˜… )

wise goblet
#

Using Python to operate Excel from outside is acceptable idea to me
Or preferably putting data into DuckDB/Sqlite3 which is even better

untold seal
#

This isn't really in Python, but I feel like a more general understanding is needed here.
I am trying to setup Redis. I keep getting this error and I don't understand what it means. I don't get any errors in my editor, but when the code executues i just get this.

https://paste.md-5.net/iyoguwodal.bash

glacial current
untold seal
#

I know your not much of a java dev, but would the class being written in Java 7 be a problem while im in Java 21? I opened the package to see if Jedis was in there and it is, but its in Java 7.

glacial current
untold seal
#

It's alright, I'll go on to some forum site. This is just tiring, I've been at this for 4 days and I don't get how its not working.

coral wasp
sharp wagon
#

im triyng to use sqlite3 to compare elements from a column of jsons to a given element. I want to fetch all rows where the element is somewhere in the json. can I do that? I tried using json_extract but it doesnt seem to be working for me

#
        staffTask = cursor.fetchone()
#

and then i have a while statement which manipulates that row and then goes to next row until the staffTask = None. the issue seems to be that the staffTask is never getting set to anything

fringe sundial
#

guys i have this in db.py

class Connection:
    DB_URL = f"postgresql://{getenv('USER')}:{getenv('PASSWORD')}@{getenv('HOST')}:{getenv('PORT')}/{getenv('DATABASE')}"

    DB_POOL = asyncpg.create_pool(DB_URL)

async def get_db():
    async with Connection.DB_POOL.acquire() as db:
        try:
            yield db
        finally:
            await db.close()

i am not getting intellicense or i dont see a method like fetchrow, etc on the DB ,how to fix it?

@user_router.post("/register")
async def register(
    request: Request,
    DB: Annotated[Connection, Depends(get_db)]
):
    body: dict = await request.json()
    if not check(body):
        raise HTTPException(400, createError("Missing required fields",400))

    user = RawUser(
        email = body["user"]["email"],
        name = body["user"]["name"],
        password = body["user"]["password"]
    )
paper flower
fringe sundial
paper flower
#

Cursor is acquired from a connection

fringe sundial
paper flower
#

Actually there is a fetchrow method on a connection

#

I think you type hinted it as your custom Connection clas

plush quartz
#

i got an issue with mongodb

#

basically i want use it to make the fuction that could save userโ€™s friend code id
and i want it delete old id user signed before when user signed again
but it isnโ€™t deleted old documents

fringe sundial
#

guys what best of using sqlalchemy in fastapi? using asyncpg as driver

wise goblet
fringe sundial
wise goblet
fringe sundial
wise goblet
#

not known to me metaphore for programming.

fringe sundial
#

i mean like
1: just spread whole code and use bad practices
2: organise code into classes and use good practices

wise goblet
# fringe sundial i mean like 1: just spread whole code and use bad practices 2: organise code int...

at minimum i highly recommend using covering your database interactive code with pytest
highly preferably with running --cov module and ensuring having at least 80% coverage.
Factory boy integration with SQLALchemy is also great to simplify testing

If u want to increase quality of your code further, make sure to run strict mypy with SQLalchemy stubs

Enforce unit testing coverage and mypy via CI check for every commit.

If you are specially frisky. Consider wrapping database code into pydantic base model structs, with organizing code in... classes somewhat yeah... in order to define minimal strictly defined interface for your other code parts in terms what it needs from database.
The trick is not in using classes, the trick is minimizing amount of strictly typed variables your code needs in order to interact with database. That makes kind of nicer architecture considering how much ORM/database code is kind of fragile. (Pydantic variables are good)
That's more optional thing, consider to be not using it. Use unit tests first, and go for mypy first as thing that will bring you more benefit.

#

organizing code is usually more difficult for people so it is not very obvious how to achieve good and bad results in it
(In order for classes to work better, u need to minimize amount of local variables mutations u have. Don't change preferably __init__ configured instance attributes of a class once instance of a class was created)

fringe sundial
#

thansk you so much

glacial current
#

What is the preferred method of doing migrations in sqlalchemy ? Ive only used flask-migrate so far. Looking at alembic and https://sqlalchemy-migrate.readthedocs.io/ anyone have any recomendatioins/preferences on this ?

#

I would prefer simple over full featured because i have to make this pallet able to others on my team that are more ruby based. Ruby has a very straight forward migrate system.

floral surge
#

does anyone have an SQL cheat sheet somewhere, to reference for syntax when I'm fuzzy on details?

simple barn
floral surge
glacial current
wise goblet
glacial current
quasi vector
#

Hi everyone!
Is there anyone working on Databases here
I need a small interview for my masters assignment. Please DM me ASAP if possible.

potent spire
glacial current
wise goblet
cunning bone
#

Hello all, Iโ€™m looking to schedule a Python script to run, scrape a website, and store the data. Any advice on the scheduling/storing part of this?

sharp tangle
cunning bone
sharp tangle
#

Sorry i dont have

fading patrol
# cunning bone Any advice on the scheduling?

It depends on your use case. If you just need to run a single script at a certain frequency then a cron job may do. For more complex stuff you've got Celery, Airflow, who knows what else

wise goblet
#
  1. dumb solution is just using Cron. Zero monitoring, easy to make. Well, u can have some measure of low monitoring with logging at least, but logging in a good way is hard.
#
  1. infrastructure profficient choice to run rare to work jobs via AWS Lambda in event bridge (Great option if u know Terraform or Pulumi and how to make them work with AWS. book Terraform up and running is great to start learning it)
#

u pay only for time of your application running then, highly optimized in terms of money spending (can be less than 10 cents at a month for all your runs ๐Ÿ˜… )
Comes with inbuilt AWS monitoring)

#
  1. programmatic python cool choice to go with Celery. this option is good if u learned docker-compose at least preferably (book Docker Deep Dive is great to learn compose). Celery comes with periodic scheduler Beat. and capable to run its jobs through redis broker. has easy to install Flower Monitoring (it is important to have observability to your script runs!)
#
  1. there are more of course advanced options... AWS batch jobs, or mentioned airflow and etc, but their complexity jumps further and not really justified unless u have matching level of complexity. At somewhat limited capacity you can chain task perfectly fine just with Celery
#

there is also another option...

#

Just having While True and some solution fully built on using shared memory of your application and smart usage of multithreading and multiprocessing, optionally asyncio stuff too. It has its own advantages and disadvantages. Works great for me in golang with goroutines and its shared memory without GIL lock at least.
Great option in terms of... simplicity with full programmatic control to have if necessary multiple in memory periodic jobs. Somewhat lacking proper monitoring, but good enough logging can compensate here.

glacial current
plush quartz
#

basically i want use it to make the fuction that could save userโ€™s friend code id
and i want it delete old id user signed before when user signed again
but it isnโ€™t deleted old documents
How can i solve it?

@client.tree.context_menu(name='Save NNID')
async def save_nnid(interaction: discord.Interaction, message: discord.Message):
  user = message.author.id
  key = {'ussr': user}
  data = {
      "NNID": message.content,
      "user": message.author.id,
  }
  user_file = {f'nnid_id.{secrets.token_hex(6)}': data}
  db.nnid_log.update_one(key, {'$set': user_file}, True)
  await interaction.response.send_message("NNID saved!")
@client.tree.context_menu(name='Find NNID')
async def find_nnid(interaction: discord.Interaction, message: discord.Message):
    user = message.author.id
    key = {'user': user}
    mes = ''โ€
    for m in db.nnid_log.find(key):
        mes = mes + "\n" + m["NNID"]
        print(m['NNID'])
    await interaction.response.send_message(f"NNID found\n{mes}")```
torn sphinx
#

when i open a scripts on python it close auto

fading patrol
fallen vault
#

I'm wanting to get the index of a record based on its row.

#

when i use SELECT base_name FORM bases I get the following data Mixing Black Base Mixing Blue Mixing Green Mixing Orange Mixing Red Mixing Violet Mixing Yellow I want to return 4 for Mixing Orange. Any suggestions?

#

Or would i be better off converting the list to python and getting the index from that?

waxen finch
fallen vault
#

hmm that seems to auto sort my data. I guess im going to have to add a pid column. Thanks

obtuse magnet
#

Yes u should almost always have an idex column for each table

paper flower
obtuse magnet
#

Not necessarily

#

It's just a row number column

paper flower
#

It's recommended to have a primary key in all tables, and they're indexed

fallen vault
paper flower
fallen vault
#

Primary id. Itโ€™s just what I call any column I auto-increment.

paper flower
#

primary key shouldn't necessarily be an auto incrementing integer

#

But in a lot of cases surrogate primary keys are easier/better choice than a natural key

fallen vault
#

Thatโ€™s why I had it set to my base_name so I could also autocheck for duplicate bases.

viscid coral
#

can sql alchemy update non primitive arrays?

#

i have an array in class 'User' and whenever i try updating the order of it, the array order won't save

#

but if i try to add to it, it works

jaunty linden
#

has anyone ever used mongoose.SchemaTypes.Buffer as a schema type for storing the password and salt hashes? the stored values are in a different format than i'm used to (both String and Blob). They are hashed because they went thru a hashing function. that much i'm sure but i'm not sure about the end result.. it's not plain text but it's also not not plain text

paper flower
#

๐Ÿค”

#

Sqlalchemy most likely marks models as dirty on assignment to the fields

#

You can either use this extension or assign to your array field when you're mutating it

tall abyss
#

%%sql
select count(community_area_number) as Total_Reports, (select community_area_name from census_data where chicago_crime_data.community_area_number = census_data.community_area_number) as Community_Area_Name from chicago_crime_data
group by community_area_name
order by Total_reports DESC limit 1 ;

#

does this count as using a sub query?

coral wasp
#

Yah, it's a correlated subquery

tall abyss
#

good

#

that question was ass

coral wasp
#

Normally tho, if someone is asking for a subquery, you'd write a join

tall abyss
#

had another way to solve it without using subqueries spent last 2 hours trying to rewrite it

coral wasp
#

The problem probably wanted you to write: ```sql

select ...
from chicago_crime_data
join census_data
on ...

tall abyss
#

i dont recall learning something like that

coral wasp
#

? How could they teach subqueries without teaching joins?

tall abyss
#

idk but it didnt look like that

#

implicit joins

#

inner joins

coral wasp
#

actually, they probaly wanted this: ```sql

select ...
from census_data
join (select count(community_area_number), ... from chicago_crime_data group by ...)
on ...

tall abyss
#

we have never actually used the word join in our work

#

i can send you some example files to show you what ive been taught

coral wasp
#

if implicit, you're prob used to something like this: ```sql

select ...
from census_data,
(select count(community_area_number), ... from chicago_crime_data) crime_query
where ...

tall abyss
#

this question absolutely ate into my supposed break time on my timetable. Scheduled to break at 12 for 1 hour its now 13:30

brazen charm
#

Has anyone else noticed an increase in AWS S3 5xx errors within the last month or two?
Recently everything seems to slowly be catching fire and all AWS is saying is

Amazon S3 can't handle the request at that time

pulsar fiber
#

idk if this is the right chat but i have a homework problem do in 4 hours, could someone hop in a call with me and help me with this shit, need to Determine Distance of Robot Arm Movement from givin points

pulsar fiber
#

rad

past charm
#

It keeps on throwing the same error untill I terminate it manually while executing the command to convert the standalone mongodb instance to replica sets.
Does anybody have a fix?

queen rose
#

sqlite3 Vs Redis

#

do you guys think the former could beat Redis for SELECT * WHERE pk_col = x LIMIT 1 queries?

#

like for a table that has about 1M records

wise goblet
wise goblet
#

or at least somewhere from second request when it was already automatically cached, then there would not have been difference

wise goblet
# wise goblet Redis by default is in RAM memory db, with periodic saving onto disk. Sqlite3 is...

Addition to answer @queen rose
There exists also File system caching in Linux. for small dbs out of just 1M records it would have potentially equalized Sqlite3 with Redis, despite them being in unequal situations

File system caching in Linux is a mechanism that allows the kernel to store frequently accessed data in memory for faster access. The kernel uses the page cache to store recently-read data from files and file system metadata.22 ั„ะตะฒั€. 2023โ€ฏะณ.

#

to this equation both dbs have also indexes, but that's probably not important detail at this step

#

the important step that Redis is not usable as main database (because data can be lost if power surge happened)

#

and because it is noSQL crap not usable to be main db ๐Ÿ˜…

queen rose
#

ok I think I will still try SQLite3 because its simpler

#

do you know by chance if sqlite3 actually uses indexes/primary keys like real databases do?

paper flower
#

Honestly postgres is easy to set up, either in docker or on your pc pithink

#

But you could go with sqlite

queen rose
#

I love sqlite's simplicity

#

no need to add any dependencies, drivers, or start any CLI processes

paper flower
queen rose
#

but I will be reading (not writing) data from multiple proccesses cuncorrently

#

should be fine, right?

paper flower
#

Also personally I just use ORM these days

paper flower
wise goblet
paper flower
#

Yep, they're disabled by default

wise goblet
#

also, if you are going to use stuff like max_length in varchars of ORM, it will be ignored. Sqlite3 is more simple
It does have constraints at least though

queen rose
# paper flower Also personally I just use ORM these days

ugh I hate ORMs, I like SQLalchemy but I use the 'core' instead of ORM.
I basically enjoy the OOP way to create queries (select(Person).where(Person.name == 'abc')) but I dislike how the ORM will try to sync the DB to the state in the program and vice versa

paper flower
#

And in a lot of applications you'd want to map everything into an object anyway

#

You could write something like this btw:
select(Person.id, Person.name, Person.email).where(Person.name == 'abc')

#

Basically works like core

queen rose
paper flower
queen rose
#

ok got it

paper flower
#

You're using ORM model

queen rose
#

no why?

paper flower
#

And queries are unified since 1.4/2.0

queen rose
#

why? because you create a table/class Person?

paper flower
#

Person is an orm model. tables are a bit more annoying to work with

queen rose
#

is that considered ORM?

paper flower
#
class Person(Base):
    ...

Is orm

#
from sqlalchemy import Table, Column, Integer, String

user = Table(
    "user",
    metadata_obj,
    Column("user_id", Integer, primary_key=True),
    Column("user_name", String(16), nullable=False),
    Column("email_address", String(60)),
    Column("nickname", String(50), nullable=False),
)

Is core

glacial current
# paper flower ```py from sqlalchemy import Table, Column, Integer, String user = Table( "...

What do they really mean by Core? Because these are class objects, (Table Column, Integer) that are describing table creation by sending

create table user (
  user_id integer,
...

I mean it has to translate that into different dialects of SQL, but isnt that, in itself, taking a set of objects and translating that into a a dialect of declarative table creation which would in some aspect be ORM?

paper flower
#

like Table

#

ORM provides features like declarative model/table definition, relationships, state management, etc

#

With core you can just declare tables and make queries to them, everything else is on you

glacial current
paper flower
#

You could use sqlalchemy's query builder: select(table.c.id).where(table.c.username == "Doctor")