#databases
1 messages · Page 30 of 1
anybody has recognizes this query before? I need to find the database that uses this syntax
{
"operator":"and",
"operands":[
{
"operation":{
"operator":"or",
"operands":[
{
"operation":{
"operator":"and",
"operands":[
{
"expression":{
"left":"type",
"operation":"equal",
"right":"stock"
}
},
{
"expression":{
"left":"typespecs",
"operation":"has",
"right":[
"common"
]
}
}
]
}
},
{
"operation":{
"operator":"and",
"operands":[
{
"expression":{
"left":"type",
"operation":"equal",
"right":"stock"
}
},
{
"expression":{
"left":"typespecs",
"operation":"has",
"right":[
"preferred"
]
}
}
]
}
},
full query: https://pastebin.com/NENzXH1r
Pastebin
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
This looks like a NoSQL query
It could be anything really, may even be elasticsearch query
I am devloping my first web app and i am stetting up my first database and its working very fine. So far so good.
I felling a bit unsure with my database scheme. I need to answear following questions:
1. Which job got the most scrap/bad parts?
2. Which article got the most scrap parts?
Normally the job and the article should be on the same layer in the pov of the Scrapbooking, but a Scrapbooking can only exist with a registered Job.
My solution is to give my ScrapBooking two foreign keys article_id and job_id. I think its not best practice, but i can't figure out a smarter solution. Maybe someone got an better solution.
Hey guys im having trouble predicting some values with my lstms
and im lost
Basically im trying to predict the number of people (n_termin_dist_quad_c1) in a certain time frame(DateTime) from a specific Grid_ID
The tutorials on lstms dont really help me in this problem
Perhaps you want to ask in #data-science-and-ml
Spark question: I know I can create a database/schema/etc in PySpark SparkSql, but I don't see any equivalent PySpark methods. Am I missing something?
edited to fix a typo; sorry for the confusion
equivalent to what?
If anyone can help me my server and client server license key checks out and send to client yet
!
In Spark SQL, you can say, "CREATE DATABASE x". I don't see the equivalent PySpark methods in the API doc.
Hey guys, I’m currently using SQLite and SQLAlchemy on a local database for an expense tracker app I’m making. I was wondering any good practices on storing “a database version” so I can keep track wether there are some migrations to do or not, I was thinking between the lines of having a version table with just one row but I’m not sure if that’s the best approach
Also, since I’m using SQLAlchemy, in the event of having to migrate or to update some schemas, migrations have to be hard coded in plain SQL and execute them with SQLAlchemy, right? Are there other good approaches for managing local DB migrations?
alembic is their first-party migration tool, to my understanding it automatically manages an alembic_version table in your db and uses python scripts to define upgrades/downgrades, optionally with multiple branches
https://alembic.sqlalchemy.org/en/latest/
cant say much more about it cause ive only used alembic in one project (and ive forgotten how to use sqlalchemy after not touching it for a while)
how can i set the "innodb_force_recovery = 3" to "innodb_force_recovery = 0"
i alreadddy tried openinng my my.ini file and trying to change it from there but there isnt any "innodb_force_recovery"
This might be a pretty dumb question, but do I need an AWS access key to download public files hosted on AWS?
wdym by this
I have an S3 link from where I need to download some data (which is public), in order to access the link and download the files do I need a ID/Secret? - I am used to using Linode, where I have an ID and key to download from my private repo, but there I create my own credentials @rare epoch
this might sound really dumb, but I am a bit lost
the last time i looked at my.ini it didn't have every option in existence there, so most likely you need to add a new line there, following the manual
https://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
look i get this error everytime
no u dont need an id or secret
What do I write in aws_acccess key etc then?
try leaving it empty
and you haven't touched this setting before? i got no clue how it would be enabled by itself, and i barely know anything about mysql anyway
like see, i have a command with which i can ban users from using the bot commands, since i did that to every command so they cant just use 1 command but every command, andd since then the error appears when using a command
@rare epoch I get unable to locate credentials then
can u take a full screenshot of where it is asking u to enter the access key
I mean I just get a "NoCredentialsError: Unable to locate credentials" error
correct me if im wrong but what u are trying to do is to access an s3 bucket that u have created in aws? if thats the case it should be quite simple if your s3 is set to to public. just need to copy and paste the link for that s3 bucket and you are able to access
Yeah, but how do I download it? Suppose I can just use requests?
not sure if this is super related to python, but my goal is to build my own ebook hosting server for myself and have a front end so i can access my books wherever. i know i could use calibre but i think it would be fun to do it on my own. anyone have any tips on best way to store epub files and their metadata if i want to use a python backend?
guys how can i change the innodb_force_recovery from 3 back to 0?
did you check if it was in the other option files?
https://dev.mysql.com/doc/refman/8.0/en/option-files.html
cant find them
im actually so lost, dont know if i cccan ever fix that trash
dont even know why it changed 0 to 3
damn
i dont know any best practices for serving files, but i guess it could be as simple as storing .epub files in a directory and recording their metadata in your database for more complex queries (whether it be sqlite, postgres, etc.)
Abstracting postgresql foreign tables in flask-sqlalchemy in a dataclass-esque way
I'm working on a webserver that's supposed to configure a testing tool. For this, I have two databases, the on for it's config, and the one for the data to run the tests on. Said second database contains tables with data I need to constrain config values with (both in that config db, as well as in the web backend). I got the foreign table setup working no issue, but I would like to query the foreign table in a similar way than I do with @dataclass.
Could someone please explain to me, how I'd go about doing that?
The most I found myself was a guide on how to handle creating/deleting those tables via DDLElement to have them called in db.drop_all() or db.create_all(), but that doesn't handle the querying of those tables data
How do I fix this error?
"C:\Users\kasperb\AppData\Local\Programs\Python\Python312\Lib\site-packages\pymysql\connections.py", line 644, in connect sock = socket.create_connection( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kasperb\AppData\Local\Programs\Python\Python312\Lib\socket.py", line 852, in create_connection raise exceptions[0] File "C:\Users\kasperb\AppData\Local\Programs\Python\Python312\Lib\socket.py", line 837, in create_connection sock.connect(sa) TimeoutError: timed out
yeah i figured id potentially have to store metadata separately but wasn’t sure bc i think epubs are technically zips that have xml and opf files within for each the book content and metadata respectively. when you say directory you mean just system storage right?
yeah, in opposition to storing them in the database
this seems to be a decent comparison between either choice:
https://softwareengineering.stackexchange.com/a/150787
IMO, deeming the storage of files in the database or not as "bad" requires more information about the circumstances and requirements. Are the size and/or number of files always going to be small? Are there no plans to use cloud storage? Will the files be served up on a website or a binary executable like a Windows application?
In general, my experience has found that storing paths is less expensive to the business even accounting for the lack of ACID and the possibility of orphans. However, that does not mean that the internet is not legion with stories of lack of ACID control going wrong with file storage but it does mean that in general that solution is easier to build, understand and maintain.
Any chess databases you guys recommend?
Can anyone recommend resources to learn SQL and C# for developing Windows Forms applications?
should i use redis as a in memory database(without needing to connect to a server) for storing multimedia files(can be video, image and any other file)?
i want to optimise both speed & memory
or keydb?
redis and keydb both run as a server, though
yes, compared to doing something in-process. the benefits of having a server are for when you need caching across processes
well i don't wanna use a server, instead i want the keyDB's speed but locally(just like sqlite3)
if there is none
how can i convert it to a file that contains the data?
Normally such files are stored on some object storage system like an S3 compatible system.
mhm
Hello fellow programmers! I have an issue with sqlite3: my program works perfectly fine, while the same program doesn't work on customers PC.
The issue appears in a SELECT query, specifically in
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: no such column: username```
The username column IS presented in table, query for table creation:
```CREATE TABLE IF NOT EXISTS user_configs
(user_id TEXT, return_timestamps INTEGER, talkers TEXT, is_admin INTEGER DEFAULT 0, can_use INTEGER DEFAULT 0, username TEXT)```
Can anyone tell me a possible reason why such thing happens?
The general rule is that the computer is right and it's your job to figure out why your brain is confused.
Maybe you're looking at the wrong db?
That's a cool notion, I agree, but no, I am looking at right db. Actually there isn't a problem on my computer, the issue appears only on his computer. Can there be any dependencies, versions or some other kind of thing that his computer probably missing, by any chance?
if the table already existed without that column, it would not be dropped and recreated, and the new column wouldn't be added
we recreated the db with second query, still no result, though column is indeed in the table
how did you test that the column is in the table?
perhaps two unique indices on (company_id, name) and (company_id, barcode) is what you're looking for? either could be a compound primary key too
would (company_id, barcode) uniquely identify an item? that could probably be a composite key
generally CREATE UNIQUE INDEX ix_something ON (col_a, col_b);, plus NOT NULL constraints if you want to fully mimic a primary key
although a sidenote, do databases normally handle primary keys differently from unique indices + not null? i know in sqlite, WITHOUT ROWID tables cause the data to be arranged according to the primary key which wouldn't be true with a unique index, but are there similar impacts in other databsase systems?
oh wait, wikipedia mentions that too https://en.wikipedia.org/wiki/Unique_key
On some RDBMS a primary key generates a clustered index by default.
this is called a clustered index in other databases
i wouldn't recommend it in this case; clustered indexes are best when only one index is needed
like you would actively avoid making one of them a primary key in favour of two indices?
would name be unique for a given company_id? or is it only the combination of (name, barcode) that is unique
if you have multiple candidate keys, the choice of which is a primary key doesn't matter. so you could just pick one to be the primary key and the other to be an index
ah ok, thats what i was thinking
i thought you were suggesting to avoid a primary key where clustering would be implicit
TIL postgres doesn't cluster rows by default until you run CLUSTER with an index, possibly in a background job
i think in this case it would make sense to include all 3 columns in the primary key, since all 3 are needed to uniquely describe an item (unless i interpreted wrong ?), and create a separate index for the other ordering of name and barcode
like, (company_id, name, barcode) and (company_id, barcode, name)?
(company_id, name, barcode) and (company_id, barcode) would be enough. though if the items don't have any other attributes, you could choose to include name in the second index to take advantage of index only scans
^ so back to this then
yeah
o wait i didnt notice the third column in your message, so its slightly different
hm, without name in this second index, does this mean there is no guarantee of name being unique within a company?
or in other words the same name can be given two barcodes
the indexes don't need the third item. a primary key should be a minimal set of columns that uniqely defines the row
hmm i dont get it
if i inserted (1, 'foo', 'xyz'), wouldn't i still be allowed to insert (1, 'foo', 'zyx') afterwards?
(assuming PK and unique index defined by your message #databases message , for clarification)
also i just read this, which is incredibly amusing
According to the SQL standard, PRIMARY KEY should always imply NOT NULL. Unfortunately, due to a bug in some early versions, this is not the case in SQLite. Unless the column is an INTEGER PRIMARY KEY or the table is a WITHOUT ROWID table or a STRICT table or the column is declared NOT NULL, SQLite allows NULL values in a PRIMARY KEY column. SQLite could be fixed to conform to the standard, but doing so might break legacy applications. Hence, it has been decided to merely document the fact that SQLite allows NULLs in most PRIMARY KEY columns.
ah yes, good ol sqlite compatibility
oh. yes, that would be possible
huh. i thought you wanted only one item to have specific name for a given company id
nevermind. it seems you have what you want
hellooo! im new to this server so idk if this is how it works but i need help 😖
i'm trying to find a csv file to link in my code that looks exactly like this https://raw.githubusercontent.com/MainakRepositor/Datasets/master/Pokemon.csv but for a different topic ( but like something similar - eg LoL, Overwatch, etc). does anyone know of any or how to find them?
well, for example if I google "lol character stat dataset", I get https://www.kaggle.com/datasets/carralas/league-of-legends-champion-stats-922 as first link, which looks right.
(oh, and if that dataset didn't exist, you could scrape the same data from the big table on the LoL wiki)
As a beginner, I'm working on a python project where I need to use Redis as the primary database (for speed) and AstraDB as the persistant database (to permanently store all the data). Is there any easy way or workflow of doing this? If some data was not found in Redis, it should load the data from AstraDB into Redis. Similarly, it should write new data instantly into Redis and then also write the data in AstraDB before it vanishes from Redis. Is there a built in way or do I have to manually put if/else conditionals everywhere?
how much speed are we talking about
Im getting an "Error converting column "issue" to bytes using encoding UTF8. Original error: bad argument type for built-in operation" error when dumping my pandas df into parquet, anyone know how I can find the issue? Also is it possible to just replace all non-ut8 chars with " " or something?=
I mean, currently, the speed Redis is providing is perfectly working for me. I just wanna add a second permanent database because redis storage is costly.
I see. I'm asking because a lot of times people ask how to combine Redis and their database when they have like 5 queries a second where it's kind of unnecessary effort.
I think if you want to prevent duplicate effort in form of if / else conditionals you could write a helper function. Assuming you'd be using a SQL database, you could do something like
result = query_or_cache(f"users:{user_id}", "SELECT * FROM users WHERE id = $1", (user_id,))
with the function being something like
def query_or_cache(cache_key: str, query: str, params: list[Any]):
from_redis = redis_client.get(cache_key)
if from_redis is not None:
return json.loads(from_redis) # assuming it's json formatted
db_cursor.execute(query, params)
results = db_cursor.fetchall()
marshalled_results = json.dumps(results)
redis_client.set(cache_key, marshalled_results)
return results
Something like this would work, but the problem is that 1) you now need to have both the Redis client and the DB client available anywhere (doable and you should do it for easier testing) but 2) if you change the database format you need to invalidate your cache as otherwise your code will expect other things to be returned than that are actually returned (imagine adding a new column and then it fetches the old cache object).
These are all solvable but I think if as you say your current approach works for you then I would stick to it. How would the database be less expensive than Redis? They both need disk resources and I'm not aware of Redis being blissfully inefficient compared to disk
I can't thank you more. I appreciate your help from the bottom of my heart. This is all I needed. And yeah, I get it that you asked to save my efforts. The thing is my project is currently in beta, but we can expect a lot of users in a month or two, for which I need to prepare the database backend. I'm thinking around 70-200 requests per second right now. Also, would a CQL database work in this workflow? AstraDB seems to be a CQL database.
If you're in 2nf, and the relation is like R(K, A, B, C), and B is dependent on A, and C is dependent on B, what's the better way to change it to 3nf? Like this?
R1(K, A)
R2(A, B)
R3(B, C)
Or like this?
R1(K, A, C)
R2(A, B)
Should it be done the latter way so that B doesn't need to be in two places?
I‘m not familiar with what CQL is but I don‘t see why not
Also a thing to be aware of, as is the function above would also cache writes, which you probably don‘t want
Or you just call it on reads^^
what if a B can have multiple C's?
R1(K, A)
R2(A, B)
R3(B, C1, C2, C3)
vs
R1(K, A, C1, C2, C3)
R2(A, B)
Maybe I have it backwards what's redundant and what isn't
I'm not really sure
i don't mean multiple columns, i mean
create table books (
title varchar(300) primary key,
author text,
author_age int
);
``` if you have this table, you could have multiple records `('CLRS', 'C', 10)`, `('CLRS', 'L', 5)`, and so on, if a book has multiple authors. if you use the second version, then you could not represent this
But if C is functionally dependent on B then doesn't that mean each B maps to one specific C?
ah. yeah that is true
I'm now trying hard to figure out which option results in both less columns and rows combined, because I'm pretty sure that's the actual goal.
Not just less columns.
R1: every value of K times every value of A, plus
R2: every value of A times every value of B, plus
R3: every value of B times every value of C
vs
R1: every value of K times every value of A times every value of C, plus
R2: every value of A times every value of B
I think it's like this.
(k * a * c) + (a * b) vs (k * a) + (a * b) + (b * c)
if you have multiple books with the same author, then you're duplicating the author_age for each book. so after a year, you need to update the age (maybe we should just track the birthday, lol), so you need to update more than 1 record
author_age is a fact about the author, not the book, so it shouldn't be present on the books table in 3nf
I just randomized k, a, b, and c values and tested what had the better score in python, but is that cheating?
How do you figure out which of these expressions will be less just by looking at them?
In our practice exercises, they really are just called "A B C D etc"
it's not about minimizing columns. you're probably going to create more columns when normalizing
I'm trying to figure out how to minimize tuples now
if there's only one B for a given A, how does R2 have A * B rows in it? wouldn't it just be how many As there are?
Yeah you're right, it would just be the number of As
I made a mistake there
k + a maximum rows vs k + a + b maximum rows?
you can't just count the rows though, because you're duplicating data in the example i gave
(k * 3) + (a * 2) maximum pieces of data vs (k * 2) + (a * 2) + (b * 2) maximum pieces of data?
these are all 1:1? each C references exactly one B, each B references exactly one A, each A references exactly one K?
Yes
is this for a software application, or a data warehouse?
do you have any practical goals/constraints here, or is it just a thought exercise?
I think the original R(K, A, B, C) has the fewest distinct cells. All the other designs should be equivalent, but you have additional columns for the foreign key references
If they're all 1:1 you could just concatenate all tuples across with INNER JOIN + USING, resulting in the exact same output as the original "quadruple entity" design
Btw if I'm not understanding you right then I'm sorry about that, I know it's irritating to deal with people when they're like that. But my brain fog is really bad rn.
It's like I can barely think more than 1 or 2 levels deep if that makes sense
from db.base_class import Base
from sqlalchemy import Boolean, Column, Integer, String, DECIMAL
from sqlalchemy import ForeignKey
from sqlalchemy.orm import relationship
class Sport(Base):
id = Column(Integer, primary_key=True, index=True)
name = Column(String, nullable=False, index=True)
matches = relationship("Match", back_populates="sport")
player = relationship("Player", back_populates="sport", uselist=False)
class Match(Base):
id = Column(Integer, primary_key=True, index=True)
name = Column(String, nullable=False, unique=False)
number_of_teams = Column(Integer)
team1_id = Column(Integer, ForeignKey("team.id"))
team2_id = Column(Integer, ForeignKey("team.id"))
sport_id = Column(Integer, ForeignKey("sport.id"))
sport = relationship("Sport", back_populates="matches")
team1 = relationship("Team", foreign_keys=[team1_id])
team2 = relationship("Team", foreign_keys=[team2_id])
entry_fees = Column(DECIMAL, nullable=False)
class Team(Base):
id = Column(Integer, primary_key=True, index=True)
name = Column(String, nullable=False, unique=False)
number_of_players = Column(Integer, nullable=False, unique=False)
players = relationship("Player", back_populates="team")
class Player(Base):
id = Column(Integer, primary_key=True, index=True)
name = Column(String, nullable=False, unique=False)
team_id = Column(Integer, ForeignKey("team.id"))
sport_id = Column(Integer, ForeignKey("sport.id"))
sport = relationship("Sport", back_populates="players", uselist=False)
team = relationship("Team", back_populates="players")
is_active = Column(Boolean, nullable=False)
is_substitute = Column(Boolean, nullable=False)
is_inactive = Column(Boolean, nullable=False)
idk why alembic is trying to drop all tables in this schema upon running alembic revision command
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_index('ix_team_id', table_name='team')
op.drop_table('team')
op.drop_index('ix_player_id', table_name='player')
op.drop_table('player')
op.drop_index('ix_sport_id', table_name='sport')
op.drop_index('ix_sport_name', table_name='sport')
op.drop_table('sport')
op.drop_index('ix_match_id', table_name='match')
op.drop_table('match')
op.drop_index('ix_user_email', table_name='user')
op.drop_index('ix_user_id', table_name='user')
op.drop_table('user')
# ### end Alembic commands ###```
each A references exactly one K?
Actually not this.
I think that's what he was trying to point out.
what is the relationship then?
A -> B
B -> C
I think then that he was saying, with R1(K, A, C) and R2(A, B), A may appear multiple times in R1 with different values of C. But if you split it into three different relations instead then that won't happen.
Did I understand what you were saying right?
I'm getting filtered by this really hard.
what about K though?
I guess it's just K -> A, B, C
yes but 1:1 or 1:many or many:1?
the best way to deal with brain fog in my experience is to write everything out in as much detail as possible
I don't know about you, but the main issue with brain fog for me is that my working memory goes to absolute shit, so writing everything out and breaking it down into the smallest possible decision points can help
I'll need to do that probably.
Each K to one A B C and each A B C to one K.
yeah. i thought multiple Ks could have the same A
Actually, if each A goes to one B and each B goes to one C, then could an A go to two different Cs?
no
Oh yeah, but this could happen.
K1, A1, B1, C1
K2, A1, B1, C1
working memory
Yeah it's that. Really bad working memory. I actually had to do that once, I had to write out each step of an algorithm because I kept forgetting what step came next and where I was.
yes, so if you keep the table like (K, A, C), then you would duplicate C when you have multiple occurences of the same A
Like this except without the B1/B2
so it's all 1:1 then, no? K 1:1 A 1:1 B 1:1 C 1:1
or could two different As be related to the same B?
I fucked it up again, sorry. An A B C could go to multiple Ks.
There's no bidirectional stuff
Or whatever you call it when it's both surjective and injective.
I don't remember the terminology, but yeah, X -> Y but no Y -> X.
Anyways, I think I can try to finish this from here. Thanks for your help, @thorny anchor and @harsh pulsar, I appreciate it. I probably would have answered R1(K, A, C) and R2(A, B) and lost points otherwise.
previous question here:
I think SQL + some kind of cache (e.g. redis) would work fine for 200rps, but it depends on what kind requests you'd be getting
We can write up to more than a hundred thousand of rows per second into SQL db.
Get SQL db like Postgres ^_^
High chance u will not be able to go beyond its limits
even a hundred thousand is tiny for postgres
in my web app, the python event loop seems to saturate before the postgres connection does
the python-side database i/o latencies are like 10x what the postgres logs show, which indicates to me that the event loop is letting completed db operations sit around for a long time before being picked up
it's still good enough for what we need, but it's definitely kind of disappointing to see what looks like a severe bottleneck. i'd love to optimize it one day if possible
maybe 3.12 will help, not sure we can switch to pypy due to some data science dependencies
this is all true, but keep in mind the original question was about adding a database to persist what's in redis, not to add redis to cache what's in the database :D
the stock default event loop or other drop in replacement like https://github.com/MagicStack/uvloop ?
i'm using uvicorn already, which as far as i know already uses uvloop
gunicorn with uvicorn workers, load-balanced across 4 relatively small ECS task instances
not sure if the gunicorn + uvicorn setup is suboptimal
might be worth checking.
you have to install uvloop if you just brought in uvicorn without the standard optional group it seems
uvicorn/loops/auto.py line 3
import uvloop # noqa```
`docs/index.md?plain=1` line 130
--loop [auto|asyncio|uvloop] Event loop implementation. [default: auto]```
also the driver one uses matters, i always reach for https://github.com/MagicStack/asyncpg (to which i am kinda shock that it also doesn't bring in uvloop as a default)
I did add the "standard" extra, but I am using psycopg3 via sqlalchemy. The generated SQL looks how I want it so I think sqla is fine, but is asyncpg significantly faster than psycopg3?
We were originally using asyncpg but migrated to psycopg3 to make testing easier without making all our db setup fixtures and everything async
I'm not even 100% sure my diagnosis is correct, but the difference between the db-side request handling time and the python-side request-handling time is really suspicious
tbh i don't understand how asyncpg is that fast.. i have just stuck with it and never really did my own extended testing
serious question then: how do you know it's "fast" compared to anything else?
the throughput was higher from metrics that i was tracking + me trusting in my colleague's words + me trusting the author's benchmark
interesting. i'll have to try it, maybe i can get permission to A/B test it in prod (we don't have a good load testing setup at the moment)
class Base(DeclarativeBase):
pass
class BaseMixin:
@declared_attr.directive
def __tablename__(cls):
return cls.__name__.lower()
id: Mapped[int] = mapped_column(primary_key=True)```
class Dms(BaseMixin,Base):
sender: Mapped[str] = mapped_column(ForeignKey("users.username", ondelete="CASCADE"))
receiver: Mapped[str] = mapped_column(ForeignKey("users.username", ondelete="CASCADE"))
def __repr__(self):
return f"Dms(id={self.id},sender={self.sender},receiver={self.receiver})"```
Whenever I access Base.metadata I get: MetaData(). That's the reason the tables aren't being created.
Could someone explain to me why the Dms table isn't added to the Base.metadata?
hey i started learning coding recently and im kind of confused with some terms. could someone please explain me what does "data type" and type function in python means???
Dose anyone know of a simple db like shelve that is hosted online?
For free specifically
Only if you have it installed
You don't run any sync operations in async endpoints, right?
Also as ry mentioned - try asyncpg, should be a simple pip install + a db url change
Yes I am using the "standard" extra which unless I misread pulls in uvloop
There is some sync code (unavoidable) but it's trivial, on the order of microseconds to complete at most
I'm seeing some people say asyncpg is faster and some people say it's not. Like I said I switched from asyncpg already, I can try a benchmark but I'm skeptical that somehow psycopg3 with the binary protocol is that much worse
Just compare them 🤔
Haven't been able to reproduce in artificial load testing, doing actual proper load testing is on our roadmap but we are a DS team so we have actual DS work to do first
have someone ever used async sqlmodel? I wanted to look at some project using it. For get some examples
Hi! I'm using psycopg2 to connect to my PostgreSQL Database. However I don't want the user, key, database name and host to be hard coded. Is it safe to have them formatted in the connection function (psycopg2.connect()) like this ? con = psycopg2.connect( f"dbname={DB_POSTGRESQL} user={USER_POSTGRESQL} password={KEY_POSTGRESQL} host={DB_HOST}" ) The variables are stored in an .env by the way. Thanks!
yes, that's fine. you can also just store a connection url, something like postgresql://username:password@host
what specifically are you trying to do? SELECT IF EXISTS isn't a thing
checking if there is something in the table if exist associate that, else don't give me anything
but i realize i don't need it
i'm trying to create a cache for my macro, but even changing the cache algoryth
it don't change anything
guess i need to add something to my lark code
like, something releated to leaves or something
but even making all the big fat ass code of lark
i still lack some knowledge
that is where stocking variables becomes a problem
if someone have Larlk library knowledge please tell me
i really need
that rare knowledge
i'm basically mining for ancients scraps (I.E Minecraft)
you can ask precise questions to get answers for things you're confused about
are you talking about lark the tool for building parsers?
yeah
few peoples knows that
library
i can't imagine why you would need sqlite with lark
just for cache
sqlite don't need
but when you need cache
you need sqlite
but i made the test and changing the cache changes nothing
i noted that the program is executing the macro from left to right
and i don't know how i fix it
due to limited knowledge of Lark
Thx !
from sqlalchemy.orm import Mapped,mapped_column,declared_attr,as_declarative
@as_declarative()
class BaseMixin:
@declared_attr.directive
def __tablename__(cls):
return cls.__name__.lower()
id: Mapped[int] = mapped_column(primary_key=True)```
class Dms(BaseMixin):
sender: Mapped[str] = mapped_column(ForeignKey("users.username", ondelete="CASCADE"))
receiver: Mapped[str] = mapped_column(ForeignKey("users.username", ondelete="CASCADE"))
def __repr__(self):
return f"Dms(id={self.id},sender={self.sender},receiver={self.receiver})"```
The table isn't being added to the BaseMixin.metadata because when I access BaseMixin.metadata, I get:
MetaData()```
Can someone explain to me what I am doing wrong?
Can anyone help me with my tables on SQLAlchemy?
If you ask a specific question someone may answer. If it's complicated grab a help channel. #❓|how-to-get-help
Depending on the DB, UPSERT may be an option
what will you do if it doesn't exist?
@paper flower I know you're an expert with sqlalchemy. I've spent hours trying to figure this out but no luck.
Are you sure you're importing the module containing that Dms class before you try to use the metadata?
also, what version of sqlalchemy is this? You should be inheriting that base mixin from sqlalchemy.orm.DeclarativeBase
I ran a print statement of the BaseMixin.metadata after the creation of the Dms class and it returned an empty collection.
so which version of sqlalchemy are you running?
oh yeah, I think you need to call configure_mappers() first
anyways, try configure_mappers() first, then you should see something in the metadata
Where specifically do I need to put it in the code? I currently don't have access to my pc as of now.
Or if you want to reference the section in the documentation
that function fills in the metadata
it's usually not needed to be called explicitly: https://docs.sqlalchemy.org/en/20/orm/mapping_api.html#sqlalchemy.orm.configure_mappers
Thanks.
Do you have a specific example?
of what?
The mappers? Im sorry if Im not providing enough context because I'm on my phone.
I don't understand. I told you that you need to call configure_mappers() to get the metadata filled in. Do you need example on how to import that, or what?
Yes
I already linked the full import to you, do you know how to import things in Python?
Okay maybe I misread the documentation. Thanks for the help.
Im thankful
I have to test it out when I get back on my pc.
I think your Dms class just doesn't exist since you didn't import that module when accessing your Base.metadata
Also it should inherit from Base to be added into metadata
Just to be sure - put both classes into the same file and see if that works, if it does - your Dms class just didn't exist
@as_declarative()
class BaseMixin:
@declared_attr.directive
def __tablename__(cls):
return cls.__name__.lower()
id: Mapped[int] = mapped_column(primary_key=True)
class Dms(BaseMixin):
sender: Mapped[str] = mapped_column(ForeignKey("users.username", ondelete="CASCADE"))
receiver: Mapped[str] = mapped_column(ForeignKey("users.username", ondelete="CASCADE"))
def __repr__(self):
return f"Dms(id={self.id},sender={self.sender},receiver={self.receiver})"
print(BaseMixin.metadata)``` It still printed MetaData()
Any more context you need?
Try to not use a decorator and inherit from DeclarativeBase instead
@as_declarative()
class BaseMixin:
->
class BaseMixin(DeclarativeBase):
this is the file structure btw
File structure shouldn't really matter
I feel like I made some stupid mistake somewhere in the code and I can't find it
Did you try changing it to DeclarativeBase?
yes, still didn't work
Can you make a reproduction of this in a single file?
sure
I printed Base.metadata.tables instead and got this: https://paste.pythondiscord.com/OYFQ
I fixed the problem
I thought alembic created the tables but I thought wrong so I just added this: ```py
from app.db.base_class import BaseMixin
from app.db.database import engine
from app.models.dms import *
from app.models.friend_requests import *
from app.models.friends import *
from app.models.notifications import *
from app.models.servers import *
from app.models.user import *
import asyncio
async def create_tables():
async with engine.begin() as conn:
await conn.run_sync(BaseMixin.metadata.create_all)
asyncio.run(create_tables())```
alembic should create tables
what's the command?
the command in the docker-compose file: command: bash -c "alembic upgrade head && uvicorn app.main:app --host 0.0.0.0 --port 80"
example?
sorry, I'm new to alembic
Like do I have to explicitly make all the tables in the versions directory?
alembic revision --autogenerate -m "whatever changed"
that will generate the change file
yep I did some research but thanks.
k cool
hey guys got a question , is it possible to create a local database without sql's involvment, like just a python data base (fyi i want to be able to get information from the data base)?
python has built-in support for a database format called "dbm" but there is no structure to it, it's just a "key-value" store.
Why not SQLite?
I'm wanting to make a database backup feature for an application. Would it be correct to just create a copy of the sqlite database and how the user pick where to save it and then delete the main database and replace it with this one when the user use the load database function?
SQLite databases are literally just a file so yeah that should be fine - the file contains all information about the database that exists
Alright, thanks. Makes my life a bit easier lol.
you might want to add options like exporting to/importing from common formats like CSV or Parquet though
I currently have export and import options that save to CSV for the main part fo the data saved. I want to add seperate export/import for the entire database to save config data instead of using a config.json for themes, user limits and uom.
json is actually a fairly descent way to save configurations tbh
I guess i could export to csv for the main data and package that with the json to create a zip maybe? I just wanted the export to be one file intead of multiple files for when the user updates the software to a new version.
This seems overly complex. You should probably implement a schema migration system for software updates. And yea you can copy the db file for backup.
Im not familiar with that process. Do you have any resources? I have the software packaged in an installer (which I only recently learned how to do properly) so ive offloaded the responsibility of migrating to the end user via two buttons (one for exporting and one for importing)
There are some.. I use Django and that's built in and very nice. There's Alembic I think it's called too.
I’d probably prefer Django, but it currently isn’t worth migrating from desktop to web app, unless more than one company starts using it, imo currently. Will definitely keep in mind for my next project.
Alembic says it’s for SQAlchemy, so I guess it’s time to learn something new lol. Thanks for the advice.
You can use the ORM part of Django and just ignore the web parts. Django is more like 5 libraries packaged together.
Honestly since it requires django config file, etc it's kinda finnicky to set up and use
I mean.. so is SQLAlchemy imo. But in different ways. The config file / environment variable is not really a big deal.
And it just gives you less control compared so say sqla
That's mostly just overblown imo. Djangos ORM has caught up quite a bit in the last 10 years, while SQLAlchemy is still as confusing as ever 🤣
I didn't use django 10 years ago, but did around 2 years ago
Writing complex queries is confusing, connection management just isn't there
What does "connection management" mean?
Explicit connection management, pooling, etc
But yea, complex queries can be annoying. But so is it in SQLAlchemy imo. Django makes joins way nicer than in SQLAlchemy and that's 90% of what you need to do.
Join in sqla is just join(Model.relationship)
Well.. ok. I have never needed that 🤷♂️
Join in Django is just __ :P
Which uses magic strings essentially, honestly I just don't like it
The aggregation/group by is super confusing and bad in Django though, that's for sure. I have to read the docs every damn time.
SQLAlchemy uses magic attribute access and operator overloading no? 🤷♂️
Without language support you can't win.
Hm, for what exactly?
Model attributes in sqlalchemy are typed since 1.4-2.0
e.g. Model.id is InstrumentedAttribute[int]
model.id is int
select(foo.bar = 3) or something? I don't remember the syntax
select(Model).where(Model.id == 4)
?
yea, that's super magic operator overloading
But it looks exactly like sql
I mean... it's perfectly ok and pretty pythonic, but magic is magic
sure, that's my point
You get magic a or magic b. Saying "magic is bad" when both are magic is not really a good argument either way.
kwarg juggling on one side, operator overloading on the other 🤷♂️
For connection management - I don't think django can use connection pooling very well and using multiple databases could be a challenge
Multiple databases is well supported for ~15 years. Connection pooling I don't know though.
For multiple dbs - depends on the usecase, with sqlalchemy you can just create as many engines as you want
Sure, and in Django you can configure as many dbs as you want. It's only useful for apps though, as you really REALLY shouldn't do it at runtime.
e.g. if you're running a multitenant system and want to use dbs/schemas/whatever
Yea, then don't use Django for sure
Well, yeah, you'd have to do that at runtime
But imo if you are doing that you are doing something very weird imo
Not really, multitenant systems are a thing
Db sharding is a thing
It's just not the 99% case
I… I think I’m gonna go with the save file method because all of that was way above my head. I need a dedicated Django class I’m sure at this point based on my learning style.
If you have import/export you can easily implement migration yourself. Just write the version in the files somewhere and you can see if you need to upgrade and do it yourself.
Version of the DB format that is, not the app.
Well the user will only be exporting and importing during version updates, and the database is set at this point, so it’s just moving the file itself.
If you use standalone django orm it's quite vague when connection is created/closed/commited, I think in case of sqla it's quite clear.
Also django orm has some weird behaviors:
- If you call
model.saveall properties are saved/updated - Not really a weird behavior, but since it doesn't use a UoW pattern it's generally less optimized for writes, at least probably have to handle that yourself with
bulk_update/create - It doesn't handle creating nested models too
Well.. then you can move the file yourself then? Why bother the user?
I will technically be moving the file, they just have to click the button to run the function.
@fallen vault What would you be using that DB for?
The nested models thing is super bad yea. That I agree on. Just super frustrating if you hit that.
Recording footage on rollers.
but.. why? If you have to do it why the button?
The button will prompt the user for a save location for exports, and to select a file for imports.
Ehm, could you explain? What footage? What's rollers? 😅
Are those video files?
And the function will handle the moving and such.
Hm.. for exports? You mean for the current db of the app? Isn't that known from before?
The location to grab it from, not the location to save it to.
The user picks the location to save to.
Well.. if it's an upgrade you will presumably already know the location as the user chose it in a previous version?
I don’t know how to communicate between version…
Currently it’s set up to uninstall the old version and run the new version installer. Then import the data from a csv file.
Which the software creates and the end user knows where they saved it.
I’m really just taking these ideas from software I currently use daily. One is quite literally just a bat file that copies the database to a user specified location.
Although, they hard coded F: so that caused issues when we got a new flash drive lol.
how your experience with creating abstract classes to handle database queries? Does it actually help?
How about using an existing ORM? That certainly helps.
@fading patrol yes, but I'm trying to cover the case where there is a change in the database technology, such as between postgres and sqlite or even psql to mongodb or whatever. I could create an ABC that needs to be implemented, but if I implement a class for the new DB then all I have to do is rename the import where I use it and it should work, right?
I don't think you can create a good abstraction for all DBs, e.g. SQL and Mongo aren't quite "compatible" with each other
Usually if you really need it you can create your abstaction layer on top of an orm, but you probably won't need to switch between dbs that much
My point was that SQLAlchemy or Django would do that, no need to reinvent the wheel
You can't really switch from SQL to mongo db willy-nilly though 🤔
No, Mongo is schemaless so I don't see how that could work. But between SQL engines it's fine
Yep, that was kind of my point too
Just pick one, you can scale out pretty well with either option
And it's not like it would be a problem in their case
I use redis and MySQL together with a data layer abstraction. The app above has no idea.
As my app evolved some things tranisioned from redis to MySQL as the way the data was being used was more suitable in one than the other
What exactly do you mean by "redis and mysql together"
Are you using redis just as a cache?
Some data is temporal and I don't want to keep it, just need it in one form for quick access then I'm done. This system ingests a lot of data and only some gets to the db
But also caching, but not for a web front end, but for services that use the incoming data for their purposes
It's like shared memory for my own micro services
thanks for the feedback, my initial feeling was that the abstraction might be unnecessary as well, but I thought perhaps I could make an ABC or something that has some simple find() insert() delete() methods where I could just implement a new class and do a simple find/replace to use the new db.
What’s best database for a discord bot, local database on my raspberry pi or an online one like MongoDB?
What kind of data? logs? user list? etc
configurations?
Server specific warns, mutes lists, prefixes ect
And some custom stuff regarding a specific server
Users submit things it gets put in a database then mods accept/deny those things
Also some user specific ones as well
sqlLite would probably be fine. What is the scale? Like 10s of users or 1000s
Sqlite would do fine for both 😅
That API sounds pretty limiting, maybe you could just use SQL and in the case you really need a separate storage, for example a mongodb or redis - you just move a part of your data into it?
It doesn't make sense to just stuff everything into a single storage if different data is used differently
Ah Oki, also is there a way to visualise the database like MongoDB does?
SQLiteStudio and DB Browser for SQLite are GUIs for it, but they need access to the database file to read it
if you're using raspberrypi headless, i.e. without a desktop, that might be difficult, and last time i tried setting up xrdp for a remote desktop i think it didnt go so well (though i had once gotten VNC working)
i guess some other ideas to use the GUI in the case that you don't have a desktop:
- set up a network drive to remotely access it, but sqlite's known to have issues being used over network https://sqlite.org/useovernet.html
- copy the database file to your computer and then use the GUI
the latter would be simpler, but if you need to make any changes, you'll have to carefully repeat them on your raspberrypi's copy to avoid losing any new data (or turn off the bot until you're finished with any changes)
if network access is really necessary, a client-server database like postgresql would be more suitable, then you can use something like DBeaver to connect to it
!cban 1231095712083607612 ban evasion
:incoming_envelope: :ok_hand: applied ban to @errant axle permanently.
I'm working on an AI powered discord bot where I'm using Redis as the primary database, and I've decided to pair Redis with AstraDB. Now, I need to build an efficient and optimized workflow that will put Redis in between AstraDB and the application. I need a workflow in which the application reads data in Redis, if it's not available in Redis, it's fetched from AstraDB to Redis for a period of time. For writing, My application should be able to write on Redis immediately and move on, while further writing can be done asynchronously without slowing down the application. As my application is a chatbot, database writing will be a crucial part. What are the solutions? How can I write code for an write-behind pattern? I'm a beginner, any kind of help will be appreciated. 🩷
For the write behind you could use redis rq workers that you queue up after the data is written to redis. The fetching is simple the application looks forward in redis and if it's not there it gets it from the database. to be honest redis might be a bit of Overkill what kind of volume are you expecting?
I would like to call a function in a postgresql check constraint, in a class I define via SQLAlchemy ORM.
I've found https://docs.sqlalchemy.org/en/20/core/constraints.html#sqlalchemy.schema.CheckConstraint, but I don't understand how I'd go about using it
this is the function I want to check against
CREATE FUNCTION mytable_valid_name_constraint(identifier VARCHAR(20))
RETURNS BOOLEAN
this is the current column definition, but I don't understand how I am supposed to craft the SQL statement
name: Mapped[str] = mapped_column(TEXT(), CheckConstraint(text("()")), nullable=False)
``` Any suggestions?
Probably
CheckConstraint(text("mytable_valid_name_constraint(name)"))
Also nullable=False is redundant since that information could be taken from the type hint
that makes sense yeah
I'll test that, thanks
yeah, that was it
thank you very much
i changed one param in my db an now it says that its locked
If u plan to support multiple databases for your program, it will help. Some do need that, if they offer running against multiple different databases for some reason
Otherwise it will not.
YAGNI, you aren't going to need it.
It increases code amount and complexity, so it will not be justified unless u indeed needing it
How to implement it easier in python? 😄 Using strictly configured mypy/pyright, abusing less any, and yeah ABC/Protocols will help
====
Some examples when it was justified
Well, i had a case of needing to create read only database user for multiple different database engines with all configurations to run datadog user
as aws lambda with automatic expiring in 24 hours.
since i needed to create user periodically for postgresql, mariadb, mysql. it was justified to me making this abstraction
Also it was justified for me during writing code for provisioning some parts during database replications from production to staging.
as this code needed again working for different database engines (postgres, mariadb, mysql)
=====
if u develop backend application meant for distribution to other developers for running, where each dev could be wishing to run it at different db, it will be justified too then
for example some application monitoring health of smth, or chat app and etc
=======
If you know you develop private backend app that will run always only on postgresql...
...it has no meaning to support anything but postgresql.
use as docker container postgresql locally too, to have it during local development
Question is there a reason that sqlite3 in python is fetching an DATETIME collumn on my db table as str instead of datetime?
This is causing an issue in my c extnesion code when I go to process the sequence of values in each row for the collumn values 😅.
sqlite doesn't have a DATETIME type, it uses either a number or a string to represent it
Sounds like you just need to import the datetime library and convert the string
issue is that I do import it, on the C extension side and in the python script side too.😅
You can use adapter/converter: https://docs.python.org/3/library/sqlite3.html#adapter-and-converter-recipes
Sadly it seems the only thing that works atm is this:
Because I implemented the function call that grabs the values from each Row stored in the list otherwise from a C extension module itself.
Do you know a good website to download Hierarchical series?
#python-discussion message @wheat thistle, cc @opal flint
could it be solved using an aggregate query? the sqlite documentation notes that using MAX() on one column causes other columns to return values from a row containing the maximum, so it could be applied to the datetime column to return just the latest reports, something like: sql SELECT id, location, type, MAX(created_at) FROM report GROUP BY location, type ORDER BY location, type; ┌────┬──────────┬─────────┬───────────────────────┐ │ id │ location │ type │ MAX(created_at) │ ├────┼──────────┼─────────┼───────────────────────┤ │ 3 │ 'loc1' │ 'type1' │ '2024-04-20 10:20:30' │ │ 6 │ 'loc1' │ 'type2' │ '2024-04-20 10:20:30' │ │ 21 │ 'loc2' │ 'type1' │ '2024-04-21 10:20:30' │ │ 24 │ 'loc2' │ 'type2' │ '2024-04-21 10:20:30' │ │ 33 │ 'loc3' │ 'type1' │ '2024-04-22 10:20:30' │ │ 36 │ 'loc3' │ 'type2' │ '2024-04-22 10:20:30' │ └────┴──────────┴─────────┴───────────────────────┘ MRE: https://paste.pythondiscord.com/463Q
(explaining the query shows ix_report_created_at was unused, so it might be better to combine that into the ix_report_location_type index)
Yes, that's excellent!
wow that behavior is really surprising. it makes sense to take the id of the max row, but 🥴
I am wanting to delete the databse file while the software is running. Is there a way to ensure its not being used? I tired running con.close()
On Windows? Probably not 😅
Apparently it does it anyway, no idea why it’s throwing the error. Decided to wrap it in a try/except block and move on. Now im having issues with the installer due to write abilities, can’t remember how I solved this issue last time.
in this er model, should there be 3 tables. an artist table that has a primary key for identifying artists, a song table that has a primary key for identifying songs, and a performs table that links the two together?
Depends on if you want to do that m2m or not.
The code isn't in English. You didn't describe the problem, or what tech you use.
nvm bro ı just forgot to delete this message ı was new in server thanks for replying me someone solved it
Guys, I'm trying to add a simple unique field to a model of mine, just for learning using SQLAlchemy plus alembic
I made a simple person model:
class Person(Base):
id = Column(Integer, primary_key=True)
name = Column(String)
age = Column(Integer)
I made the Person.name field unique.
class Person(Base):
id = Column(Integer, primary_key=True)
name = Column(String, unique=True)
age = Column(Integer)
I added render_as_batch to my alembic.context.confuguration methods in env.py.
I'm getting this error that doens't take care of itself automatically:
File "C:\Users\zackp\.virtualenvs\migrations-learning-XL53Svsw\Lib\site-packages\alembic\operations\batch.py", line 669, in add_constraint
raise ValueError("Constraint must have a name")
ValueError: Constraint must have a name
My metadata comes form a Base = declarative_base() line, so I don't directly instantiate metadata to add conventions.
How can I fix this?
Working with a simple sqlite database, and to re-emphasize, I already added render_as_batch...
Take ink and splash it on paper
context: #discord-bots message #discord-bots message
i want to store a bunch of numbers like this:
py 43623536647361
to represent a connect 4 board game position. whats the best way to put this into a database?

CREATE TABLE IF NOT EXISTS foo (data BLOB)
it takes bytes
low big do you expect your integers to be in terms of bytes?
no like how do i make the integers blobs and vice versa
very long
like idk
20-30 digits?
bytes(num)
Idk :kekw:
dumb af 😭
me when
ok so its .from_bytes()
He said in terms of bytes 💀
idfk then
turns out it isnt
fucks sake
"Overflow error" py n = 12345 a = n.to_bytes() b = a.decode() print(b)
😭
Shorten your integers
What are you doing with all those
they're moves for connect 4
so 4343 would be:
play 4, play 3, play 4, play 3
im relying on idiots 
pls chill out before you violate #code-of-conduct even further
wait i've already broken it?
Yeah kinda
#discord-bots message can you give me the values of both vars?
I want to experiment a bit with it

im a genius
n = 12345
a = n.to_bytes(10, 'big')
b = int(a.hex(), base = 16)
print(b, type(b))```
there were 3 i posted
should be just under there
int ➡️ bytes ➡️ hex ➡️ back to int
easy
(im done for)
well with a db:
- get int ➡️ make bytes ➡️ store bytes
- get bytes ➡️ make hex ➡️ get int
you made it way harder 😭
Lmao
i blame you lot 🙏🙏
me
which db are u using?
alr solved
using text?
you can text to ai
I need to have this exact functionality but if command is awaited twice in cache() function, it raises errors. How to have this exact functionality without awaiting command twice.
async def store_data(key, value):
if isinstance(value, dict):
await r.hmset(key, value)
elif isinstance(value, list):
await r.rpush(key, value)
elif isinstance(value, (str, bytes)):
try:
json_data = json.loads(value)
await r.json.set(key,'.', json_data)
except json.JSONDecodeError:
await r.set(key, value)
else:
await r.set(key, str(value))
async def cache(command, key):
result = await command
if result:
print('DEBUG : Retrieved from Redis')
return result
else:
result = db.get_collection(key.split(':', 1)[0]).find_one({'_id' : key.split(':', 1)[1]})
value = result['data']
print('DEBUG : Cached from AstraDB')
store_data(key,value)
print('DEBUG : Added to Redis')
result = await command
print('DEBUG : Cache working in Redis')
return result
async def main():
await cache(command=r.lrange('users:1',0,-1),key='users:1')
asyncio.run(main())```
Raised Error :
File "d:\Coding\Applications\LLM-Apps\Projects\Snow\database-read-backend.py", line 31, in cache
result = await command
RuntimeError: cannot reuse already awaited coroutine
Never say "errors". Always copy paste the error. In full.
looks better?
You already have the result, why are you asking for it again? Just to check that it works?
You're trying to await coroutine twice
yes, and I'm asking about is it possible to have this functionality without getting this error or awaiting coroutine twice?
@shut tiger the first time if it returns none, the function retrieves data from the NoSQL database (the db object), stores it inside Redis. Then I need to read using the command again, to read the new cached value in Redis.
Perhaps make a lambda or use functools.partial
Or move redis call to cache function if you'll always be using lrange
I mean.. the code as it is now is kinda bad. You aren't trusting the library. That's madness.
How do I fix it? I'm a beginner. As I know the libraries don't have a solution which can let me pair with AstraDB effortlessly.
that's the complex part, I may need to use several different commands including different datatypes.
If you write the data, just don't try to read it. You know what you will get: exactly what you just wrote.
I was about to do that until I realized my application has a couple of places where it tries to read a particular field of a hash. Sometimes it's a particular field of a json. If it returns the whole document, that makes it extra work.
That seems irrelevant. If you have a dict like {'foo': {'bar': 2}} and you want the 2, reading the full dict from redis or not doesn't change anything.
You had the full dict before you read it anyway. You can just change the second result = await command to result = result and it would be the same. Or delete the line of course, which is my suggestion.
So it doesn't provide any optimization advantage if one single field is retrieved instead of a long hash as a python dictionary?
That's not what I'm saying. I'm saying your code as you showed it does this:
- write X
- read X
- return X
But the read is redundant. If you delete it you would get:
- write X
- return X
which is faster, and removes your problem.
yes, because each time the command might be completely different. Also, Some commands specifically target a specific field in a hash. It's not reading X, it's reading a part of X, which creates the main complexity.
But you are trying to call the exact same function twice. Do you expect to get different data from it the second time?
(an await is like a function call)
I'm making a project that is basically an atm system. I need to make a database for the cards and accounts. How should I best structure the cards and/or accounts database? As I am allowing for an account to have more than one card
can somsone please explain to me how does the i in game work
and what does this do
Sounds like a many to many relationship, which you can read about. But basically, you can have a cards table, an accounts table, and then a accounts_cards table. That last table would just have rows of pairs of IDs of accounts and cards.
If it's one to many (only one account can have a given card) then you just need a basic foreign key, no extra table
That's bad code from someone who has done too much C. i should be called game. And game should be games. And a tuple of two ints is horrible. And temp should be called current_best. And the return value makes no sense.
Basically everything is wrong.
Also, this is not about databases. So that is wrong too 🤣
yes there is a problem in the code but i got it thank you!
this code is supposed to check for a word that is the same written backwords (with a list) but there is a problem here can you help me?
what's wrong with a tuple of two ints
(supposed to be i not i(6) btw)
im only in 9th grade so dont expect much (also i didnt write this)
and i dont know if C and python somehow intertwine but okay youre the boss
so
do you even know sql

I never said that? 😭
https://sqlbolt.com - go back to the dungeon
SQLBolt provides a set of interactive lessons and exercises to help you learn SQL
didnt even write a url 😭
we love https://sqlbolt/
this things lying to me
SQLBolt provides a set of interactive lessons and exercises to help you learn SQL
how so?
Idk Ion understand it so I gave up for now 💀
Ima get the part of calling the function to generate the vid first then ill worry abt charging a token for it
You can't tell from the code what the different values mean semantically.
Hi
So i found a csv file with all movie names from 2006 to 2016 (im a noob in dis)
so if i give genre name
i want to get a list of all movies from tht genre
dataset link
ping me
or dm me
I would load that data into an SQL database and then write queries against that
oh k
isn't that just saying that tuples are bad in general?
I mean.. often yea they are. Returning values from a function is ok-ish. But only because python has no nicer thing.
And, because: ```py
import duckdb
df = duckdb.sql("select * from 'myfile.csv' ").df()
wt does this do
It runs sql against a csv file
anyone up?
You might as well ask your actual question either way
Hey everybody, I'm trying to understand how storing audio and video files on a database would work. For learning purposes, I'm just trying to create a website that people can just upload videos on and then have those videos saved and played back to them later using Django. How could I save these files? I assume I wouldn't actually be storing the audio/video file on MongoDB or some SQL database, but I'm not sure. Any advice is appreciated that points me in the right direction
the usual way to do this is to just store the files in some blob storage like s3 and store a file path or url to where you stored it
oh i see, that seems much easier then i thought it would be 😅
@grim holly If you're familiar with docker you can host Minio or SeaweedFS for S3 storage, if you don't want to use any cloud option
you can also just use your local filesystem, django supports it natively. unless you see yourself storing terrabytes of files that‘s definitely the most pain free variant
could someone explain to me whats wrong with my code? https://paste.pythondiscord.com/N3LQ
whats not working?
Save files to local file system with some kind of organization to keep the number of files in a singe folder under control. Save that path or relative path in the db with all other needed data. I would not try and put large binaries in a db. It will create a single gigantic file that is impossible to backup.
why is a large file impossible to backup?
Whatever you're trying to do, it sounds like you're doing it wrong... But any key-value store will let you do that as long as the keys are all unique
No, but there's SQLite
It isn't
I think they are looking for a database service provider.
With databases, you usually don't backup the database as a file, but through either a database export or other database backup integration. Recovering from files is very error prone.
Yes but a file is still produced no matter how you do it, and a large db will require a lot of io and time or bandwidth. If the media files are not in the db they can be backed up much more efficiently, and without locking the db.
Agree on solution (don't put large media in a database), I was just commenting on the backup part. I've backed up (and occasionally restored) some monstrous databases, it's not so bad with some planning.
I have a MySQL db that is about 200gb on disk with a table with about a billion rows and it is a nightmare to deal with. Long locking times on backup, and honestly would be a disaster to try and recover given the app downtime it would cause. You have any tricks? Honestly would love to hear about as I just have not dedicated the time to figuring out something better.
Have you looked at partitioning? What storage engine?
(Nowadays, I push as much as possible to parquet and duckdb when the use case allows, but I abused MySQL for many years)
For backups, you can do incrementals and do full backups less frequently. I've never replicated MySQL, but in past lives I'd backup off a replica (that was Oracle tho)
Also with sql server, we'd use filesystem snapshots and then backup off the snapshot. In AWS, could snap the EBS volume.
Another idea is copying it to parquet via DuckDB. No idea how the read performance would be, of course https://www.pauldesalvo.com/how-to-export-data-from-mysql-to-parquet-with-duckdb/
More on partitioning: you could partition by, say, year... and then just backup the current year. Then you're only backing up live data.
I tried replication one on mysql just feels really unstable and overly complicated. I think your point about incrementals is solid and that's probably something I need to look into. I've just been lazy using MySQL default backup. Going to have to get more custom.
Using innodb. I looked a little bit into partitioning but it was more from a performance point of view. I haven't really thought about it as being part of a backup strategy.
Partitions aren't technically needed for the backup case, it just fits well with the strategy. You can always backups/dump data in ranges without partitions.
I have to work with a postgres database that only supports bigint, however all my unique values are stored as sha256 hashes, which are naturally materially larger by quite a factor - whats the best hash function that returns in bigint format?
You can split it into four bigint slices. Or store it as a char
Its a pretty odd system I only have 1 column to work with and that 1 column only supports bigint
or uuid but that doesnt help much
what are these unique values for?
The database is structured with incremental id, the problem is I often need to go back and update individual elements
and incremental id is sketchy for that
why not generate uuids for primary keys if its readily available?
uuid is perfect for this
two primary reasons 1) I have to delete and recreate the database as it is incremental ID atm, and it cant be changed to UUID; 2) All my data in my other tables identified using sha256 hashes
ah rip then ur best bet is to do as boxed says i think u can store the hashes in TEXT columns
or varchar
Yeah that would have been my preferred solution as well, but I only have 2 columns to work with - ID (bigint or for new tables UUID) and json column (where all data is stored as a json object)
so I cant just create a new column make it text and use that as my effective unique key column
so my thinking is to take my sha256 and then hash that into a format that is bigin compatible
collisisions should still be extremely extremely rare
or I mean I could slice out the last digits of my sha256 hash, but that feels so hacky
perhaps using hashlib then using the hexdigest to get the hex number and finally convert it to int like this? int(digest, base=16)
Yeah exactly
def f_sha256_to_8_byte(f_sha256_input_value):
return int(f_sha256_input_value, 16) % 10**8
it just hurts my eyes to effectively hack off the rest of my hash
just calculated risk of collision, with 10 million records the probability of one overlap stands at 5e-06
so should be alright
hmm its still 5% at 1 bil records tho
Okay minor mistake in the above it should not be mod 10**8, but something substantially higher
8 bytes iirc bigint is
Yeah should be 2**63
def f_sha256_to_8_byte(f_sha256_input_value):
return int(f_sha256_input_value, 16) % 2**63
I could technically double it because it also uses negative values
but meh
roughly 10**19 i think
BIGINT 8 bytes -9,223,372,036,854,775,808 +9,223,372,036,854,775,807
!e
import math
x, y = (-9_223_372_036_854_775_808, 9_223_372_036_854_775_807)
z = y - x
print(math.log(z, 10))
@west hill :white_check_mark: Your 3.12 eval job has completed with return code 0.
19.265919722494793
or 2**64
That would be unsigned, you'll need one bit for the sign.
yeah
True I will only be using positives though, so that would make it 63 no?
Can I ask questions about pandas to study in this chat or will I be banned if I do that?
Hi, I need a help with Mongodb. When I am updating data in db for a user I want that a particular field value should not be exceed 100, How to do that.
await cllection.update_one({"user_id": 111}, {"$inc": {'data': 30}}, upsert= True) # now i want that this data value should not be exceed 100, From begining if data value was 80 it should become 100 not 110.
instead of increase set the value using something like this min(original + increment, 100)
Well then I have to do one more query to get the original value right. I wanted to do in a single query.
Ask
$min operators just checking the existing value and the current value then compared between them and whichever is less set that value.
yeah i am not too experienced with mongo but can u chain the two operations inc and min in one command or perhaps run update twice if not thats prolly closest u will get
u just need to do min with 100 as an argument
It's okay, I am also learning.
No it will raise an error, if I try to update same value twice in a single query.
Ah, I wanted to do in a single query, If I have to do 2 query then I can just use findOne then check the current value basis on that update.
using inc + min is probably faster than fetching the data i think
Yup right
Let me get this straight...
So SQLAlchemy functions typically depend on the session object.
But that session object requires an engine that has been bound in some way to the Base.metadata.create_all(engine) method...
How on earth are you supposed to manage all of that, make sure that your engine has touched the Base.metadata.create_all() method, and that every session you're using has been created from that engine when you're designing your units of work?
I don’t think you need the students.[column_name] at the start as you specified from Students. I’d recommend calling the table tblStudents, but it’s up to you
Is the screenshot your entire database or are there some columns missing? Just looks like you’re requesting stuff that doesn’t exist
Ah ok cool let me have a Quick Look again
What does the left join bit mean, I’ve not seen that before?
Ah cool cool
Ah
Your problem is you’ve only asked for two courses
It’s getting the FirstName, MiddleName, LastName, Description, and two names
I think that’s why
Potentially
Oh okok
So having Name1 + Name2 in one column? Is that what you mean
Oh ok. The only way I can think of doing that is with python but you mentioned you’re not able to do that.
I am not 100% sure if there’s a way to do it as one field within SQL 🤔
That’s alr no problem. Good luck with finding your solution ☺️
The current function adds numbers between 0-9 into a queue named q
She gets a number that goes into x aswell
this function gets a number and checks if it’s in the queue then returns true if it is and false if not.
Can I get help with filling it?
This is not a database question and looks like homework.
how do i write a program that: when clicking a button it generates something from a determined list?
which part is the problem? running a function when a button is clicked, finding the list, or selecting randomly?
Selecting randomly
ok i have another question
if i want to put an image on tkinter
but i want it to be displayed as various symbols like (-,*,@)
how
This is the wrong channel. Your question is not related to databases.
You iterate over the pixels and insert different chatacters depending on the color
Base.metadata.create_all dosen't "Bind" anything in any way, it just creates tables
And it wouldn't be used in production environment, you'd use a migration tool, typically alembic
But when I wasn't using it I kept getting an error saying that the models didn't exist when I was testing 🤔
Well, yes, because table didn't exist
Ohhhhh
Wait
Maybe its because it was a memory database that wasn't instantiated yet
When testing...
Potentially?
I don't know how you set it up so I can't tell
Just wondering how you would make it work without using Base.metadata.create_all 🤔
The engine has to know at some point the models exist right?
You have to create tables in your DB somehow
Ah wait
Otherwise how would you expect it to work?
I'll make a little repo
I just always thought it needed the base.metadata.create_all to work 😅
create_tables is only really useful for development or testing, otherwise you'd want to use alembic
No, if model in your application is compatible with table in your db it will just work
Sqlalchemy just generates sql and sends it to the server, if it can be executed you'll get the result
I've been misunderstanding it this whole time then
rip
But ok nbd
Not much code to change
E.g. if we have table users(id, username, email) but on python/sqla side we only have id, username it will still work, at least when querying that table
inserts may fail depending on if email is nullable or if it has a default value
does anyone know how to make a database to let only a few commands be used in a specific channel ?
You may just need to clarify but this looks like a question for #discord-bots that has nothing to do with databases?
Someone said that i need a database for that
Pasting large amounts of code
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
You need a database to have a bot that does much of anything. SQLite is a good beginner option. There are plenty of tutorials you can follow
Omg, sqlalchemy and alembic finally make so much sense to me.
Migrations aren't something you should include in a package. They're something you should include in a project / script that uses a Python package.
The database is sort of a separate thing.
You can define models in a package but you don't necessarily have to include all the migrations etc. for it, or even abstract the session creation part or the Base.metadata.create_all() piece.
If you need the database models you can just work with those in a package, and just use the other details on the scripts where you actually run the code and connect everything with the db_uri 🫠
makes sense
At least, I think that's how that works 🤔
Maybe it still makes sense to keep migrations in a package.
Just if you put the migrations in a package, and you pip install that package, how are you supposed to use alembic to migrate it inside the new project? 😅 Now it doesn't make sense.
You need some kind of install script or you can do it manually.
I've never used this but it looks helpful; https://flask-migrate.readthedocs.io/en/latest/
^
dont know if this is the right channel, but I am encountering an issue when using Python to refresh my SQL queries in Excel. I want to create a button in Excel that runs a pyhton script. This python script will firstly refresh all data in excel and should then do some additonal steps. However, for some reasons it will not load all the data before starting to do the next steps. The sleep function does not work as it still doesnt refresh the data. any suggestions to avoid this issue?
Ok, so only the first part of what I said applies
Show your code. #❓|how-to-get-help
Migrations are just the history of how your models have changed over time. This makes upgrading possible.
the python code below. the code in excel to click the button is attached as picture.
import xlwings as xw
from datetime import datetime
from time import sleep
import win32com
def SQLOpdater():
app = xw.apps.active
wb = app.books.active
wb.api.RefreshAll()
sheet = wb.sheets['Afgrænsninger']
today_date = datetime.now().date()
sheet.range("C26").value = today_date
value = sheet.range("C7").value
sheet.range("C27").value = value
def RunQueryAndAppendResult():
app = xw.apps.active
wb = app.books.active
ws = wb.sheets["Afgrænsninger"]
last_row = ws.cells(10, "K").end("up").row
for i in range(3, last_row + 1):
if ws.cells(i, "K").value is not None:
ws.range("C4").value = ws.cells(i, "K").value
ws.range("C5").value = ws.cells(i, "L").value
ws.range("C7").value = ws.cells(i, "M").value
ws.range("C9").value = ws.cells(i, "N").value
ws.range("C11").value = ws.cells(i, "O").value
ws.range("C12").value = ws.cells(i, "P").value
ws.range("C14").value = ws.cells(i, "Q").value
ws.range("G3").value = ws.cells(i, "R").value
SQLOpdater()
for table in wb.sheets['Finans Udtræk'].api.ListObjects:
while table.QueryTable.Refreshing:
sleep(1)
pass
AppendToFinalTab(i)
else:
# Exit the loop if a blank cell is encountered
break
when i put this code it it should create a new database file right ? conn = sqlite3.connect('All_agents.db')
Hi Everyone, has anyone achieved running an Oracle DB container natively on ARM (apple silicon specifically) without Colima as a runtime replacement?
I want to learn about database coding bc i need a database for this highlighted text. But i dont know how i could use it in my bot code
can you clarify what your discord bot needs to do? from that description alone, it doesn't sound like you need a database
in this case, do you have data that needs to change dynamically? e.g. does the server admin need a way to allow/disallow the command in different channels?
or do you already know, at the time of writing your python script, which channels you want to allow your commands in?
if you don't expect the channels to change then it would be reasonable to hardcode the allowed channels into your script, saving you the time and effort
Alembic feels really annoying to use 😅 How do you guys point your alembic file to different database urls? Like if you have a database url for production and development, how would you separate them and make the migrations with alembic?
i havent used migration tools much, but perhaps you could check for an environment variable in env.py or use Context.get_x_argument() with the alembic -x foo ... option to toggle between URLs? cant say if either will be a good dev experience
help me pls #1234993295881670749
I apply migrations before/during deployment automatically, if it fails I investigate
Gotcha, also just found a programmatic way to do it with the alembic config and alembic command (surprisingly thanks to Google Gemini I'll add haha).
Now the only thing is,
I created a new database (simply sqlite3) with the most recent Base.metadata.create_all(engine) since Alembic doesn't create databases from scratch I guess (?) and I'm getting Key Errors when I try to upgrade head OR errors saying this database isn't up to date with the latest version... which is true... when I try to make a new revision because I'm trying to create a new database lol.
So it looks like I'm missing something 🤔 maybe I should just make a new empty database first without any tables then let alembic upgrade it from scratch?
Trying stuff
I'm a bit confused why it wouldn't work. In Django you literally just have to migrate and it both creates a new db and applies all the history to it.
With alembic it's so... I mean I kind of hate it to be honest lmao. It's just the only option for SQLAlchemy, which is kind of the only option for ORMs outside of Django, primarily because it's the only one with a somewhat existing automigration system alembic, which ironically is pretty manual 😅 but not as manual as writing your own migrations.
I already know while writing python code what channels i want to allow certain commands in. Let me explain what i mean:
I want to make multiple commands like !cypher_ascend_a and !cypher_ascend_ b. I want to make it that those 2 commands (i want to add more commands) can only be send in a channel named “Cypher” and i want to block all the other channels to that command. So eventually i will have multiple commands like those 2 and than that j can choose in my code what command can be send in witch channel. I asked it in #discord-bots but someone said i need an database for that so thats why im in this channel (Let me know if you understand what i mean otherwise i explain it a different way)
ah ok, hardcoding the channel name would be the easier option then
ill write an answer in the discord bots channel momentarily
that would be very kind of you
Why not use Django then?
It's a webscraper, not really a website. I know there's a way to make it a Django project and just use the ORM, but it seems like a lot just for the ORM and all the extra setup.
imo it's not a lot. You just copy paste the 3 lines from inside manage.py for the setup.
I want to use alembic and SQLAlchemy it's just hard for me to understand haha
Whoa
Harder than copy pasting 3 lines of code from manage.py? :P
Then I definitely misunderstood something haha
Lots of people are super confused about django... I hear so much nonsense all the time
https://discordapp.com/channels/267624335836053506/1235281158640504923
if anyone could take a quick look
What is the best PostgreSQL driver for Python?
asyncpg or psycopg3
Thank you
Hey, wondering if anyone can take a look at this issue I’ve been consistently having
https://discord.com/channels/267624335836053506/1235767714228736020
In SSRS. I have columns grouped together by company name but for some reason I can't get the title to appear ABOVE. it only appears to the side. How do i get it to appear above and take up less space. I'm really just trying to fit everything on one page
Anyone can give me a hand?
questions about pandas belong in #data-science-and-ml. Be sure to post the relevant code, the whole error message as text, and a sample of the dataframe with print(df.head().to_dict('list')), without posting screenshots.
👍
@rough hearthhow do i send code to discord btw
!code
ok thx
I bet this bot was made using python
!src
you can see the source code using the link above, but that is pretty offtopic for this channel - if you are just testing random stuff please use #bot-commands
import aiosqlite
import tkinter as tk
import threading
class DatabaseUser:
def __init__(self):
self.conn = None
async def get_connection(self):
if self.conn is None:
self.conn = await aiosqlite.connect("nation.db")
await self.create_table()
return self.conn
async def create_table(self):
conn = await self.get_connection()
async with conn:
cursor = await conn.cursor()
await cursor.execute('''CREATE TABLE IF NOT EXISTS users (
user_id INTEGER PRIMARY KEY,
nation_id TEXT
)''')
await conn.commit()
async def add_user_nation(self, user_id, nation_id):
conn = await self.get_connection()
async with conn:
cursor = await conn.cursor()
await cursor.execute("INSERT INTO users (user_id, nation_id) VALUES (?, ?)", (user_id, nation_id))
await conn.commit()
async def is_user_registered(self, user_id):
conn = await self.get_connection()
async with conn:
cursor = await conn.cursor()
await cursor.execute("SELECT * FROM users WHERE user_id=?", (user_id,))
result = await cursor.fetchone()
return result is not None
async def remove_user(self, user_id):
conn = await self.get_connection()
async with conn:
cursor = await conn.cursor()
await cursor.execute("DELETE FROM users WHERE user_id=?", (user_id,))
await conn.commit()
async def get_user_nation_id(self, user_id):
conn = await self.get_connection()
async with conn:
cursor = await conn.cursor()
await cursor.execute("SELECT nation_id FROM users WHERE user_id=?", (user_id,))
result = await cursor.fetchone()
return result[0] if result else None```
An error occurred: threads can only be started once
Whatever command I use it returns this error
Hi is there any free cloud database for mobile app development using python?
- This is the wrong channel
- The error message is quite clear
-
I opened the post in #1035199133436354600 but still no one replied there due to which is automatically closed
-
I'm not opening the threads only
Do you have a complete traceback?
Wdym?
The full error with line numbers and filenames
It's not giving the error in console, when I use the command then it gives error
You can't run this code locally?
I got the answer and it got fixed. Thank you for your support
What was the problem?
async with conn:
Was that the problem or the solution?
Problem
I had to remove all of them
Hmm.. you wrote that much code without testing the parts firsts?
Yeah, I thought it would be correct
hey guys. python newbie here. need some help with above
class ORMFlops(Base):
__tablename__ = 'Flops'
id: Mapped[String] = mapped_column(primary_key=True, default=uuid.uuid4)
cardoneid: Mapped[String] = mapped_column( ForeignKey('PossibleCards.id'))
cardtwoid: Mapped[String] = mapped_column( ForeignKey('PossibleCards.id'))
cardthreeid: Mapped[String] = mapped_column(ForeignKey('PossibleCards.id'))
handid: Mapped[String] = mapped_column(ForeignKey('Hands.id'),index=True)
# relationships
# how can i populate cardone by cardoneid cardtwo by cardtwoid
# and cardthree by cardthreeid from PossibleCards?
cardone:Mapped["ORMPossibleCards"] = relationship()
cardtwo:Mapped["ORMPossibleCards"] = relationship()
cardthree:Mapped["ORMPossibleCards"] = relationship()
how can i properly define relationship of each card to map to a row by the matching foreign key?
the code is using sqlalchemy lib
would appreciate your help , I find the documentation thin
https://docs.sqlalchemy.org/en/20/orm/join_conditions.html I found an example similar to yours.
Why have 3 ids? Can you just have a many to many relationship here? Also link that Blaziken sent you should solve that issue
Ik this is super questionable but can someone walk me through how I can make a database for this code?
heyo, i got a question. whats the difference between mySQL and postgresql. im currently using postgres but dont know if i should switch to mySQL. my use case would be for a single server bot of 250 members with moderate growth and a leveling system
mysql is simpler and has less features, postgres is more complex. they implement many of the same features, though
can mySQL scale like postgres can?
and does mySQL need a database to host like postgres?
for a single guild bot expected to serve a few hundred members, i imagine any relational database would be sufficient for your use case
last time i did a sqlite benchmark with 10 concurrent connections, it got CPU-bound and only managed like 140 select queries per second which was much lower than i expected, but that was still 140/s, or 12 million if you stretch it to a day, and when i switched to a single connection it managed ~2000/1s - so how much traffic do you expect, in terms of concurrent users and queries? doing some napkin math, maybe you run 3 queries on every message and you get 1000 messages daily, so thats 3000 queries daily / 1 query every ~30s, plus however many you get from users running your bot's commands
in other words, databases are really fast, so "scaling" in this context might mean going to hundreds or thousands of queries a second, which sounds like way more than you'll need for the situation you've described
they both run as servers, yes
would you guys recommend this? https://aiven.io/free-mysql-database
I am generally very skeptical of any "free" SAAS
because they are never truly "free" or without some serious catches
for the sake of running one local postgres or sqlite instance...
good?
cursor.execute('''CREATE TABLE IF NOT EXISTS config (user_id BIGINT PRIMARY KEY, guild_name INTEGER, channel_name INTEGER, role_name INTEGER, message_content INTEGER, invite_content INTEGER)''')```
are they names or ids? usually names aren't integers.
I am using PostgreSQL database with asyncpg.
How do I get the id of that column? the column id is auto increment.
result = await db.execute("INSERT INTO table(user_id) VALUES($1) RETURNING id", ...)
print(result)```
it prints `INSERT 0 1`
And what did you expect?
value of id
actually i need to use db.fetchval()
I mean.. we don't know what that value would be...
it is auto increment
...yea but.. the first row will be zero or one...
- Building a Custom Context Manager in Python for Efficient Database Operations with SQLite
- Python Programming Tutorial: Designing a Custom Context Manager for SQLite Database CRUD Operations
- How to Create a Custom Context Manager for SQLite Database in Python for Table Manipulations
- Step-by-Step Guide to Building a Python Custom Conte...
fyi what you printed there is supposedly called a command status or tag, though i couldnt find more generic info about it in docs
https://magicstack.github.io/asyncpg/current/api/index.html#asyncpg.connection.Connection.execute
Return str: Status of the last SQL command.
https://www.postgresql.org/docs/16/sql-insert.html
On successful completion, an INSERT command returns a command tag of the formINSERT oid count ``` The count is the number of rows inserted or updated. oid is always 0 ...
What is the best SQLite async driver for Python3 using asyncio?
Not sure about "the best" but aiosqlite is probably the most popular
https://github.com/Rapptz/asqlite
i like asqlite's implementation and their convenient/ideal defaults, but it's not well known, has no test suite, must be installed with pip install git+https://..., and, if you're planning to use an ORM, sqlalchemy only supports aiosqlite
And to write on filesystem, like as to write images (jpeg) in asyncronous wasy, are there any lib to write with Python using asyncio?
Could someone please help with this? #1237180469150486670 message
you could use aiofiles for that, although if they're small files, writing is not likely to block for a significant amount of time
I have a doubt.
I have a program that must always be turned on.
It connects to the database and then leaves the connection open and whenever you want to do a select and a commit you don't need to open the connection again.
However, this creates a problem, if you lose connection to WiFi, the code understands that it is the database that is not responding and is forever waiting for a response.
I'm using this connection in a try/Exception, but the program does not generate an exception.
How do I generate an exception if I don't have internet access?
I'm using mysql-connector-python==8.3.0
Database libraries typically have an option for the connection timeout
I think that's connect_timeout here: https://dev.mysql.com/doc/connector-python/en/connector-python-connectargs.html
Yes, but this timeout only works when connecting to the database and not if the connection has already been opened.
There should be a timeout
So if I set a timeout in the database connection configuration, does it stay for all parameters? (Commit, select, ect..)
I believe the timeout is more on the socket level 🤷♂️
:incoming_envelope: :ok_hand: applied timeout to @real raptor until <t:1715085858:f> (10 minutes) (reason: duplicates spam - sent 5 duplicate messages).
The <@&831776746206265384> have been alerted for review.
I'm going through a difficult time, can anyone help me (conceptually)
I need to process a folder that has several csv files and generate 3 other files (also csv)
The problem is that the files are of varying sizes (some very large)
currently I try to treat the files by dask dataframe
build the code that accesses the files and generates dataframe but when exporting it keeps getting a memory error
I guess they are related in some way? Meaning you need to use data from them all at the same time?
Has anyone ever gone through something similar?
yes, since I need to generate a concatenated file they
You said "yes" and then the rest of the sentence did not back up that yes. Just concatenating files can be done row by row.
generally speaking you have to either do things in chunks or via streaming when it becomes larger than your memory
just concatenating things do not require having any more than a single line in memory at a time though, even pure python can do that with no problems if you don't needlessly collect all things into memory
Is there another way to join files without having them all in memory? If there is, I don't know
JOIN or CONCATENATE?
concatenate
literally ```py
for file in files:
for line in file:
output_file.write(line)
You can write byte by byte. I think you have some conceptual problem about what file APIs look like.
You can open files in append mode. You can open them memory mapped. There are many ways.
not sure about how it works in dask
with pandas you would have to manually work in chunks since it does not have a lazy api
with polars it would also be trivial, scan_csv -> concat -> sink_csv
Skip/write first line of first file (should contain a header), then simply read some bytes from that file, write them into your destination
Repeat for each file 🤔
You don't have to keep everything in memory if you just want to concatenate them
Hey, do you guys have any tips on how to look for weird data/value in our database that would cause slowness in connected applications? (MySQL)
For example: app executing queries is slower than directly executing app's queries in the database terminal. I'm not sure if it's because database caching is available for either case or not, but I'm 100% sure that the problem is with the weird data in our database, because I'm able to reproduce the problem in another instance.
Is it bad idea to have something like this done for default admin user? https://github.com/Nipa-Dev/lemon-API/blob/main/postgres/init.sql#L27
Obviously not really secure in any way, but is there a way to create this admin user based on ENV variables? And how would I do that as password hash is required, which is done in here: https://github.com/Nipa-Dev/lemon-API/blob/main/lemonapi/utils/auth.py#L88-L102
or would it make more sense to try something else, maybe trying to initialize the account in app startup event?
also, what about the SQL queries in general, should something be changed/done differently?
Using the asyncpg/aiosqlite for non blocking use, is the same as to create a second thread and use a normal (blocking) postgresql/sqlite driver?
So you've done queries that don't involve this "weird data" and found those queries are as fast as terminal?
Then fix your data and whatever it is causing it to be "weird".
Yes, but I'm finding a way to discover the cause. Because that data is causing general slowness in the search function of our Laravel app (Snipe-IT). Would you able to suggest something like a query that would find data that often interferes with MySQL connectors (not language-specific, in general)?
can someone explain me what is sneak peek in sql database?
never heard about it in this context. Where did you hear it?
idk someone random in my dm asked me that
i was also thinking what he is trying to say
You should give us the full quote I think
he just said me that
he gave me this code only:
# Method to fill sneak peek number 1 data
pass
the database is mysql
I think you should ask him. This is not a database thing.
i also think so anyway thx
Why are you so sure it has to do with your data and not the way that you're accessing it?
Because I was able to reproduce this in a fresh Docker instance of that software. I exported the database and imported it into Snipe-IT's database, and the seatch is still slow.
it doesn't seem feasible to me that it's caused by only the data. if the query in the app is using parameterized queries, then the database can't optimize for the specific values of the parameters. but when you run the query directly, the DB does have access, and can optimize the query better. I would check the query plan to see if they are what i expect.
In my program I'm attempting to use sqlite3 database but my program isn't doing what I want it to do. I've deeply analysed my programming and nothing is wrong with my code, but there is something wrong with the database, because when I click on the database file, at the top it says: "this file was loaded in wrong encoding utf-8 ". How do I fix this? (I use pycharm Community edition)
I don't want to contradict you, but it sounds like there is something wrong with your code, or how you are running it. Can you pastebin the code?
or, there is something wrong with clicking it to open it. What application says the file was loaded in the wrong encoding?
The database file is a binary file; you can't open it in a text editor. See also https://www.sqlite.org/fileformat.html
import sqlite3
def make_user(name,pin,balance):
connection = sqlite3.connect('data.db')
cursor = connection.cursor()
cursor.execute('INSERT INTO users VALUES(?, ?, 0.0)', (name,pin))
connection.commit()
connection.close()
This is the creating user function. This function is imported and is being called by another file
import sqlite3
def find_user_by_name_and_pin(name, pin):
try:
# Establish a connection to the database
connection = sqlite3.connect('data.db')
cursor = connection.cursor()
# Execute the query
cursor.execute('SELECT name, pin FROM users')
user_name_and_pin = [{'name': row[0], 'PIN': row[1]} for row in cursor.fetchall()]
# Search for the user
for user in user_name_and_pin:
if user['name'] == name and user['PIN'] == pin:
print('User found!')
return user
# If no user is found
print(f'User with name "{name}" and PIN "{pin}" not found.')
return None
except sqlite3.Error as e:
print(f"Error accessing the database: {e}")
return None
finally:
# Close the database connection
connection.close()
And this the file where I try to compare the arguments I receive from the user and compare them to all the usernames and passwords in order to find the user
OK, and what application is open the file when you click on it?
What do you mean by application?
You said, "when I click the file, at the top it says..." Where are you clicking the file? Where exactly is "the top" where it says that?
So after the database was created, I clicked on the database file.
On the now opened database file, at the top of the file (as a message) the message said this file was loaded in wrong encoding utf-8.
I'll send a picture
@baki ☝️
Oh ok
I searched online if when the message "this file was loaded in wrong encoding utf-8" comes it can give you errors... is this true?
If not, please can you look at my code and see if there are any problems
your database is probably fine. You shouldn't try to open it here (I think this is VS Code?)
I've been trying for hours
Im using pycharm Community edition
notice the blue bar suggests installing a plugin that will know how to open SQLite files.
ok, that's the application opening the file then.
Ok I'll try downloading it
Ok, thanks for the help sir
Agree it seems fine, that photo looks about like what you see when you open a sqlite db somewhere you aren't supposed to 😛
Hey I am using dbeaver and I am using a sqlalchemy db. To connect to the db I use sqlite. When I try searching a column in the username in dbeaver I get the error below. Everything works in the code why am I getting the error in dbeaver?
Here is the error in dbeaver.
https://paste.pythondiscord.com/YO5A
I just want to add I view the the specific record piciho8411@ociun.com in the the user table but I cannot search it.
hi guys, do u know SAS programming language? there's a job with it in the description, I know c#, SQl, javascript, and Python primarily...does it work similar to one of those? which one? and how fast could i learn enough to mention it in my resume?..... what r your thoughts?
Out of curiosity googled it.
SAS looks like Fortran/Cobol to me in terms of syntax.
Supposedly data management language for data scientists and some data engineers accordingly.
I have some questions regarding how much it is outdated and used.
https://www.reddit.com/r/datascience/comments/j2x2wp/sas_is_easily_one_of_the_worst_languages_i_have/
This thread is useful full of opinions on it.
I will summarize it as tech from 30 years into past, that has some usages among non developers (language for mathematicians) for its a lot of GUI drag and drop stuff.
Tldr, prehistoric alternative to R/python looks like
I would avoid its learning at any cost I think, even if I was data scientist/mathematician
as funny argument, if u will learn it, then u will posses negotiating skills similar to having learned COBOL i guess
Oh yes, COBOL and SAS are very closely located. 0.85% for COBOL https://www.tiobe.com/tiobe-index/
And 0.76% for SAS.
So u will know language even more rare someone knows than COBOL. That is some argument for very interesting salary negotions.
TIOBE
why would they use it then? maybe I can change the whole structure, make them migrate to Postgresql.... however it said to be required 😔 and I really need the upgrade of jobs... say bye to Walmart forever 🤣
even better! (worse) this language is not open sourced 😄 A single company SAS smth is owning it and desinging its GUI to click things
shrugs. For some companies COBOL/FORTRAN/SAS can be indeed a thing they long used
because they are probably over 50-60 years old in average and that is a thing they got used to
like... u can hit every nail with a tool you got overly comfortable
they were not lucky to get comfortable to this, or it was a right decision at a moment of their time, or they learned it because of succesful SAS propaganda and or they had many ||ancient|| mathematicians in their ranks in general.
Considering that it is closed source ancient language of mathematicians with such... history. Tbh nothing comes to my mind for real reason except succesful subterfuge propaganda among mathematicians in a very long play (it was first released in 1972 year after all)
funny enough developed and owned by company in US as well. So it is a very close brother to COBOL.
I think it's probably easy to learn.
I think you are confused about fortran vs cobol
I'm finally less dumb about SQL now 🙂
Database triggers
So basically in my application i have to inc or dec a few values in a few tables on creation / update / deletion of rows in some other table
Currently using sqlalchemy i query all required rows modify them and commit to database these are never used again, only queried to modify 1 column of said row
So i came across database triggers on postgres, i think with them this can be done faster and simpler
Am i in the right direction ? Does this sound like a good way to achieve my requirements ?
Maybe yea. But also maybe you are doing the update wrong? You can tell SQL to increase/decrease instead of giving the full number. update foo = foo + 1 where .... or something like that. In Django you do that with F objects.
Yeah instead of querying i could just execute a update statement on the database in sqlalchemy, now I'm not sure which way sounds better
If i do that in backend it's a bit simpler and explicit easy to understand how the value gets updated
But with triggers it's abstracted to the database side, might be quicker and backend code becomes simpler
If the trigger does a full select it's also more likely to be correct.
Yeah i think triggers are the way to go here, I'll look into them, thank you for the help
I personally find triggers hard to work with because your logic because fractured beween db and your application 🤔
As boxed mentioned you can simply do an update, is has some benefits over doing simple update ... set ... like sqlalchemy would do when you just save your model
update ... set field = field + 1 wouldn't cause any race conditions
I think if you DO have triggers, you want them written in your code and have code on deploy that makes sure the triggers that exists are exactly matching those in the code.
with sqlalchemy, i believe you can define triggers in your code. i agree it's easy to get them out of sync (along with stored procs, rules, and so on), so you need to do some effort to keep them in sync
hey guys anyone knows how to work with couch base
Hello guys, little question here. In college while doing Databases I they told us that relational databases shouldn't have "lists" in a column.
I'm trying to do a very simple library database. I have topics, and authors. Each book can have more than one topic and more than one author and visceversa.
Should I make 2 different tables for the relation topic - book and author - book?
I was just wondering if this is generally considered "the best way". I'm not particularily concerned about speed or storage since it's gonna be a relatively small database but still it would be good to do something good.
I was thinking maybe I have a table for topic-book relationships and then I also have one "list" inside the books so that I can quickly fetch the topics from that book? That way if I want to quickly see all the information about a book I don't have to do another look up in that other table and if I want to search by topics I don't have to look inside every single book to see if they have that topic.
i believe doing this would make it easier to accidentally get your topics out of sync since you'd be duplicating the topics in your book and book_topic tables, hence why avoiding lists is the first normal form in database normalization
https://en.wikipedia.org/wiki/Database_normalization
one association table with a compound primary key and an index with the keys in reverse should be enough to get efficient lookups for book->topics and topic-books while satisfying 1st normal form
an index with the keys in reverse, I haven't really messed with that kind of stuff, is something that you configure within the table?
I see, thank you very much :)
Generally yes, but it depends
If the column doesn't depend on any other piece of your database and nothing depends on it then it can be anything, e.g. json
Alright, I have the following query now:
SELECT book.isbn, book.title, book.place, book.publisher, book.dateIssued,
book.edition, book.abstract, book.description, book.ddcClass,
GROUP_CONCAT(DISTINCT author.name) AS authors,
GROUP_CONCAT(DISTINCT topic.topicName) AS topics
FROM book
LEFT JOIN bookAuthor ON book.isbn = bookAuthor.isbn
LEFT JOIN author ON bookAuthor.authorName = author.name
LEFT JOIN bookTopic ON book.isbn = bookTopic.isbn
LEFT JOIN topic ON bookTopic.topic = topic.topicName
WHERE book.isbn = ?
GROUP BY book.isbn, book.title, book.place, book.publisher, book.dateIssued, book.edition;
Which returns authors as a single string, for example: "Bioy Casares, Adolfo,Borges, Jorge Luis,Ocampo, Silvina"
As you can see it's pretty hard to parse (authors are ["Bioy Casares, Adolfo" , "Borges, Jorge Luis", "Ocampo, Silvina"])
I could try to use a separator to make it easier to parse, but GROUP_CONCAT(DISTINCT author.name, "|") is not supported. I could try using subqueries like so:
SELECT book.isbn, book.title, book.place, book.publisher, book.dateIssued, book.edition,
(SELECT GROUP_CONCAT(author.name, '|') FROM (SELECT DISTINCT author.name FROM author INNER JOIN bookAuthor ON author.name = bookAuthor.authorName WHERE bookAuthor.isbn = book.isbn)) AS authors,
(SELECT GROUP_CONCAT(topic.topicName, '|') FROM (SELECT DISTINCT topic.topicName FROM topic INNER JOIN bookTopic ON topic.topicName = bookTopic.topic WHERE bookTopic.isbn = book.isbn)) AS topics
FROM book
WHERE book.isbn = '?';
But I think subqueries are less performant than joins? I want to retrieve the authors and topic strings and parse them to get a list of authors and topics. What would be the best way of doing this? (is using GROUP_CONCAT even advisable?)
(bookAuthor and bookTopic are indexed as was suggested earlier)
Why would you need DISTINCT in the concat, the link should be unique already? And if you select the same column you are using to link with another table, you do not need the other table at all.
SELECT book.isbn, book.title, book.place, book.publisher, book.dateIssued,
book.edition, book.abstract, book.description, book.ddcClass,
GROUP_CONCAT(bookAuthor.authorName, '|') AS authors,
GROUP_CONCAT(bookTopic.topic, '|') AS topics
FROM book
LEFT JOIN bookAuthor ON book.isbn = bookAuthor.isbn
LEFT JOIN bookTopic ON book.isbn = bookTopic.isbn
WHERE book.isbn = ?
GROUP BY book.isbn, book.title, book.place, book.publisher, book.dateIssued,
book.edition, book.abstract, book.description, book.ddcClass;
the link is not unique:
('9789875662445', 'Antología de la literatura fantástica', 'Buenos Aires', 'Debolsillo', '2014', '9na. ed.', None, '407 p. : il.', 'A863 732ant', 'Bioy Casares, Adolfo|Bioy Casares, Adolfo|Bioy Casares, Adolfo|Bioy Casares, Adolfo|Bioy Casares, Adolfo|Borges, Jorge Luis|Borges, Jorge Luis|Borges, Jorge Luis|Borges, Jorge Luis|Borges, Jorge Luis|Ocampo, Silvina|Ocampo, Silvina|Ocampo, Silvina|Ocampo, Silvina|Ocampo, Silvina', 'ARGENTINA|CUENTOS|DOCUMENTOS TEORICOS O METODOLOGICOS|LITERATURA|LITERATURA ARGENTINA|ARGENTINA|CUENTOS|DOCUMENTOS TEORICOS O METODOLOGICOS|LITERATURA|LITERATURA ARGENTINA|ARGENTINA|CUENTOS|DOCUMENTOS TEORICOS O METODOLOGICOS|LITERATURA|LITERATURA ARGENTINA')
The separator is working correctly but it's very clear that authors and topics repeat multiple times :/
although it's true that I was using redundant LEFT JOINs
What database is this? And the join should be unique. Why is the same author joined multiple times with the book?
sqlite
also this is what I get from this query:
sqlite> SELECT * FROM bookAuthor;
9789875662445|Borges, Jorge Luis
9789875662445|Bioy Casares, Adolfo
9789875662445|Ocampo, Silvina
9789875780620|Cortázar, Julio
9500420457|Borges, Jorge Luis
9879243501|Muslip, Eduardo
043527032X|Dickens, Charles
043527032X|Tarner, Margaret
043527032X|Wilson, Kay Mary
9500718227|Benedetti, Mario
Doesn't look like the same author is repeated in the same book, so it's not a problem with bookAuthor having repeated entries or anything
