#databases

1 messages · Page 21 of 1

paper flower
#

using a lock?

waxen finch
#

yeah i added a lock to my benchmark just after pool.acquire()

paper flower
#

🌚

waxen finch
#

man that like defeats the purpose of a connection pool

#

well probably not because 1000/7s is still 142 executions per second, but i expected better performance from running them concurrently... i guess the alternative is carefully sharing one connection without holding it for too long

#

or opening an indefinite number of connections, which i found wasnt a good idea when i implemented my NullPool and attempted running 1000 queries in 1000 threads

vernal blade
#

someone who is very good at sql, please help me. ive open a help channel

warped flax
#

JSON Database best

queen rose
#

why is it such a pain in the ass to install the mysqlclient library smh

halcyon dew
#

some projects such as dbots and traditional web apps are best suited for SQL, while others are more suited for MongoDB

warped flax
halcyon dew
#

MongoDB is a NoSQL database that stores data as documents in BSON format. Not recommended in general as most of Discord data you are storing is relational (e.g. economy things) while mongodb is for non-relational data, hence there is no reason to use NoSQL over SQL to store relational data.
^ that's the reason above

warped flax
#

other than JSON DB

halcyon dew
warped flax
halcyon dew
warped flax
#

database.json

halcyon dew
#

Leads to data corruption, file locking, file permission issues, no atomic reads/writes, loading all of the data into RAM (which is a costly mistake), and many others

warped flax
#

wha?

halcyon dew
#

funnily enough even MongoDB would be better than just a file JSON db

halcyon dew
queen rose
# warped flax discord.py

by the sound of it, you might want to look into Sqlite3, its plenty fast, and stored as a single file

halcyon dew
#

SQLite works great for smaller bots

#

i just primarly use PostgreSQL bc i know that most of my bots will probably be running off of a PostgreSQL server instead. Primary bot uses both PostgreSQL and Redis in order to not screw over my RAM usage and Redis is used as a cache for server config values

warped flax
halcyon dew
warped flax
#

Because PostgreSQL requires money in r- ok.

halcyon dew
#

and plus replit's fs is ephermal, which means that any file you saved via your bot will be overwritten when you next launch.

warped flax
#

But replit:

halcyon dew
#

pretty much meaning that your data would be gone

halcyon dew
halcyon dew
warped flax
halcyon dew
#

Replit's postgres db are trash anyways. I can just get a vps and just host it there instead

halcyon dew
warped flax
#

Please don't think of me as stupid, I'm new to coding and all.

delicate fieldBOT
#
Discord Bot Hosting

Using free hosting options like repl.it or Heroku for continuous 24/7 bot hosting is strongly discouraged.
Instead, opt for a virtual private server (VPS) or use your own spare hardware if you'd rather not pay for hosting.

See our Discord Bot Hosting Guide on our website that compares many hosting providers, both free and paid.

You may also use #965291480992321536 to discuss different discord bot hosting options.

warped flax
#

I want free

#

Free VPS

halcyon dew
halcyon dew
#

Please make sure you read this fully

When you look for hosting your bot, you might be tempted to use free services or providers that claim to support hosting discord bots . Do not use them. Why? Let me explain.

Free hosts and tiers always have drawbacks - The point of a free tier or a host is to get the user to use their product, and upgrade for all of the good features. Hosts such as replit use shared cpus, oftentimes will run into ratelimits, and have "gotchas" that make you tempted to upgrade to their paid tiers. They also literally remove all control out of how you run your bot. Think about it: Hosts need to make money somehow in order to either break even or gain profit. So naturally they will tempt you into upgrading in order to do that. This is basic economics and quite literally how businesses run.

#

If you are still looking for a host after reading this, then you have some options. And here are them:

  1. Pay up - Pay up and use a VPS provider. My pick is Heznter, but see ?tag vps for the full list (if you are on the Python discord server, then see !hosting for your options). More than likely if you are not an eligible HS or college student, this is your option.

  2. Self Host - If you are able to self host your own server, then go for it. This is really the only "free" way if you already pay for your electricity bill and have the equipment to do so. Raspberry Pis oftentimes are great for this.

  3. GitHub Student Developer Pack - IF you are a enrolled student in an HS or college (you will need to either provide an valid student ID or proof of education of your HS or college), then you may apply for this deal. More info about this can be found here. The deal offers $200 free credits on DigitalOcean for 1 year, so if you are a student and need to host your bot, go for it.

#

ik this is more dbot related stuff but just had to put it out there

gentle zinc
#

how do i clear my console on windows? i'm using the mysql shell plugin for vscode

#

! cls doesnt work

fading patrol
gentle zinc
#

that's for linux mate

#

clear doesnt work on windows \ ! cls supposedly works

#

but it aint

#

works on the cli

#

as u can see

fading patrol
# gentle zinc that's for linux mate

And PowerShell (on Windows), like I said... but that's not what you're using. You did say MySQL... here's the first result from a web search: https://stackoverflow.com/questions/8807836/how-to-clear-mysql-screen-console-in-windows

gentle zinc
brazen charm
#

A master class in how not to get helped next time 😅

torn sphinx
#

i have an almost fully functional stat bot tracker. i just need help making the data go to a database.. the time in vc resets every time i leave the call and i want all of the data to be on a database but i am unsure of how to set it up

blissful steppe
#

Hello, can anyone suggest any libraries for working with large parquet datasets?

cosmic drift
blissful steppe
coral wasp
blissful steppe
#

Can't find a way to stream rows with polars, duckdb doesn't appear to have any sort of streaming

#

Dataset is larger than memory'

paper flower
blissful steppe
#

I have a table with 3 columns, (a, b, c) and I want to create a new table with 2 columns ([a | b], c). - So that a query for rows with some a/b value returns the unique c values that is associated with the a/b value on the original data

# (a, b, c)
(5, 6, 7)
(8, 5, 7)
(9, 3, 5)
(8, 6, 9)
# ->
# (a|b, c)
(5, 7)
(6, 7)
(6, 9)
(8, 7)
(8, 9)
(3, 5)
(9, 5)
fringe sundial
#

Guys is docker only way to use cockroachdb locally?

brazen charm
#

Docker is by far the easiest way to set it up

fringe sundial
brazen charm
#

I mean.. Why are you using cockroach of postgres then?

brazen charm
#

I mean if you have super low resource requirements then postgres is going to be better

fringe sundial
fringe sundial
brazen charm
#

why?

#

Are you actually doing anything which actually makes these alternatives useful?

fringe sundial
brazen charm
#

neon DB is not really just a postgres host

fringe sundial
#

shoudl i use mongodb the, i dont wana use any db locally

brazen charm
#

It's complicated, but it's a cloud giving you effectively serverless postgres but has things like storage decoupling

#

Litterally every cloud provider and their dad has some hosted postgres offering

fringe sundial
#

but neon db is free i guess

brazen charm
#

Mmm yes but you're charged per unit of a metrics (Compute, Storage, Data transfer, Data written)

#

For small projects though it is pretty cheap

fringe sundial
#

ye, i just making a social media app for practice

fringe sundial
brazen charm
#

I would prefer using postgres over mongo for 99% of things, so probably neon IG if you really don't want to self-host

brazen charm
#

sorry, mistyped

#

I meant to say postgres over mongo

fringe sundial
brazen charm
#

I mean you just said neon

#

Neon provides a postgres API

#

which is all you need really

river vortex
#

I am trying to create a simple IRC chat platform using the socket library in Python. I've got that part going well, but I want to implement encryption to ensure security. The solution I thought of for that would be to have one public key for the server and one private key for each client. The user would define a matching encryption passphrase within the server & client and matching keys could be generated individually based on that passphrase. With this logic messages could be kept within the bubble of a single server securely and the server would only have to encrypt it once and every client would be able to decrypt it with their own key. Would this work? If so, what's a good library I could use to do it or could someone send a good tutorial/documentation?

bright hound
#

That's not really a database question

#

But this is

#

I'm making an app to track some grocery prices over time. I'm going to be tracking potentially thousands of UPCs. What's the best way to do this? I've used Parquet in the past to store stock price data but there were only a hundred or so tickers. I could use a SQL database also and have been looking at QuestDB. I have used MongoDB in the past also.

Overall I'm not super familiar with using databases so I'm wondering if anyone has any thoughts on the pros and cons of each type of database for this particular application

keen minnow
#

maybe with an extension for timeseries if it's really required for fancy timeseries computations. But I would suggest to keep it simple and stupid and start with something simple like postgres

coral wasp
keen minnow
#

(it comes down to if you want to prioritize the metadata and relationships around the UPCs or the TS itself)

bright hound
#

Ideally I'd like to build a dashboard to display charts of price trends etc

#

Nothing really fancy on the math side

#

I have used Postgres in the past and was underwhelmed with the GUI for it

#

But maybe this time I'll do strictly command line

#

The only thing I'm wondering is this: I'm only going to be storing UPC, Description, Price, and Date. With so few rows/columns wouldn't it make sense to just store everything in one table? And if that's the case does it make sense to use a relational database?

brazen charm
brazen charm
bright hound
#

Yeah I was looking at QuestDB, but really it's a timeseries for each UPC

#

So idk what the heck that would look like

brazen charm
#

I mean that shouldn't affect things too much, since that could just be your partition key probably

bright hound
#

What would be

brazen charm
#

eh dw about it actually

#

I don't think QuestDB should really struggle with anything, or ClickHouse really.
Both take very little configuration to be performant

bright hound
#

So here's what I'm thinking

#

If I did parquet I could have a parquet file for each read date (once a week) and then 6000 columns inside it (one for each upc)

#

oh no

#

nvm

brazen charm
#

Why cant it just be like this: UPC, Description, Price, and Date

bright hound
#

and price and date are lists?

#

oh

#

like every week make a new entry for each UPC

#

man

brazen charm
#

Well I would just have 1 row = one price, one date, one upc

bright hound
#

right

brazen charm
#

you can have multiple rows for a single date and upc

bright hound
#

Yeah. Just feels like so much duplicated data

brazen charm
#

but for storage efficiency, performance and convenience, you probably shouldnt have them in nested collections

bright hound
#

What if I added a new column each week

brazen charm
#

Don't add columns

bright hound
#

oh I see

#

Maybe I'll go with QuestDB then

#

It was built for this kind of thing

brazen charm
#

Quest or ClickHouse yeah

bright hound
#

Ok it makes more sense to me now. I wasn't even thinking that you could just add another row each week for the same upc

brazen charm
#

Yeah, having them be consistent columns and flat also makes aggregation and what not easier

bright hound
#

right just do a groupby or select where or whatever

#

Appreciate the help

brazen charm
#

They are cherry picking the query for that graph 😅

#

I wouldn't put too much thought into it

bright hound
#

Yeah I figured. I also realized my math was wrong though and I need to track 325,000 products

#

6500 api calls * 50 products per call

coral wasp
#

fwiw, since clickhouse was mentioned, I'd also throw duckdb into the discussion. Postgres is probably fine, but if you're looking at the analytical db route, and are throwing it in parquet, then it's as simple as: ```py
import duckdb
df = duckdb.execute("select category, count(*) from 'myfile.parquet' group by category").df()
print(df)

bright hound
coral wasp
bright hound
#

Yeah. I enjoyed using it in the past. Just not sure if it's the right way to go with this. Almost tempted to use something I've never used before just for the learning experience, but idk

#

Kind of leaning toward mongo

coral wasp
#

that's probably the worst choice for what you've said so far.

#

(sorry)

bright hound
#

Really?

#

Dang it

#

I have no sense of scale. 325,000 products updated once per week is going to end up being a lot of data right?

coral wasp
#

It sounds like you have an analytical problem with a simple structured dataset. Therefore, an OLAP/columnar db is probably a good fit.

#

(but again, postgres is probably fine, and has columnar extensions)

bright hound
#

If I'm going to use columnar I'll just use parquet and I'll check out duckdb

coral wasp
#

And in a columnar store, you'll get significant compression.

bright hound
#

Not sure. Haven't done one yet but each row would have maybe 6 columns max

coral wasp
#

Just doesn't seem like that much data to me. If it were once per hour or minute, I'd worry about timeseries.

bright hound
#

Ok. Yeah the number just seems big to me because I don't do this often. I know databases get much much larger

#

Appreciate the input

#

It appears I should also switch to polars from pandas lol

coral wasp
#

yah, it's pretty ridiculous how bad pandas is. Although, I avoid both as much as possible and use duckdb (this is not really a popular view around here, it's just my view)

keen minnow
#

At this stage, I wouldn't worry about the database

harsh pulsar
harsh pulsar
#

if you're using pandas to process the 325k records weekly you will have no problems, although for a production app polars will probably save you some cpu cycles

#

for analyzing a time series dataset of millions or billions of rows, definitely don't use pandas. i believe polars can do out-of-core data processing with lazy data frames, but really you'd want to consider something like spark at that point, or an actual OLAP database

#

definitely do not use mongo

bright hound
#

I'll have to see what I'm actually going to end up doing with the data. I think very rarely would I ever be accessing all the data at once

#

Usually it will just be plotting the price trend of one product at a time

#

Or possibly the averages of products in a category

harsh pulsar
bright hound
#

I was, yes

harsh pulsar
#

so you're writing 325 million rows / week?

bright hound
#

I guess so. Really all I need to do is add a data point per product

harsh pulsar
#

(assuming some of those are updates to existing entities, but you shouldn't be throwing out historical data so it's the same either way)

#

thats like what, 16 million rows a year?

keen minnow
harsh pulsar
bright hound
#

But honestly I think I'm going to go with parquet

harsh pulsar
#

hell you can put all that in a single pandas dataframe in memory and have little trouble

#

ive worked with 1 billion rows before completely in memory using the R data.table package

bright hound
#

lmao\

coral wasp
#

Fwiw; my main issue with pandas (and polars) is its yet another API to learn when there’s already SQL: the one data language to rule them all.

harsh pulsar
#

oh yeah i remember that argument. i don't agree at all (partly because SQL is a super clunky API that you have to embed in python strings) but that's a separate discussion

coral wasp
#

(Just clarifying my earlier point)

harsh pulsar
#

yeah its an interesting perspective

keen minnow
#

imho, most questions related to the choice of db should start with: why is postgresql not working for you? What have you tried?

waxen finch
harsh pulsar
#

++++

bright hound
#

You can't do a lot of transformation with SQL can you?

harsh pulsar
#

ask any data analyst what they think of in-place updates

#

you'll get groans and head shakes

waxen finch
#

ah right, cant plot data thats deleted

bright hound
harsh pulsar
#

not just that. what if the CEO wants a report of user % with profile pics over time? i can't give them the current number, i need to know who had a profile picture every month

keen minnow
harsh pulsar
#

@bright hound how is this data actually coming to you?

bright hound
#

Yeah I was storing streaming data from a websocket for that one, but for this project I'm going to be making async api calls

harsh pulsar
#

is it a weekly dump of 325k records, or is it through an API or something?

coral wasp
bright hound
#

6500 api calls

keen minnow
bright hound
#

It was pgadmin

#

And I couldn't even open it to view my database because it would take 30mins to unfreeze

harsh pulsar
# bright hound 6500 api calls

honestly? save each raw request/response pair to a file and figure it out later. ideally you'd use a database but there's something to be said for storing the rawest form of data possible and cleaning it up later

bright hound
#

Not sure what the problem was

coral wasp
bright hound
#

I've got the data cleaned pretty much. Whittled it down to 7 columns. Figuring out how to make pandas accept a nested list when creating the dataframe

coral wasp
bright hound
#

1 sec

#

So I have a list of lists, and the last item in each sub-list is also a list

#

So pandas is giving me trouble about that

coral wasp
bright hound
#

yes

#

1 sec

#

Yeah

coral wasp
#

You could convert the categories to a string. Or, you could onehotencode them

bright hound
#

So the last column needs to be of type 'object' I think

#

oh just make more columns

#

I hate that

#

lol I'm OCD

coral wasp
#

Yah. But if you just want this to work, convert to a str

bright hound
#

Yeah you're right. Then just split it later

coral wasp
harsh pulsar
#

you know honestly, JSON request/response pairs are actually an OK use case for mongo

#

json files in an s3 bucket do a similar job though

bright hound
harsh pulsar
#

assuming you don't need to query this data except to process it a database

coral wasp
#

Not to be a broken record but duckdb supports lists and structs as a column type 🙂

bright hound
#

I will need to query it though. Like if I want to track the trend of the average price of Frozen sausages I'll need to be able to get that

coral wasp
#

(That was more for srl, stick with what you’re doing

bright hound
#

Which I still could with one hot

#

Oh right

#

Not actually querying it

#

I'll be loading the data as needed

coral wasp
#
import pandas as pd
data = [["a", "b", "frog, cow"]]
df = pd.DataFrame(data, columns=["col1", "col2", "col3"])
dummied = df["col3"].str.get_dummies(',')
pd.concat([df, dummied], axis=1)
#
0    a    b    frog, cow    1    1```
coral wasp
bright hound
#

Right. Then use loc to find where cow == 1 after you read it back in from the parquet file

#

Or something similar

#

Or idk honestly if what I'm saying even makes sense. Been coding all day. Brain fried lol

torn sphinx
#

how do i make a db

spare pendant
#

that's a loaded question
what db do you need/want

torn sphinx
#

just to store my keys

#

and restic there hwid/ip

spare pendant
#

i can only speak about SQLite, if you need anything else i can't really help

torn sphinx
spare pendant
# torn sphinx idc what is it i just neeed something

you might want to do more reading on what a database is, because if i start telling you to use SQLite without you knowing what SQL database is or how you interact with it we're both going to have a hard time understanding each other

torn sphinx
#

ok

#

k

#

but what can i do for now cause i got to drop this shit soon

spare pendant
#

i would recommend trying sqlite because its a simple disk-based SQL database, and what knowledge you get can be transferred to other sql databases like MySQL, PostgresSQL, etc

spare pendant
# torn sphinx but what can i do for now cause i got to drop this shit soon

this should get you up to speed
https://realpython.com/python-sqlite-sqlalchemy/

In this tutorial, you'll learn how to store and retrieve data using Python, SQLite, and SQLAlchemy as well as with flat files. Using SQLite with Python brings with it the additional benefit of accessing data with SQL. By adding SQLAlchemy, you can work with data in terms of objects and methods.

#

you dont need sqlalchemy just yet i dont think, but it's a good thing to know and learn

torn sphinx
#

ok

#

ty

pearl lodge
#

guys, anyone here knows mongodb atlas?

#

i need help with this:

raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: encoding with 'idna' codec failed (UnicodeError: label empty or too long), Timeout: 30s, Topology Description: <TopologyDescription id: 64fe05c96c3a72c737ffff88, topology_type: Unknown, servers: [<ServerDescription ('mongodb%2bsrv%3a%2f%2fhirukarogue%3adv%40plasma123%40rpucloudserver.bscpl0p.mongodb.net%2f%3fretrywrites%3dtrue%26w%3dmajority', 27017) server_type: 
Unknown, rtt: None, error=UnicodeError("encoding with 'idna' codec failed (UnicodeError: label empty or too long)")>]>

i'm with problem in idna

#

i have my database but i can't connect to it

#

in atlas mongodb

warm bough
#

greetings

viral girder
#

anyone here care to share there fav database ?

paper flower
wise goblet
coral wasp
keen minnow
#

with docker/podman, it's so easy to get postgres up and running anyway

wise goblet
coral wasp
#

But yes for op: PostGres is the first stop for most ppl

wise goblet
paper flower
#

Honestly even for pet projects I'd prefer posgresql - more features, no pitfalls such as disabled foreign keys by default

halcyon dew
#

if you dont know what the #cal reference is, look up on google where postgres was first made

#

Hint: It's the same university that made BSD

keen minnow
halcyon dew
#

i mean UC Berkeley apps are open rn so feel free to apply lol

keen minnow
#

lol true

#

it's a nice campus too

halcyon dew
#

yep agreed

paper flower
orchid forge
#

ay yes, DATA

brazen charm
#

you can enable it pretty easily though

paper flower
brazen charm
#

yeah... Unfortunately backwards compatibility is a bitch

median marten
#

hi

#

i have this function

def to_datetime(_v: Optional[str]) -> datetime.datetime:
    if _v is None:
        return datetime.datetime.now(datetime.timezone.utc)
    
    date = datetime.datetime.strptime(_v, "%Y/%m/%d %H:%M:%S").replace(tzinfo=datetime.timezone.utc)
    return date

that returns two datetime aware objects, i pass these datetime aware objects to asyncpg and it raises:

Ignoring exception in slash command 'create':
Traceback (most recent call last):
  File "asyncpg\protocol\prepared_stmt.pyx", line 168, in asyncpg.protocol.protocol.PreparedStatementState._encode_bind_msg
  File "asyncpg\protocol\codecs/base.pyx", line 206, in asyncpg.protocol.protocol.Codec.encode
  File "asyncpg\protocol\codecs/base.pyx", line 111, in asyncpg.protocol.protocol.Codec.encode_scalar
  File "asyncpg\pgproto\./codecs/datetime.pyx", line 152, in asyncpg.pgproto.pgproto.timestamp_encode
TypeError: can't subtract offset-naive and offset-aware datetimes
#

but they're both offset-aware datetimes, are they?

hollow oar
median marten
#

i have two variables that represent a date in the following format "%Y/%m/%d %H:%M:%S", on both i'm calling the to_datetime method and passing them to asynpg to be inserted into my table, now they should be both offset-aware datetimes

hollow oar
#

it would be beneficial to post the snippet of code where to_datetime is used as well.

#

and is that the entire traceback?

median marten
#

no, lemme paste everything

#

!paste

delicate fieldBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

median marten
hollow oar
#

what's the column type of created_at?

median marten
#
                id                  SERIAL       PRIMARY KEY,
                created_by          bigint,
                starts_at           timestamp,
                ends_at             timestamp,
                winner_numbers      integer,
                win_goal            integer,
                prize               text,
                partecipants        bigint[],
                title               text        NOT NULL,
                description         text        NOT NULL
delicate fieldBOT
#

codecs/datetime.pyx line 152

delta = obj - pg_epoch_datetime```
median marten
#

is pg_epoch_datetime a tz aware object?

delicate fieldBOT
#

codecs/datetime.pyx line 17

pg_epoch_datetime = datetime.datetime(2000, 1, 1)```
median marten
#

then isn't that the issue?

hollow oar
#

sorry, isn't what the issue?

#

timestamp you have in your table definition is timestamp without time zone

asyncpg/pgproto is using the encoding function for timestamp without time zone correctly, which uses tz naive pg epoch

yet you have supplied a tz aware datetime object to the query, as such it blew up with such a cryptic message in the encoding function

naive grotto
#

im using sql for storing the body text of my blog posts with django, is sql good for this application?

harsh pulsar
#

you'll almost certainly never get useful help asking questions like this. it forces people to "interview" you to even determine if they can answer your question. the general principle is "don't ask to ask". just state your question.

also this is a python server, java questions aren't on topic

cinder cypress
viral girder
shut tiger
naive grotto
wise jewel
#

I don't know if this is the correct channel to talk about the subject. But i'm syncronizing a microsoft sql database to a mysql database. But there a some columns with file stored in base_64 in it. This is really slow to copy and bottlenecking my speed. How should i approach this to fasten it up

brazen charm
#

😅 store the files in something like s3 rather than the DB

#

If these are big files, you are always going to be killing your DB

wise jewel
#

Yeah, I know but I didn't start the project and have to clone the db nightly

#

I know it's a bad approach but i'll have to work with it :p

brazen charm
#

The other way to speed it up is to throw more money at the problem, I e. nvme disks, more memory for more caching

wise jewel
#

Ok thanks for the insights, i was hoping for a little magic but i guess i'll just run them in batches and cope with the long runtime

brazen charm
#

you could maybe look to see if you can incrementally update the DB? Like actually do the duplicatation to the second DB when it happens rather than doing it all at once

#

Or maybe... If it's supported dump MSSQL DB to a flat file like parquet, CSV, etc... and then ingest that way

wise jewel
#

ow, the csv route is actually quite interesting. I'll check to do it that way. cause there is one table already with 40K records and 4 of those columns so it's really thrashing the speed

coral wasp
#

(So you don’t have e to mount the entire dump). But yah, generally; externalize you large blobs lest you end up with an unrecoverable database

#

Me: I’d consider exporting to parquet files. But everything I do involves parquet nowadasys

wise jewel
torn sphinx
#

Hola
Alguien habla español?

fading patrol
fading patrol
pastel wren
#

does anyone know the right way to do declarative base in sql alchemy 2.0? I'm updating to the latest and i want to not get this warning anymore
The ``declarative_base()`` function is now available as sqlalchemy.orm.declarative_base()

#

do i import it differently now?

waxen finch
white gyro
#

Is 15 days a plausible amount of time to develop a mobile app that can perform and update database interactions in real time from scratch?

#

Taking chatgpt into account too

foggy juniper
#

Why is this:

@group.command(name="decree", description="Store a mayoral decree to the database")
    @app_commands.checks.has_any_role(583496754712805376, 1087922383698014279)
    async def decree(self, interaction: discord.Interaction, decree_name: str, decree_number: int, signed_by: str, decree_status: str, decree_link: str):
        if decree_link.startswith("https://drive.google.com/file/d/") or decree_link.startswith("https://forums.stateoffirestone.com/") or decree_link.startswith("https://docs.google.com/document/d/"):
            async with aiosqlite.connect("/home/pi/Documents/Arborfield_Automation/arborfield_backup.db") as db:
                message = await interaction.response.send_message("Storing decree to database...", ephemeral=True)
                try:
                    await db.execute(f"INSERT INTO decrees values('{decree_name}', {decree_number}, '{decree_status}', '{signed_by}', '{decree_link}')")
                    print(f'Added Mayoral Decree {decree_number} - {decree_name} to database')
                    await db.commit()
                    print('Saved')
                    await message.edit(content="Decree stored to database!", ephemeral=True)
                except Exception as e:
                    await message.edit(content=f"Error: {e}", ephemeral=True)
                    print(f'Ignoring exception in command decree: {e}')
                    pass
                pass
        else:
            await interaction.response.send_message("Use a Google Drive or Forums link", ephemeral=True)
            pass
        pass```
#

Creating this error?

Traceback (most recent call last):
  File "/home/pi/Documents/Arborfield_Automation/cogs/arborfield-automation-store.py", line 20, in decree
    await db.execute(f"INSERT INTO decrees values('{decree_name}', {decree_number}, '{decree_status}', '{signed_by}', '{decree_link}')")
  File "/home/pi/.local/lib/python3.9/site-packages/aiosqlite/core.py", line 184, in execute
    cursor = await self._execute(self._conn.execute, sql, parameters)
  File "/home/pi/.local/lib/python3.9/site-packages/aiosqlite/core.py", line 129, in _execute
    return await future
  File "/home/pi/.local/lib/python3.9/site-packages/aiosqlite/core.py", line 102, in run
    result = function()
sqlite3.OperationalError: no such table: decrees```
keen minnow
#

sqlite3.OperationalError: no such table: decrees

#

It means it could not find such table

foggy juniper
#

From SQLite Studio

keen minnow
# foggy juniper

is it connecting to the same db?
It may be worth logging which db it's connecting to and listing the available tables on start

foggy juniper
#

Whoops

#

Bruh, today has just been one of those days

keen minnow
#

It happens. That's why it's good to have a rubber duck or to talk it out

foggy juniper
#

I perfect bonking sticks

torn sphinx
#

random but I've started using regex so much in my job and I'm loving it. previously I was relying on the limited LIKE operator in Snowflake lol

shut tiger
#

easy, more seductive

ripe horizon
#

hi,
In psycopg2 is is possible to catch this exception type? ObjectNotInPrerequisiteState

I tried to write except psycopg2.ObjectNotInPrerequisiteState:, but PyCharm is telling me that doesn't exist

#

it does let me write this though: except psycopg2.OperationalError:

brazen charm
#

Its the error code

#

not an actual type

#

so you need to catch the operational error

#

and then get the error code from that

#

e.g.

try:
    cur.execute("LOCK TABLE mytable IN ACCESS EXCLUSIVE MODE NOWAIT")
except psycopg2.OperationalError as e:
    if e.pgcode == psycopg2.errorcodes.LOCK_NOT_AVAILABLE:
        locked = True
    else:
        raise
ripe horizon
#

thanks!

waxen finch
#

seems to work fine importing psycopg2.errors.ObjectNotInPrerequisiteState

waxen finch
brazen charm
#

ah yeah, you're right

ripe horizon
#

my IDE doesn't seem to like it though :/

hollow oar
#

use types-psycopg2 from typeshed
https://pypi.org/project/types-psycopg2/
it will reveal the dynamically generated types to your editor for type checking/hinting purposes
see: https://github.com/python/typeshed/blob/main/stubs/psycopg2/psycopg2/errors.pyi

if you inspect the source of psycopg2, you will see the types are infact dynamically generated and highlights the need of using something from typeshed (or anything that provides the correct *.pyi really..)

(edit: as mentioned below, you would still need to do a import psycopg2.errors even with types-psycopg2, i forgot to mention this)

waxen finch
#

the error message says __init__.pyi so i think it is correctly using the typeshed, but there's no mention of an errors module in that type stub

#

i guess as a workaround you can explicitly import psycopg2.errors

oak pumice
#

Does anyone have experience using SQLServerAgent to run SSIS Packages that execute Python scripts?

#

Or any suggestions on a means to schedule my scripts to run regardless if my machine is online or not?

autumn torrent
#

liuljhhhh,h

torn sphinx
#

Hello

halcyon dew
#

hi

lucid crane
#

I'm getting a sqlite3.IntegrityError: UNIQUE constraint failed: Canada.line_name error even though I'm using update

UPDATE {country} SET line_name = ?, `0-7` = ?, `8-15` = ?, `16-25` = ?, `26-40` = ?, `>40` = ?, undelivered = ?, cache_time = ?
#

What could be the problem?

trail rune
#

Are there certain rules that can determine whether it makes sense to create a reference table based on the number of distinct values and the max length of those values?

brazen charm
lucid crane
#

Oh shit

#

mb

ashen heart
#

Hi! Is it possible to get the inserted IDs after bulk-inserting rows using insert described at https://docs.sqlalchemy.org/en/20/orm/queryguide/dml.html#orm-bulk-insert-statements?
Simplified example of my code:

session.execute(insert(model), inserted_rows))  # inserted_rows = list of dicts
session.flush()
# return inserted_row_ids

I previously did it row by row which allowed me to access the id after each add/flush:

session.add(inserted_row)  # inserted_row = SQLAlchemy object
session.flush()
return inserted_row.id
hexed estuary
#

not sure offhand how to make the ORM return only the id and not the entire object, though

ashen heart
#

Oh, thats nice. Probably returning(User.id)

ashen heart
paper flower
#
session.add_all(models)
await session.flush()
print([model.id for model in models])
#

It should bulk insert them, I don't see a good reason to use raw insert there pithink

paper flower
#

What version of sqlalchemy and what db are you using?

ashen heart
#

2.0.20

paper flower
#

And what db?

ashen heart
#

postgresql + psycopg3

paper flower
#

I think it should work that way already

#

enable echo flag on your engine and see what sql it produces to be extra sure

ashen heart
#

Looks good, thanks for the input

wooden topaz
#

Hi,
i have a sqlite query as below which is working fine

query = SELECT * FROM JIMBO
        WHERE emp_id IN %(emp_id)s

execute_query(query, 1)

but, when i try to query an emp_id which is not in DB,

sqlite3.OperationalError: near "%": syntax error
queen rose
#

do guys consider the database name to be senstive info? something you would put in the .env?

somber ember
#

tfw the reason your spark-submit was keep throwing "classNotFoundException" is because your class name had single capital letter instead of lower case along spark-submit line, and that capital letter looked like it was part of actual class name.

fading patrol
wise goblet
#

Domain address? may be. 🤔 but if i made proper private network setup, then no

trail rune
# harsh pulsar what is a reference table?

I think a reference table is a type of table that stores data that you reference in other tables. For example, if you have a warehouse, you might have a shipping status reference table. Instead of typing in "shipped", "processing", etc for each order, you would just reference the ID of the status in the Status (reference) table. If you have lots of rows, it might save space to store integers instead of a varchar with many characters.

coral wasp
#

Enum fields are also helpful: far more efficient than joining on a second table for the lookup

compact harness
#

😫

trail rune
coral wasp
trail rune
coral wasp
#

Then look at ENUMs, if you have a lot of constant values

trail rune
#

After reading a bit about them, it sounds like you can't add to the list of values because it is created at the time the table is created. That being said, for some of my values, it is unlikely that I will need to add to them.

coral wasp
#

ENUMs are just stored as ints, so modifying makes sense when you’re adding a new possible value

trail rune
#

In that case, perhaps I could get some use out of ENUMs.
Changing topic. After doing some reading online I've come to the conclusion that 1:1 relationships shouldn't exist and should be merged into a single table. Would you agree with this statement? If so, why? If not, why not?

coral wasp
#

, 1:1 is often 1:[0,1], and it’s cleaner to model as two separate tables, especially if second is sparse

#

Logically, one giant flat table can cause performance problems in row oriented databases: the cost of retrieving each row is a factor. In OLAP, which is generally columnar, this is less of a concern

#

Sometimes you might just need to join against one or the other, so there potentially a cost of a giant wide table

#

So, it’s a good thought and worth considering: usually wide is simpler and more performance, but not as a rule

trail rune
#

I do have a 1:[0,1]. However, I have many more that are actually 1:1. I was using this 1:1 relationship to organize attributes that were similar. Is this not necessary, or worse, inefficient?

coral wasp
#

In OLAP, we tend to go wide, in transaction, small is fast

#

It’s like object oriented design: start based on the classes- separate tables for separate things, not for attributes of a thing

sage night
#

Hi, I have a database with which I'm accessing with Sqlite3. I can commit data into it, which I can see if I open the .db file. And I can see the data with a select query. But if I close my python file and run it again, the data is still in the database file, but I cannot select it with a query, does anybody know what's going on?

frosty mirage
harsh pulsar
harsh pulsar
halcyon dew
#

just pass your db name as a part of your uri connection string

austere steeple
#

i have 1 file nyp how to open ít

gentle patio
#

hi

#

alter user system ACCOUNT UNLOCK
Error report -
ORA-01031: insufficient privileges
01031. 00000 - "insufficient privileges"
*Cause: An attempt was made to perform a database operation without
the necessary privileges.
*Action: Ask your database administrator or designated security
administrator to grant you the necessary privileges

fading patrol
fallen sage
#

Hi i installed mariaDB and mySQL workbench but when i try to connect to the server i get the following error Authentication plugin 'auth_gssapi_client' cannot be loaded: The specified module could not be found. I read online and it says that the module should come with the mariaDB installation so im not sure why i get this problem.

slate pumice
#

How can I fix this error with aiosqlite?


    @app_commands.commands.command(name="unrole-stript", description="nothing")
    @app_commands.checks.has_any_role(owner_role, co_owner_role) 
    async def unroe_stript(self, interaction: discord.Interaction, member: discord.Member):


        async with self.bot.db2.cursor() as cursor:
            await cursor.execute("SELECT role FROM user =?", (member.id))
            role = await cursor.fetchone()

        
        await member.add_roles(role)```
waxen finch
slate pumice
waxen finch
#

oh alright

slate pumice
#

how could I fix this formatting?

#
  
    @app_commands.commands.command(name="warn-check", description="checks if you have a warn or not")
    @app_commands.checks.has_any_role(mod_role)
    async def warn2(self, interaction: discord.Interaction, member: discord.Member):

        embed = discord.Embed(
            title="Warns",
            description=member.mention,
            color=discord.Color.red()
        )
        async with self.bot.db3.cursor() as cursor:
            await cursor.execute("SELECT reason FROM warn WHERE user =?", (member.id,))
            
            #mod = await cursor.execute("SELECT mod FROM warn")

            

            warn = await cursor.fetchall()
            for user in warn:
                embed.add_field(name="Warns", value=user, inline=False)




                await self.bot.db3.commit()

        await interaction.response.send_message(embed=embed)
                
slender atlas
#

fetchall returns a list of records. Each record consists of multiple columns, even though you may have selected one column, so it's a tuple (the warn table record which you named user in the for loop consists of only reason so the tuple consists of one item)

#

You just need to use user[0] when setting the value of the embed field

shut tiger
#

uggh no wait

#

you are selecting reason, not user

#

(reason, ) = row is the correct line there

warm scroll
#

hi, I am having a problem with mySQL database in Xampp, it does not start at all, have anyone know why?(apache and proftpd is working properly)
running linux

inland pawn
#

why is this query returning a big object instead of returning the query iteself

#

?

obtuse magnet
#

Any resource on query performance optimization?

coral wasp
coral wasp
obtuse magnet
#

yeah ive read abit on how a database plans and executes a query

coral wasp
#

Good (experts?) study their databases "explain plan" output, and in particular, the intricacies of certain paths: the subtleties of how, say, MYSQL vs Postgres handle certain optimizations do matter.

coral wasp
# obtuse magnet yeah ive read abit on how a database plans and executes a query

Yah, I think there's three skills involved: understanding the domain/data model - you can often do things differently (optimize upstream of the database, for instance, such as via caching/etc. Understanding indexes and the various indices available in your particular database. And, understanding the details of your databases implementation... at least at more than a superficial layer. And... for bonus: data types / layout optimization. Sometimes laying the data out differently can help with certain types of problems.

obtuse magnet
#

ait thanks, seems like i just need to read up and understand how a db runs the query and reading the analyze/explain

hollow oar
obtuse magnet
coral wasp
#
  • I might not really be helping with my answers because perhaps you just needed something like ry posted, rather than the advanced stuff
dire nebula
#

please can someone help me out here all my sqlmap scan

rotund bear
#

Hey I am having problem in writing SQL query for this
The public dataset of Ubereats available on BigQuery can be accessed using -
SELECT * FROM arboreal-vision-339901.take_home_v2.virtual_kitchen_ubereats_hours LIMIT 10;

It has several columns and in response column there is json data from where I have to extract endTime and startTime
This is the JSON structure I have attached.
Please if possible provide or help in sql query

fallen vault
#

Can i have 2-3 instances of an app connected to a single sqlite3 database?

wise goblet
wise goblet
fallen vault
wise goblet
fallen vault
#

Ive never used anything outside of sqlite3 so im fairly maliable lmao

#

Sucks that I have to switch to webapp though lmao.

wise goblet
#

best idiot proof decision

#

if u will just use Django boilerplating tools to maximum, u will be able to screw up far less

#

Flask newbies can screw up way worse

wise goblet
wise goblet
#

applications written in python without unit tests are usually screwed up beyond recoverability always

fallen vault
fallen vault
fallen vault
wise goblet
#

will it need Moderator/Administrator input to add data?

#

Django has an awesome feature called Django Admin

#

boilerplated the hell out of it, Admin web site creator

#

Pretty much stuff admin level access out of the box

fallen vault
#

I should’ve call it a management system. It’s more of a data collection system. I’ll have a login but it doesn’t need to be secure. Just enough of a deterrent to keep the customer from trying to log in.

#

Users will input two sets of data, production data or downtime data. And it will create reports based on the information in the database.

#

I wish I could show you but since I started working on it from a company laptop it’s their property and I have no ownership over it lol.

wise goblet
#

it will still benefit u greatly in comparison to Flask

#

because Django has FAR easier to use ORM interacting with db

#

everything is configured out of the box 😉 including easy to use mgirating system

fallen vault
#

I’ll look into it.

austere walrus
#

I am just starting to use databases. I currently am using mongo and creating collections of different data for stocks as far as company management team, job titles, names, company, ticker, etc. I have very little experience and after reading, I want to ask you all this. What database should I learn how to structure financial data with? I like the visualization aspect of the mongodb compass application. I want to stress I am just learning. I see this as needing to track relationships between people and companies, as well as people and their job titles as well as the companies that employ them (sometimes more than 1), companies latest: dcf, cash flow statement, revenue report, expense report, balance sheet, esg ratings, historical earnings, etc.(lots of numerical data with relation to the companies), sec fillings for each company, and lobbying information from each company (keyword search, numerical filter). Open to any suggestions on what to start with. again, I do well when I have a visual such as mongo compass. not stuck on mongo though.

obtuse magnet
#

can probs read data warehouse toolkit, got examples there

halcyon dew
#

if you are doing more analytical stuff, then use a analytical DB such as snowflake to do the analysis on instead. If your data is massive, then probably look into data warehouses instead

wise goblet
#

It is universal database serving well for all cases until u reached TBs of data

#

At some point more specialized dbs for specific usage causes could be wished though

subtle basin
#

Could anyone help me to understand how to compare value from PG array with 'like' statement? SQLAlchemy 2.0

        or_(
            Packaging.barcode.like(q),
            any_(Packaging.additional_barcodes).contains(q),
            any_(ClientContactsNotification.phones).contains(q),
        )
)```
So I need to check 2 arrays if any string contains symbols (param `q`)
fading patrol
austere walrus
#

Thanks! @halcyon dew @wise goblet @fading patrol

stone wolf
#

who good with excel? i got a question i can pay pm me right now

storm mauve
#

!rule 9

delicate fieldBOT
#

9. Do not offer or ask for paid work of any kind.

fallen vault
#

Can I connect to postgre over wifi?

wise goblet
#

And where u can make TCP connection, u can connect postgres

fallen vault
#

Sounds like I’ll have to get IT involved because that’s over my head.

unreal hemlock
unreal hemlock
#

poor guy got confused

fallen vault
# unreal hemlock you overcomplicated the explanation

I'm not confused, just working on a network I don't have that level of access to. Fairly certain the PC ill be working on is isolated on its own network as the location and thus network belongs to the customer and not my employer.

unreal hemlock
#

if that's a yes then you can access the postgre

fallen vault
#

Not sure, I’m not at the facility. Was just told I’ll be going there to set up a script I wrote. I know at my main location, I cannot access the wifi. Its password protected which means I would need IT to set my our devices (highly doubt they’ll just give me the password lol)

pseudo compass
#

Hello, I just wanted to get an idea as to how to go about it. I'm thinking of using python to retrieve JSON data from a website and load it into postgreSQL data base. However, I want to create make it such that the new data is updated daily from the website onto my database. How should I go about it?

hexed estuary
#

You could set up a cronjob (or a task scheduler job) to run your script every day.

austere walrus
# fading patrol As others have said, you want a relational (SQL) database. If you have any troub...

My friend showed me a google program that could take a json dump and had a user interface that showed all of the data as intended. If you know of something else that can take this json and store it while being able to visualize the individual arrays I'd like to know. I'm probably just missing something fundamental. Here is the Json, it is different financial analysis data exported from OpenBBTerminal. earnings, cashflow, earnings estimates, balance sheet, revenue, growth, discounted cashflow, etc. total of I think like 10 tables per dump but I'm trying to start with the whole S&P500. @halcyon dew @wise goblet, suggestions?

slim sleet
#

Hello everyone!

#

I have to do a project using machine learning but I would like to know if anyone has an interesting dataset to share with me?

austere walrus
#

would you like to have access to the fmp pro api key for ESG Rating analysis?? @slim sleet

austere walrus
austere walrus
fallen vault
#

How fast is a Sql select count(*) compared to a Python len function?

wise goblet
#

i can recommend "The manga guide to databases" for getting started

#

if u will not learn proper normalization, your SQL will suck ^_^

#

Sqlite3Browser is great to navigate sqlite

#

Beekeper Studio is good universal SQL interactive GUI in general ( i use it for postgres usually, sometimes for other engines like mariadb/mysql)

austere walrus
wise goblet
#

Exception can be given to trully Enum data (very limited amount of constant value choices that is not going to be ever changed probably). This is fine to denormalize for sure

halcyon dew
torn sphinx
#

if anyone could help me solve this

#

and just explain what a join is in general using that above questio

#

i know the answer but im just really confused

austere walrus
#

If I’m totally miss understanding bcnf lmk

austere walrus
# wise goblet until u got experienced with SQL, recommending always normalizing to 100% capaci...

That was just the format that the OpenBB Terminal(api management hub), exports the data as. I can either choose json, csv, or xlsx. But now I am starting to work directly with the financialmodelingprep api. I can export with the same name (structured as {data type}_{TICKER}.{json or csv}, so that after the data is updated and the database updates with time stamp, the old data gets overwritten, therefore only the database grows and my data for time series analysis is in the database. The export folder should end up being the same size and same structure of data everyday theoretically.

#

If I already have the automation to pull revenue forecasts, cash flow analysis, balance sheet, dcf cash score, earnings history, earnings estimates, and income statements… automatically format itself into a layered json dump… also maintaining each csv file that is composed in the dump… and it translates perfectly in google big query… how far off am I from getting this into a solid format? Does this, being over a period of time with a number of entries following the same data frame, normalize it in a horizontal sort of way?

obtuse magnet
austere walrus
#

(2, 4)🤞

austere walrus
# torn sphinx if anyone could help me solve this

ChatGPT:
import pytesseract
from PIL import Image

Load the image from the provided path

image_path = "/mnt/data/5C7E6F87-BB95-4915-BCD8-24E20CF98987.png"
image = Image.open(image_path)

Use pytesseract to extract text from the image

extracted_text = pytesseract.image_to_string(image)

extracted_text

The provided image contains two tables: LOAN and BORROWER.

The LOAN table has the following structure:
[
\begin{array}{|c|c|c|}
\hline
\text{LOAN_NO} & \text{B_LNAME} & \text{AMOUNT} \
\hline
\text{L-170} & \text{DELHI} & 3000 \
\text{L-230} & \text{KANPUR} & 4000 \
\hline
\end{array}
]

The BORROWER table has the following structure:
[
\begin{array}{|c|c|}
\hline
\text{CUST_NAME} & \text{LOAN_NO} \
\hline
\text{JOHN} & \text{L-171} \
\text{KRISH} & \text{L-230} \
\text{RAVYA} & \text{L-170} \
\hline
\end{array}
]

The question is: How many rows and columns will be there in the natural join of these two tables?

A natural join combines rows from two tables based on the equality of specified columns. Here, the common column between the two tables is LOAN_NO.

Let's find out how many rows and columns will be there in the natural join of these two tables.

Data for tables

loan_data = [
{"LOAN_NO": "L-170", "B_LNAME": "DELHI", "AMOUNT": 3000},
{"LOAN_NO": "L-230", "B_LNAME": "KANPUR", "AMOUNT": 4000}
]

borrower_data = [
{"CUST_NAME": "JOHN", "LOAN_NO": "L-171"},
{"CUST_NAME": "KRISH", "LOAN_NO": "L-230"},
{"CUST_NAME": "RAVYA", "LOAN_NO": "L-170"}
]

Perform natural join on LOAN_NO column

natural_join = []
for loan in loan_data:
for borrower in borrower_data:
if loan["LOAN_NO"] == borrower["LOAN_NO"]:
combined_data = {**loan, **borrower}
natural_join.append(combined_data)

Number of rows and columns in the result

num_rows = len(natural_join)
num_columns = len(natural_join[0]) if natural_join else 0

num_rows, num_columns

(2, 4)

The natural join of the two tables will have 2 rows and 4 columns.

austere walrus
# torn sphinx i know the answer but im just really confused

Join:
A join operation in databases combines rows from two or more tables based on a related column between them. It allows querying data from multiple tables as if they were a single table. There are several types of joins:

  1. Inner Join (or simply Join): Returns only the rows for which there's a match in both tables.
  2. Left (Outer) Join: Returns all rows from the left table, and the matched rows from the right table. If there's no match, the result will contain NULL.
  3. Right (Outer) Join: Returns all rows from the right table, and the matched rows from the left table. If there's no match, the result will contain NULL.
  4. Full (Outer) Join: Returns all rows when there's a match in either the left or the right table. Rows that do not have matching counterparts in the other table will contain NULL values.
  5. Self Join: Joins a table with itself. This can be useful when the data related to one row can be found in another row of the same table.
  6. Cross Join: Returns the Cartesian product of the two tables. It returns every combination of rows from the two tables.

Natural Join:
A Natural Join is a type of join that is based on column(s) with the same name in both tables. When performing a natural join, the database system:

  1. Looks for columns in both tables with the same name.
  2. Performs an inner join on these columns.
  3. Returns only one column for each pair of columns with the same name (i.e., it doesn't duplicate columns).

For instance, given two tables with a common column "ID", a natural join will combine rows with the same "ID" value from both tables. The resulting table will have only one "ID" column.

It's important to note that while natural joins can be convenient, they can also be risky if the schema of the tables changes or if columns with the same name have different meanings in different tables. For this reason, it's often recommended to explicitly specify the columns on which you're joining, as you would with an inner join.
-openai

#

I know this is a db chat but does anyone use obsidianMD? thinking about starting a new vault for database definitions and relationships.

wise goblet
delicate fieldBOT
#

10. Do not copy and paste answers from ChatGPT or similar AI tools.

obtuse magnet
#

dont post chatgpt answers lel

shy dragon
#

Basically i did this in Sqlalchemy
query = select(User, Attendance).join(Attendance, (User.id == Attendance.user_id) & (Attendance.date == date.date())) \ .where(User.designation_id != None, User.organization_id == organization_id)

suppose there is user and attendance model. user may or may not have attendance in a particular date.
now write a query that will fetch all the users of a organization along with the attendance object of a given date. if any user doesnt have attendance on given that date that user will still be listed but their attendance will be null.
how can i do it in django ?

torn sphinx
austere walrus
austere walrus
#

wont anymore. Also thought the OCR aspect was cool.

torn sphinx
torn sphinx
harsh pulsar
#

"you're only cheating yourself"

potent spire
#

what is thats mean? discord.ext.commands.errors.CommandInvokeError: Command raised an exception: TypeError: 'int' object is not subscriptable

#

thats in SQLite

harsh pulsar
verbal scaffold
#

can anyone help me here because no one replied to my post?

fringe sundial
#

guys have i made relationship correctly, it should delete exp if user gets deleted , i using sqlalchemy

class User(Base):
    __tablename__ = 'users'

    id: Mapped[int] = mapped_column(primary_key=True)
    guild_id: Mapped[int] = mapped_column()
    user_name: Mapped[str] = mapped_column(String(32))
    avatar_url: Mapped[str] = mapped_column(String(128))
    exp: Mapped["Exp"] = relationship("Exp", back_populates="user", uselist=False, cascade="all, delete-orphan", single_parent=True)

    def __init__(self, id: int, guild_id: int, user_name: str, avatar_url: str):
        self.id = id
        self.guild_id = guild_id
        self.user_name = user_name
        self.avatar_url = avatar_url

    def __repr__(self):
        return f"<User(id={self.id}, guild_id={self.guild_id}, user_name={self.user_name}, avatar_url={self.avatar_url})>"


class Exp(Base):
    __tablename__ = 'exp'

    id: Mapped[int] = mapped_column(ForeignKey("users.id"), primary_key=True)
    level: Mapped[int] = mapped_column()
    exp: Mapped[int] = mapped_column()
    user: Mapped["User"] = relationship("User", back_populates="exp", uselist=False, cascade="all, delete-orphan", single_parent=True)

    def __init__(self, id: int, level: int, exp: int):
        self.id = id
        self.level = level
        self.exp = exp

    def __repr__(self):
        return f"<Exp(id={self.id}, level={self.level}, exp={self.exp})>"
halcyon dew
#

also applies to the level col

paper flower
halcyon dew
#

Np

fringe sundial
paper flower
#

Well, you already have a relationship there

#

You just need to add ondelete to your foreignkey

fringe sundial
fringe sundial
#

guys how can i fix this error

qlalchemy.exc.ArgumentError: Error creating backref 'user' on relationship 'User.rank': property of that name exists on mapper 'Mapper[Rank(ranks)]'
class User(Base):
    __tablename__ = 'users'

    user_id: Mapped[int] = mapped_column(primary_key=True)
    guild_id: Mapped[int] = mapped_column()
    user_name: Mapped[str] = mapped_column(String(32))
    avatar_url: Mapped[str] = mapped_column(String(128))

    rank: Mapped["Rank"] = relationship("Rank", backref=backref("user", cascade="all, delete-orphan"))

    def __init__(self, id: int, guild_id: int, user_name: str, avatar_url: str):
        self.id = id
        self.guild_id = guild_id
        self.user_name = user_name
        self.avatar_url = avatar_url

    def __repr__(self):
        return f"<User(id={self.id}, guild_id={self.guild_id}, user_name={self.user_name}, avatar_url={self.avatar_url})>"
    

class Rank(Base):
    __tablename__ = 'ranks'
    user_id: Mapped[int] = mapped_column(ForeignKey('users.user_id'), primary_key=True)
    level: Mapped[int] = mapped_column()
    exp: Mapped[int] = mapped_column()

    user: Mapped["User"] = relationship("User", backref=backref("rank", cascade="all, delete-orphan"))

    def __init__(self, user_id: int, level: int, exp: int):
        self.user_id = user_id
        self.level = level
        self.exp = exp

    def __repr__(self) -> str:
        return f"<Rank(user_id={self.user_id}, level={self.level}, exp={self.exp})>"

i am using sqlalchemy

torn sphinx
#
CREATE TABLE store
(store_name varchar (15),sales int (8),st_manager varchar (20));
INSERT INTO store
(store_name,sales,st_manager)
VALUES (colombo, 1200, Perera);
INSERT INTO store
(store_name,sales,st_manager)
VALUES (Kandy, 8000, Silva);
SELECT *
FROM store;
SELECT store_name, sales
FROM store;
WHERE sales >=6000;
SELECT store_name, sales
FROM store;
WHERE store_name = 'Silva';
UPDATE store
SET sales = 10000
WHERE store_name = 'Silva';
ALTER TABLE store
ADD (telephone int (10));

Can someone tell me if this SQL code I have written is correct?

coral wasp
#

It’s not. You have where’s by themselves. Alter table is missing ‘add column’, etc.

torn sphinx
coral wasp
#

Semicolons divide sql into separate statements. Separate what you wrote by each semicolon and look at each statement

ashen heart
#

Can I somehow use ON CONFLICT DO NOTHING with SQLAlchemy's session.add_all()? I'm inserting multiple rows, some of the data will already exist and will trigger my UniqueConstraint set on the table. I want to ignore this data but insert the ones who are OK.

harsh pulsar
#

i remember looking into this kind of thing recently and i don't believe Session.add supports it

#

that is, you can use ORM objects, but you have to use Core syntax

harsh pulsar
#

maybe you can pass a list of ORM objects though? i never tried it

ashen heart
paper flower
warm jasper
#

Anyone that can help with a difficult sql issue i've been stuck on for days. I want to create a view that has 2 CTE's. One of the CTE's uses the
sql ROW_NUMBER() window function which partitions based on two rows, with an order by. I then join these two cte's and fetch the relevant columns which have a rank 1 from the window function. The problem is that when I create this view, and I want to query & filter it on a date column, it only filters the resulting data and not the CTE's, leading to very long query times because the window function ends up ranking large amounts of data even though I only want to get a specific date. Is it possible to adjust the view to force the filters to apply to the CTE's?

coral wasp
#

But, it sounds like you need to move the where clauses into the CTEs.

warm jasper
#

Yeah but I can't have the where clauses hardcoded, I want to be able to query the created view with different dates/filters. I may have found a solution where the window function seems much more optimised by only selecting the columns that are used within the function, and then fetching the remaining columns from a 3rd CTE. But I can't seem to get the JOIN's properly now that theres 3 CTE's as it's returning double the amount of rows as its supposed to.

coral wasp
#

Why would it matter if the where clause is in the cte vs outside? You can parameterize eithrr.

#

Oh. You’re creating a view

warm jasper
# coral wasp Can you share the sql or a minimal reproduction?
ID,
Date,
SUM(Col1),
...
MAX(Col2)
FROM MainTable
GROUP BY ID, Date
), CTE_Rank as (SELECT
ID,
Date,
Load_Date
ROW_FUNCTION () OVER (PARTITION BY ID, Date ORDER BY CASE WHEN ID IS NOT NULL THEN 0 ELSE 1 END, Load_Date DESC) as rank
FROM MainTable
), CTE_Remaining as (SELECT
ID,
Date,
RemainingCol1,
...
RemainingCol2
FROM MainTable)
SELECT 
CTE_Aggregate.ID,
CTE_Aggregate.Date,
CTE_Rmeaining.RemainingCol1,
...
FROM CTE_Aggregate
JOIN CTE_Rank ON (CTE_Aggregate.ID = CTE_Rank.ID)
AND CTE_Aggregate.Date = CTE_Rank.Date)
JOIN CTE_Remaining ON (CTE_Aggregate.ID = CTE_Remaining.ID)
AND CTE_Aggregate.Date = CTE_Remaining.ID``` This is the 3 CTE one i'm trying, with the original being 2 cte's with the CTE_Remaining cols all being in CTE_Rank. This new one seems to be much more efficient but I can't get the JOIN's proper
coral wasp
#

So the first problem is that the where filter can’t one pushed down to the CTE clause because row_number is operating on the whole table: sometimes the database can optimize/push a filter down, but not in this case

warm jasper
#

Yeah I think thats unfortunately the main issue here but I think the 3 CTE sql is the compromise but I'm not sure how to join them

coral wasp
#

This would be much easier as a parameterized query, because you could parameterized the date field directly the cte. It’d perform the best.

coral wasp
warm jasper
coral wasp
#

And why aren’t you filtering out rank is null in cte_rank? It’ll never join in the select with a null id

warm jasper
#

But incase one is not null and the other is null I want it to prioritize the non null one, which is what I believe the function does now

coral wasp
#

Then your join is wrong, because you’re joining on cte_rank.id, which doesn’t work for nulls

warm jasper
# coral wasp Then your join is wrong, because you’re joining on cte_rank.id, which doesn’t wo...

Sorry I just looked at the code again and saw that it's supposed to account for nulls, heres the actual join: FROM CTE_Aggregate JOIN CTE_Rank ON (CTE_Aggregate.ID = CTE_Rank.ID OR (CTE_Aggregated.ID IS NULL AND CTE_Rank.ID IS NUL)) AND CTE_Aggregate.Date = CTE_Rank.Date) JOIN CTE_Remaining ON (CTE_Aggregate.ID = CTE_Remaining.ID OR (CTE_Aggregated.ID IS NULL AND CTE_Remaining.ID IS NULL)) AND CTE_Aggregate.Date = CTE_Remaining.ID

coral wasp
#

What db are you using?

warm jasper
#

Netezza

coral wasp
#

Oh wow, that’s a name I haven’t heard in a while. I don’t know but some dbs have null safe equality’s

#

(Just simplifies the code)

warm jasper
#

Well the original 2 cte solution worked fine and returned the expected amount of rows, including the ones that had null IDs, using the (one of the) JOIN from above. I've just tried to expand it to account for the 3rd CTE

coral wasp
#

Are you only interested in rank 1?

warm jasper
#

Yeah, the entire query is ended with WHERE CTE_Rank.rank =1;

coral wasp
#

So why not just filter it to the first row in the cte with a limit clause?

#

And drop the row number stuff

warm jasper
coral wasp
#

Cte rank is just sorting and assigning a rank from the row number, and then you’re filtering to rank 1. That’s the same as sorting by same criteria and taking first row, right?

#

Sort and limit is (likely) much faster than window ranking

warm jasper
#

I thought that even if there was manual sort logic that did the same as the window function that the performance impact would end up being equivelant but that may very well be wrong

coral wasp
warm jasper
somber ember
#

i now understand why everyone hates mapping; especially when names of the columns change. holy cow

brazen charm
#

You mean like with MongoDB?

somber ember
#

not necessarily. We are migrating from one data warehouse to another, and in that process the team that's doing the migration spread all the data to somewhere else with different names for columns. Some were easy and were found within minutes, but some took hours and days to figure out. Going back and forth between old and new, trying to match the data, and scratching my head why name is similar but value is different, and where did original value go, then find the expected value in totally different name.

coral wasp
fallen vault
#

For Sqlite3, :memory: is what you need for a temporary database right?

storm mauve
#

yeah, if you use sqlite3.connect(":memory:") it will only be stored in memory instead of saving to / loading from a file

fallen vault
#

Alright, I think my issue is the way im using the class. Oh well, ill just make a module level script.

fallen vault
#

Nvm, its becuase my code automatically connect and disconnect to the database

pliant imp
#

very basic question here. Is this the proper way of defining the objects in a set through user input? Example; credentials = input({"enter your number: ", "enter your name: "})

fallen vault
pliant imp
#

True…I’m gonna give that a try. Thank you!

austere walrus
obtuse magnet
#

Hi,

At work, for ETL we basically have functions (stored procedures) defined in pgadmin which are run on a daily basis by a cron job.

However, there isnt any version control for the code in these functions. Does anyone know of any tool that can integrate pgadmin with some form of version control?

wise goblet
# obtuse magnet Hi, At work, for ETL we basically have functions (stored procedures) defined in...

Flyway, Liquibase are more language agnostic
But we are in python server, Python has really strong solutions already for that.

Django ORM has excellent migrating system
Also Alembic from SQLAlchemy is excellent migrating system

I would recommend Alembic or Django ORM. Not sure which one for sure to recommend. Both are good
thinking more towards Django ORM migrating system first perhaps, it has more features 🤔

obtuse magnet
#

Afaik that is more for app purposes

wise goblet
# obtuse magnet How does Django ORM work with typical data engineering pipelines tho?

https://docs.djangoproject.com/en/4.2/topics/migrations/
https://docs.djangoproject.com/en/4.2/ref/migration-operations/#runsql
just abuse its migrating system only

Define migrations
with runSQL (or with its other features if desired). Define migration to move forward and to revert it back just in case
and use python3 manage.py migrate

Thus u will have version controlled applyable and changable postgres procedures (as well as if desired all SQL structure changing code having version controlled pretty much 😄)

obtuse magnet
#

Will look into it when I get home, thanks

wise goblet
#

it has some features that may be advantageous over Django ORM

#

Alembic is more friendly towards being used as raw SQL migrating tool

wise goblet
#

and having ability to revert them easily

#

and it will be code reviewed in pull requests

#

and stored in git

#

so...

paper flower
obtuse magnet
paper flower
#

And also how do you change your tables? pithink

obtuse magnet
#

So we define these functions in pgadmin. There's a cron job that executes these functions on a daily basis

#

So far I haven't had to change a table (add new col etc)

wise goblet
#

leave SELECT/EXECUTE and etc for reading only

#

in order to prevent future GUI creep

obtuse magnet
#

Huh?

wise goblet
obtuse magnet
#

We are the only users

wise goblet
#

enforce changes via code only

obtuse magnet
#

GUI as in pgadmin?

wise goblet
obtuse magnet
#

Ah right okok

#

Will be a big overhaul hahahah

#

Especially when we're in the middle of building more tables for more analytics

potent spire
#

how to delete a row in mysql?

fallen vault
#

DELETE FROM table WHERE condition = ?

paper flower
#

The same as in any other SQL db

potent spire
#

so?

#

solved

stiff radish
#

This is definitely a place where performance matters and where you want to be in control of the exact queries

spark bear
#

Hello

#

Guys

#

I need help

#

Asap

#

I created two tables called staff and dept

#

I have added primary keys to both

#

I wanted to make joints but i forgot to add foreign key

#

Now i am using alter table function to add it

#

But its giving some weird error

#

Help pls?

#

@stiff radish sry for the ping mate. But can you help me 😭🙏🏻

#

Oh i got it

#

Thanks tho

obtuse magnet
stiff radish
#

The codebase I inherited used a SQLalchemy ORM based pipeline. It was super clean, 0 comments or docs but I was able to finish the work and it ran. Yay! It scaled super super poorly though. At a certain throughput it became a massive bottleneck.

#

Then it became a question of digging through all the SQLalchemy docs to see what could be done. I ended up removing and rewriting critical parts without the ORM. I think if you fully master SQLalchemy you could do it but it's not worth the effort, you'll constantly be thinking at the SQL level and finding what features do what you want

brazen charm
#

For us it tends to not be an issue, if we hit a bottleneck it tends to be an expense related bottleneck where just using SQL in general for the pipeline becomes too expensive to run

#

Regardless of how well optimized the query is

#

So then it gets of loaded to spark

obtuse magnet
#

just so i can see where i currently lie on that line

brazen charm
#

idk it depends on the data

#

but we process several TB of data as part of single pipelines

obtuse magnet
#

ah ok, we more on the multiple GB scale currently

#

altho we do face some long runtimes

foggy iron
#

someone know a better way to access a int from the .idxmin() (pandas library) instead using .idxmin().values[0]?

#

thats one example of return of a .idxmin()

hexed estuary
#

.iloc[0] might work

harsh pulsar
#

you might be running idxmin on the dataframe but you want to just run it on one column?

#

!e ```python
import pandas as pd

data = pd.DataFrame({'x': [1,2,3]}, index=list('abc'))

idxmin on all columns

print( data.idxmin() )

print()

idxmin on column 'x'

print( data['x'].idxmin() )

delicate fieldBOT
#

@harsh pulsar :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | x    a
002 | dtype: object
003 | 
004 | a
harsh pulsar
#

!e ```python
import pandas as pd

data = pd.DataFrame({'x': [1,2,3]}, index=list('abc'))

print( data['x'].idxmin() )
print( data.idxmin()['x'] )
print( data.idxmin().at['x'] )
print( data.idxmin().loc['x'] )

delicate fieldBOT
#

@harsh pulsar :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | a
002 | a
003 | a
004 | a
harsh pulsar
#

@foggy iron

stiff radish
pale ruin
#

What is an xlmx file??
Or did the guy I heard it from mean xlsx?

slender atlas
#

Might have meant xmlx

neon salmon
#

How can i find out whats my database name and collection name from mongoDB?

slender atlas
harsh pulsar
#

I don't get it

#

It's one thing to try to make a business supporting your open source software

#

But VC funding? I feel like the only possible reasonable exit scenario is acquisition

west sentinel
#

ServerSelectionTimeoutError: SSL handshake failed: ac-qb0gg1h-shard-00-00.2lrs6ai.mongodb.net:27017: [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:992),SSL handshake failed: ac-qb0gg1h-shard-00-02.2lrs6ai.mongodb.net:27017: [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:992),SSL handshake failed: ac-qb0gg1h-shard-00-01.2lrs6ai.mongodb.net:27017: [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:992), Timeout: 30s, Topology Description: <TopologyDescription id: 650ef00ec787e7cf6195619f, topology_type: ReplicaSetNoPrimary, servers: [<ServerDescription ('ac-qb0gg1h-shard-00-00.2lrs6ai.mongodb.net', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('SSL handshake failed: ac-qb0gg1h-shard-00-00.2lrs6ai.mongodb.net:27017: [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:992)')>, <ServerDescription ('ac-qb0gg1h-shard-00-01.2lrs6ai.mongodb.net', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('SSL handshake failed: ac-qb0gg1h-shard-00-01.2lrs6ai.mongodb.net:27017: [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:992)')>, <ServerDescription ('ac-qb0gg1h-shard-00-02.2lrs6ai.mongodb.net', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('SSL handshake failed: ac-qb0gg1h-shard-00-02.2lrs6ai.mongodb.net:27017: [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:992)')>]>

i keep getting this error when i try to access my database in mongodb, ive gotten it before but fixed by adding tls=true in the uri but now its showing up again

cobalt jasper
#

anyone good with php?

#

in vscode

#

please help asap 🥲

torn sphinx
#

https://paste.pythondiscord.com/2I2Q bascially, i'm having a flask run server problem here and it's database for postgresql since this text is too long for me to paste and that i don't have nitro, i have this link here that y'all can look at it my problem

fading patrol
fading patrol
cobalt jasper
fading patrol
coral wasp
#

Duckdb is open source, owned by Duckdb labs and the funding has nothing to do it it. MotherDuck is a separate entity (with some overlap in people), which is the ventures backed entity building a cloud service around duckdb. They have a fairly active discord.

brazen charm
#

A lot of other systems I get

#

but DuckDB seems slightly odd in terms of how it generates revenue

coral wasp
brazen charm
#

Yeah i suppose

#

my only concern is it's a bit of a hard field for them to compete with VS the other current heavyweight champions

#

Since effectively motherduck tries to compete with Starburst, Athena, BigQuery, Sneller, maybe ClickHouse, etc... as Comerical offerings.
And has to compete with Trino, Presto, etc... for self-hosted services.

torn sphinx
#

did i do anythin wrong here idk what i did wrong

brazen charm
#

In particular, Starburst, BigQuery and Sneller are just giants with really solid products, but idk, I haven't seen MotherDuck's pricing/I cant find it?

coral wasp
brazen charm
#

yeah, like DuckDB has a lot of use cases

#

and is a super powerful tool

#

but IDK how well that ends up happening in the world of datalakes

coral wasp
coral wasp
harsh pulsar
harsh pulsar
coral wasp
harsh pulsar
#

Fortunately it's open source so if they fuck it up the community can fork

#

Im not sure if apache drill and friends are comparable

coral wasp
harsh pulsar
#

Right

coral wasp
#

So no fork problem.

brazen charm
#

since datalakes now days are just execution engines ontop of S3/Blobstorage

#

on some file format, normally parquet

harsh pulsar
#

right but they're optimized for different things as far as i understand

#

maybe you could build a data warehouse using duckdb internally but that'd be a whole project

brazen charm
#

They seem to be targetting the same things no? I don't think it's anything near capable of competing with Spark processing, i'd argue they'd have a better time trying to compete with Athena

coral wasp
harsh pulsar
#

the data lake was just a bunch of delta lake tables + docs

coral wasp
#

Yah, the vendors always want to make something easy hard.

brazen charm
#

I can't work out how to use their cloud product atm, but it's in beta so meh
but from what I can see in their docs its effectively:

  • Attach Motherduck to S3 as a primary datalake (Other sources available but point is the same)
  • In your notebook you can create queries and do processing of the files like parquet files on the lake
coral wasp
#

I guess reminiscent of how I use jupyter remote kernels.

brazen charm
#

it's your fairly standard datalake saas offering from playing around with it

wise goblet
#

Mother Duck

#

(never used it, but excited in advance because of their theme 😁 )

potent bolt
#

Anybody has experience with relational databases in FastAPI specifically using the sqlalchemy ORM?

fading patrol
potent bolt
#

Alright. So I just want to know if the FastAPI docs is more than enough for my needs? Should I just follow the example?

fading patrol
potent bolt
#

Alright. Thanks.

harsh pulsar
#

SQLA isn't that hard to use, but I suggest working on one element at a time because both frameworks are fairly rich in features and it will be a lot more effective with your time to learn one at a time instead of trying to learn both halfway

#

the one tip i have is to use fastapi Depends to obtain a database session instance from your sessionmaker. that won't make a lot of sense now, but file it away and come back to it once you start working on the SQLA stuff

#

fastapi also makes nontrivial use of pydantic and is a relatively thin layer over starlette's asgi implementation, so you're already looking at two additional frameworks before you get to any database interaction

paper flower
# potent bolt Specifically: https://fastapi.tiangolo.com/tutorial/sql-databases/

I think fastapi's tutorial is not that good - it's outdated, uses legacy SQLA model definition API, uses synchronous API, specifies unnecessary aguments (e.g. Column(Integer, primary_key=True, index=True), PKeys are already indexed), uses outdated pydantic API, uses legacy SQLA query API, uses try-finally instead of with
TLDR: Don't follow that tutorial, just read sqlalchemy docs, though a lot of people find it quite hard to read/understand

potent bolt
#

My problem is the plenty boilerplate code.

paper flower
potent bolt
#

I've used SQLA with Flask, and the flask wrapper makes it relatively easier

potent bolt
paper flower
#

Nobody can read minds

potent bolt
#

Ok please hold

paper flower
potent bolt
#

I see what you mean. But would it be cool if I just remove the indexargument?

#

Also, what do you mean by synchronous API?

#

I'm using the sqla package directly, not some wrapper

paper flower
potent bolt
#

I kinda understand this because it's really similar to when I was working with Flask

#

I mean the Column stuff

paper flower
#

That tutorial is just outdated

potent bolt
#

But it seems specific to FastAPI. Would it be easy for me to integrate with FastAPI by reading the official docs?

paper flower
#

fastapi does not require anything specific to do

#

If you can use sqlalchemy you can use it anywhere

#

As @harsh pulsar it would be a good idea to move session creation to a dependency, so you don't have to do that by hand everywhere, that's all

#

But it's not required

potent bolt
#

That's what they do

paper flower
#

That could be shortened to just

async def get_session() -> AsyncIterator[AsyncSession]:
    async with async_session_factory() as session:
        yield session
potent bolt
#

With FastAPI, I don't need to install any package to use the async/await right? I mean, I've done it before but I've only worked with FastAPI once so I'm asking

paper flower
#

To use async/await where?

#

With fastapi? No.
Sqlalchemy would require a specific DB driver depending on what DB you use

potent bolt
#

Yeah, with fast api

#

But I don't see the dependency that enables that feature

#

Something like asyncio

#

I can only see anyio

#

Is that it?

#

I'm using postgresql db

#

What do you mean specific driver? Isn't sqla the driver?

paper flower
#

You need psycopg3 or asyncpg

potent bolt
#

hmmm... I see what you mean. THat's simply as a dependency I guess, not that I'm using it directly

paper flower
#

Yep

potent bolt
#

I appreciate the help guys! I'll try my best :)

timid loom
#

why doesnt it replace?? i get as output medium, hard and soft

import fastf1
import pandas as pd
from fastf1 import plotting
import numpy as np
import csv
from datetime import datetime


track = "Catalunya"


race = fastf1.get_session(2023, track, "R")
race.load()
point_finishers = race.drivers[:10]
print(point_finishers)
driver_laps = race.laps.pick_quicklaps()
driver_laps_view = driver_laps[["Compound", "TyreLife", "LapTime"]]
driver_laps_view["LapTimeSeconds"] = driver_laps_view["LapTime"].dt.total_seconds()
 
del driver_laps_view['LapTime']
driver_laps_view["Compound"].replace("SOFT", 0)
driver_laps_view["Compound"].replace("MEDIUM", 1)
driver_laps_view["Compound"].replace("HARD", 0)
driver_laps_view.to_csv("D:/Dev/Python/F1Strategy/tyre-model/tyredeg.csv", header="Compound, TyreLife, LapTime", index=False)
potent bolt
#

The closest I've seen is async_session or something like that

ancient kraken
#

I think thats why doctor say just follow sqla's docs rather than fastapi's docs.

potent bolt
#

I'm following sqla docs and that's why I asked that

#

The sqla docs is really hard to read :)

#

@paper flower I read somewhere that I don't need to specify psycopg2 when creating the engine

#

As sqla uses psycopg2-binary under the hood

potent bolt
#

Yoooo

#

For a datetime field, do I need a special data type (say DateTime from sqlalchemy.types)?

#

Or I can use Mapped[str]??

#

I need answers 😢

#

This is an interview task I'm building that would be submitted

halcyon dew
#

just use the datetime field. that's why that exists

potent bolt
#

So would it be Mapped[DateTime]???

halcyon dew
#

no clue what Mapped does

potent bolt
#

That's the new model definition syntax for version 2.0 I guess

#

Also, have any idea how I can provide validation for email field?

halcyon dew
#

so Mapped is basically REFERENCE table (id)?

potent bolt
#

REFERENCE is supposedly a FK relationship right? No, it's not

#

There's the relationship() function for that

#

Mapped is just used for the data type mapping

halcyon dew
#

i dont work with sqlalchemy so can't help you that part. only work with SQL

potent bolt
#

Alright, that's fine. Thanks for the help

harsh pulsar
#
engine = create_engine(...)
async_session_factory = async_sessionmaker(engine, ...)

async def get_db() -> AsyncSession:
    async with async_session_factory() as async_session:
        yield async_session

app = fastapi.FastAPI()

@app.get("'/users"
async def get_users(
    db: Annotated[AsyncSession, fastapi.Depends(get_db)]
):
    ...
#

or something like that anyway

#

that's just off the top of my head, you'll need to confirm with the docs

paper flower
paper flower
#

Not for async

hearty siren
#

consider I have a 2 tables in db user (with columns primary_key and id) and user_collections (with columns primary_key, some_value and user_id) why should I use foreign key in user_collection(user_id column) when i can just use the user id real value (because that value is also uniue)

#

does this make sense can I use either option and it's fine or 1 is better then another

paper flower
hearty siren
potent bolt
potent bolt
paper flower
paper flower
#

Sqlalchemy has a bunch of standard types already "mapped" to sql types

#

ints, floats, datetime, date, etc

potent bolt
#

I see... thank you

#

I will regularly ask for help here. Is it even okay I DM you directly?

paper flower
#

You could but I'd prefer if you don't

potent bolt
#

Alright. I'll stick with here then

potent bolt
#

Please is this valid code?
date_created: Mapped[datetime] = mapped_column(default=datetime.now)

#

I mean the default argument stuff

#

@paper flower

#

Also, how can I implement the equivalent of auto_now_add and auto_now in Django for datetime fields?

#

I guess onupdate argument solves this?

paper flower
#

And onupdate is called when you update the model

potent bolt
#

Nice... thanks

#

Please how can I add email validation at the model definition level?

#

I'm really sorry for the many questions :)

paper flower
#

Also you'd be surprised what emails are valid

#

We're not quite sure exactly when email was invented. Sometime around 1971. We do know exactly when spam was invented: May 3rd, 1978, when Gary Thuerk emailed 400 people an advertisement for DEC computers. It made a lot of people very angry... but it also sold a few computers, and so junk email was born.

Fast forward half a century, and the rel...

▶ Play video
potent bolt
#

Okay okay

#

What about FK relationships. So I have two tables, User and Todo. A user can definitely have multiple todos. Should I use a 1:M or M:1 r/ship?

#

My guess is that 1:M solves this even though it sounds counter-intuitive

#

??

paper flower
#

1:M is the way to go there

potent bolt
#

Yeah, I'm actually looking at that part of the docs

#

Well, now I think about it, it doesn't sound counter-intuitive

#

One User to Many Todos

#

The docs uses List[] data type though as opposed to list[]

#

Does that mean I'd have to import typings module?

#

I thought that's not needed for pydantic v2?

paper flower
#

They're the same, and it's not related to pydantic at all

#

If your python version supports list[] - use it over typing.List

potent bolt
#

Okay. Thanks. My python version does - it's 3.10+

#

Even in FastAPI docs, I noticed they didn't have to import typing

#

I have a problem

#

In the docs, they just used the child model in string, like "Todos"

#

but mine doesn't really work.you can see the editor highlight

#

I guess this is because the model is defined in another file

#

Should I just import the model, and add it as typing argument?

#

Without the strings?

paper flower
#

You can't do that because of circular imports

#

Also model name preferrably should be Todo

#

You could import it in TYPE_CHECKING block

potent bolt
#

Ah, yes Todo actually

potent bolt
paper flower
potent bolt
#
from typing import TYPE_CHECKING
from datetime import datetime
from sqlalchemy import String
from sqlalchemy.orm import Mapped, mapped_column, relationship
from .base import Base

if TYPE_CHECKING:
    from .todos import Todo


class User(Base):
    __tablename__ = "users"

    id = Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[str] = mapped_column(unique=True)
    password: Mapped[str] = mapped_column(String(10))
    created: Mapped[datetime] = mapped_column(default=datetime.now)
    modified: Mapped[datetime] = mapped_column(
        default=datetime.now, onupdate=datetime.now
    )
    children: Mapped[list["Todo"]] = relationship()

    def __str__(self) -> str:
        return f"User: {self.email}"
#

Is this correct?

paper flower
#

replace = with : in id = Mapped...

potent bolt
#

Yes, thanks for the oversight

#

How do I handle creating database table and migrations

#

From the docs, I'm seeing something like

        await conn.run_sync(meta.create_all)
#

Is that what that line does?

#

Whereas in FastAPI docs, I think they recommend alembic?

paper flower
potent bolt
#

Too many moving parts. Phew

burnt abyss
#

How can I improve mysql data insertion speed? I'm importing data from an xml file, but it's taking about 2 seconds to insert the data, I wanted to improve the speed at which it inserts the data...

import pymysql

db_config = {
    ...
}
table_name = 'intimacoes'
xml_file_path = 'my_file.xml'

conn = pymysql.connect(**db_config)
cursor = conn.cursor()

load_xml_query = f"""
LOAD XML LOCAL INFILE '{xml_file_path}'
INTO TABLE {table_name}
ROWS IDENTIFIED BY '<intimacoes>';
"""
cursor.execute(load_xml_query)

conn.commit()
cursor.close()
conn.close()

coral wasp
#

Also; local infile is going to be slower than infile (serverside), so if you can move the file to the server, that’s also help.

burnt abyss
#

1.152.175 lines

burnt abyss
coral wasp
#

Does your table have any indices or constraints? If so, you could disable them before load.

#

Second, you could transform the xml to csv before loading. I would compare xml vs csv load times for your dataset before writing any code here tho

potent bolt
#

@paper flower From the alembic docs, I can see that they modified the .ini file to add the database url. However, this is risky as the creds are exposed. I intend to read from an .env file which I'm using. Please how do I go about this?

#

It seems I have to edit the env.py file but I'm not very sure about this

#

Please, again, forgive my plenty questions

#

I have a deadline in less than 24 hours and this is just the backend of the task. I still have frontend and devops left

#

I can't afford to do things myself

#

I edited the run_migrations_online() function in the env.py:

def run_migrations_online() -> None:
    """Run migrations in 'online' mode.

    In this scenario we need to create an Engine
    and associate a connection with the context.

    """
    connectable = create_engine(settings.SQLALCHEMY_DATABASE_URL)

    with connectable.connect() as connection:
        context.configure(connection=connection, target_metadata=target_metadata)

        with context.begin_transaction():
            context.run_migrations()

Does this make any sense?

#

I substituted engine_from_config function with create_engine

potent bolt
potent bolt
#

Using the same config above, I get this error when I run migrations (specifically alembic upgrade head):

sqlalchemy.exc.MissingGreenlet: greenlet_spawn has not been called; can't call await_only() here. Was IO attempted in an unexpected place? (Background on this error at: https://sqlalche.me/e/20/xd2s)
#

If I change create_engine to create_async_engine, I get the error:

with engine.connect() as connection:
AttributeError: __enter__
fading patrol
potent bolt
#

Wouldn't that take long to get a response?

#

I just need a quick response to my problem

#

And I promise I might not disturb here again

harsh pulsar
harsh pulsar
hearty siren
#

Not sure if it supports xml though

potent bolt
#

I just needed to run alembic init -t async alembic

#

the -t flag was missing previously

paper flower
potent bolt
#

What libs exists for tokens?

#

auth tokens?

paper flower
#

What tokens?