#databases

1 messages · Page 35 of 1

tender kernel
#

so in one second, you can have like a dozen trades that occur that just happen to be at the same price and volume and all within the same exact second

obtuse halo
#

Then you purge the data for the given range in the file at the time you do the bulk insert

paper flower
#

How would that help with duplicates? pithink

#

Jobs are ephemeral, no?

#

you can just use a timestamp instead in that case

obtuse halo
#

To know if that batch has been processed already, that specific file

#

Or if it's from the realtime strea

tender kernel
#

yes, but there is no truly unique identifier for each trade

#

so...

#

I thought maybe i could create them and identify data dump entries from live trade data entries to make sure that identical trades, have a unique ID so if they are from different sources they can be identified to have already been recorded

#

or not recorded...

#

soo, i mean that if I put like....'dump' or 'live' at the beginning of each trade id that gets created for each entry...duplicates would be identifiable. so if the identical trade entries are already present from the live trade data, dump doesn't need to record to db. if dump trade entries are already present and identical. they are also, obviously duplicates.

#

Add a unique identifier for data sources: Differentiate between live data and data dumps by adding a column (e.g., "source") in your database to identify whether the data is from a live feed or bulk dump.

Handle timestamp precision: If possible, use higher precision timestamps (milliseconds or microseconds) to distinguish between trades occurring in the same second.

Create a composite unique key: Use a combination of timestamp, price, volume, and source to ensure identical trades are not incorrectly marked as duplicates.

Use PostgreSQL's ON CONFLICT: This allows you to skip inserting duplicate rows based on defined conditions.

Batch processing with checksums: Use a rolling checksum or hash of each trade to identify duplicates more efficiently.

This approach helps manage legitimate, identical trades while avoiding unintentional duplicates when merging data from multiple sources.```
#

^ thats what chatgpt had to say about the conversation we are having

#

I think I'm right

obtuse halo
#

That's what I wrote 🙂

tender kernel
#

So I'm right?

tender kernel
obtuse halo
#

Sorry was super distracted by a game

tender kernel
#

no worries

obtuse halo
#

You have two sources:

  1. Bulk data inserts
  2. Live, api-sourced (or whatever)
tender kernel
#

yes

obtuse halo
#

You keep track of the job/task/whatever that inserted each row so you can tell if it came from a live feed or not

tender kernel
#

yes

obtuse halo
#

When you process a bulk report, since you know it's for a specific time range and is complete for that time range, you delete everything in that range in the same transaction that inserts the new bulk report

tender kernel
obtuse halo
#

Depends on the guarantees of the bulk report

tender kernel
#

its the most granular data possible. they dont register deeper than one second

#

i may not be understanding that correctly though. could you elaborate please?

#

what guarantees do you mean

#

as in, are those all the trade that occurred? yes.

#

more precisely, even if they make a mistake, it's reflected in their charts and I need to make the same mistake so my charts reflect the same candle patterns

torn sphinx
obtuse halo
tender kernel
obtuse halo
#

Like I said above you destroy the data you got from another source for the overlapping period when you get the bulk report

#

You only need incremental data for a period not covered by a bulk report

tender kernel
#

Oh my God I think I see what you're saying now

#

Bulk data kills live data

#

I'm so dumb

obtuse halo
#

Yes, the bulk kraken reports are complete for that time period

tender kernel
#

I'm so dumb and you're so smart thank you so much.

#

Oh my God that's so f****** simple

obtuse halo
#

Nah I'm just explaining it very poorly getting my ass kicked

tender kernel
#

I'm sitting here trying to figure out how to fit these chunks together down to the millisecond and ultimately it doesn't even matter

#

Because my program will just rewrite the candlesticks anyway

#

Thank you @obtuse halo 👍

#

Good luck in your game

cold timber
#

I have script for web scraping in which i used function for scrolling web page but its not working is there anyone who can help me

cold timber
coral wasp
#

Open a help thread please, I don't understand your question

cold timber
#

Ok

floral folio
#

SQL lite my beloved fr

restive current
#

I'm finding way to convert sql alchemy row obj -> dict

#

Isn't there a function like this in sqlalchemy?

cedar tiger
restive current
#

like Instance of Model Class

#

class Model(Base):
tablename = 'models'
....

obj = Model()
obj to dict

paper flower
#

because depending on what you're using it for there could be better ways of doing that

restive current
#

to create another class instance easily.
obj = Model()
model_dict = convert_to_dict(obj)
Another(**model_dict)

paper flower
#

Is there really a guarantee that their fields will match? Is Another a sqlalchemy model too?

restive current
#

no

paper flower
#

I asked two questions 😅

restive current
#

The translator app is slow to turn on.

It's a construct that creates a dataclass object that has a one-to-one correspondence with the model. The dataclass is easy to convert, but sqlalchemy doesn't provide a way to convert the model.

paper flower
#

I thought that maybe you could use pydantic models instead but if you want dataclasses it would be easier to just convert your model to dict 🤔

#
def to_dict(model: DeclarativeBase) -> dict[str, object]:
    mapper = inspect(model)
    return {
        attr.key: getattr(model, attr.key)
        for attr in mapper.attrs
    }
restive current
paper flower
#

I don't remember if column.name could be different from attr.key

#

but

class Test(Base):
    a: Mapped[int] = mapped_column("id")
restive current
urban stream
#

hey yall I'm looking for some advice. I am working on building an app that is more or less and inventory system for multiple users to track their own personal inventory. My only real experiance with databases is SQL, I was wondering If SQL would work for this. For a little more context, users will be able to input item, what the item type is(i,e. tool, electronic, book, toy), what comes packaged with said item, and the condition of the items. My idea is that as people build out their inventories and categorize items it will offer suggested categories to other people inputting the same item to their inventory. I hope this makes sense, any help would be appreciated thank you.

brave bluff
#

I'm starting to learn SQLAlchemy, having worked a bunch with Django. First, am I alone in really disliking the official docs? I feel like it's really hard to figure out... More importantly, how do people handle helper functions in SQLAlchemy? I'm currently thinking about something like how Django does managers. For example, if I have a column with a timestamp for when something was deleted, when I query for existing values, I'll want to filter by where that field is null. Easy enough to do with Django managers, but not sure if SQLAlchemy goes more for the "write it into every query" style, or there's another mechanic I should know.

austere badge
#

So my current way of storing data is using json. Problem is I went so far into coding my project to store it all in json that I would need to rewrite the entire format of my code if I wanted to get a real database, mysql. Heres how Im thinking about this... if Im running a website why not just store it locally on the same machine as json file so its more simplistc? Why should I get mysql? If I do run mysql should I create another script to just translate from json to mysql format like a middleware thing or should I simply just rewrite the entire code to directly just store into mysql. I like the way json is formatted its easier to read in my opinion and look at the data vs mysql. But I think its just because ive been doing json for so long thats the reason why. Any opinions please.

#

I was debating on having it both. Store as json locally so I can look into the data much easier and store it in mysql as the real deal

prime abyss
#

@modern ice

#

' or '1'='1

brave bluff
# austere badge So my current way of storing data is using json. Problem is I went so far into c...

A few things there.

  • JSON is slow. As the file gets larger, it'll get slower. It also runs into issues if you have two people using the app at the same time.
  • If your current code already uses JSON and you just want to get to using a DB as fast as possible, having a translator is not a bad idea, though in general, you'd want to have code that does reading/writing separated anyway, for exactly this reason.
  • I totally get that JSON is easy to read, but I'd argue SQL databases are just as easy, once you get over the inital "new" hump. There are plenty of GUIs that help out too.
  • If you want to stick to JSON and use a database, NoSQL is an option, but it's not a replacement for SQL. Each has its own use cases.
    tl;dr; Static files are fine for small things that won't change/won't change much. They don't scale though, and you need a database for that. If you're interested in programming at any scale, I strongly recommend pushing past the discomfort and learning DBs; you'll wonder why you didn't before.
austere badge
# brave bluff A few things there. - JSON is slow. As the file gets larger, it'll get slower. I...

Gotcha. So I do plan to scale in the future the database by alot. So I guess my final question is. Since I do plan on rewriting the entire code to just directly store into mysql as my main database for the website. Since im new to mysql and how the database is structured. I do have a bunch of nested lists/dicts. So when trying to layout how I want to save the data in mysql should I just have a bunch of tables for those nested lists?

#

Or have it all written out in 1 table

coral wasp
brave bluff
# austere badge Gotcha. So I do plan to scale in the future the database by alot. So I guess my ...

In databases, we refer to what you're talking about as "normalization". N1 basically means you just have a giant table, each column representing a field, up to N6, which basically means you have a separate table for every value (it's far more nuanced than that, but a generalization). You generally want to aim for N3/N4, which means you group data by idea (into a table), and relate other tables via foreign keys. In JSON terms, if something is nested, the vast majority of the time, that means it should be a separate table, because each nested item represents it's own group of data.
For example, a json object representing a student: You'd probably have personal information about them at the top tier, such as name, phone number, address. While address is it's own data group, it's more closely related so the student's personal info, so you group all that into a table. Then you may have a list of classes. Now the classes aren't really linked to their personal info, and different students can have the same class. So you'd make classes their own table. Then you'd add a third table that links students with classes. I know that sounds like a lot, but I just mean it to help you think about how to break up groups of data.
Also, don't fret too much about over/under normalizing. You only learn good database design by doing it, IMO.

#

I'd just start by thinking of each nested structure as it's own table. If something is nested in a nested structure, it's likely it's own table, etc.

#

From there, I'd do some reading on normalizing, and look up how other software breaks up it's data.

austere badge
#

Sounds good!

brave bluff
#

One BIG advantage to the database is you define types. It helps you make sure your data is consistent

#

In JSON, each object can vary, each value can be different.

#

In SQL, if a column is a number, only numbers can go there, so you know when you retrieve, you'll get a number. A table has a fixed definition, so you know when you get a row, you know what columns you can expect.

austere badge
#

Is it possible I can set up the database locally on my computer before trying to run it on the server? As like a test.

brave bluff
#

Yah, 100%

austere badge
#

Perfect

brave bluff
#

MySQL has installers

#

And if you want to do a small side track, containers are perfect for testing like this. You can learn docker basics to do what you're talking about pretty quickly. But that is another thing to learn, so don't stress over it.

austere badge
#

Will look into it in the future.

#

Thankyou so much man!

brave bluff
#

Sounds good.

#

If you're on mac, homebrew has mysql

#

Just make sure you install the client and server

#

They're two different things: the server holds the data, the client is how you connect to the server

austere badge
#

Did not know. I thought I just needed the client

brave bluff
#

Also, if you're in Python, I recommend Postgres over MySQL

brave bluff
#

A lot of Python tools have more native support for Postgres, wherehas MySQL requires additional packages

austere badge
#

Oh. But is it the same concept as we talked about?

brave bluff
#

Mind you, if you're just doing your own stuff, it doesn't matter

#

PostgreSQL and MySQL are just two different flavors of SQL

#

If you learn SQL basics, it applies to any SQL database

#

There are just specifics in each you end up tweaking

austere badge
#

I know basics of SQL but not crazy stuff

brave bluff
#

If you plan on staying in Python, I recommend go with Postgres. If you're just messing around, either one works. You won't go wrong.

austere badge
#

Alright since my projects is mainly Python I will go with Postgres

brave bluff
austere badge
#

lol

brave bluff
#

So yah, a LOT of tutorials on how to use Postgres with Python.

#

They'll teach you how to connect your python app to your postgres server. I suspect you can probably find tutorials that will tell you how to set up the postgres server too, but I can't guarentee that.

austere badge
#

Ill figure it out.

#

But thanks again

brave bluff
#

Best of luck!

brave bluff
#

The more I try to learn SQLAlchemy, the more I think I should just write the queries myself and use dataclasses data structuring...

unborn cradle
harsh pulsar
#

Check out Sqlmodel if you like Pydantic

#

!pypi sqlmodel

delicate fieldBOT
#

SQLModel, SQL databases in Python, designed for simplicity, compatibility, and robustness.

Released on <t:1725097404:D>.

brave bluff
#

I've been following sqlmodel, I just worry it's really early. I am using fast api, so it's convenient.

#

And yah, there's a lot of reading/writing, which is why I want an orm...

#

I want to learn Sqlalchemy, but oh man...

#

And unless I'm wrong, won't I be using sqlalchemy for the actual querying with sqlmodel? Doesn't it just wrap the orm part of sqlalchemy?

barren abyss
#

in-memory databases maintain a disk data structure such as B-tree and write the log entries periodically to the B-tree. After applying log entries to the B-tree, we purge them from the log. This is called checkpointing. Now, when the system restarts, we first load the checkpoint and write more recent log entries that are not included in the checkpoint to the memory.
so,in-memory database simply append the current data onto the checkpoint?

#

in a situation where checkpoint alr loaded into the disk

cedar tiger
paper flower
brave bluff
woeful heath
#

I'm trying to follow this tutorial https://learndjango.com/tutorials/django-docker-and-postgresql-tutorial to set up docker and use postgres, but when I do docker-compose up I get the error

Couldn't import Django. Are you sure it's installed and available on your PYTHONPATH environment variable? Did you forget to activate a virtual environment?

I thought the entire point was to work in docker rather than a local virtual environment. Did this tutorial miss something?

brave bluff
woeful heath
pastel vale
#

are there any good books or website on learning more advanced sql techniques?

cedar tiger
restive current
#
import abc
from sqlalchemy.ext.declarative import declarative_base, DeclarativeMeta

class DeclarativeABCMeta(DeclarativeMeta, abc.ABCMeta):
    pass

class Base(declarative_base(metaclass=DeclarativeABCMeta)):
    __abstract__ = True
    @abc.abstractmethod
    def implement_me(self):
        """Canhaz override?"""

Will this code work in SQLAlchemy 2.0, or has there been a better way to do it since the version change?

jaunty linden
#

how to get good at the art of database normalization?

#

i need a resource where i can practice lots of database normalization

paper flower
#

Why do you want an abstract method there?

#

(just curious)

#

If you're using mypy you can just do

class Base(DeclarativeMeta):

    @abc.abstractmethod
    def implement_me(self) -> None:
        """Canhaz override?"""


class Test(Base):
    pass

But there's a downside - mypy would only throw an error if you try to instantiate Test, you can also mark it with @typing.final, forcing mypy to type check it when you define it

paper flower
#

There's really no benefit in using ABC baseclass/metaclass here, IMO

#

Unless you're instantiating your classes dynamically (which mypy won't catch)

wild nymph
#

Hi, I have some questions regarding the optimistic or pessimistic approach to database locking.
I found this good answer on stackoverflow:
https://stackoverflow.com/questions/129329/optimistic-vs-pessimistic-locking/129397#129397

But I have a couple of additional questions in case someone can help me:

  1. The optimistic approach means that records are not locked, while the pessimistic approach does lock records. Do these locks only imply not being able to write to them, or do they also imply not being able to read them?
  2. If the locks only imply not being able to write to them, what advantage does the optimistic approach have? If two operations are done almost at the same time, one of the two will end up being discarded due to the version change caused by the first.
coral wasp
fervent charm
#

I 💗 duckdb

#

handles a table with 30k columns no problem

coral wasp
#

And I thought my tables were wide. Wow.

half thorn
echo raven
#

@ocean grail

ocean grail
#

Oh nvm

#

Mb

echo raven
#

long press

ocean grail
#

Somehow I still didn't hit a 100

echo raven
#

how many u got

ocean grail
#

99

echo raven
#

on my msg

ocean grail
#

Nope

#

Not working

#

Welp it is what it is

#

Anyways I gotta hit the bed

echo raven
ocean grail
#

It still didn't hit a 100

#

No biggie

fervent charm
harsh pulsar
coral wasp
vague fable
#

Just a general question, I've been practicing my SQL (not my strongest point), I inserted into a table, 10,000 records, and did a simple select * and it took around 10-12 seconds to pull all the records, when i added a where clause for 3 columns, it dropped the time from 10 seconds to 0.5-1 second, the thing thats confusing me, is the filters did not change the dataset, it was the same 10000 record dataset from the select *, so even though the dataset is the exact same, how come the time difference is so large? can someone explain that to me?

vague fable
pastel vale
#

can i use a alias for a subquery for sqlite? currently going through learn sql (3rd edition) and it uses mysql but for now i want to stick with sqlite until ive 'mastered' sqlite.

fervent charm
coral wasp
vague fable
# coral wasp What was your faster query? (What you said is inconsistent, which is why I'm ask...

SELECT * FROM events WHERE server_id = 1 AND state = 'ACTIVE' AND created_at BETWEEN NOW() - INTERVAL 30 DAY AND NOW();
This took 1 second for 10k records, however when i did the SELECT without the where clause it took 10 seconds, and it was the same dataset because all the server ids, event sates and created_at were all the same for each records, as i was just testing the query, was just mock_data

harsh pulsar
#

Which I guess is queryable from Duckdb

harsh pulsar
#

What database was this?

coral wasp
normal ore
#

I would like to know is duckdb for vector storing is realiable in python?

coral wasp
pastel vale
#

is ROLLUP in sqlite? if not, whats the sqlite alternative? the book im going through uses mysql throughout

coral wasp
pastel vale
#

Idk. Im going though a book called LearnSQL (3rd edition) to learn more about what I can do in SQL. Theres certain features I cant use because theyre not available in SQLite. Just asking for the alternative way to make up for it

flint folio
#

can someone help me in coding im new where do i even start i need soemone to teach me the basics

harsh pulsar
#

That's what group by rollup, cube, and grouping sets is (conceptually): grouping at various levels and unioning the results

harsh pulsar
pastel vale
thorny anchor
#

why not just use that, then

pastel vale
#

Cba with setting up temp server. Ill stick with sqlite for now. Plus sqlite does what i need. Ill move onto other versions when im skilled with sqlite

thorny anchor
#

just run a mysql docker container. pz ez

outer palm
#

I'd like to know if I am strange to go crazy when I see something like this, am I the only one?

cedar tiger
harsh pulsar
harsh pulsar
cedar tiger
#

I wanted to say that its very excel of them to think that its very easy to change columns etc

harsh pulsar
#

There are so many stories out there of databases designed exactly like this, either by amateurs who didn't understand how to do it right, or by lazy sloppy pros who fetishize YAGNI beyond what would ever make sense

cedar tiger
#

I forget whats YAGNI?

harsh pulsar
#

"You aren't going to need it" -- don't add features, interfaces, etc. speculatively, only build for what you know for sure that you need

cedar tiger
#

ah ok

harsh pulsar
#

It's a good principle to avoid over designing too early in the process and ending up with wrong/bad abstractions, too much indirection, etc

#

But some people take it too far

#

If there's one argument in favor of this, it's that your database lookup is slightly faster

cedar tiger
#

Always KISS it is what my PM usually say

harsh pulsar
#

Right

nova hawk
#

HR might want to hear about that

cedar tiger
harsh pulsar
#

And this is partly a matter of knowing your domain. If this is recording temperature sensors from an IoT device, my own experience tells me that you should always plan for a variable number of sensors, because you never know when they will come out with a new version of the device that has more or fewer sensors

#

But if this is something more specific where you are definitely designing the database around a known-fixed set of sensors, then yeah it's actually fine

outer palm
outer palm
harsh pulsar
outer palm
harsh pulsar
#

That would explain the column names

#

Excel -> R data frame -> your database

outer palm
#

Plus the mix with complexity of LIMS software and 💥

#

But I think sometimes it doesn't need to be DB Experts... Today I see a person creating a simple table and in the first column add title info1/info2 ... Why why don't you simply add a new column????

nova hawk
#

as long as they normalize it afterwards

outer palm
nova hawk
#

the process of organising a database properly is called normalization

outer palm
#

Yes yes it was just an example of everyday work, ok you have to think how to organize a DB... But I think it is a matter of mental organisation not to mix different info inside the same table columns... Then why creating a table, just write some sentences:)

harsh pulsar
#

The problem is that scientists are busy being scientists, and either don't take the time to learn, or are never even introduced to good practice and don't even know they're doing it weird

#

The latter was a problem generally in scientific computing until maybe a decade ago

#

It's a little different now because more people care than they did in the past, but there was a long period of time when scientists wrote the worst most unmaintainable code imaginable

outer palm
#

Ok I know I'm probably part of the category and then part of the problem (I think I wrote and still writing terrible code). But if the scientific culture does not get you the mind structure to create a simple table I think we have deeper problems

delicate fieldBOT
#

:incoming_envelope: :ok_hand: applied timeout to @torn sphinx until <t:1727708836:f> (9 minutes and 59 seconds) (reason: mentions spam - sent 10 mentions).

The <@&831776746206265384> have been alerted for review.

mossy swift
outer palm
outer palm
#

Db stories part 2
I have different tables formatted like this

Sample| test| parameter_1| par2...|res_nwne1|res_name2..

One table for every test because parameters and results are different from one test to the other, and one sample can have different results if parameters values are different

Now someone would like to have a table like
Sample| res1_test1|res2_test1| res1_ test2

And I cannot explain to him that it is not that easy to merge all tables in one and not lose parameters information (which could be essential for test result meaning)

pastel vale
#

if im using sql with python, are case expressions still useful?

thorny anchor
#

why wouldn't they be?

pastel vale
#

idk. i feel id rather use python than sql for anything to do with conditional logic

thorny anchor
#

generally, you want to push calculations into SQL, for efficiency

coral wasp
harsh pulsar
#

But honestly, if you have a fixed set of sensors, and this is collecting data for a specific experiment, the "wide" data layout in the screenshot is actually 100% OK

#

it would be much worse for an application that might need to run for years and will probably change at some point

grim vault
#

You missed thermal_conductivity but I think the "how to" is clear.

brave bluff
#

Anyone have any suggestions for a database migration tool for raw sql? As much as I want to get into sqlalchemy, it seems like just an overly complex abstraction of sql, to the point where I wonder why I shouldn't just use raw sql and dataclasses. But now I'm not sure how to manage my database, other than write/run sql files.

smoky hare
#

Anyone here familiar with postgres orm? I wanted an implementation of array field in postgres.

brave bluff
smoky hare
harsh pulsar
#

there's an official container image too which is very useful for deployment

waxen finch
#

not sure if you figured it out yet, but you have to pass your tuple of arguments to Connection.execute() directly, no *args unpacking needed

#

otherwise aiosqlite will think your 5 letter string should fill 5 placeholders in your query

#

you dont need to at all

#

to clarify, if you were to do: py cursor = await conn.execute("SELECT ?, ?, ?", "Bob") then aiosqlite would pass 'B', 'o', 'b' to each three placeholders

#

if you put your strings inside a tuple or list instead, it'll be given to one placeholder each: py cursor = await conn.execute("SELECT ?, ?, ?", ("Bob", "Alice", "John"))

#

dont use f-strings

#

formatting strings directly to add data to your queries is bug prone and a security risk, placeholders are the easiest way to manage it

#

remember that *args in a function signature puts all your arguments into a tuple, so you can pass that directly to connection.execute(), nothing fancy required: ```py
async def execute(self, query, *args):
cursor = await self.connection.execute(query, args)
... # no * needed ^^^^

await db.execute("SELECT ?, ?, ?", "Alice", "Bob", "John")```

maiden light
#

Why are you using SELECT?

waxen finch
#

oh yeah, you only want insert/update/delete for your execute method

#

the query can be whatever, but you want to give a tuple of arguments to aiosqlite, not unpack * your arguments

maiden light
#

Lemme test whatever i understood

#

@waxen finch it worked thanks

night garnet
#

When working with rest api, how do we validate or stop user from deleting data that doesn't belong to them?

keen minnow
night garnet
#

Hmm, I will just try to do the basics for now. Will revisit validation and constrain check later.

keen minnow
#

that's already a good start

night garnet
#

@keen minnow I'm just thingking it wouldn't be hard for other players to find out other player account id

night garnet
#

Would other player able to do Delete if they have other people info?

keen minnow
#

only if you let them 😉

#

same shit though

night garnet
#

Right but that's where my brain shut down.

keen minnow
#

if your APIs have the player id in the URL

night garnet
#

So my understanding is api is stateless? No hand shake what so ever right?

#

So I need to figure out the validation bit

keen minnow
#

the point is it can scope the operation

night garnet
#

I see, well not gonna be worried too much about that for now. I have yet touched api in my life and about to start.

keen minnow
#

if your API is like /<player_id>/etc. you can scope out any operation on that

keen minnow
#

the point is to enforce/scope out the operation on an account/player

#

so that anything outside of it would not work

night garnet
#

I will keep that in mind 🙇‍♂️

#

Been stuck with console project in python.

#

Hopefully making a simple data base isn't a huge leap

keen minnow
#

yeah no worries

#

one problem at a time

pastel vale
#

can someone give me a better definition of what window functions are?

grim vault
weak maple
#

i know fastapi and flask

#

will that work

#

out

#

with django

next shadow
#

Hello, your post has been removed for advertising

cursive pollen
#

Aahh... It's not a ad but I got ur point

#

Should I just say ..

#

Looking for team mate for a hackathon(django+js+web3)

weak maple
#

what is the best db btw

#

for me its redis

#

and im making my own db

cedar tiger
#
reason, time = await self.db.fetch(f"SELECT reason, time FROM staff WHERE employee = {id}")
maiden light
#

How will it return data?

#

("reason", "time",) or ("reason",), ("time",)

cedar tiger
#

The first one

maiden light
#

Oh

#

Easy then

delicate fieldBOT
#
SQL & f-strings

Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.

Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.

For example, the sqlite3 package supports using ? as a placeholder:

query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)

Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.

See Also

  • Python sqlite3 docs - How to use placeholders to bind values in SQL queries
  • PEP-249 - A specification of how database libraries in Python should work
maiden light
coral wasp
wise goblet
pastel vale
#

whats a good transactional database project I could do? Im thinking of doing a weather based project where I use the openmeteo api to extract real time data over a period of time, store the data in an SQLite database and then create a graph from the data

weak maple
#

Yah actuallly

real forum
#

Can someone help me?

async def execute(self):
        statement = SimpleStatement(self.query)
        return self.session.execute_async(statement, (4, "Cassandra"))```
```py
async def main():
    async with Cassandra("INSERT INTO test (id, test_value) VALUES (%s, %s)") as cassandra:
        await cassandra.execute()```

```py
Traceback (most recent call last):
  File "C:\Users\vhipn\IdeaProjects\Cassandra\main.py", line 40, in <module>
    asyncio.run(main())
  File "C:\Users\vhipn\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "C:\Users\vhipn\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 649, in run_until_complete
    return future.result()
  File "C:\Users\vhipn\IdeaProjects\Cassandra\main.py", line 35, in main
    await cassandra.selectAll()
  File "C:\Users\vhipn\IdeaProjects\Cassandra\main.py", line 27, in selectAll
    self.session.execute(statement, 4, "mariadb")
  File "C:\Users\vhipn\IdeaProjects\Cassandra\venv\lib\site-packages\cassandra\cluster.py", line 2677, in execute
    return self.execute_async(query, parameters, trace, custom_payload, timeout, execution_profile, paging_state, host, execute_as).result()
  File "C:\Users\vhipn\IdeaProjects\Cassandra\venv\lib\site-packages\cassandra\cluster.py", line 2720, in execute_async
    future = self._create_response_future(
  File "C:\Users\vhipn\IdeaProjects\Cassandra\venv\lib\site-packages\cassandra\cluster.py", line 2979, in _create_response_future
    query_string = bind_params(query_string, parameters, self.encoder)
  File "C:\Users\vhipn\IdeaProjects\Cassandra\venv\lib\site-packages\cassandra\query.py", line 905, in bind_params
    return query % tuple(encoder.cql_encode_all_types(v) for v in params)
TypeError: 'int' object is not iterable```
coral wasp
weak maple
#

any one contribute

#

ping me fr questions

keen minnow
#

And it's not complex enough to be that interesting

weak maple
#

complex enough

#

????????????

#

thats not

#

complex EnOuGH

keen minnow
#

no

weak maple
#

Hello
I am Cools9 i am here with a project for intermediate python devs it is a db called elementaldb its simple and fast and made in python for those who want to see it check www.github.com/cools9/ElementalDB
Pinh me fr it
*Ping

paper flower
weak maple
#

its faster

paper flower
#

Can I see the benchmarks?

weak maple
#

well

#

yah

formal current
#

lol, right. Storing the "database" as JSON files will certainly not be faster than using a real DB.

weak maple
#

it is faster

#

well thats how mongodb stores it data what do you say about that

#

its not a real db

#

My db faster than sql lets goo

keen minnow
formal current
keen minnow
#

this is far to primitive to be compared with mongodb

weak maple
#

yah

#

its in dev

#

im making a vector db for it

keen minnow
delicate fieldBOT
#

6. Do not post unapproved advertising.

keen minnow
#

you should see authorization prior to advertising your work

weak maple
#

for god sake isnt there a colab or showcase channel

keen minnow
#

none as far as I know

weak maple
#

ok

#

but still fasterr than sql lets goo

keen minnow
weak maple
#

i shared the benchmark

keen minnow
#

you shared a random screenshot. That has nothing to do with what benchmarks mean

weak maple
#

thats proof

keen minnow
#

nuh uh

weak maple
#

that its faster than mysql

paper flower
#

Also you compared it with sqlite, not mysql

weak maple
#

well yes

#

i used mysql

#

both are same

paper flower
#

Even if fetching a single row is faster, what about fetching multiple rows? Aggregations? Does it support ACID, Locks, Scaling (Vertical/Horizonal), Indexing, Searching, defining schemas?

#

What if your db contains 1000 rows? 1 million rows?

keen minnow
paper flower
#

Would it still be faster?

weak maple
#

well most of them

#

well i tested that too

#

well how else are you supposed to use sql in python

#

without sqlite

paper flower
#

sqlite is not mysql

weak maple
#

then

#

how are u supposed to use mysql in python

paper flower
weak maple
#

still faster than sqlite

paper flower
# weak maple still faster than sqlite

Key difference between your project and sqlite is that sqlite is far more feature rich and for example data won't get corrupted in case of power outage or similar events

weak maple
#

ok

formal current
#

Also, you're not faster than sqlite. You might be faster in a tiny example, but in a proper benchmark sqlite will beat you easily.

cedar tiger
#

no clue

thorny anchor
#

no

#

yes. generally, file extensions don't change anything about the actual file

wise folio
#

Any snowflake experts or anyone with a good snowflake cheat sheet?

lethal swan
#

I didnt undertand you very well, do you need tips to try to make the queries on VS, and try to use them locally?

paper flower
#

It is

barren abyss
#

anyone here know how cassandra works?

#

like where i can learn the nuances ?

keen minnow
barren abyss
barren abyss
#

?

#

i mean like specific book that you use

terse viper
barren abyss
#

i know docs

#

but the docs doesnt really spit the information

#

i wan

coral wasp
barren abyss
#

i mean can google for the docs like for sure

#

im also learn programming that way

#

but it just doesnt gives me the information i wan

#

that you guys might hv

#

btw, thanks for not being classic toxic

#

: )

#

idc whatever the mod gonna do with me

#

but it just so classic that this ppl when they cant answer iinway more friendly like as if they're some prodigy kid that has solved PhD level math

coral wasp
barren abyss
#

im not complaining you

coral wasp
#

I understand, I'm just saying that some questions are so broad that you won't get a good answer

thorny anchor
#

the original dynamo paper in 2007 is fairly readable. Cassandra is based off that design

brave bluff
#

Anyone familiar with sql alchemy know if the query_class still holds for v2? Looking through the docs, I don't see any examples in the v2 docs, and the only documentation on it is one sentance long and points to documentation that doesn't mentions query_class and is prefaced with a note saying that technique is legacy.

#

When googling, I'm seeing multiple techniques for soft delete, but the query_class one is the only one that seems to make sense to me (others feel overly complicated)

cedar tiger
#

!d sqlalchemy.orm.relationship

delicate fieldBOT
#
function sqlalchemy.orm.relationship(argument: _RelationshipArgumentType[Any] | None = None, secondary: _RelationshipSecondaryArgument | None = None, ...) → _RelationshipDeclared[Any]```
Provide a relationship between two mapped classes.

This corresponds to a parent\-child or associative table relationship. The constructed class is an instance of [`Relationship`](https://docs.sqlalchemy.org/en/stable/orm/internals.html#sqlalchemy.orm.Relationship).

See also

[Working with ORM Related Objects](https://docs.sqlalchemy.org/en/stable/tutorial/orm_related_objects.html#tutorial-orm-related-objects) \- tutorial introduction to [`relationship()`](https://docs.sqlalchemy.org/en/stable/orm/relationship_api.html#sqlalchemy.orm.relationship) in the [SQLAlchemy Unified Tutorial](https://docs.sqlalchemy.org/en/stable/tutorial/index.html#unified-tutorial)

[Relationship Configuration](https://docs.sqlalchemy.org/en/stable/orm/relationships.html) \- narrative documentation
cedar tiger
#

Doesn't seem to show it but according to this, its towards a Query

brave bluff
#

Oh, I didn't think to look at relationsnhip for that. So seems like I can use query_class to build a soft delete mechanism.

tranquil aspen
#

What's the best way to store password salt bytes in sqlalchemy? base64 encode? LargeBinary type?

cedar tiger
# tranquil aspen What's the best way to store password salt bytes in sqlalchemy? base64 encode? ...
paper flower
#

most password hashing algorithms already store salt with the password, don't invent you own hashing algorithm

#

And bcrypt uses similar structure, but it only includes one parameter into the resulting hash string

paper flower
#

Query was deprected in 2.0 in favor of using select/Select

brave bluff
# paper flower Query was deprected in 2.0 in favor of using `select/Select`

Can you expand on that? I've seen examples of how query_class can add where segments onto a query. How would I do that with a select? And I'm also considering I'm looking at this wrong. I'm used to Django and thinking how I'd do a soft delete there. I'm guessing there are different mechanisms/patterns I should be following for sqlalchemy.

paper flower
#

Code example would be great too

brave bluff
# paper flower Explain what exactly you want to do, as I didn't really use Query api, it alread...

I'm looking to build a soft delete mechanism. I don't have code yet; I'm brand new to SQLAlchemy so learning how it works. But I found these two examples of methods:
https://blog.miguelgrinberg.com/post/implementing-the-soft-delete-pattern-with-flask-and-sqlalchemy
https://theshubhendra.medium.com/mastering-soft-delete-advanced-sqlalchemy-techniques-4678f4738947
The second, using a listener, seems an overly complex way of doing things. The first seems like a really reasonable way to build the mechanism, but it's also 8 years old, so I'm not sure if applies to v2

paper flower
#

But I don't entirely remember if it plays nicely with relationships

brave bluff
#

Oh, and I just noticed the first one has code that doesn't exist anymore

paper flower
#

(Yes, via a listener)

brave bluff
#

Ooof, a listener makes sense but feels so needless. At that point, I could just create a manager to setup queries

#

Or a helper function

paper flower
#
@event.listens_for(Session, "do_orm_execute")
def _add_filtering_criteria(execute_state):
    """Intercept all ORM queries.   Add a with_loader_criteria option to all
    of them.

    This option applies to SELECT queries and adds a global WHERE criteria
    (or as appropriate ON CLAUSE criteria for join targets)
    to all objects of a certain class or superclass.

    """

    # the with_loader_criteria automatically applies itself to
    # relationship loads as well including lazy loads.   So if this is
    # a relationship load, assume the option was set up from the top level
    # query.

    if (
        not execute_state.is_column_load
        and not execute_state.is_relationship_load
        and not execute_state.execution_options.get("include_private", False)
    ):
        execute_state.statement = execute_state.statement.options(
            orm.with_loader_criteria(
                HasPrivate,
                lambda cls: cls.public == true(),
                include_aliases=True,
            )
        )

As you see it's not that complicated

brave bluff
#

Oh, def not complicated

paper flower
#

class HasPrivate:
    """Mixin that identifies a class as having private entities"""

    public = Column(Boolean, nullable=False)

It doesn't seem to have been updated to use new syntax though

brave bluff
#

A lot of the docs use the old syntax 😛

paper flower
#

Should be

class HasPrivate:
    public: Mapped[bool]
brave bluff
#

Yah, that's more or less what the second link I provided above does

paper flower
#

Does it work with relationships?

#

If so - just use it to be honest 🤔

brave bluff
#

It's not complicated, but I'm not sure how it's better than

def SoftDeleteQuery(my_class):
    return select(my_class).where(deleted == False)
#

(that was thrown together really quick, so please ignore the syntax errors :P)

paper flower
brave bluff
paper flower
#

While you can just add that mixin to the classes you need

paper flower
brave bluff
#

Yah, you'd want a mixin to add the column

brave bluff
#

Like I said, I may be thinking about it wrong, since I'm new to SQLAlchemy

#

I don't dislike the listener

paper flower
#

To make it more explicit you still can set include_private to True by default

#

But generally use whatever you find easier to use

brave bluff
#

Yah, a param of some sort

paper flower
#

For soft deletes it may be simpler to just create a base query in your repository if you use that pattern

brave bluff
#

But I'll look at the listener more... it just feels like a lot for a little

paper flower
#

e.g.

_base_stmt = select(Model).where(Model.is_deleted.is_(False))
brave bluff
#

I really do wish the docs were better

#

Finding something like that would take forever as someone new to the code

paper flower
brave bluff
#

Yah, I'll read more on the listener

brave bluff
#

Is putting the old instant yeast in warm water and sugar the best way to tell if it's still good?

tranquil aspen
elder cipher
#

Hello Mongo db user here

livid coyote
#

Is there any particular reason why something like this would be unsafe?

from sqlalchemy.orm import DeclarativeBase
from pydantic import BaseModel


class Input(BaseModel):
    """Base input object for ORM mapped classes."""


T = TypeVar("T", bound=Input)


class Base(DeclarativeBase, Generic[T]):

    def __init__(self, base_input: T) -> None:
        input_dict = base_input.model_dump()
        for key, val in input_dict.items():
            if hasattr(self, key):
                setattr(self, key, val)
#

The idea being that inheriting mapped classes can just pass in a pydantic validated object to the __init__ - I suppose I could add an else that raises an error at the ORM level where hasattr is False

paper flower
livid coyote
#

So long as my inheriting classes are correctly typed, how would that be different from passing each value separately as an argument to __init__?

#

e.g.

class PhoneInput(Input):
    number: str
    phone_type: PhoneType
    extension: str | None = None


class Phone(Base[PhoneInput]):
    __tablename__ = "phones"

    phone_id: Mapped[int] = mapped_column(
        primary_key=True,
        autoincrement=True,
    )

    number: Mapped[str]
    phone_type: Mapped[PhoneType] = mapped_column(Enum(PhoneType))
    extension: Mapped[str | None] = mapped_column(nullable=True)
brave bluff
#

I don't understand the square bracket syntax of this line class Phone(Base[PhoneInput]):. What's that indicating? Base of type PhoneInput? But that doesn't make sense for inheritance?

real robin
#

if somebody can explain the ussage

brave bluff
#

I think it's a case where Base is a generic

brave bluff
#

SQLAlchemy question: from the docs alone, I'm not really getting the WHY of sessionmaker over Session(engine) . It says it's for when you have consistent configs, but like what? I suspect this is mostly because of how bad the sqlalchemy docs are, but coudl someone direct me?

thorny anchor
brave bluff
brave bluff
#

Yah, I read that, and it's still not clear to me. As always, the "example" they give doesn't mean much.

thorny anchor
#

if you use a sessionmaker, you can do Session(). if you don't use a sessionmaker, you need to do Session(engine)

brave bluff
#

Yah, heh, that's obvious, so it's one less import 😛

brave bluff
#

Another SQLAlchemy question. If I want to make a mixin class, am I better doing:

class SoftDeleteMixin(Base):
    __abstract__ = True

    deleted: Optional[Mapped[datetime.datetime]] = mapped_column(default=None)

or

class SoftDeleteMixin:
    deleted: Optional[Mapped[datetime.datetime]] = mapped_column(default=None)

Inherit from the DeclarativeBase and set the class as abstract or just make a standard class?

torn sphinx
#

i've been thinking bout database internals and indexing strategies. I've not actually looked under the hood of databases yet but I want an opinion if the strategy i describe would be viablee!!

each table column get a BST which holds indexes in the main table structure

and each row item has a row_id, the items in the BST are sorted based on the column values, and each node holds a row id to the said row

so, can someone, by this logic, create a O(log(n)) time complexity based indexing system for a table using this strategy?

so a query like SELECT * FROM table WHERE name = 'abc' should conduct a search in the BST of the name column, once it finds the row_id for that, it should be able to return all columns from the table structure using that row_id as an index.

and since insertion in a BST is also a O(log(n)) operation, that should also be relatively easy

data retreival and insertion would be a O(m log(n)) operation for a table of columns m, and rows n

(pls ping)

paper flower
paper flower
torn sphinx
#

that's so good to knoww!! i was just thinking how db works with this stuff so fast!!

plucky holly
#

hey, anyone has experienced in Apache Beam? especially WriteToJdbc

thorny anchor
# torn sphinx i've been thinking bout database internals and indexing strategies. I've not act...

this captures the basic idea pretty well, and you technically could use a BBST for an index. as Doctor said though, most commonly you'll use a BTree, because they have a high fanout, leading to shallower trees, leading to less IO operations, since each node takes an IO operation to load into memory.

also, generally, you won't have an index for each column, though you could. you usually only put indexes on columns that you want to search by, because maintaining indexes increases the cost of insertions and deletions, which you saw with the O(m log n) time complexity

torn sphinx
#

mmm, i'll look into BBST and BTree rq, ty

thorny anchor
#

a bbst is just a bst that balances itself

torn sphinx
#

indeed, i just gotta see how it functions

ashen timber
#

Would you guys recommend working when working with databases to use SQLAlchemy?

tranquil aspen
#

I like it

#

It is a well established and well documented library

tired beacon
#

Made my own php .txt database hmu if you're interested

#

It's programmable like mysql

cyan bay
#

I'd really like to learn more about using ORM in SQLAlchemy for work, specifically with Oracle DB. I'm not sure I can easily buy or make a mock setup of Oracle DB at home, though. Is there a good way to practice this?

coral wasp
#

It's not too hard, no harder than installing postgresql

steep basalt
brave bluff
#

I know this is a common pattern in a bunch of packages, but I guess I never bothered to understand the underlying technique, so... I'm trying to set up a foreign key in sqlalchemy:

owner: Mapped["models.user.User"] = mapped_column(ForeignKey("models.users.User.id"))

All my models are in the folder models. I can't import the models directly into each other due to circular referencing, so I'm using the string format for referencing. But I'm not sure if I should be putting the full path to model, a relative path, or just the class name. I'd love some direction.

tranquil aspen
brave bluff
tranquil aspen
#

Yeah

brave bluff
#

And that applies to the Mapped[?] as well?

tranquil aspen
#

The full line is user_id: Mapped[str] = mapped_column(ForeignKey("user_account.id"))

brave bluff
#

Well, int, but yah

tranquil aspen
#

In my case the primary key is a UUID

brave bluff
#

Yah, makes sense

#

Ok, that helps

#

As usual, why are the docs not more clear...

#

Thanks for helping make that clearer

brave bluff
#

@tranquil aspen Sorry to ping directly, but as you seem to know about this, I think my previous question confused ForeignKey and relationship. The docs have this example:

class Parent(Base):
    __tablename__ = "parent_table"

    id: Mapped[int] = mapped_column(primary_key=True)
    children: Mapped[List["Child"]] = relationship(back_populates="parent")


class Child(Base):
    __tablename__ = "child_table"

    id: Mapped[int] = mapped_column(primary_key=True)
    parent_id: Mapped[int] = mapped_column(ForeignKey("parent_table.id"))
    parent: Mapped["Parent"] = relationship(back_populates="children")

So ForeignKey uses the table.column notation as you mentioned. This would obviously create a column in the table. The relationship I'm guessing won't, and is instead used to get matching relationships in code? There, the Mapped["Parent"] seems a reference to the class; so if the class is in a different file, how do I refer do it? And I'm guessing I need a relationship on both tables for this to work?

tranquil aspen
#

Yeah the relationship part works differently

#

Based on your naming it seems you are looking at the same post I used to make my tables, so you'll recognize the comments I used to remind me of how it works:

# Child
class UserSession(Base):
    __tablename__ = "user_session"
    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=str(uuid4()))

    created_date: Mapped[DateTime] = mapped_column(
        DateTime(timezone=True), server_default=func.now()
    )
    updated_date: Mapped[DateTime] = mapped_column(
        DateTime(timezone=True), server_default=func.now(), onupdate=func.now()
    )

    # parent: Mapped["Parent"] = relationship(back_populates="child")
    # parent_id: Mapped[int] = mapped_column(ForeignKey("parent_table.id"))
    user: Mapped["User"] = relationship(back_populates="user_session")
    user_id: Mapped[str] = mapped_column(ForeignKey("user_account.id"))

    UniqueConstraint(user_id)

    def to_json(self):
        return {
            "id": self.id,
            "created_date": str(self.created_date),
            "updated_date": str(self.updated_date),
        }


def uuid_str():
    return str(uuid4())


# Parent
class User(Base):
    __tablename__ = "user_account"

    id: Mapped[str] = mapped_column(String(36), primary_key=True, default=uuid_str)
    username: Mapped[str] = mapped_column(String(16), nullable=False)
    password: Mapped[str] = mapped_column(Text, nullable=False)
    salt: Mapped[str] = mapped_column(Text, nullable=False)
    created_date: Mapped[DateTime] = mapped_column(
        DateTime(timezone=True), server_default=func.now()
    )
    updated_date: Mapped[DateTime] = mapped_column(
        DateTime(timezone=True), server_default=func.now(), onupdate=func.now()
    )
    active: Mapped[Boolean] = mapped_column(Boolean(), server_default="t", default=True)

    # child: Mapped["Child"] = relationship(back_populates="parent")
    user_session = relationship(UserSession, back_populates="user", uselist=False)
#

In the above example each UserSession is linked to a user id

brave bluff
#

Ok, great, thanks

#

So it seems to refer to the class, regardless of location

tranquil aspen
#

Yes

brave bluff
#

It's not a location pointer, just a name

#

These docs...

#

Thank you again

brave bluff
# tranquil aspen Yes

Yah, this is where I'm getting thrown:

permissions: Mapped[List["Permission"]] = relationship()

Pylance is throwing a squiggly line under User saying it's not defined. But I'm not sure how to handle that.

tranquil aspen
#

Show code

brave bluff
#

Eer, that is the code. You mean the two separate classes involved? I can do that in the morning, as it's late for me, heh.

brave bluff
#

In models/role.py I have

class Role(Base, TimestampMixin, SoftDeleteMixin):
    __tablename__ = "roles"

    name: Mapped[str] = mapped_column(String(64), unique=True)
    plural: Mapped[str] = mapped_column(String(64), unique=True)
    owner_id: Mapped[int] = mapped_column(ForeignKey("users.id"))
    permissions: Mapped[List["Permission"]] = relationship(
        secondary="role_permissions", back_populates="roles"
    )

I don't import permission there, as it would cause a circular dependency.

paper flower
#

@brave bluff Sorry, but what's the issue? pithink

brave bluff
#

Pylance is throwing an error on Mapped[List["Permission"]] saying Permission is not defined.

paper flower
#

Import permission in a if TYPE_CHECKING block

#
if typing.TYPE_CHECKING:
    from somewhere import Permission
#

This is needed for type checkers to understand where that type actually comes from

brave bluff
#

Ah, but won't cause the circular imports?

paper flower
#

No, TYPE_CHECKING is always False at runtime

brave bluff
#

Is this new? Or just something I somehow haven't run into before

paper flower
#

It's quite old pithink

brave bluff
#

Oh, so it's not a literal import, it's just a definition

paper flower
#

Added in version 3.5.2.

brave bluff
#

I don't know how I've never run into this before

#

Yah, wild

#

I've been doing Python for like 5 years, heh, but maybe this would have come up sooner if I had started SQLAlchemy earlier

#

Thanks

delicate fieldBOT
#

src/app/db/models/manga/_manga.py line 30

if TYPE_CHECKING:```
brave bluff
#

Thanks

#

Yah, seems clearer now

#

Oh, I guess the professional projects I've been working on have been older and using minimal type checking, so that's probably also why I havent run into this.

#

Good to know. I assume this will come up a lot more as I add more type checks

paper flower
#

Just configure ruff with all rules enabled brainmon

brave bluff
#

Not sure what ruff is, but I'll look into it

paper flower
brave bluff
#

Oh, I've been using black

#

and pylance

#

But I'll look at ruff

paper flower
#

ruff format can replace black and isort

brave bluff
#

Oh, that'd be nice, heh

paper flower
#

I personally use mypy, but coworker says pylance is not bad either

#

It's not a type checker btw, just a formatter and linter

brave bluff
#

Yah, I'm trying to learn more about tooling, as it's always been set up at a company before I got there

paper flower
#

Also look into uv 🙏

brave bluff
#

Yah, been seeing uv come up a lot lately

#

I've been using pipenv for a long time

#

But it's resolver is meh at best

#

I'm thinking of switching my project to uv instead

subtle swift
#

Yo who’s good at SQL

coral wasp
brave bluff
#

What's the SQLAlchemy way of setting a value on save? I have a field in model that is dependent on another field. I could obviously set both any time the primary is set, but that's redundent. If it were a normal class, I'd use a getter/setter, but I'm not sure how I'd do that with sqlalchemy. In Django, I'd use the save method, but not sure how I should with SQLA. Is it just "make a method"?

brave bluff
#

I think hybrid_property is the mechanism to use here. Still curious about checks before saving, to test data is set, for example

brave bluff
cedar tiger
# brave bluff Not how to save a model, but rather how to do something before saving.

Like this?

from sqlalchemy.orm import Session

with Session(engine) as session:
    spongebob = User(
        name="spongebob",
        fullname="Spongebob Squarepants",
        addresses=[Address(email_address="spongebob@sqlalchemy.org")],
    )
    sandy = User(
        name="sandy",
        fullname="Sandy Cheeks",
        addresses=[
            Address(email_address="sandy@sqlalchemy.org"),
            Address(email_address="sandy@squirrelpower.org"),
        ],
    )
    patrick = User(name="patrick", fullname="Patrick Star")
    session.add_all([spongebob, sandy, patrick])
    session.commit()
brave bluff
#

No, not the example out of the docs. As I said above, I have a field who's value is dependent on another. So some mechanism to modify/check the model before it's saved.

cedar tiger
brave bluff
cedar tiger
brave bluff
#

As long as the type matches, it's fine.

cedar tiger
#

Are you looking at the insert or update?

#

If its an update, wouldn't you need to do a select query and find if the value entered matches the value from the db?

#

If its an insert, surely you would need some sort of guard that the value or format of the data input is within an acceptable range or format

#

like emails for example, you would need regex

silent flicker
#

hello, i need help with mysql and python connection
here is my code:

from mysql.connector import connect, Error

try:
    with connect(
        host="localhost",
        user=input("Enter username: "),
        password="123456"
    ) as connection:
        print(connection)
except Error as e:
    print(e)```

i was having problems installing and doing stuff with mysql connector at first, but now that i've done everything, i dont even get any outputs, no error either, nothing.
Thanks for the help in advance!

MySQL server version - 8.0.39
Using - VScode or IDLE. spyder doesnt even know that mysql.connector exists, returns error
delicate fieldBOT
#
Traceback

Please provide the full traceback for your exception in order to help us identify your issue.
While the last line of the error message tells us what kind of error you got,
the full traceback will tell us which line, and other critical information to solve your problem.
Please avoid screenshots so we can copy and paste parts of the message.

A full traceback could look like:

Traceback (most recent call last):
  File "my_file.py", line 5, in <module>
    add_three("6")
  File "my_file.py", line 2, in add_three
    a = num + 3
        ~~~~^~~
TypeError: can only concatenate str (not "int") to str

If the traceback is long, use our pastebin.

torn sphinx
#

Are SQL and MySQL the same thing?

hexed estuary
#

if in doubt, you probably just want Sqlite. It's the right choice in most cases. And when it's not, consider postgresql.

silent flicker
#

look, no output
as per my code, i believe it should printing connection variable or e variable, but it doesnt

coral wasp
silent flicker
# coral wasp Add additional print statements to see what's happening. It's also possible you'...

i am running the code i pasted above in my original message, where exactly would you suggest me to add the print statements?

i believe its not working after the

try:
    with connect(
        host="localhost",
        user=input("Enter username: "),
        password="123456"
    ) as connection:```
since any print statement after that doesn't work, but it does ask me for the username input

again, please do tell me where should i add print statements if there is anywhere it should be added
tranquil aspen
#

Try obtaining the username via input before the with statement

silent flicker
coral wasp
coral wasp
drowsy sedge
#

connection = sqlite3.connect('business_management_system.db')

cursor = connection.cursor()

createtable = """CREATE TABLE IF NOT EXISTS
bms(username_id TEXT, password_id TEXT)"""

cursor.execute(createtable)

def loginaccount():
password = password_field.get
username = username_field.get
cursor.execute("SELECT username_id, password_id WHERE username_id, password_id = username, password")
if password == password_id and username == username_id:
accessgranted = true

def createaccount():
password = password_field.get
username = username_field.get
cursor.execute("""INSERT INTO bms(username_id, password_id)VALUES(username_id, password_id)""")

bms.commit()
bms.close()
#

does anyone know how to fix 'column does not exist' when i try to store account details to my database?

coral wasp
#

What's wrong with this query: cursor.execute("SELECT username_id, password_id WHERE username_id, password_id = username, password") || hint: multiple things, but let's start with - no from clause ||

paper flower
pastel ice
#

Hello need help in CrewAI with Memory main issue is Database is not saving the data if anyone allow me should I explain my complete problem

gentle surge
#

There are two tables T1 and T2, an entity from T1 must be related to at least one entity of T2. An entity from T2 may be related to at most one entity of T1.

#

is that correct?

waxen finch
# gentle surge is that correct?

i think its meant to be read the other way around, i.e. T1 may be related to at most one of T2, and T2 must be related to at least one T1

gentle surge
#

I need to make the diagram

waxen finch
#

oh oops

gentle surge
paper flower
#

I don't think there's a good way to guarantee that T1 relates to at least one T2? But otherwise it looks like a typical one-to-many relationship

gentle surge
#

And I should just use numbers

#

So like if it says at most 1 just put a 1 on the top of the line

paper flower
#

They're not uncommon, but you can use numbers too

gentle surge
paper flower
#

Each?

gentle surge
#

Doesn’t the text describe the relation from t1 to t2 and t2 to t1

#

So each side needs a number or am I wrong

paper flower
#

Yeah, but it says at least 1 and at most 1 on the other side

gentle surge
#

One wide is 1 to many and the other side is 0 to 1

#

So I put 1:N and 0:1 on either side

paper flower
#

Yeah

gentle surge
#

Or the other table

daring latch
#

How can I talk in the room?

gentle surge
waxen finch
pastel vale
#

is there a faster way of converting a csv to sql database for SQLite? I know theres a faster method using postgreSQL but Im using SQLite atm for practice and getting used to database management.

silent flicker
pastel vale
# thorny anchor what is your current method

I just use to_sql. Simple and does the job. Its not slow but for the future, if i working with a much larger dataset, want to learn a faster and efficient method for csv to sql conversion

coral wasp
#

The faster method is usually using a native loader: most database have some direct (fast) way to ingest a csv file

#

And, in some cases, doing batch updates rather than inserting record individually

#

This is somewhat a rathole of incomplete work, but when it works it works pretty well

drowsy sedge
coral wasp
#

From what table does the id come from?

drowsy sedge
coral wasp
drowsy sedge
coral wasp
drowsy sedge
#

Bms the one I made above

coral wasp
#

But bms is not in the query.

drowsy sedge
#

Does it have to be?

coral wasp
drowsy sedge
#

What about the create account function?

coral wasp
#

You have many issues with your queries.

#

Let's just deal with the first

drowsy sedge
#

Okay

coral wasp
#

Also, you have to learn how to pass parameters to your queries

#

!sql is a good instruction

delicate fieldBOT
#
SQL & f-strings

Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.

Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.

For example, the sqlite3 package supports using ? as a placeholder:

query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)

Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.

See Also

  • Python sqlite3 docs - How to use placeholders to bind values in SQL queries
  • PEP-249 - A specification of how database libraries in Python should work
drowsy sedge
#

cursor.execute("SELECT username_id, password_id FROM bms WHERE username_id, password_id = username, password")

?

coral wasp
#

You're not passing any parameters there, see the embed

#

And, that should be a=b and c=d, not a,c = b,d

drowsy sedge
#

What do you mean?

coral wasp
#

(That's valid in some dbs but I hate it)

coral wasp
drowsy sedge
#

I’m really confused

#

cursor.execute("SELECT username_id, password_id WHERE username_id = username, password_id = password")

#

?

coral wasp
#

Remove the comma, use 'and'

drowsy sedge
#

cursor.execute("SELECT username_id, password_id WHERE username_id = username AND password_id = password")

coral wasp
#

Yes

#

Now, add the parameters

#

!sql

delicate fieldBOT
#
SQL & f-strings

Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.

Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.

For example, the sqlite3 package supports using ? as a placeholder:

query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)

Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.

See Also

  • Python sqlite3 docs - How to use placeholders to bind values in SQL queries
  • PEP-249 - A specification of how database libraries in Python should work
coral wasp
#

And add a print statement so you can see what username and password is being used

drowsy sedge
#

cursor.execute("SELECT username_id, password_id WHERE username_id = username, password_id = password")
parameters = (username, password)

coral wasp
#

Try it and see what happens

#

(It's not right, but it'll help you see the message)

drowsy sedge
coral wasp
#

Almost

coral wasp
#

Or ||import sqlite3

conn = sqlite3.connect(':memory:')
cursor = conn.cursor()

print(cursor.execute("select :name", {"name": "bobby"}).fetchall())
||

drowsy sedge
#

print(cursor.execute(“SELECT username”, {“username_id”:”username”})fetchall())

drowsy sedge
#

What you sent me is really different to what I had

coral wasp
#

I'm showing you how to use parameters in sqlite

drowsy sedge
#

All I really need to do is let it add account to database and check if it matches an account logging in

coral wasp
drowsy sedge
# coral wasp Yes

passwordcheck = “SELECT * FROM bms WHERE password_id = password AND username_id = username;”
parameters = (username, password)
db.execute(passwordcheck, parameters)

#

Why does the parameters help this function?

coral wasp
#

How does the database know where the username or password goes?

#

Go ahead and run the query and read the error.

drowsy sedge
#

“Column does not exist”

coral wasp
drowsy sedge
#

1 sec

delicate fieldBOT
#
SQL & f-strings

Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.

Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.

For example, the sqlite3 package supports using ? as a placeholder:

query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)

Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.

See Also

  • Python sqlite3 docs - How to use placeholders to bind values in SQL queries
  • PEP-249 - A specification of how database libraries in Python should work
drowsy sedge
#

def createaccount():
password = password_field.get
username = username_field.get
cursor.execute("""INSERT INTO bms(username_id, password_id)VALUES(username, password)""")

#

??

#

I’m confused on the symbol, ? And commas

#

Have I passed it through here?

coral wasp
drowsy sedge
#

query = “INSERT INTO bms WHERE symbol = ?;”
parameter1(username, password)
dbexecute(query, parameter1)

#

Or am I supposed to replace symbol with column and ? With username_id, password_id

coral wasp
#

The ? Represents one value in your parameter tuple. If you have two parameters, you'd have two ?'s

drowsy sedge
#

query = “INSERT INTO bms WHERE symbol = (?, ?)”
parameter1(username, password)
dbexecute(query, parameter1)

coral wasp
#

What's symbol ?

drowsy sedge
#

placeholder

#

would it be column?

coral wasp
#

Yes

drowsy sedge
#

I’ll test it in a minute

#

and I’m correct in passing through the user inputs?

#

and how does it know whether it is in username_id or password_id column

#

wait I have annidea

#

Idea

#

query = “INSERT VALUES(username, password) INTO bms WHERE column = (?, ?)”
parameter1(username_id, password_id)
dbexecute(query, parameter1)

coral wasp
drowsy sedge
coral wasp
drowsy sedge
#

Why?

coral wasp
#

Run it

#

The placeholder only represent the data you want to pass in. Like: user_id = ? : Means you want where the userid column to equal the input parameter.

spiral delta
#

can someone give tips for this? especially db

The platform must be able to handle 10,000 requests per second, with a registered base of 100,000 drivers and 50 million users globally.

backend - docker-compose of django,python
I had to build quick some 35 40 endpoints and few websockets so py dont hate
for db I am using postgres
platform is something like uber so constant writes too not just read
currently thinking of sharding postgres with redis in b/w but more tips would be nice
does k8s work well with dbs?

using gcp free 300 usd credits

hidden garden
#

Hi guys this is probably easy but I have to complete this for a school competition. I never did python before but I think it could help

The step I reached is that

I need to automate a massive excel file with 19 tabs for countries, each tab compares 4 columns, CODE1 VALUE1, CODE2 VALUE2 for different fictional cities and buildings. I made some formulas to show if they match and where which code is missing (1 or 2) on excel

I manually populate the excel files that are extracted from different applications from each country.

I need to match them excluding a legend of values that are based on another set of excel files that are from the past month that are commented as known gaps.

I'm really stuck with this, I've done an excel, but I'm not sure it's fast enough. I know VBA is out of the question since it doesn't do much that excel already can.

Should I go with python? I never used python before. Will I need an interface for it? Maybe power bi can match the values but I'm not sure if there is a limit for the number of tables. Could it concatenate code1-value1? but how will this work for the past month legend check.

I've been thinking of doing an automated extract from the application's API, or from the excel that will be in the cloud for each month on Sharepoint that populates a database with python that matches the concatenated columns ( city 1 & building 1 vs city 2 & building 2) for each country

Then, populate three new columns if there is a match, if there is a missing building or a missing city and then using the past month excel flag the past gaps as a known gap in an new column.

All of this is then sent into power BI where it is filtered by country.

With a Sharepoint comment button done on power apps ill annotate a comment for each gap and then another power app creates a final report from the entire power bi.

I have midterms coming up and I've been thinking about this for a while, but I'm open to new ideas.

coral wasp
coral wasp
hidden garden
#

Thank you! I will look into it for sure, I'm working on Jupyter notebook with anaconda navigator

real pollen
#

do you love it when you have the exact some code in 2 functions, but one works the other doesnt?

#

works:
count = cur.execute("select colum1, sum(colum2) from table group by colum1 order by colum1 asc") for rec in count.fetchall(): colum1.append(rec[0]) colum2.append(rec[1])
IndexError: tuple index out of range:
count2 = cur.execute("select colum3, sum(colum2) from table group by colum3 order by sum(colum2) desc") for rec in count2.fetchall(): colum3.append(rec[0]) colum2.append(rec[1])
can someone explain?

#

and error appears at the last line: colum2.append(rec[1])

coral wasp
real pollen
#

now it is working, lol.
thanks, whatever that was it helped

wild knoll
#

yooo

manic spruce
#

how long do you think it takes to learn MySQL for web dev?

keen minnow
slow temple
#

Hello guys l, can someone please help me connecting DataGrip with oracle

#

I am not able to create a database

#

And now when I tried to create new connectio, it's not working

#

This is when i am trying to create a database

manic spruce
torn sphinx
#

what's the difference b/w sqlite3.connect(":memory:") and sqlite3.connect("file::memory:")

brave bluff
#

Anyone know if with SQLAlchemy polymorphic classes, if there's a mechanism to hard code the value used for the polymorphism? If I'm already setting polymorphic_identity, it'd be convenient to set the field to the same value?

lavish heron
#

Please share resources to learn Dependency Injection.

drowsy sedge
#

Anyone know why it says column does not exist in sqlite3?

cedar tiger
drowsy sedge
cedar tiger
drowsy sedge
# cedar tiger whats the full error message?

Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\Oscar\AppData\Local\Programs\Python\Python312\Lib\tkinter_init_.py", line 1968, in call
return self.func(*args)
^^^^^^^^^^^^^^^^
File "C:\Users\Oscar\OneDrive\Documents\PYTHON CODE O3\v10.py", line 60, in createaccount
cursor.execute("""INSERT INTO bms (usernameid, passwordid) VALUES (username, password);""")
sqlite3.OperationalError: no such column: username

cedar tiger
drowsy sedge
cedar tiger
# drowsy sedge how?

E.g.

cur.execute("INSERT INTO movie VALUES(?, ?, ?)", ("Monty Python's Life of Brian", 1979, 8.0))
drowsy sedge
#

isnt that basically how i did it?

cedar tiger
drowsy sedge
#

cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password))

cedar tiger
#
    cursor.execute("""INSERT INTO bms (usernameid, passwordid) VALUES (username, password);""")

The username here is assumed as a column name. Same as password. Not usernameid or passwordid

drowsy sedge
# cedar tiger Correct

accountid = int(0)
accountid = accountid + 1

would this be correct for creating an order of accountids?

#

or would that be incorrect logic in python

cedar tiger
drowsy sedge
#

and my error this time is

       ^^^^^^^^^^^^^^^^

File "C:\Users\Oscar\OneDrive\Documents\PYTHON CODE O3\v10.py", line 60, in createaccount
cursor.execute(cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.ProgrammingError: Error binding parameter 1: type 'method' is not supported

drowsy sedge
# cedar tiger https://tenor.com/view/inception-deeper-go-deeper-we-need-to-go-deeper-leonardo-...

ive removed it and still getting

Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\Oscar\AppData\Local\Programs\Python\Python312\Lib\tkinter_init_.py", line 1968, in call
return self.func(*args)
^^^^^^^^^^^^^^^^
File "C:\Users\Oscar\OneDrive\Documents\PYTHON CODE O3\v10.py", line 60, in createaccount
cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password))
sqlite3.ProgrammingError: Error binding parameter 1: type 'method' is not supported

cedar tiger
cedar tiger
drowsy sedge
cedar tiger
drowsy sedge
cedar tiger
delicate fieldBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

drowsy sedge
#

def createaccount():
password = password_field.get
username = username_field.get
cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password))

cedar tiger
#

after get

cedar tiger
drowsy sedge
#

do i need db.commit?

#

its saying it is not defined

cedar tiger
drowsy sedge
#

bms.commit()

#

i have it as bms

#

and bms is my database

#

bms(usernameid TEXT, passwordid TEXT)

nova hawk
#

iirc it autocommits by default

drowsy sedge
#

ive just got no error message..

cedar tiger
drowsy sedge
#

does that mean its in the database

cedar tiger
#

!pastebin your result/code

delicate fieldBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

drowsy sedge
#

it just has random symbols and create table

cedar tiger
#

Stop sharing your file here

#

Use the pastebin service ☝️

drowsy sedge
cedar tiger
drowsy sedge
#

my database

cedar tiger
drowsy sedge
cedar tiger
drowsy sedge
#

and db-journal cant be opened in viewer

cedar tiger
cedar tiger
# drowsy sedge

Type this in place of that:

-SELECT * FROM 'bms' LIMIT 0,30
+INSERT INTO 'bms'(usernameid, passwordid) VALUES ("AgentQ", "IsSuperAwesome")

And then click "Execute"

drowsy sedge
cedar tiger
# drowsy sedge

OK now retry the select statement: SELECT * FROM 'bms' LIMIT 0,30

cedar tiger
drowsy sedge
#

wheres my input?

cedar tiger
#

It didn't go in

drowsy sedge
#

def createaccount():
password = password_field.get()
username = username_field.get()
cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password))

#

is this not correct?

cedar tiger
drowsy sedge
cedar tiger
nova hawk
#

can you share the full code/

drowsy sedge
cedar tiger
drowsy sedge
cedar tiger
drowsy sedge
cedar tiger
drowsy sedge
cedar tiger
#

What happened to my record?

hexed estuary
#

isn't LIMIT 0,30 equivalent to LIMIT 30?

drowsy sedge
formal epoch
#

https://www.youtube.com/watch?v=7mz73uXD9DA&t=917s

is this a good vid to get started with SQL?

Course Problems & Certificate 👉 https://lukeb.co/sql
Course Databases & Links 👉 https://lukebarousse.com/sql

⚠️ Fix Database Load Issue in Ch 2 👉 https://lukeb.co/sql_error

Thanks to Kelly Adams for her work in producing this course 🎉🙌
👩🏻‍💼 Kelly's LinkedIn: https://www.linkedin.com/in/kellyjianadams/
🌐 Kelly's Website: https://www.kellyjadam...

▶ Play video
#

if anyone has referred to this.

coral wasp
hollow oar
#

anyone good with snowflake?

i have a snowflake data warehouse with a sparsely populated time series. for example, the table might look like this:

DATETIME                VALUE
2023-10-01 00:00:00    10
2023-10-05 12:30:00    20
2023-10-12 08:45:00    30

i'm trying to generate a minutely series, where each minute would represent the most recent observation that's at most 10 days old. so for example:

  • at 2023-10-05 12:30:00, the value is 20.
  • at 2023-10-11 12:30:00, the value is still 20 (because it's within 10 days of2023-10-05 12:30:00).
  • at 2023-10-12 08:44:00, the value is still 20 (because it's within 10 days of2023-10-05 12:30:00).
  • at 2023-10-12 08:45:00, the value is now 30 (because we actually saw 30 at exactly 2023-10-12 08:45:00).
  • until 2023-10-22 08:45:00, every single minute have value 30

someone suggested i first create a CTE that computes the "valid_until" timestamp for each row (either the next timestamp or 10 days later, whichever comes first)[1]. this part runs really fast, but when i try to join it with a generated table of all minutely timestamps in a range (either using SELECT time_add(...) FROM table(generator(...)) or a fancy recursive CTE), the query performance drops significantly and it takes forever.

any recommendation?

[1] e.g.

DATETIME                VALUE     VALID UNTIL
2023-10-01 00:00:00    10        2023-10-05 12:30:00
2023-10-05 12:30:00    20        2023-10-12 08:45:00
2023-10-12 08:45:00    30        2023-10-22 08:45:00
brave bluff
#

With SQLAlchemy, anyone know how I can store a class object as json into a field? I saw sqlalchemy.types.json in the docs, but I'm not sure if there's a mechanism to automatically serialize/deserialize it into another class? Or point me where I should be looking in the docs?

#

SQLA is a serious PITA. I do not understand how it became so popular. I'm always ready to admit I'm the problem when it comes to a package, but the SQLA docs are so hard to follow...

coral wasp
#

Let's assume first that your data is rounded to the nearest minute (if not, round it)... that'll make this next part easy:

#

I'm going to write this in duckdb, since it's handy... but should be translatable to snowflake

#

create or replace table irregular_data as 
    SELECT DATE '2000-01-01' + INTERVAL (i) hours as datetime, random() val FROM range(2000) t(i)
    ;

create or replace table regular_intervals as 
(with q1 as (select min(datetime) as dt_start, max(datetime) as dt_end from irregular_data)
select unnest(generate_series(q1.dt_start, q1.dt_end, interval 1 minute)) datetime from q1)
;
with q1 as (
    select r.datetime, val, from regular_intervals r left outer join irregular_data i on i.datetime=r.datetime)
select 
    datetime, 
    last(val ignore nulls) over (order by datetime range between interval 10 hours preceding and current row) as filled_val
from q1
#

Idea is: given irregular_data that is not aligned to your desired regular_intervals, left outer join then use a window function.

#

** Requirement is that irregular_data is rounded to nearest minute, so the left outer join works with regular_intervals.

#

*** Alternative would be to use an asof join. Considered that, but this seemed easier to explain.

hollow oar
# coral wasp So, you have irregular data and you want to regularize it to 1 minute granularit...

indeed!

yeah if last / last_value supported range-based window then your example query is exactly what i would have written
unfortunately snowflake doesn't support that :\

i wonder if making a proper table instead of a transient one in CTE would make a difference, something to test tomorrow 🤷‍♂️


and yeah i am aware of asof join, it's also not quite performant, perhaps ultimately the issue is that we aren't using a proper table for the regular_intervals but generating on the fly

coral wasp
#

hmm, I have a dumb idea

lament parcel
coral wasp
#

Then, you'd add one more subquery to compare datetime to filled_date to see if the lagged value is within your window (and if not, null)

brave bluff
hollow oar
coral wasp
#

In other words, you first take the infinite loopback... then you replace any "too far backs" with null.

hollow oar
brave bluff
lament parcel
brave bluff
torn sphinx
#

Hi

median wave
#

Which database is most suitable for having a few read/writes very frequently (every couple seconds/minute)
but the records will be deleted after use too. sqlite is suitable right?

paper flower
median wave
median wave
#

And that answers it thank you. And I assume no problems with be caused by using aiosqlite right? I’m new to this as you can probably say

paper flower
#

How much is a few? Under ~100? You should be fine with sqlite

#

even with thousands too, but I didn't exactly benchmark sqlite myself

median wave
#

Yeah definitely under 100 thanks

heavy spindle
#

Does anyone here used Sqlalchemy only for query builder ?

hollow oar
# coral wasp Yes, you'd then add an if/case to check filled_date against datetime

I just took a proper look at the query profile, turns out it's the join that's causing the issue... It using 97% of wall time.

So what I did was to rewrite query to not do any join, just generate an array of minutely timestamp where the row is valid, unnest it and fill in the gaps application side.

1000x faster than whatever the hell we were doing now. Phew.

coral wasp
heavy spindle
abstract ruin
#

so an index in a dataframe is just a row header?

brave bluff
#

Can anyone recommend a migration manager that's tool agnostic? If I'm not using any ORM.

brave bluff
paper flower
brave bluff
paper flower
#

I mean, you can try whatever tool you like

#

I think alembic can use sqlalchemy stuff but only if it's installed

steady shard
#

Hi, if you are building or looking to build projects that include Azure Cosmos DB, please reach out to me. I am on the team and would love to share what you are doing or provide feedback to our team.

coral wasp
steady shard
coral wasp
#

Interesting. I'm more a olap guy, so generally shy away from this world, but it's interesting

formal epoch
#

i reinstalled sql workbench and now i can't find it on my pc

#

life is difficult

formal epoch
#

Guys? Trying to open it on my Mac and this pops up.

brave ingot
#

hey! I'm trying to avoid hard-coding relationship strings in sqlalchemy to make everything more type safe

for example, if I have such models:

class User(Base):
    id = Column(Integer, primary_key=True)
    is_active = Column(Boolean, default=True)

    items = relationship(lambda: Item, back_populates="owner")


class Item(Base):
    id = Column(Integer, primary_key=True)
    owner_id = Column(Integer, ForeignKey(User.id))

    owner = relationship(lambda: User, back_populates="items")

how do I avoid hard-coding values like "owner" or "items" of back_populates?

keen ravine
#

Anyone used weaveiate?

keen ravine
#

help I need to choose between milvus and weaviate

cedar tiger
#

I just use pgvector in those cases

cedar tiger
paper flower
# brave ingot hey! I'm trying to avoid hard-coding relationship strings in sqlalchemy to make ...

You wouldn't always be able to use lambda (e.g. when having circular references between files), also you're not using sqlalchemy 2.0 syntax, so your code isn't that type safe, maybe try using new Mapped and mapped_column:

class User(Base):
    id: Mapped[int] = mapped_column(primary_key=True)
    is_active: Mapped[bool] = mapped_column(default=True)

    items: Mapped[list[Item]] = relationship(back_populates="owner")


class Item(Base):
    id: Mapped[int] = mapped_column(primary_key=True)
    owner_id: Mapped[int] = mapped_column(ForeignKey(User.id))

    owner: Mapped[User] = relationship(back_populates="items")
#

That way mypy can check if you use your relationships correctly

pastel vale
#

Is there something similar to the Kindling project but for databases? I find it difficult to try to put a use case for a project when I dont really need to code something for my general life (if that makes sense)

craggy canyon
#

Hi all. I am just about to lose my mind. I’m trying to create a bi-directional many to many relationship between a Tag model and any number of other models via sqlalchemy. All of the examples in the documentation seem to generate a one-to-many. I don’t care if it creates multiple tables like user_tag, image_tag, etc, but I would expect those tables to essentially consist of just parent_id and tag_id. I do need to be able to query all tags. So User.tags, Image.tags, Tag.all, Tag.users, Tag.images, etc should all work. Does anyone have any examples of how this relationship can be constructed, ideally using something like a HasTags mixin?

harsh pulsar
#

That is, you have Foo and Bar and each of those has a one-to-many relationship with a 3rd FooBar

#

The first example there is a many-to-many relationship relationship without manually constructing the association table but you still need to manually write out something to create an association table

#

You could automate that with a helper function or maybe even a decorator if you design it carefully, but probably not a mixin without doing nasty metaclass stuff

paper flower
# harsh pulsar Wait, I stand corrected. https://docs.sqlalchemy.org/en/20/orm/extensions/associ...

I think they're looking for a "generic foreign key" pithink
@craggy canyon Basically if you want to be able to tag any model there are two options:

  • Junction table for each relation, this can guarantee database/fk integrity
  • Single table with a discriminator signifying which table/model this relationship belongs to

I would say first option is better since you can use foreign keys with it, and these junction tables pretty much can be autogenerated

hollow oar
#

any recommendation for database migration management? have used alembic in the past, wondering if other people have different opinions

and any recommendation for sharing database migration magnement-related code across many git repos? might do git subtree but wondering if it's a well-known solved problem

lean olive
hollow oar
lean olive
#

there are language agnostic DB-migrations-as-code tools, too

hollow oar
hollow oar
lean olive
#

it's a Git repo, so using branches for trying things out works fine

hollow oar
#

oh i see what you mean now!

spiral pebble
#

can anyone help me fix this error ?

spiral pebble
cedar tiger
cedar tiger
spiral pebble
#

this should have fixed it na

cedar tiger
spiral pebble
cedar tiger
spiral pebble
cedar tiger
spiral pebble
#

in my database

#

it is storing 0 or price

cedar tiger
spiral pebble
cedar tiger
spiral pebble
spiral pebble
cedar tiger
keen ravine
cedar tiger
keen ravine
cedar tiger
keen ravine
#

and can handle previous queries

cedar tiger
keen ravine
keen ravine
cedar tiger
cedar tiger
#

zeroes is also empty

keen ravine
cedar tiger
#

Hence the error

cedar tiger
spiral pebble
#

some items are also returing this

cedar tiger
keen ravine
spiral pebble
cedar tiger
keen ravine
cedar tiger
spiral pebble
cedar tiger
spiral pebble
keen ravine
cedar tiger
cedar tiger
# spiral pebble

!rule AI doesn't understand the context of your StatisticsError.

delicate fieldBOT
#

10. Do not copy and paste answers from ChatGPT or similar AI tools.

cedar tiger
#

!d statistics.mode

delicate fieldBOT
#

statistics.mode(data)```
Return the single most common data point from discrete or nominal *data*. The mode (when it exists) is the most typical value and serves as a measure of central location.

If there are multiple modes with the same frequency, returns the first one encountered in the *data*. If the smallest or largest of those is desired instead, use `min(multimode(data))` or `max(multimode(data))`. If the input *data* is empty, [`StatisticsError`](https://docs.python.org/3/library/statistics.html#statistics.StatisticsError) is raised.

`mode` assumes discrete data and returns a single value. This is the standard treatment of the mode as commonly taught in schools:

```py
>>> mode([1, 1, 2, 3, 3, 3, 3, 4])
3
```  The mode is unique in that it is the only statistic in this package that also applies to nominal (non\-numeric) data:
cedar tiger
spiral pebble
#

0.0 is not empty

cedar tiger