#databases
1 messages · Page 35 of 1
Then you purge the data for the given range in the file at the time you do the bulk insert
How would that help with duplicates? 
Jobs are ephemeral, no?
you can just use a timestamp instead in that case
To know if that batch has been processed already, that specific file
Or if it's from the realtime strea
yes, but there is no truly unique identifier for each trade
so...
I thought maybe i could create them and identify data dump entries from live trade data entries to make sure that identical trades, have a unique ID so if they are from different sources they can be identified to have already been recorded
or not recorded...
soo, i mean that if I put like....'dump' or 'live' at the beginning of each trade id that gets created for each entry...duplicates would be identifiable. so if the identical trade entries are already present from the live trade data, dump doesn't need to record to db. if dump trade entries are already present and identical. they are also, obviously duplicates.
Add a unique identifier for data sources: Differentiate between live data and data dumps by adding a column (e.g., "source") in your database to identify whether the data is from a live feed or bulk dump.
Handle timestamp precision: If possible, use higher precision timestamps (milliseconds or microseconds) to distinguish between trades occurring in the same second.
Create a composite unique key: Use a combination of timestamp, price, volume, and source to ensure identical trades are not incorrectly marked as duplicates.
Use PostgreSQL's ON CONFLICT: This allows you to skip inserting duplicate rows based on defined conditions.
Batch processing with checksums: Use a rolling checksum or hash of each trade to identify duplicates more efficiently.
This approach helps manage legitimate, identical trades while avoiding unintentional duplicates when merging data from multiple sources.```
^ thats what chatgpt had to say about the conversation we are having
I think I'm right
That's what I wrote 🙂
thats super funny btw
Sorry was super distracted by a game
no worries
You have two sources:
- Bulk data inserts
- Live, api-sourced (or whatever)
yes
You keep track of the job/task/whatever that inserted each row so you can tell if it came from a live feed or not
yes
When you process a bulk report, since you know it's for a specific time range and is complete for that time range, you delete everything in that range in the same transaction that inserts the new bulk report
okay, but how do I track time if the unix stamp from the raw data is only down to the second? the range of time would overlap, wouldn't it?
Depends on the guarantees of the bulk report
its the most granular data possible. they dont register deeper than one second
i may not be understanding that correctly though. could you elaborate please?
what guarantees do you mean
as in, are those all the trade that occurred? yes.
more precisely, even if they make a mistake, it's reflected in their charts and I need to make the same mistake so my charts reflect the same candle patterns
Just thought I'd share some SQL tips and tricks I shared with the folk in reddit.com/r/SQL:
https://www.reddit.com/r/SQL/comments/1fl1fvz/ive_put_together_a_list_of_some_sql_tips_that_i/
Do the reports overlap? It would be weird for them to publish overlapping bulk reports
No the reports from the bulk data do not overlap. The only concern is an overlap between the incremental live data that I'll be recording and layering in the quarterly bulk data to fill in the gaps
Like I said above you destroy the data you got from another source for the overlapping period when you get the bulk report
You only need incremental data for a period not covered by a bulk report
Oh my God I think I see what you're saying now
Bulk data kills live data
I'm so dumb
Yes, the bulk kraken reports are complete for that time period
I'm so dumb and you're so smart thank you so much.
Oh my God that's so f****** simple
Nah I'm just explaining it very poorly getting my ass kicked
I'm sitting here trying to figure out how to fit these chunks together down to the millisecond and ultimately it doesn't even matter
Because my program will just rewrite the candlesticks anyway
Thank you @obtuse halo 👍
Good luck in your game
I have script for web scraping in which i used function for scrolling web page but its not working is there anyone who can help me
Can you provide any function or any python code for scrolling webpage?
Open a help thread please, I don't understand your question
Ok
SQL lite my beloved fr
I'm finding way to convert sql alchemy row obj -> dict
Isn't there a function like this in sqlalchemy?
Remind me again, what does a row object look like?
like Instance of Model Class
class Model(Base):
tablename = 'models'
....
obj = Model()
obj to dict
Could you explain why you want to convert it to dict? 🤔
because depending on what you're using it for there could be better ways of doing that
Thats a model to a dictionary. You can already access the model's attributes
to create another class instance easily.
obj = Model()
model_dict = convert_to_dict(obj)
Another(**model_dict)
Is there really a guarantee that their fields will match? Is Another a sqlalchemy model too?
no
I asked two questions 😅
The translator app is slow to turn on.
It's a construct that creates a dataclass object that has a one-to-one correspondence with the model. The dataclass is easy to convert, but sqlalchemy doesn't provide a way to convert the model.
I thought that maybe you could use pydantic models instead but if you want dataclasses it would be easier to just convert your model to dict 🤔
def to_dict(model: DeclarativeBase) -> dict[str, object]:
mapper = inspect(model)
return {
attr.key: getattr(model, attr.key)
for attr in mapper.attrs
}
Thank you. However, after looking into some more data, I think this is a little faster than inspect.
def model_to_dict(model: Model):
return {c.name: getattr(model, c.name) for c in model.__table__.columns}
I don't remember if column.name could be different from attr.key
but
class Test(Base):
a: Mapped[int] = mapped_column("id")
model__dict__ => {'a': val}
model.table => {'id': val}
I guessed the result
hey yall I'm looking for some advice. I am working on building an app that is more or less and inventory system for multiple users to track their own personal inventory. My only real experiance with databases is SQL, I was wondering If SQL would work for this. For a little more context, users will be able to input item, what the item type is(i,e. tool, electronic, book, toy), what comes packaged with said item, and the condition of the items. My idea is that as people build out their inventories and categorize items it will offer suggested categories to other people inputting the same item to their inventory. I hope this makes sense, any help would be appreciated thank you.
I'm starting to learn SQLAlchemy, having worked a bunch with Django. First, am I alone in really disliking the official docs? I feel like it's really hard to figure out... More importantly, how do people handle helper functions in SQLAlchemy? I'm currently thinking about something like how Django does managers. For example, if I have a column with a timestamp for when something was deleted, when I query for existing values, I'll want to filter by where that field is null. Easy enough to do with Django managers, but not sure if SQLAlchemy goes more for the "write it into every query" style, or there's another mechanic I should know.
So my current way of storing data is using json. Problem is I went so far into coding my project to store it all in json that I would need to rewrite the entire format of my code if I wanted to get a real database, mysql. Heres how Im thinking about this... if Im running a website why not just store it locally on the same machine as json file so its more simplistc? Why should I get mysql? If I do run mysql should I create another script to just translate from json to mysql format like a middleware thing or should I simply just rewrite the entire code to directly just store into mysql. I like the way json is formatted its easier to read in my opinion and look at the data vs mysql. But I think its just because ive been doing json for so long thats the reason why. Any opinions please.
I was debating on having it both. Store as json locally so I can look into the data much easier and store it in mysql as the real deal
A few things there.
- JSON is slow. As the file gets larger, it'll get slower. It also runs into issues if you have two people using the app at the same time.
- If your current code already uses JSON and you just want to get to using a DB as fast as possible, having a translator is not a bad idea, though in general, you'd want to have code that does reading/writing separated anyway, for exactly this reason.
- I totally get that JSON is easy to read, but I'd argue SQL databases are just as easy, once you get over the inital "new" hump. There are plenty of GUIs that help out too.
- If you want to stick to JSON and use a database, NoSQL is an option, but it's not a replacement for SQL. Each has its own use cases.
tl;dr; Static files are fine for small things that won't change/won't change much. They don't scale though, and you need a database for that. If you're interested in programming at any scale, I strongly recommend pushing past the discomfort and learning DBs; you'll wonder why you didn't before.
Gotcha. So I do plan to scale in the future the database by alot. So I guess my final question is. Since I do plan on rewriting the entire code to just directly store into mysql as my main database for the website. Since im new to mysql and how the database is structured. I do have a bunch of nested lists/dicts. So when trying to layout how I want to save the data in mysql should I just have a bunch of tables for those nested lists?
Or have it all written out in 1 table
Generally nested = separate tables
In databases, we refer to what you're talking about as "normalization". N1 basically means you just have a giant table, each column representing a field, up to N6, which basically means you have a separate table for every value (it's far more nuanced than that, but a generalization). You generally want to aim for N3/N4, which means you group data by idea (into a table), and relate other tables via foreign keys. In JSON terms, if something is nested, the vast majority of the time, that means it should be a separate table, because each nested item represents it's own group of data.
For example, a json object representing a student: You'd probably have personal information about them at the top tier, such as name, phone number, address. While address is it's own data group, it's more closely related so the student's personal info, so you group all that into a table. Then you may have a list of classes. Now the classes aren't really linked to their personal info, and different students can have the same class. So you'd make classes their own table. Then you'd add a third table that links students with classes. I know that sounds like a lot, but I just mean it to help you think about how to break up groups of data.
Also, don't fret too much about over/under normalizing. You only learn good database design by doing it, IMO.
I'd just start by thinking of each nested structure as it's own table. If something is nested in a nested structure, it's likely it's own table, etc.
From there, I'd do some reading on normalizing, and look up how other software breaks up it's data.
Sounds good!
One BIG advantage to the database is you define types. It helps you make sure your data is consistent
In JSON, each object can vary, each value can be different.
In SQL, if a column is a number, only numbers can go there, so you know when you retrieve, you'll get a number. A table has a fixed definition, so you know when you get a row, you know what columns you can expect.
Is it possible I can set up the database locally on my computer before trying to run it on the server? As like a test.
Yah, 100%
Perfect
MySQL has installers
And if you want to do a small side track, containers are perfect for testing like this. You can learn docker basics to do what you're talking about pretty quickly. But that is another thing to learn, so don't stress over it.
Sounds good.
If you're on mac, homebrew has mysql
Just make sure you install the client and server
They're two different things: the server holds the data, the client is how you connect to the server
Did not know. I thought I just needed the client
Also, if you're in Python, I recommend Postgres over MySQL
How come?
A lot of Python tools have more native support for Postgres, wherehas MySQL requires additional packages
Oh. But is it the same concept as we talked about?
Mind you, if you're just doing your own stuff, it doesn't matter
PostgreSQL and MySQL are just two different flavors of SQL
If you learn SQL basics, it applies to any SQL database
There are just specifics in each you end up tweaking
I know basics of SQL but not crazy stuff
If you plan on staying in Python, I recommend go with Postgres. If you're just messing around, either one works. You won't go wrong.
Alright since my projects is mainly Python I will go with Postgres
You're only going to learn the crazy stuff by diving in 😄
lol
So yah, a LOT of tutorials on how to use Postgres with Python.
They'll teach you how to connect your python app to your postgres server. I suspect you can probably find tutorials that will tell you how to set up the postgres server too, but I can't guarentee that.
Best of luck!
The more I try to learn SQLAlchemy, the more I think I should just write the queries myself and use dataclasses data structuring...
My head hurts... https://jepsen.io/consistency
It's good for bigger CRUD-heavy OO applications, but for smaller things I agree 100%
Check out Sqlmodel if you like Pydantic
!pypi sqlmodel
I've been following sqlmodel, I just worry it's really early. I am using fast api, so it's convenient.
And yah, there's a lot of reading/writing, which is why I want an orm...
I want to learn Sqlalchemy, but oh man...
And unless I'm wrong, won't I be using sqlalchemy for the actual querying with sqlmodel? Doesn't it just wrap the orm part of sqlalchemy?
in-memory databases maintain a disk data structure such as B-tree and write the log entries periodically to the B-tree. After applying log entries to the B-tree, we purge them from the log. This is called checkpointing. Now, when the system restarts, we first load the checkpoint and write more recent log entries that are not included in the checkpoint to the memory.
so,in-memory database simply append the current data onto the checkpoint?
in a situation where checkpoint alr loaded into the disk
append the current data onto the in-memory database and that is set as the new checkpoint.
sqlmodel is built on top of sqlalchemy, so yeah
So then back to 1, learning sqlalchemy, which seems to be easier said than done.
I'm trying to follow this tutorial https://learndjango.com/tutorials/django-docker-and-postgresql-tutorial to set up docker and use postgres, but when I do docker-compose up I get the error
Couldn't import Django. Are you sure it's installed and available on your PYTHONPATH environment variable? Did you forget to activate a virtual environment?
I thought the entire point was to work in docker rather than a local virtual environment. Did this tutorial miss something?
Not a DB issue, but you copied the requirements file and the dockerfile exactly?
Yes. They both match. What is a better channel to put this question?
are there any good books or website on learning more advanced sql techniques?
practice problems online
The intermediate part of the SQL is https://sqlbolt.com/topics but best if you go through the previous topics (as a quick revision or see if you missed something)
SQLBolt provides a set of interactive lessons and exercises to help you learn SQL
import abc
from sqlalchemy.ext.declarative import declarative_base, DeclarativeMeta
class DeclarativeABCMeta(DeclarativeMeta, abc.ABCMeta):
pass
class Base(declarative_base(metaclass=DeclarativeABCMeta)):
__abstract__ = True
@abc.abstractmethod
def implement_me(self):
"""Canhaz override?"""
Will this code work in SQLAlchemy 2.0, or has there been a better way to do it since the version change?
how to get good at the art of database normalization?
i need a resource where i can practice lots of database normalization
🤔
Why do you want an abstract method there?
(just curious)
If you're using mypy you can just do
class Base(DeclarativeMeta):
@abc.abstractmethod
def implement_me(self) -> None:
"""Canhaz override?"""
class Test(Base):
pass
But there's a downside - mypy would only throw an error if you try to instantiate Test, you can also mark it with @typing.final, forcing mypy to type check it when you define it
That's Correct
There's really no benefit in using ABC baseclass/metaclass here, IMO
Unless you're instantiating your classes dynamically (which mypy won't catch)
Hi, I have some questions regarding the optimistic or pessimistic approach to database locking.
I found this good answer on stackoverflow:
https://stackoverflow.com/questions/129329/optimistic-vs-pessimistic-locking/129397#129397
But I have a couple of additional questions in case someone can help me:
- The optimistic approach means that records are not locked, while the pessimistic approach does lock records. Do these locks only imply not being able to write to them, or do they also imply not being able to read them?
- If the locks only imply not being able to write to them, what advantage does the optimistic approach have? If two operations are done almost at the same time, one of the two will end up being discarded due to the version change caused by the first.
A lock is more about making sure someone else doesn't modify the data before you finish. Optimistic db locking means you hope nobody else modifies the database before you finish. Pessimistic means you don't let anyone else. Theres many layers to this topic, like transaction isolation, etc.
And I thought my tables were wide. Wow.
wait a minute, 30k columns?
@ocean grail
Somehow I still didn't hit a 100
how many u got
99
Thankfully was able to reduce it to only 500 1700 after getting rid of sparse columns, but yeah that's what happens when you one-hot encode
you're using duckdb as a feature store?
What do you use?
Just a general question, I've been practicing my SQL (not my strongest point), I inserted into a table, 10,000 records, and did a simple select * and it took around 10-12 seconds to pull all the records, when i added a where clause for 3 columns, it dropped the time from 10 seconds to 0.5-1 second, the thing thats confusing me, is the filters did not change the dataset, it was the same 10000 record dataset from the select *, so even though the dataset is the exact same, how come the time difference is so large? can someone explain that to me?
Have you tried select * again?
I was testing it consistently, even when the select * is cached its still around 7-8s, but with the filters it drops to below 1s
can i use a alias for a subquery for sqlite? currently going through learn sql (3rd edition) and it uses mysql but for now i want to stick with sqlite until ive 'mastered' sqlite.
I guess I am
What was your faster query? (What you said is inconsistent, which is why I'm asking)
oh my bad, 1 second
SELECT * FROM events WHERE server_id = 1 AND state = 'ACTIVE' AND created_at BETWEEN NOW() - INTERVAL 30 DAY AND NOW();
This took 1 second for 10k records, however when i did the SELECT without the where clause it took 10 seconds, and it was the same dataset because all the server ids, event sates and created_at were all the same for each records, as i was just testing the query, was just mock_data
Parquet files honestly
Which I guess is queryable from Duckdb
Did you check the EXPLAIN output? Seems like the query planner was able to use the WHERE as some kind of optimization hint
What database was this?
Yah, that's basically us.
that's insane
I would like to know is duckdb for vector storing is realiable in python?
Wdym? Also; duckdb discord might be a better place to ask
is ROLLUP in sqlite? if not, whats the sqlite alternative? the book im going through uses mysql throughout
Can you give an example of what you have/want?
Idk. Im going though a book called LearnSQL (3rd edition) to learn more about what I can do in SQL. Theres certain features I cant use because theyre not available in SQLite. Just asking for the alternative way to make up for it
can someone help me in coding im new where do i even start i need soemone to teach me the basics
You could use duckdb; https://duckdb.org/docs/sql/query_syntax/grouping_sets.html
Union multiple group by's
That's what group by rollup, cube, and grouping sets is (conceptually): grouping at various levels and unioning the results
What database does the book use?
MySQL
why not just use that, then
Cba with setting up temp server. Ill stick with sqlite for now. Plus sqlite does what i need. Ill move onto other versions when im skilled with sqlite
just run a mysql docker container. pz ez
I'd like to know if I am strange to go crazy when I see something like this, am I the only one?
yea that's a strange db design indeed
It's not a very forward-thinking design
It's lazy
I wanted to say that its very excel of them to think that its very easy to change columns etc
There are so many stories out there of databases designed exactly like this, either by amateurs who didn't understand how to do it right, or by lazy sloppy pros who fetishize YAGNI beyond what would ever make sense
I forget whats YAGNI?
"You aren't going to need it" -- don't add features, interfaces, etc. speculatively, only build for what you know for sure that you need
ah ok
It's a good principle to avoid over designing too early in the process and ending up with wrong/bad abstractions, too much indirection, etc
But some people take it too far
If there's one argument in favor of this, it's that your database lookup is slightly faster
Always KISS it is what my PM usually say
Right
HR might want to hear about that

And this is partly a matter of knowing your domain. If this is recording temperature sensors from an IoT device, my own experience tells me that you should always plan for a variable number of sensors, because you never know when they will come out with a new version of the device that has more or fewer sensors
But if this is something more specific where you are definitely designing the database around a known-fixed set of sensors, then yeah it's actually fine
Ok now at least I feel less strange and alone in this mess xD
I see, the strange thing is that this is a database of lab test results, the other tests are designed with some criteria, I do no not why here did something completely different, they already had the structure to manage this 😩
Was this in fact migrated from an Excel sheet?
I have to check how the set it like this , it could be
I think the idea maybe was : temp1 2 3 are always the same they are test parameters, they should be always fixed... And then ah no let the user choose the temperature he/she likes
Plus the mix with complexity of LIMS software and 💥
But I think sometimes it doesn't need to be DB Experts... Today I see a person creating a simple table and in the first column add title info1/info2 ... Why why don't you simply add a new column????
as long as they normalize it afterwards
Sorry I don't know if I get what you mean... Do you mean being consistent with format info1/info2?
the process of organising a database properly is called normalization
Yes yes it was just an example of everyday work, ok you have to think how to organize a DB... But I think it is a matter of mental organisation not to mix different info inside the same table columns... Then why creating a table, just write some sentences:)
There's a difference between expertise and following basic conventions and established good practice
The problem is that scientists are busy being scientists, and either don't take the time to learn, or are never even introduced to good practice and don't even know they're doing it weird
The latter was a problem generally in scientific computing until maybe a decade ago
It's a little different now because more people care than they did in the past, but there was a long period of time when scientists wrote the worst most unmaintainable code imaginable
Ok I know I'm probably part of the category and then part of the problem (I think I wrote and still writing terrible code). But if the scientific culture does not get you the mind structure to create a simple table I think we have deeper problems
:incoming_envelope: :ok_hand: applied timeout to @torn sphinx until <t:1727708836:f> (9 minutes and 59 seconds) (reason: mentions spam - sent 10 mentions).
The <@&831776746206265384> have been alerted for review.
I'm new to databases, what's a different/better way to do this?
I cannot be very professional in telling you but at least be consistent in what you write. Do not put your information inside the columns. It would be better to have a column of temperature one of conductivity, one for diffusivity and one to take care of order/numbering 1,2,3,4
Db stories part 2
I have different tables formatted like this
Sample| test| parameter_1| par2...|res_nwne1|res_name2..
One table for every test because parameters and results are different from one test to the other, and one sample can have different results if parameters values are different
Now someone would like to have a table like
Sample| res1_test1|res2_test1| res1_ test2
And I cannot explain to him that it is not that easy to merge all tables in one and not lose parameters information (which could be essential for test result meaning)
if im using sql with python, are case expressions still useful?
why wouldn't they be?
idk. i feel id rather use python than sql for anything to do with conditional logic
generally, you want to push calculations into SQL, for efficiency
One way to look at it is; keep logic as localized as possible. Sometimes it makes sense to keep the logic and transformations all together in SQL, sometimes in Python. Sometimes half and half (step 1 in sql, step 2 in Python), etc.
One option: run_id | sensor_id | temperature | diffusivity and then you have sensor ids 1,2,3,4 within each run
But honestly, if you have a fixed set of sensors, and this is collecting data for a specific experiment, the "wide" data layout in the screenshot is actually 100% OK
it would be much worse for an application that might need to run for years and will probably change at some point
You missed thermal_conductivity but I think the "how to" is clear.
Anyone have any suggestions for a database migration tool for raw sql? As much as I want to get into sqlalchemy, it seems like just an overly complex abstraction of sql, to the point where I wonder why I shouldn't just use raw sql and dataclasses. But now I'm not sure how to manage my database, other than write/run sql files.
Anyone here familiar with postgres orm? I wanted an implementation of array field in postgres.
Which ORM? Postgres is a database.
oh I meant to say tortoise
We use Liquibase at work
there's an official container image too which is very useful for deployment
not sure if you figured it out yet, but you have to pass your tuple of arguments to Connection.execute() directly, no *args unpacking needed
otherwise aiosqlite will think your 5 letter string should fill 5 placeholders in your query
you dont need to at all
to clarify, if you were to do: py cursor = await conn.execute("SELECT ?, ?, ?", "Bob") then aiosqlite would pass 'B', 'o', 'b' to each three placeholders
if you put your strings inside a tuple or list instead, it'll be given to one placeholder each: py cursor = await conn.execute("SELECT ?, ?, ?", ("Bob", "Alice", "John"))
dont use f-strings
formatting strings directly to add data to your queries is bug prone and a security risk, placeholders are the easiest way to manage it
remember that *args in a function signature puts all your arguments into a tuple, so you can pass that directly to connection.execute(), nothing fancy required: ```py
async def execute(self, query, *args):
cursor = await self.connection.execute(query, args)
... # no * needed ^^^^
await db.execute("SELECT ?, ?, ?", "Alice", "Bob", "John")```
Why are you using SELECT?
oh yeah, you only want insert/update/delete for your execute method
the query can be whatever, but you want to give a tuple of arguments to aiosqlite, not unpack * your arguments
When working with rest api, how do we validate or stop user from deleting data that doesn't belong to them?
- API design
- Validation and constraint checks
Hmm, I will just try to do the basics for now. Will revisit validation and constrain check later.
tbh, if you api is like /<account_id>/<doc_id>
It will be quite difficult to delete or retrieve any doc not belonging to your account
that's already a good start
@keen minnow I'm just thingking it wouldn't be hard for other players to find out other player account id
so?
same principles
Would other player able to do Delete if they have other people info?
Right but that's where my brain shut down.
if your APIs have the player id in the URL
So my understanding is api is stateless? No hand shake what so ever right?
So I need to figure out the validation bit
it can be quite a loaded statement
the point is it can scope the operation
I see, well not gonna be worried too much about that for now. I have yet touched api in my life and about to start.
if your API is like /<player_id>/etc. you can scope out any operation on that
best time to start is today
the point is to enforce/scope out the operation on an account/player
so that anything outside of it would not work
I will keep that in mind 🙇♂️
Been stuck with console project in python.
Hopefully making a simple data base isn't a huge leap
can someone give me a better definition of what window functions are?
Better than what? Here is the SQLite docu: https://www.sqlite.org/windowfunctions.html
Hello, your post has been removed for advertising
Aahh... It's not a ad but I got ur point
Should I just say ..
Looking for team mate for a hackathon(django+js+web3)
reason, time = await self.db.fetch(f"SELECT reason, time FROM staff WHERE employee = {id}")
The first one
!sql
Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.
Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.
For example, the sqlite3 package supports using ? as a placeholder:
query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)
Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.
See Also
- Python sqlite3 docs - How to use placeholders to bind values in SQL queries
- PEP-249 - A specification of how database libraries in Python should work
Just using it to fetch data and bot insert or delete command
Don't (don't use an f string), just learn to pass a param. Not hard.
Postgresql. Usable for everything and good serving until very very very large sized applications.
for most people it is the only db they need
Everyone knows what I'd say
(Postgres, because of https://github.com/duckdb/pg_duckdb)
whats a good transactional database project I could do? Im thinking of doing a weather based project where I use the openmeteo api to extract real time data over a period of time, store the data in an SQLite database and then create a graph from the data
Yah actuallly
Can someone help me?
async def execute(self):
statement = SimpleStatement(self.query)
return self.session.execute_async(statement, (4, "Cassandra"))```
```py
async def main():
async with Cassandra("INSERT INTO test (id, test_value) VALUES (%s, %s)") as cassandra:
await cassandra.execute()```
```py
Traceback (most recent call last):
File "C:\Users\vhipn\IdeaProjects\Cassandra\main.py", line 40, in <module>
asyncio.run(main())
File "C:\Users\vhipn\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "C:\Users\vhipn\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 649, in run_until_complete
return future.result()
File "C:\Users\vhipn\IdeaProjects\Cassandra\main.py", line 35, in main
await cassandra.selectAll()
File "C:\Users\vhipn\IdeaProjects\Cassandra\main.py", line 27, in selectAll
self.session.execute(statement, 4, "mariadb")
File "C:\Users\vhipn\IdeaProjects\Cassandra\venv\lib\site-packages\cassandra\cluster.py", line 2677, in execute
return self.execute_async(query, parameters, trace, custom_payload, timeout, execution_profile, paging_state, host, execute_as).result()
File "C:\Users\vhipn\IdeaProjects\Cassandra\venv\lib\site-packages\cassandra\cluster.py", line 2720, in execute_async
future = self._create_response_future(
File "C:\Users\vhipn\IdeaProjects\Cassandra\venv\lib\site-packages\cassandra\cluster.py", line 2979, in _create_response_future
query_string = bind_params(query_string, parameters, self.encoder)
File "C:\Users\vhipn\IdeaProjects\Cassandra\venv\lib\site-packages\cassandra\query.py", line 905, in bind_params
return query % tuple(encoder.cql_encode_all_types(v) for v in params)
TypeError: 'int' object is not iterable```
fixed
Open a help thread plz: #❓|how-to-get-help . ... nm, they said fixed
any one contribute
ping me fr questions
This is way too close to an ad
And it's not complex enough to be that interesting
no
Hello
I am Cools9 i am here with a project for intermediate python devs it is a db called elementaldb its simple and fast and made in python for those who want to see it check www.github.com/cools9/ElementalDB
Pinh me fr it
*Ping
This seems like an ad, also why would someone use your project instead of postgresql, etc?
its faster
Can I see the benchmarks?
lol, right. Storing the "database" as JSON files will certainly not be faster than using a real DB.
it is faster
well thats how mongodb stores it data what do you say about that
its not a real db
My db faster than sql lets goo
that has nothing to do with how mongodb work
[citation needed]
this is far to primitive to be compared with mongodb
!rule 6
you should see authorization prior to advertising your work
for god sake isnt there a colab or showcase channel
none as far as I know
no one will believe it until they see a basic benchmark at least
i shared the benchmark
you shared a random screenshot. That has nothing to do with what benchmarks mean
thats proof
nuh uh
that its faster than mysql
You shared a screenshot, people need code that they can run locally to verify your claims
Also you compared it with sqlite, not mysql
Even if fetching a single row is faster, what about fetching multiple rows? Aggregations? Does it support ACID, Locks, Scaling (Vertical/Horizonal), Indexing, Searching, defining schemas?
What if your db contains 1000 rows? 1 million rows?
they have nothing in common besides sql in the name
Would it still be faster?
well most of them
well i tested that too
well how else are you supposed to use sql in python
without sqlite
sqlite is not mysql
https://dev.mysql.com/doc/connector-python/en/
Get mysql and a python connector for it (e.g. mysqlclient, pymysql, aiomysql, etc)
still faster than sqlite
Key difference between your project and sqlite is that sqlite is far more feature rich and for example data won't get corrupted in case of power outage or similar events
ok
Also, you're not faster than sqlite. You might be faster in a tiny example, but in a proper benchmark sqlite will beat you easily.
no clue
Any snowflake experts or anyone with a good snowflake cheat sheet?
I work with snowflake, what do you need to do?
I didnt undertand you very well, do you need tips to try to make the queries on VS, and try to use them locally?
sqlite is severless isnt?/
It is
I learned them through their docs and books
books?
LMAO
I don't do much with cassandra, but my search strategy for new topics includes some docs but also conference talks, such as: Cassandra Summit 2023... and now I'm watching this retrospective: Apache Cassandra: What’s New and What’s Next
i mean can google for the docs like for sure
im also learn programming that way
but it just doesnt gives me the information i wan
that you guys might hv
btw, thanks for not being classic toxic
: )
idc whatever the mod gonna do with me
but it just so classic that this ppl when they cant answer iinway more friendly like as if they're some prodigy kid that has solved PhD level math
"how cassandra works" is such a broad question that, almost necessarily, it starts with explaining the original academic (phd level) paper.
It's like asking how magnets work
im not complaining you
I understand, I'm just saying that some questions are so broad that you won't get a good answer
what information do you want
the original dynamo paper in 2007 is fairly readable. Cassandra is based off that design
Anyone familiar with sql alchemy know if the query_class still holds for v2? Looking through the docs, I don't see any examples in the v2 docs, and the only documentation on it is one sentance long and points to documentation that doesn't mentions query_class and is prefaced with a note saying that technique is legacy.
When googling, I'm seeing multiple techniques for soft delete, but the query_class one is the only one that seems to make sense to me (others feel overly complicated)
!d sqlalchemy.orm.relationship
function sqlalchemy.orm.relationship(argument: _RelationshipArgumentType[Any] | None = None, secondary: _RelationshipSecondaryArgument | None = None, ...) → _RelationshipDeclared[Any]```
Provide a relationship between two mapped classes.
This corresponds to a parent\-child or associative table relationship. The constructed class is an instance of [`Relationship`](https://docs.sqlalchemy.org/en/stable/orm/internals.html#sqlalchemy.orm.Relationship).
See also
[Working with ORM Related Objects](https://docs.sqlalchemy.org/en/stable/tutorial/orm_related_objects.html#tutorial-orm-related-objects) \- tutorial introduction to [`relationship()`](https://docs.sqlalchemy.org/en/stable/orm/relationship_api.html#sqlalchemy.orm.relationship) in the [SQLAlchemy Unified Tutorial](https://docs.sqlalchemy.org/en/stable/tutorial/index.html#unified-tutorial)
[Relationship Configuration](https://docs.sqlalchemy.org/en/stable/orm/relationships.html) \- narrative documentation
Doesn't seem to show it but according to this, its towards a Query
Oh, I didn't think to look at relationsnhip for that. So seems like I can use query_class to build a soft delete mechanism.
What's the best way to store password salt bytes in sqlalchemy? base64 encode? LargeBinary type?
Text. And I like how this article explains it: https://variable-scope.com/posts/storing-and-verifying-passwords-with-sqlalchemy
I really enjoy abstractions. Abstractions are the lifeblood of programming. They take complex operations and make them easy to work with through accessible interfaces. This article will be about doing that with the way we store and verify passwords (or rather their cryptographic hashes) in (web-)applications based on SQLAlchemy. And in such a wa...
Just text is fine
most password hashing algorithms already store salt with the password, don't invent you own hashing algorithm
e.g. argon2 uses PHC format https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md
And bcrypt uses similar structure, but it only includes one parameter into the resulting hash string
If it requires a Query class/subclass then it's probably deprecated 
Query was deprected in 2.0 in favor of using select/Select
Can you expand on that? I've seen examples of how query_class can add where segments onto a query. How would I do that with a select? And I'm also considering I'm looking at this wrong. I'm used to Django and thinking how I'd do a soft delete there. I'm guessing there are different mechanisms/patterns I should be following for sqlalchemy.
Explain what exactly you want to do, as I didn't really use Query api, it already was superseded by select in version 1.4, roughly when I started using sqlalchemy 🤔
Code example would be great too
I'm looking to build a soft delete mechanism. I don't have code yet; I'm brand new to SQLAlchemy so learning how it works. But I found these two examples of methods:
https://blog.miguelgrinberg.com/post/implementing-the-soft-delete-pattern-with-flask-and-sqlalchemy
https://theshubhendra.medium.com/mastering-soft-delete-advanced-sqlalchemy-techniques-4678f4738947
The second, using a listener, seems an overly complex way of doing things. The first seems like a really reasonable way to build the mechanism, but it's also 8 years old, so I'm not sure if applies to v2
There's actually an example of how you can implement soft deletes with sqlalchemy in their docs
But I don't entirely remember if it plays nicely with relationships
Oh, and I just noticed the first one has code that doesn't exist anymore
(Yes, via a listener)
Ooof, a listener makes sense but feels so needless. At that point, I could just create a manager to setup queries
Or a helper function
@event.listens_for(Session, "do_orm_execute")
def _add_filtering_criteria(execute_state):
"""Intercept all ORM queries. Add a with_loader_criteria option to all
of them.
This option applies to SELECT queries and adds a global WHERE criteria
(or as appropriate ON CLAUSE criteria for join targets)
to all objects of a certain class or superclass.
"""
# the with_loader_criteria automatically applies itself to
# relationship loads as well including lazy loads. So if this is
# a relationship load, assume the option was set up from the top level
# query.
if (
not execute_state.is_column_load
and not execute_state.is_relationship_load
and not execute_state.execution_options.get("include_private", False)
):
execute_state.statement = execute_state.statement.options(
orm.with_loader_criteria(
HasPrivate,
lambda cls: cls.public == true(),
include_aliases=True,
)
)
As you see it's not that complicated
Oh, def not complicated
class HasPrivate:
"""Mixin that identifies a class as having private entities"""
public = Column(Boolean, nullable=False)
It doesn't seem to have been updated to use new syntax though
A lot of the docs use the old syntax 😛
Should be
class HasPrivate:
public: Mapped[bool]
Yah, that's more or less what the second link I provided above does
It's not complicated, but I'm not sure how it's better than
def SoftDeleteQuery(my_class):
return select(my_class).where(deleted == False)
(that was thrown together really quick, so please ignore the syntax errors :P)
Won't you have to add that to every relationship?
I'm suggesting using that as a replacement for select(Model)
While you can just add that mixin to the classes you need
Ah, but that won't filter the relationships 
Yah, you'd want a mixin to add the column
Right, it becomes more explicit, at the downside of you need to track what's deleted and not.
Like I said, I may be thinking about it wrong, since I'm new to SQLAlchemy
I don't dislike the listener
To make it more explicit you still can set include_private to True by default
But generally use whatever you find easier to use
Yah, a param of some sort
For soft deletes it may be simpler to just create a base query in your repository if you use that pattern
But I'll look at the listener more... it just feels like a lot for a little
e.g.
_base_stmt = select(Model).where(Model.is_deleted.is_(False))
Yes! That's the sort of thing I'm looking for! Of course, then how do you override that when you need everything.
I really do wish the docs were better
Finding something like that would take forever as someone new to the code
Honestly I don't think you can easily do that, I guess that's why listener could be better 😅
Yah, I'll read more on the listener
Is putting the old instant yeast in warm water and sugar the best way to tell if it's still good?
Thanks, taking a look!
Thanks
Hello Mongo db user here
Is there any particular reason why something like this would be unsafe?
from sqlalchemy.orm import DeclarativeBase
from pydantic import BaseModel
class Input(BaseModel):
"""Base input object for ORM mapped classes."""
T = TypeVar("T", bound=Input)
class Base(DeclarativeBase, Generic[T]):
def __init__(self, base_input: T) -> None:
input_dict = base_input.model_dump()
for key, val in input_dict.items():
if hasattr(self, key):
setattr(self, key, val)
The idea being that inheriting mapped classes can just pass in a pydantic validated object to the __init__ - I suppose I could add an else that raises an error at the ORM level where hasattr is False
I'd generally advocate against stuff like this because there's no guarantee that your models are compatible between each other
So long as my inheriting classes are correctly typed, how would that be different from passing each value separately as an argument to __init__?
e.g.
class PhoneInput(Input):
number: str
phone_type: PhoneType
extension: str | None = None
class Phone(Base[PhoneInput]):
__tablename__ = "phones"
phone_id: Mapped[int] = mapped_column(
primary_key=True,
autoincrement=True,
)
number: Mapped[str]
phone_type: Mapped[PhoneType] = mapped_column(Enum(PhoneType))
extension: Mapped[str | None] = mapped_column(nullable=True)
I don't understand the square bracket syntax of this line class Phone(Base[PhoneInput]):. What's that indicating? Base of type PhoneInput? But that doesn't make sense for inheritance?
i dont understand it, but it seems too cool
if somebody can explain the ussage
I think it's a case where Base is a generic
SQLAlchemy question: from the docs alone, I'm not really getting the WHY of sessionmaker over Session(engine) . It says it's for when you have consistent configs, but like what? I suspect this is mostly because of how bad the sqlalchemy docs are, but coudl someone direct me?
the idea is that you can abstract over the specific engine you use
As in two database types in the same code? I guess I'm still not getting it :/ Sorry.
Yah, I read that, and it's still not clear to me. As always, the "example" they give doesn't mean much.
if you use a sessionmaker, you can do Session(). if you don't use a sessionmaker, you need to do Session(engine)
Yah, heh, that's obvious, so it's one less import 😛
Another SQLAlchemy question. If I want to make a mixin class, am I better doing:
class SoftDeleteMixin(Base):
__abstract__ = True
deleted: Optional[Mapped[datetime.datetime]] = mapped_column(default=None)
or
class SoftDeleteMixin:
deleted: Optional[Mapped[datetime.datetime]] = mapped_column(default=None)
Inherit from the DeclarativeBase and set the class as abstract or just make a standard class?
i've been thinking bout database internals and indexing strategies. I've not actually looked under the hood of databases yet but I want an opinion if the strategy i describe would be viablee!!
each table column get a BST which holds indexes in the main table structure
and each row item has a row_id, the items in the BST are sorted based on the column values, and each node holds a row id to the said row
so, can someone, by this logic, create a O(log(n)) time complexity based indexing system for a table using this strategy?
so a query like SELECT * FROM table WHERE name = 'abc' should conduct a search in the BST of the name column, once it finds the row_id for that, it should be able to return all columns from the table structure using that row_id as an index.
and since insertion in a BST is also a O(log(n)) operation, that should also be relatively easy
data retreival and insertion would be a O(m log(n)) operation for a table of columns m, and rows n
(pls ping)
I don't know much about internals, but IIRC indexes are stored separately from tables and, at least in case of postgres usually btree is used (https://en.wikipedia.org/wiki/B-tree), and If I remember correctly the upside is that you can have multiple children on single node, thus reducing amount of memory/disk reads
I'd say there's no need to inherit from base
that's so good to knoww!! i was just thinking how db works with this stuff so fast!!
hey, anyone has experienced in Apache Beam? especially WriteToJdbc
this captures the basic idea pretty well, and you technically could use a BBST for an index. as Doctor said though, most commonly you'll use a BTree, because they have a high fanout, leading to shallower trees, leading to less IO operations, since each node takes an IO operation to load into memory.
also, generally, you won't have an index for each column, though you could. you usually only put indexes on columns that you want to search by, because maintaining indexes increases the cost of insertions and deletions, which you saw with the O(m log n) time complexity
mmm, i'll look into BBST and BTree rq, ty
a bbst is just a bst that balances itself
indeed, i just gotta see how it functions
Would you guys recommend working when working with databases to use SQLAlchemy?
Made my own php .txt database hmu if you're interested
It's programmable like mysql
I'd really like to learn more about using ORM in SQLAlchemy for work, specifically with Oracle DB. I'm not sure I can easily buy or make a mock setup of Oracle DB at home, though. Is there a good way to practice this?
You can download and install Oracle. I believe there's a docker container too.
It's not too hard, no harder than installing postgresql
It won't be hard if you use docker to setup postgresql. https://hub.docker.com/_/postgres/
I know this is a common pattern in a bunch of packages, but I guess I never bothered to understand the underlying technique, so... I'm trying to set up a foreign key in sqlalchemy:
owner: Mapped["models.user.User"] = mapped_column(ForeignKey("models.users.User.id"))
All my models are in the folder models. I can't import the models directly into each other due to circular referencing, so I'm using the string format for referencing. But I'm not sure if I should be putting the full path to model, a relative path, or just the class name. I'd love some direction.
The FK should look like this:
ForeignKey("user_account.id") where user_account is the defined table name and id is the primary key of the other table
So it's the table, not the model?
Yeah
And that applies to the Mapped[?] as well?
The full line is user_id: Mapped[str] = mapped_column(ForeignKey("user_account.id"))
Well, int, but yah
In my case the primary key is a UUID
Yah, makes sense
Ok, that helps
As usual, why are the docs not more clear...
Thanks for helping make that clearer
@tranquil aspen Sorry to ping directly, but as you seem to know about this, I think my previous question confused ForeignKey and relationship. The docs have this example:
class Parent(Base):
__tablename__ = "parent_table"
id: Mapped[int] = mapped_column(primary_key=True)
children: Mapped[List["Child"]] = relationship(back_populates="parent")
class Child(Base):
__tablename__ = "child_table"
id: Mapped[int] = mapped_column(primary_key=True)
parent_id: Mapped[int] = mapped_column(ForeignKey("parent_table.id"))
parent: Mapped["Parent"] = relationship(back_populates="children")
So ForeignKey uses the table.column notation as you mentioned. This would obviously create a column in the table. The relationship I'm guessing won't, and is instead used to get matching relationships in code? There, the Mapped["Parent"] seems a reference to the class; so if the class is in a different file, how do I refer do it? And I'm guessing I need a relationship on both tables for this to work?
Yeah the relationship part works differently
Based on your naming it seems you are looking at the same post I used to make my tables, so you'll recognize the comments I used to remind me of how it works:
# Child
class UserSession(Base):
__tablename__ = "user_session"
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=str(uuid4()))
created_date: Mapped[DateTime] = mapped_column(
DateTime(timezone=True), server_default=func.now()
)
updated_date: Mapped[DateTime] = mapped_column(
DateTime(timezone=True), server_default=func.now(), onupdate=func.now()
)
# parent: Mapped["Parent"] = relationship(back_populates="child")
# parent_id: Mapped[int] = mapped_column(ForeignKey("parent_table.id"))
user: Mapped["User"] = relationship(back_populates="user_session")
user_id: Mapped[str] = mapped_column(ForeignKey("user_account.id"))
UniqueConstraint(user_id)
def to_json(self):
return {
"id": self.id,
"created_date": str(self.created_date),
"updated_date": str(self.updated_date),
}
def uuid_str():
return str(uuid4())
# Parent
class User(Base):
__tablename__ = "user_account"
id: Mapped[str] = mapped_column(String(36), primary_key=True, default=uuid_str)
username: Mapped[str] = mapped_column(String(16), nullable=False)
password: Mapped[str] = mapped_column(Text, nullable=False)
salt: Mapped[str] = mapped_column(Text, nullable=False)
created_date: Mapped[DateTime] = mapped_column(
DateTime(timezone=True), server_default=func.now()
)
updated_date: Mapped[DateTime] = mapped_column(
DateTime(timezone=True), server_default=func.now(), onupdate=func.now()
)
active: Mapped[Boolean] = mapped_column(Boolean(), server_default="t", default=True)
# child: Mapped["Child"] = relationship(back_populates="parent")
user_session = relationship(UserSession, back_populates="user", uselist=False)
In the above example each UserSession is linked to a user id
Yes
Yah, this is where I'm getting thrown:
permissions: Mapped[List["Permission"]] = relationship()
Pylance is throwing a squiggly line under User saying it's not defined. But I'm not sure how to handle that.
Show code
Eer, that is the code. You mean the two separate classes involved? I can do that in the morning, as it's late for me, heh.
In models/role.py I have
class Role(Base, TimestampMixin, SoftDeleteMixin):
__tablename__ = "roles"
name: Mapped[str] = mapped_column(String(64), unique=True)
plural: Mapped[str] = mapped_column(String(64), unique=True)
owner_id: Mapped[int] = mapped_column(ForeignKey("users.id"))
permissions: Mapped[List["Permission"]] = relationship(
secondary="role_permissions", back_populates="roles"
)
I don't import permission there, as it would cause a circular dependency.
@brave bluff Sorry, but what's the issue? 
Pylance is throwing an error on Mapped[List["Permission"]] saying Permission is not defined.
Import permission in a if TYPE_CHECKING block
if typing.TYPE_CHECKING:
from somewhere import Permission
This is needed for type checkers to understand where that type actually comes from
Ah, but won't cause the circular imports?
No, TYPE_CHECKING is always False at runtime
Is this new? Or just something I somehow haven't run into before
It's quite old 
Oh, so it's not a literal import, it's just a definition
Added in version 3.5.2.
I don't know how I've never run into this before
Yah, wild
I've been doing Python for like 5 years, heh, but maybe this would have come up sooner if I had started SQLAlchemy earlier
Thanks
@brave bluff Here's an example btw:
https://gitlab.com/dpy.su/manga/-/blob/ce21ba4dcedc9a5f3e5ad901dd94345dc5367772/src/app/db/models/manga/_manga.py#L30
src/app/db/models/manga/_manga.py line 30
if TYPE_CHECKING:```
Thanks
Yah, seems clearer now
Oh, I guess the professional projects I've been working on have been older and using minimal type checking, so that's probably also why I havent run into this.
Good to know. I assume this will come up a lot more as I add more type checks
Just configure ruff with all rules enabled 
Not sure what ruff is, but I'll look into it
An extremely fast Python linter and code formatter, written in Rust.
ruff format can replace black and isort
Oh, that'd be nice, heh
I personally use mypy, but coworker says pylance is not bad either
It's not a type checker btw, just a formatter and linter
Yah, I'm trying to learn more about tooling, as it's always been set up at a company before I got there
I have taskfile set up in that repo, it should match the CI: https://gitlab.com/dpy.su/manga/-/blob/ce21ba4dcedc9a5f3e5ad901dd94345dc5367772/Taskfile.yml
But it's a bit outdated and uses black
Also look into uv 🙏
Yah, been seeing uv come up a lot lately
I've been using pipenv for a long time
But it's resolver is meh at best
I'm thinking of switching my project to uv instead
Yo who’s good at SQL
Just ask your question. You're literally in a database channel.
What's the SQLAlchemy way of setting a value on save? I have a field in model that is dependent on another field. I could obviously set both any time the primary is set, but that's redundent. If it were a normal class, I'd use a getter/setter, but I'm not sure how I'd do that with sqlalchemy. In Django, I'd use the save method, but not sure how I should with SQLA. Is it just "make a method"?
I think hybrid_property is the mechanism to use here. Still curious about checks before saving, to test data is set, for example
You .commit() the query. https://docs.sqlalchemy.org/en/20/orm/quickstart.html#create-objects-and-persist
Not how to save a model, but rather how to do something before saving.
Like this?
from sqlalchemy.orm import Session
with Session(engine) as session:
spongebob = User(
name="spongebob",
fullname="Spongebob Squarepants",
addresses=[Address(email_address="spongebob@sqlalchemy.org")],
)
sandy = User(
name="sandy",
fullname="Sandy Cheeks",
addresses=[
Address(email_address="sandy@sqlalchemy.org"),
Address(email_address="sandy@squirrelpower.org"),
],
)
patrick = User(name="patrick", fullname="Patrick Star")
session.add_all([spongebob, sandy, patrick])
session.commit()
No, not the example out of the docs. As I said above, I have a field who's value is dependent on another. So some mechanism to modify/check the model before it's saved.
If you can see from the example, they use classes to check
To check what? The example sets values and saves them. It does no checking of any sort.
Its a very simple example. So obviously, you've gotta set your own setter/getter logic behind the classes
As long as the type matches, it's fine.
Are you looking at the insert or update?
If its an update, wouldn't you need to do a select query and find if the value entered matches the value from the db?
If its an insert, surely you would need some sort of guard that the value or format of the data input is within an acceptable range or format
like emails for example, you would need regex
hello, i need help with mysql and python connection
here is my code:
from mysql.connector import connect, Error
try:
with connect(
host="localhost",
user=input("Enter username: "),
password="123456"
) as connection:
print(connection)
except Error as e:
print(e)```
i was having problems installing and doing stuff with mysql connector at first, but now that i've done everything, i dont even get any outputs, no error either, nothing.
Thanks for the help in advance!
MySQL server version - 8.0.39
Using - VScode or IDLE. spyder doesnt even know that mysql.connector exists, returns error
What error do you get?
!trace
Please provide the full traceback for your exception in order to help us identify your issue.
While the last line of the error message tells us what kind of error you got,
the full traceback will tell us which line, and other critical information to solve your problem.
Please avoid screenshots so we can copy and paste parts of the message.
A full traceback could look like:
Traceback (most recent call last):
File "my_file.py", line 5, in <module>
add_three("6")
File "my_file.py", line 2, in add_three
a = num + 3
~~~~^~~
TypeError: can only concatenate str (not "int") to str
If the traceback is long, use our pastebin.
Are SQL and MySQL the same thing?
mysql is a specific SQL database: https://www.mysql.com/
SQL is the language.
if in doubt, you probably just want Sqlite. It's the right choice in most cases. And when it's not, consider postgresql.
as i said, there is no error, but there is no output either
look, no output
as per my code, i believe it should printing connection variable or e variable, but it doesnt
Add additional print statements to see what's happening. It's also possible you're not running the code you think you're running.
i am running the code i pasted above in my original message, where exactly would you suggest me to add the print statements?
i believe its not working after the
try:
with connect(
host="localhost",
user=input("Enter username: "),
password="123456"
) as connection:```
since any print statement after that doesn't work, but it does ask me for the username input
again, please do tell me where should i add print statements if there is anywhere it should be added
Try obtaining the username via input before the with statement
still no output, no errors, works the same as before
Add print statements before and after your code, for starters
Also, change except Error to except Exception.
connection = sqlite3.connect('business_management_system.db')
cursor = connection.cursor()
createtable = """CREATE TABLE IF NOT EXISTS
bms(username_id TEXT, password_id TEXT)"""
cursor.execute(createtable)
def loginaccount():
password = password_field.get
username = username_field.get
cursor.execute("SELECT username_id, password_id WHERE username_id, password_id = username, password")
if password == password_id and username == username_id:
accessgranted = true
def createaccount():
password = password_field.get
username = username_field.get
cursor.execute("""INSERT INTO bms(username_id, password_id)VALUES(username_id, password_id)""")
bms.commit()
bms.close()
does anyone know how to fix 'column does not exist' when i try to store account details to my database?
What's wrong with this query: cursor.execute("SELECT username_id, password_id WHERE username_id, password_id = username, password") || hint: multiple things, but let's start with - no from clause ||
See above
How does your except block looks like? 🤔
Hello need help in CrewAI with Memory main issue is Database is not saving the data if anyone allow me should I explain my complete problem
Read #❓|how-to-get-help
There are two tables T1 and T2, an entity from T1 must be related to at least one entity of T2. An entity from T2 may be related to at most one entity of T1.
is that correct?
i think its meant to be read the other way around, i.e. T1 may be related to at most one of T2, and T2 must be related to at least one T1
The text is what’s given to me
I need to make the diagram
oh oops
Does that mean my line is the wrong way round
I don't think there's a good way to guarantee that T1 relates to at least one T2? But otherwise it looks like a typical one-to-many relationship
I asked the TA and he said he’s never seen those symbols
And I should just use numbers
So like if it says at most 1 just put a 1 on the top of the line
They're not uncommon, but you can use numbers too
So for that question do I just put 1 on each end of the line?
Each?
Doesn’t the text describe the relation from t1 to t2 and t2 to t1
So each side needs a number or am I wrong
Yeah, but it says at least 1 and at most 1 on the other side
One wide is 1 to many and the other side is 0 to 1
So I put 1:N and 0:1 on either side
Yeah
Okay thanks last thing the 1:N do I put that on the side of the T1 table
Or the other table
How can I talk in the room?
I think u need a certain level
to my understanding, you're meant to read the cardinality at the end of the line, so if you were to write T1 1:N ---- 0:1 T2, it would say "a T1 relates to zero or one T2", and conversely "a T2 relates to one or more T1s"
Alright thank you so much
is there a faster way of converting a csv to sql database for SQLite? I know theres a faster method using postgreSQL but Im using SQLite atm for practice and getting used to database management.
nevermind, i reinstalled everything and it got fixed
apparently i had to add python to path and also install mysql shell
thank you
what is your current method
I just use to_sql. Simple and does the job. Its not slow but for the future, if i working with a much larger dataset, want to learn a faster and efficient method for csv to sql conversion
To sql usually generates terrible inefficient sql, for a few reasons
The faster method is usually using a native loader: most database have some direct (fast) way to ingest a csv file
And, in some cases, doing batch updates rather than inserting record individually
For some dbs, there's a multi method in to sql: https://pandas.pydata.org/pandas-docs/dev/reference/api/pandas.DataFrame.to_sql.html
This is somewhat a rathole of incomplete work, but when it works it works pretty well
Why does it need a from if the stuff is from user input?
You're saying: select .... where id = someinput , right?
From what table does the id come from?
I’m trying to make the input box put the username and password into the database and then it lets you login
Just focus on that one line of code. What is it supposed to do?
Check if the input matches one in the database
What table in the database is it checking?
Bms the one I made above
But bms is not in the query.
Does it have to be?
Yes
What about the create account function?
Okay
Also, you have to learn how to pass parameters to your queries
!sql is a good instruction
Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.
Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.
For example, the sqlite3 package supports using ? as a placeholder:
query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)
Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.
See Also
- Python sqlite3 docs - How to use placeholders to bind values in SQL queries
- PEP-249 - A specification of how database libraries in Python should work
cursor.execute("SELECT username_id, password_id FROM bms WHERE username_id, password_id = username, password")
?
You're not passing any parameters there, see the embed
And, that should be a=b and c=d, not a,c = b,d
What do you mean?
(That's valid in some dbs but I hate it)
'A = b and c=d': make the comparisons clear, rather than comparing tuples
I’m really confused
cursor.execute("SELECT username_id, password_id WHERE username_id = username, password_id = password")
?
Remove the comma, use 'and'
cursor.execute("SELECT username_id, password_id WHERE username_id = username AND password_id = password")
Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.
Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.
For example, the sqlite3 package supports using ? as a placeholder:
query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)
Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.
See Also
- Python sqlite3 docs - How to use placeholders to bind values in SQL queries
- PEP-249 - A specification of how database libraries in Python should work
And add a print statement so you can see what username and password is being used
cursor.execute("SELECT username_id, password_id WHERE username_id = username, password_id = password")
parameters = (username, password)
passwordcheck = “SELECT * FROM bms WHERE password_id = password AND username_id = username;”
parameters = (username, password)
db.execute(passwordcheck, parameters)
Almost
Hint: ||import sqlite3
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
print(cursor.execute("select ?", (213,)).fetchall())||
Or ||import sqlite3
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
print(cursor.execute("select :name", {"name": "bobby"}).fetchall())
||
I’m confused
print(cursor.execute(“SELECT username”, {“username_id”:”username”})fetchall())
What are you confused about?
What you sent me is really different to what I had
I'm showing you how to use parameters in sqlite
Do I need them?
All I really need to do is let it add account to database and check if it matches an account logging in
Yes
passwordcheck = “SELECT * FROM bms WHERE password_id = password AND username_id = username;”
parameters = (username, password)
db.execute(passwordcheck, parameters)
Why does the parameters help this function?
How does the database know where the username or password goes?
Go ahead and run the query and read the error.
“Column does not exist”
Yes, because it thinks 'username' is a column. You have to make it a parameter
My computer science a levels teacher spends most of our lessons on side quests across town it’s not easy out here
1 sec
Read this carefully:
!sql
Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.
Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.
For example, the sqlite3 package supports using ? as a placeholder:
query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)
Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.
See Also
- Python sqlite3 docs - How to use placeholders to bind values in SQL queries
- PEP-249 - A specification of how database libraries in Python should work
def createaccount():
password = password_field.get
username = username_field.get
cursor.execute("""INSERT INTO bms(username_id, password_id)VALUES(username, password)""")
??
I’m confused on the symbol, ? And commas
Have I passed it through here?
Read the message, and follow the links if not clear. This is importsnt to understand.
query = “INSERT INTO bms WHERE symbol = ?;”
parameter1(username, password)
dbexecute(query, parameter1)
Or am I supposed to replace symbol with column and ? With username_id, password_id
The ? Represents one value in your parameter tuple. If you have two parameters, you'd have two ?'s
query = “INSERT INTO bms WHERE symbol = (?, ?)”
parameter1(username, password)
dbexecute(query, parameter1)
What's symbol ?
Yes
I’ll test it in a minute
and I’m correct in passing through the user inputs?
and how does it know whether it is in username_id or password_id column
wait I have annidea
Idea
query = “INSERT VALUES(username, password) INTO bms WHERE column = (?, ?)”
parameter1(username_id, password_id)
dbexecute(query, parameter1)
No. Where user_id= ? ....
So would this statement be better?
It'd be wrong
Why?
Run it
The placeholder only represent the data you want to pass in. Like: user_id = ? : Means you want where the userid column to equal the input parameter.
can someone give tips for this? especially db
The platform must be able to handle 10,000 requests per second, with a registered base of 100,000 drivers and 50 million users globally.
backend - docker-compose of django,python
I had to build quick some 35 40 endpoints and few websockets so py dont hate
for db I am using postgres
platform is something like uber so constant writes too not just read
currently thinking of sharding postgres with redis in b/w but more tips would be nice
does k8s work well with dbs?
using gcp free 300 usd credits
Hi guys this is probably easy but I have to complete this for a school competition. I never did python before but I think it could help
The step I reached is that
I need to automate a massive excel file with 19 tabs for countries, each tab compares 4 columns, CODE1 VALUE1, CODE2 VALUE2 for different fictional cities and buildings. I made some formulas to show if they match and where which code is missing (1 or 2) on excel
I manually populate the excel files that are extracted from different applications from each country.
I need to match them excluding a legend of values that are based on another set of excel files that are from the past month that are commented as known gaps.
I'm really stuck with this, I've done an excel, but I'm not sure it's fast enough. I know VBA is out of the question since it doesn't do much that excel already can.
Should I go with python? I never used python before. Will I need an interface for it? Maybe power bi can match the values but I'm not sure if there is a limit for the number of tables. Could it concatenate code1-value1? but how will this work for the past month legend check.
I've been thinking of doing an automated extract from the application's API, or from the excel that will be in the cloud for each month on Sharepoint that populates a database with python that matches the concatenated columns ( city 1 & building 1 vs city 2 & building 2) for each country
Then, populate three new columns if there is a match, if there is a missing building or a missing city and then using the past month excel flag the past gaps as a known gap in an new column.
All of this is then sent into power BI where it is filtered by country.
With a Sharepoint comment button done on power apps ill annotate a comment for each gap and then another power app creates a final report from the entire power bi.
I have midterms coming up and I've been thinking about this for a while, but I'm open to new ideas.
Python is exactly the right tool for this. Start with pandas.read_excel, which creates a dataframe from an excel file.
Then, use Pandas to transform the data how you want. Kaggle.com/learn has some instructions, or you can open a help thread: #❓|how-to-get-help
Thank you! I will look into it for sure, I'm working on Jupyter notebook with anaconda navigator
do you love it when you have the exact some code in 2 functions, but one works the other doesnt?
works:
count = cur.execute("select colum1, sum(colum2) from table group by colum1 order by colum1 asc") for rec in count.fetchall(): colum1.append(rec[0]) colum2.append(rec[1])
IndexError: tuple index out of range:
count2 = cur.execute("select colum3, sum(colum2) from table group by colum3 order by sum(colum2) desc") for rec in count2.fetchall(): colum3.append(rec[0]) colum2.append(rec[1])
can someone explain?
and error appears at the last line: colum2.append(rec[1])
You're most likely not running what you think you're running. Add a 'print(len(rec))' inside the loop to show the size of the rec tuple
now it is working, lol.
thanks, whatever that was it helped
yooo
how long do you think it takes to learn MySQL for web dev?
depends from where you start
Hello guys l, can someone please help me connecting DataGrip with oracle
I am not able to create a database
And now when I tried to create new connectio, it's not working
This is when i am trying to create a database
I only knew the basic SELECT BY, WHERE, JOIN etc
I want to learn index, transaction, locks, logs
and Master-Slave Replication, Sharding, High Availability etc
what's the difference b/w sqlite3.connect(":memory:") and sqlite3.connect("file::memory:")
Anyone know if with SQLAlchemy polymorphic classes, if there's a mechanism to hard code the value used for the polymorphism? If I'm already setting polymorphic_identity, it'd be convenient to set the field to the same value?
Please share resources to learn Dependency Injection.
Anyone know why it says column does not exist in sqlite3?
it means the column does not exist in your database. Common thing to look out for:
- Did you spell out the column name correctly?
- Are you referencing to the right "from" table that may have that column?
- Did you alter or created the table that have that column to begin with?
createtable = """CREATE TABLE IF NOT EXISTS
bms(accountid INTEGER, usernameid TEXT, passwordid TEXT)"""
cursor.execute("""INSERT INTO bms (usernameid, passwordid) VALUES (username, password);""")
whats the full error message?
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\Oscar\AppData\Local\Programs\Python\Python312\Lib\tkinter_init_.py", line 1968, in call
return self.func(*args)
^^^^^^^^^^^^^^^^
File "C:\Users\Oscar\OneDrive\Documents\PYTHON CODE O3\v10.py", line 60, in createaccount
cursor.execute("""INSERT INTO bms (usernameid, passwordid) VALUES (username, password);""")
sqlite3.OperationalError: no such column: username
Oh so you're not passing in values to your sql statement. SQL thinks that you're referring to the column "username", not the value
Use parameterized queries.
how?
E.g.
cur.execute("INSERT INTO movie VALUES(?, ?, ?)", ("Monty Python's Life of Brian", 1979, 8.0))
isnt that basically how i did it?
No. You didn't pass the values as parameters. In fact its not being passed in at all.
cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password))
cursor.execute("""INSERT INTO bms (usernameid, passwordid) VALUES (username, password);""")
The username here is assumed as a column name. Same as password. Not usernameid or passwordid
Correct
accountid = int(0)
accountid = accountid + 1
would this be correct for creating an order of accountids?
or would that be incorrect logic in python
maybe if you set the accountid column as a PRIMARY KEY AUTOINCREMENT, it would do it for you
oh okay
and my error this time is
^^^^^^^^^^^^^^^^
File "C:\Users\Oscar\OneDrive\Documents\PYTHON CODE O3\v10.py", line 60, in createaccount
cursor.execute(cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.ProgrammingError: Error binding parameter 1: type 'method' is not supported
why cursor.execute inside cursor.execute?
ive removed it and still getting
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\Oscar\AppData\Local\Programs\Python\Python312\Lib\tkinter_init_.py", line 1968, in call
return self.func(*args)
^^^^^^^^^^^^^^^^
File "C:\Users\Oscar\OneDrive\Documents\PYTHON CODE O3\v10.py", line 60, in createaccount
cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password))
sqlite3.ProgrammingError: Error binding parameter 1: type 'method' is not supported
it looks like its something to do with tkinter.
How are you calling this database insert?
wdym
what triggers the insert?
ok and in code please
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
def createaccount():
password = password_field.get
username = username_field.get
cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password))
You're missing ()
after get
you may need to add a sql select statement to check if the username exists or not
Where is db defined?
bms.commit()
i have it as bms
and bms is my database
bms(usernameid TEXT, passwordid TEXT)
iirc it autocommits by default
ive just got no error message..
bms in this line is the table name
does that mean its in the database
check it and find out
I always have https://sqlitestudio.pl/ installed in my pendrive. In case I need to check someone's sqlite db file
!pastebin your result/code
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
it just has random symbols and create table
where is this from?
my database
Use a SQlite Browser or SQLite Viewer or something to look inside it.
it created a journal but theres nothing new in the db
share a screenshot
yea its empty. It didn't get inserted
Type this in place of that:
-SELECT * FROM 'bms' LIMIT 0,30
+INSERT INTO 'bms'(usernameid, passwordid) VALUES ("AgentQ", "IsSuperAwesome")
And then click "Execute"
OK now retry the select statement: SELECT * FROM 'bms' LIMIT 0,30
Will it show at least one row?
It didn't go in
def createaccount():
password = password_field.get()
username = username_field.get()
cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password))
is this not correct?
can you confirm if that createaccount function gets called?
createaccount_button = ttk.Button(mainframe, text= 'Create Account',command=createaccount)
Why not add a print statement in the function and then create another account? If the print statement shows up, that means it did get called correctly
can you share the full code/
Can you show the code that shows this print?
def createaccount():
password = password_field.get()
username = username_field.get()
cursor.execute("INSERT INTO bms VALUES(?, ?)", (username, password))
print("Account Created")
Ok good. Now print the username and password before the insert and test again
Does it show in your db?
isn't LIMIT 0,30 equivalent to LIMIT 30?
i think it only happened on the web version
https://www.youtube.com/watch?v=7mz73uXD9DA&t=917s
is this a good vid to get started with SQL?
Course Problems & Certificate 👉 https://lukeb.co/sql
Course Databases & Links 👉 https://lukebarousse.com/sql
⚠️ Fix Database Load Issue in Ch 2 👉 https://lukeb.co/sql_error
Thanks to Kelly Adams for her work in producing this course 🎉🙌
👩🏻💼 Kelly's LinkedIn: https://www.linkedin.com/in/kellyjianadams/
🌐 Kelly's Website: https://www.kellyjadam...
if anyone has referred to this.
Never seen it. Here's my short list: #python-discussion message
Thank you!!
anyone good with snowflake?
i have a snowflake data warehouse with a sparsely populated time series. for example, the table might look like this:
DATETIME VALUE
2023-10-01 00:00:00 10
2023-10-05 12:30:00 20
2023-10-12 08:45:00 30
i'm trying to generate a minutely series, where each minute would represent the most recent observation that's at most 10 days old. so for example:
- at 2023-10-05 12:30:00, the value is 20.
- at 2023-10-11 12:30:00, the value is still 20 (because it's within 10 days of2023-10-05 12:30:00).
- at 2023-10-12 08:44:00, the value is still 20 (because it's within 10 days of2023-10-05 12:30:00).
- at 2023-10-12 08:45:00, the value is now 30 (because we actually saw 30 at exactly 2023-10-12 08:45:00).
- until 2023-10-22 08:45:00, every single minute have value 30
someone suggested i first create a CTE that computes the "valid_until" timestamp for each row (either the next timestamp or 10 days later, whichever comes first)[1]. this part runs really fast, but when i try to join it with a generated table of all minutely timestamps in a range (either using SELECT time_add(...) FROM table(generator(...)) or a fancy recursive CTE), the query performance drops significantly and it takes forever.
any recommendation?
[1] e.g.
DATETIME VALUE VALID UNTIL
2023-10-01 00:00:00 10 2023-10-05 12:30:00
2023-10-05 12:30:00 20 2023-10-12 08:45:00
2023-10-12 08:45:00 30 2023-10-22 08:45:00
With SQLAlchemy, anyone know how I can store a class object as json into a field? I saw sqlalchemy.types.json in the docs, but I'm not sure if there's a mechanism to automatically serialize/deserialize it into another class? Or point me where I should be looking in the docs?
SQLA is a serious PITA. I do not understand how it became so popular. I'm always ready to admit I'm the problem when it comes to a package, but the SQLA docs are so hard to follow...
So, you have irregular data and you want to regularize it to 1 minute granularity, with the most recent value that's no more than 10 days old
Let's assume first that your data is rounded to the nearest minute (if not, round it)... that'll make this next part easy:
I'm going to write this in duckdb, since it's handy... but should be translatable to snowflake
create or replace table irregular_data as
SELECT DATE '2000-01-01' + INTERVAL (i) hours as datetime, random() val FROM range(2000) t(i)
;
create or replace table regular_intervals as
(with q1 as (select min(datetime) as dt_start, max(datetime) as dt_end from irregular_data)
select unnest(generate_series(q1.dt_start, q1.dt_end, interval 1 minute)) datetime from q1)
;
with q1 as (
select r.datetime, val, from regular_intervals r left outer join irregular_data i on i.datetime=r.datetime)
select
datetime,
last(val ignore nulls) over (order by datetime range between interval 10 hours preceding and current row) as filled_val
from q1
In snowflake, you'd use last_value https://docs.snowflake.com/en/sql-reference/functions/last_value instead of last()
Idea is: given irregular_data that is not aligned to your desired regular_intervals, left outer join then use a window function.
** Requirement is that irregular_data is rounded to nearest minute, so the left outer join works with regular_intervals.
*** Alternative would be to use an asof join. Considered that, but this seemed easier to explain.
indeed!
yeah if last / last_value supported range-based window then your example query is exactly what i would have written
unfortunately snowflake doesn't support that :\
i wonder if making a proper table instead of a transient one in CTE would make a difference, something to test tomorrow 🤷♂️
and yeah i am aware of asof join, it's also not quite performant, perhaps ultimately the issue is that we aren't using a proper table for the regular_intervals but generating on the fly
hmm, I have a dumb idea
For such cases if you really trust the source of data you can use pickle type to store the entire class as it takes care of serializing/deserializing, it comes with a security risk though as it can run arbitrary code. https://pydoc.dev/sqlalchemy/latest/sqlalchemy.types.PickleType.html, otherwise you may define your own serialization/deserialization into JSON
Alternatively, ```sql
with q1 as (
select r.datetime, i.datetime datetime_i, val, from regular_intervals r left outer join irregular_data i on i.datetime=r.datetime)
select
datetime,
lag(val ignore nulls) over (order by datetime) as filled_val,
lag(datetime_i ignore nulls) over (order by datetime) as filled_date
from q1
Then, you'd add one more subquery to compare datetime to filled_date to see if the lagged value is within your window (and if not, null)
Yah, this is entirely internal, the format is based off what I will probably make a dataclass. I'll look at PickleType. And otherwise, I was thinking maybe doing a getter/setter? I guess I should look into if there are post load/pre save methods. But as usual, I can rarely seem to make sense of the SQLA docs. And I don't know if it's me or it's the docs.
this would take the previous value with infinite lookback right? 🤔
Yes, you'd then add an if/case to check filled_date against datetime
In other words, you first take the infinite loopback... then you replace any "too far backs" with null.
ah gotcha, will have a play with this tomorrow once i am at work, thanks!
I just realized, I'd rather not pickle it, because then I can't use JSON and lose the query abilities on that. Any idea how I'd create a custom serialization? I can't seem to find anything other than custom type, and as before, I'm having trouble figuring out those docs.
Was about to mention Custom types, other than that you can also override the dict method in your class and store that as json
Ok, I'll take a look at custom classes. And at overriding dict.
Hi
Which database is most suitable for having a few read/writes very frequently (every couple seconds/minute)
but the records will be deleted after use too. sqlite is suitable right?
Most databases should handle thousands of reads per second, and usually less writes, how much traffic exactly you want to handle?
Specifically the aiosqite library *
Well I just want to be prepared if there are few writes or reads per seconds. But it’s not like a big amount
And that answers it thank you. And I assume no problems with be caused by using aiosqlite right? I’m new to this as you can probably say
How much is a few? Under ~100? You should be fine with sqlite
even with thousands too, but I didn't exactly benchmark sqlite myself
Yeah definitely under 100 thanks
Does anyone here used Sqlalchemy only for query builder ?
I just took a proper look at the query profile, turns out it's the join that's causing the issue... It using 97% of wall time.
So what I did was to rewrite query to not do any join, just generate an array of minutely timestamp where the row is valid, unnest it and fill in the gaps application side.
1000x faster than whatever the hell we were doing now. Phew.
Nice. Sometimes we get too clever in db land 🙂
Thanks, I also have found this "select(Users).where(Users.GL_USER_GID == user_gid).compile(compile_kwargs={"literal_binds": True}" to work too 🙂
so an index in a dataframe is just a row header?
Can anyone recommend a migration manager that's tool agnostic? If I'm not using any ORM.
Maybe alembic?
I can go over the docs again, but even without SQLA, doesn't alembic use SQLA's language?
You can just use text with it
Ah, straight SQL? I'll look at that. I was considering Atlas as it is.
I mean, you can try whatever tool you like
I think alembic can use sqlalchemy stuff but only if it's installed
Hi, if you are building or looking to build projects that include Azure Cosmos DB, please reach out to me. I am on the team and would love to share what you are doing or provide feedback to our team.
I wanted to make a joke about mongodb being webscale, but 🙂 Is the stack mongo based or mongo compatible?
haha i worked at mongo and actually made tshirts with that. We have multiple APIs, a native NoSQL one that speaks a cosmos specific SQL syntax, two different mongodb offerings and cassandra
Interesting. I'm more a olap guy, so generally shy away from this world, but it's interesting
Guys? Trying to open it on my Mac and this pops up.
hey! I'm trying to avoid hard-coding relationship strings in sqlalchemy to make everything more type safe
for example, if I have such models:
class User(Base):
id = Column(Integer, primary_key=True)
is_active = Column(Boolean, default=True)
items = relationship(lambda: Item, back_populates="owner")
class Item(Base):
id = Column(Integer, primary_key=True)
owner_id = Column(Integer, ForeignKey(User.id))
owner = relationship(lambda: User, back_populates="items")
how do I avoid hard-coding values like "owner" or "items" of back_populates?
Anyone used weaveiate?
help I need to choose between milvus and weaviate
I just use pgvector in those cases
I think that's normal. Like how often do you need to change them? They should be designed once and should rarely be changed
yeah that's right, thanks!
You wouldn't always be able to use lambda (e.g. when having circular references between files), also you're not using sqlalchemy 2.0 syntax, so your code isn't that type safe, maybe try using new Mapped and mapped_column:
class User(Base):
id: Mapped[int] = mapped_column(primary_key=True)
is_active: Mapped[bool] = mapped_column(default=True)
items: Mapped[list[Item]] = relationship(back_populates="owner")
class Item(Base):
id: Mapped[int] = mapped_column(primary_key=True)
owner_id: Mapped[int] = mapped_column(ForeignKey(User.id))
owner: Mapped[User] = relationship(back_populates="items")
That way mypy can check if you use your relationships correctly
You can read on what's new was added in 2.0 and on how to migrate here:
https://docs.sqlalchemy.org/en/20/changelog/whatsnew_20.html
https://docs.sqlalchemy.org/en/20/changelog/migration_20.html
Is there something similar to the Kindling project but for databases? I find it difficult to try to put a use case for a project when I dont really need to code something for my general life (if that makes sense)
Hi all. I am just about to lose my mind. I’m trying to create a bi-directional many to many relationship between a Tag model and any number of other models via sqlalchemy. All of the examples in the documentation seem to generate a one-to-many. I don’t care if it creates multiple tables like user_tag, image_tag, etc, but I would expect those tables to essentially consist of just parent_id and tag_id. I do need to be able to query all tags. So User.tags, Image.tags, Tag.all, Tag.users, Tag.images, etc should all work. Does anyone have any examples of how this relationship can be constructed, ideally using something like a HasTags mixin?
I am not a SQLA expert, but I did some searching and it looks like you need to manually set up an association table
That is, you have Foo and Bar and each of those has a one-to-many relationship with a 3rd FooBar
Wait, I stand corrected. https://docs.sqlalchemy.org/en/20/orm/extensions/associationproxy.html
The first example there is a many-to-many relationship relationship without manually constructing the association table but you still need to manually write out something to create an association table
You could automate that with a helper function or maybe even a decorator if you design it carefully, but probably not a mixin without doing nasty metaclass stuff
I think they're looking for a "generic foreign key" 
@craggy canyon Basically if you want to be able to tag any model there are two options:
- Junction table for each relation, this can guarantee database/fk integrity
- Single table with a discriminator signifying which table/model this relationship belongs to
I would say first option is better since you can use foreign keys with it, and these junction tables pretty much can be autogenerated
any recommendation for database migration management? have used alembic in the past, wondering if other people have different opinions
and any recommendation for sharing database migration magnement-related code across many git repos? might do git subtree but wondering if it's a well-known solved problem
I've used alembic and prisma both
prisma is comparatively less mature for Python
as for sharing the migrations-as-code, I've had a good experience with keeping the DB interaction entirely as a separate, installable library
I've had a good experience with keeping the DB interaction entirely as a separate, installable library
how does your development workflow look like if you need to make schema changes?
some of my repo is in java btw if that changes things 💀
I bump the version of the DB models library with schema changes
my repos are all Python, so it's a bit simpler
there are language agnostic DB-migrations-as-code tools, too
ooo. that's plenty of reading for me to do, thanks!
doesn't that mean you are always 100% sure about a schema change?
what happens if you just want to try things out 🤔 ?
it's a Git repo, so using branches for trying things out works fine
oh i see what you mean now!
can anyone help me fix this error ?
complex_keys is empty/None
Try printing out complex_keys before line 52
for i, _ in enumerate(complex_numbers):
if _.real is not None:
complex_keys.append(_.real)
if len(complex_keys) > 0:
mode_keys_price = statistics.mode(complex_keys)
else:
mode_keys_price = 0.0
this should have fixed it na
Did it?
nope
Did you find out what is in complex_keys?
give 3 mins
I'll give you 1 min 😈
it is working
in my database
it is storing 0 or price
Is 0 or price from complex_keys?
So no more errors?
yes see inside keys there is value but when i am calling from fastapi it giving error
no there is
You have to follow the trail. Print out complex_keys when you are calling from FastAPI
May I ask when exactly do you use pgvector?
When I need a Vector Database
ok doing
what about requirements?
What requirements?
I want to make a big data chatbot that can infer from self stored text and a cloud database.
and can handle previous queries
I'd ask you the same. When do you need to use Milvus or Weaviate?
Weaviate cause it has similarity search and I dont know much about db so f Milvius i heard its hard
is it storing zero
I have only used chromadb before
ok. Well I can't really say much about either Weaviate or Milvus as I have no experience with either. The most I've used is Pinecone but since I self-host, I use pgvector.
I am self hosting then? is pgvector easy to use? can I augment it with non available featuers
Hence the error
Depends. Do you know Postgresql?
so what do i do if there is only zero ?
some items are also returing this
I don't know. Follow the trail. What fills/assigns value to complex_keys?
No, is it easy?
It depends™️
I find it easy.
to learn? I dont know jack of backend. I need to get into it though
is complex_keys just a list of float numbers?
if there are only zeros in the list then only i should get the error right if i have more numbers instead of zeros the value of list should not be none right ?
idk what to tell you but get started ig 🙂
yes i am finding the mode in that list
okay thanks boss
well if its all 0, isn't that just nothing?
!rule AI doesn't understand the context of your StatisticsError.
!d statistics.mode
statistics.mode(data)```
Return the single most common data point from discrete or nominal *data*. The mode (when it exists) is the most typical value and serves as a measure of central location.
If there are multiple modes with the same frequency, returns the first one encountered in the *data*. If the smallest or largest of those is desired instead, use `min(multimode(data))` or `max(multimode(data))`. If the input *data* is empty, [`StatisticsError`](https://docs.python.org/3/library/statistics.html#statistics.StatisticsError) is raised.
`mode` assumes discrete data and returns a single value. This is the standard treatment of the mode as commonly taught in schools:
```py
>>> mode([1, 1, 2, 3, 3, 3, 3, 4])
3
``` The mode is unique in that it is the only statistic in this package that also applies to nominal (non\-numeric) data:
If the input data is empty, StatisticsError is raised.
0.0 is not empty
Good you've established that.