#databases
1 messages ยท Page 8 of 1
this is good too, it means you understand what people actually do with data and how to serve their needs. don't undersell this
Alright
okay, so you might have a lot to learn. it's much better to know "what you don't know" so you can at least understand your own limits better. spend the evening reading about sql joins, window functions, and indexes
Alright
I am just doing a course on Datacamp on Data Engineering rn to improve my understanding of it
so at least you can say "i want to learn how to use sql more efficiently and effectively" vs "idk i never heard of that"
happy to help set people in the right direction
vscode
Are there any sqlalchemy/sqlmodel experts here? I feel like I have a very niche but critical problem... I am dynamically generating a sql table from a pydantic model. This works fine, my problem is that the queries using that table are incredibly slow as SQLAlchemy does not seem to be able to the dynamically created table, instead if queries EVERY table in the full select for every id as a separate query which is obviously not scalable. Has anyone encountered this before?
e.g. querying 1000 rows becomes 5000 queries instead of just one query with a few joins
def _create_orm_class(
self, variety_data_model: type[BaseModel]
) -> type[WithInteractionIdBase]:
return type(
(
f"InteractionVariety{self.interaction_type}"
f"{self.interaction_source.capitalize()}ORM"
),
(variety_data_model, WithInteractionIdBase),
{
"__tablename__": (
f"Interaction_VarietyData_{self.interaction_type}_"
f"{self.interaction_source}_{self.hash}"
),
"__table_args__": {"extend_existing": True},
"metadata": WithInteractionIdBase.metadata,
},
table=True,
)
^ how I am creating the table, where variety_data_model and WithInteractionIdBase are both SQLModel types
Why are you creating them dynamically?
Is the main issue only present when querying from these tables?
because they are not known until runtime (they are generated off of a json schema coming from another microservice)
and yeah it seems to be only when joining to this table specifically
If you have such requirements - why use sql?
Dynamically creating tables is generally a bad idea
You could use json fields with postgres or simply a different db for that
the practical answer is real world requirements, I work somewhere that used a mssql stack when this app was created
the shift to sqlalchemy is intended to help migrate to something like postgres
also at the end of the day it works completely fine except for this one problem
a developer's time isn't cheap
Database schema question
I have a database for storing messages in my messenger, what is the best way to split chats? make one table messages, and make columns there chat_id, message_id, content, author_id or for each chat create a table with the name chat_%chat_id%(for example chat_111) and make columns message_id, content, author_id in it?
can you share query itself and what sql it produces?
ah stepping through the debugger has narrowed down my issue, I think it's to do with lazy loading. I.e. the original query is correct (with joins), but then when the ORM objects are converted to non-ORM models it queries every single object with data from the database
first option, chat_id is your primary key
Are you using synchronous sqlalchemy? ๐ค
yeah mssql doesnt have async drivers (in python world at least)
If you can run your application with engine(echo=True) enabled - do it, maybe you're just running into N+1 problem?
I think I am
Do you use sentry?
ive never ran into it before but it fits the bill, I assumed it was becuase I was trying to do something edge casey with sqlmodel/dynamic tables etc
sentry is for frontend I thought ๐ค
You can use it with frontend too
99% sure its the N+1 problem, question is how do I make it not lazy load? ๐
But it has intergrations with python frameworks, including sqlalchemy so you can check what queries are emitted
Try adding eager loading to your query
(I am not using relationships via ORM)
Oof
ah cool good to know thanks
Wait, are you sure? ๐ค
it's an existing database schema, the tables already existed as such
so ORM models are just reflections of it
Usually you run into n+1 if you have an ORM model and do something like this:
for book in books:
print(book.author) # Implicitly does a select from database
with Session(self._sql_engine) as session:
interaction_rows = get_many_interaction_data_with_variety_data(
session,
interaction_ids,
self.type_orm_class,
self.format_orm_class,
self.variety_orm_class,
)
return {
interaction_id: self._build_slim_interaction(
interaction_rows[interaction_id]
)
for interaction_id in interaction_ids
}
๐
self._build_slim_interaction accesses every attribute coming back
SELECT call.interactions.id, ... many columns ..., call.[Interaction_VarietyData_Call_twilio_066eb22372e80da2].disposition
FROM call.interactions
JOIN call.[Interaction_TypeData_Call] ON call.interactions.id = call.[Interaction_TypeData_Call].interaction_id
JOIN call.[Interaction_FormatData_Audio] ON call.interactions.id = call.[Interaction_FormatData_Audio].interaction_id
JOIN call.[Interaction_VarietyData_Call_twilio_066eb22372e80da2] ON call.interactions.id = call.[Interaction_VarietyData_Call_twilio_066eb22372e80da2].interaction_id
JOIN call.interactionqueryhelper ON call.interactionqueryhelper.process_id = ? AND call.interactionqueryhelper.interaction_id = call.interactions.id
and then afterwards when going through the above loop it does it one by one 
SELECT call.interactions.id AS call_interactions_id, ... many cols ..., call.interactions.has_ml_info AS call_interactions_has_ml_info
FROM call.interactions
WHERE call.interactions.id = ?
for each table in the join and for each id, so to query 10k rows it takes like 45 minutes ๐
but as you say the first sql query has all the attributes in it i am confused
Hey I got a schema related doubts, I want to make such table as database and how do define primary keys ? What should I be using if I update a row the previous row gets "flag" column updated..
Why do you want to keep row history?
it's a task I was given to create such table..and I am interested to know even if I timestamp instead of flag what should I be doing?
As you mentioned there's two ways of implementing it - either using composite primary key with id, timestamp or just move that data into separate history table
okay is there any way to keep it in the same table with flag column..
But why use flag?
Ideally you'd use a timestamp or some sort of version id, since primary key should be unique
I know its dumb but in the task I was given they want me to use this
You can't ๐คทโโ๏ธ
okay ๐
You should either use a id + version_id, like
1234, 1
1234, 2
1234, 3
etc ...
or a timestamp instead of version id
Since primary key should be unique using a flag for more that two records with same id won't work
This is the description of the table they want
what do you mean by user created? the table with the hash in it is created dynamically, the rest already existed in database
In SELECT call.interactions.id AS call_interactions_id, ... query
Since there's many selects to call.interactions and it's not created by user it should be easy to fix
also I can't seem to disable lazy loading here ๐ข trying all the permutations of this doesn't do anything:
results_raw = session.exec(
select(
InteractionORM,
type_orm_class,
format_orm_class,
variety_orm_class,
)
.join(
type_orm_class,
InteractionORM.id == type_orm_class.interaction_id,
)
.... a few more joins
.options(selectinload("*"))
).fetchall()
raiseload, eagerload, joinedload etc
easy fix? sounds good but I don't follow what you mean sorry
all the documentation is talking about relationships which I have none of, and also says that inner joins aren't lazy loaded by default ๐ค ยฏ_(ใ)_/ยฏ
You can add raiseload first to see where exception is raised
bleh I added .options(defer(variety_orm_class.disposition, raiseload=True)) and it didn't error
had to try that as "raiseload() applies to relationship() attributes only. "
you're deferring single column though
can you wild card defer?
Yep, but I think you should try raiseload("*") first
yeah I tried that it didn't do anything, I think because it only applies to relationship()
Share code?
I am using a very low cpu shared vm to run my bot and it takes about 3-4 minutes for python start the process up but whenever the code is trying to access the db it's returning this error asyncpg.exceptions.ConnectionDoesNotExistError: connection was closed in the middle of operation an the code follows as this ```py
async def connect(self) -> None:
self.cxn = await asyncpg.create_pool(
user=self.user,
host=self.host,
port=self.port,
database=self.dbname,
password=self.password,
loop=asyncio.get_running_loop(),
max_inactive_connection_lifetime=3,
)
Can you copy and paste the full text of the insert command here? The end is cut off in the screenshot you posted.
looks like you have 6 columns but only 5 values
insert into person.persons (PersonID,LastName,FirstName,Address,City,Score) values(7,'Mathews','Jacob','Bay Area','California');
Which is this program?
where is Score in the values?
Ah yeah, you're missing the Score column from your values.
If you want to keep it NULL, I think you have to remove Score from the column names (in the insert statement)
insert into person.persons (PersonID, LastName, FirstName, Address,City)
values (7, 'Mathews', 'Jacob', 'Bay Area', 'California');
so drop column?
i got it...u removed score
Don't drop it from the table. I just mean remove it from your statement
i got it...thanks
with mongodb can you watch for changed kinda like websocket? like as soon as a value is changed grab it instantly or am i only to keep checking the values with intervals?
is there someone who can help me, i want to connect my moderation screen with my database. so when i send a message from my gui it has to come on to my moderation screen and then i have to accept it or reject and then the message will show in my database
anyone else have an opinion?
balls
Hello everybody. Im coding Telegram bot with web hooks + async sqlalchemy. And I get this error. Why? Query is correct
https://paste.pythondiscord.com/igoqufucom
trying to make a database thats its taking the member who did the giveaway and then he can end it w the id he did
I want an object to be automatically deleted after a while (sqlalchemy and orm), for this what can I do?
Hey
can someone help me with mongodb im having a right brainmelt with it
type 1 v type 2 makes no sense to me
Which filetype can hold objects
Hello.
Anyone know the 'query' to retrieve the 5 rows with the highest ID?
Thanks!
(Ping me)
Select, sort, limit. If that doesn't give you the answer you need, show us what you tried
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/discord/ui/view.py", line 425, in _scheduled_task
await item.callback(interaction)
File "/Users//Desktop/Code/CookieMarket/src/tickets/tickets.py", line 314, in callback
await db.execute('INSERT INTO applications VALUES (?,?,?);', (interaction.user.id, interaction.channel.id, self.values))
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/aiosqlite/core.py", line 184, in execute
cursor = await self._execute(self._conn.execute, sql, parameters)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/aiosqlite/core.py", line 129, in _execute
return await future
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/aiosqlite/core.py", line 102, in run
result = function()
sqlite3.InterfaceError: Error binding parameter 2 - probably unsupported type.``` ```py
await db.execute('INSERT INTO applications VALUES (?,?,?);', (interaction.user.id, interaction.channel.id, self.values))``` `self.values` is a list, what can I do instead?
If I print `self.values`, I get `['Illustrator', 'GFX Designer', 'Skin Designer']`.
There's no error
I need some help with something.
Let's say that I have a table of movies and a table of tickets. Let's say that each ticket has a Boolean field called "sold" and a foreign key referring to the movie. Let's say that I want to find the remaining tickets for each movie, but I also want the results to include movies without any tickets remaining.
It seems straight forward
SELECT m.title, COUNT(CASE WHEN t.sold THEN 1 END) as tickets_sold, m.total_tickets, (m.tickets - COUNT(CASE WHEN t.sold THEN 1 END) as remaining
FROM movie m
LEFT JOIN tickets t ON ...
If you include sample tables and data it would be easier to test my theory.
or maybe
select
title,
(select count(*) from tickets t where t.movie_id = m.id and not t.sold) as tickets_remaining
from movies m
Good idea. This makes more sense.
The following code is raising this error
Code -> https://hastebin.com/uyayafopar.py
Error -> ```
Traceback (most recent call last):
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/pool.py", line 218, in release
self._con.terminate()
AttributeError: 'NoneType' object has no attribute 'terminate'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/workspace/Blue-Brain/bluebrain/db/db.py", line 49, in wrapper
return await func(self, *args, conn=conn)
File "/workspace/Blue-Brain/bluebrain/db/db.py", line 66, in sync
await self.executemany("INSERT INTO system (GuildID) VALUES ($1) ON CONFLICT DO NOTHING", [(g.id,) for g in my_guilds])
File "/workspace/Blue-Brain/bluebrain/db/db.py", line 47, in wrapper
async with self.cxn.acquire() as conn:
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/pool.py", line 1005, in aexit
await self.pool.release(con)
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/pool.py", line 879, in release
return await asyncio.shield(ch.release(timeout))
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/pool.py", line 220, in release
raise ex
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/pool.py", line 210, in release
await self._con.reset(timeout=budget)
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/connection.py", line 1366, in reset
await self.execute(reset_query, timeout=timeout)
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/connection.py", line 317, in execute
return await self._protocol.query(query, timeout)
File "asyncpg/protocol/protocol.pyx", line 338, in query
asyncpg.exceptions.ConnectionDoesNotExistError: connection was closed in the middle of operation
Just handle the asyncpg.exceptions.ConnectionDoesNotExistError exception
Is there any similar approach like pojo classes in Java/spring boot for python(pyside6) to deal with the databases? I have a db table with primary key and foreign key references and i want to store those in a python variable like list of class object in java
That's a solution really
Why are you closing whole asyncpg pool?
Ahh...i am new to asyncpg
Can you show how you're using that class?
Also what's the purpose of
@with_connection
async def commit(self, conn: asyncpg.Connection) -> None:
if self.bot.ready.ok:
async with conn.transaction() as tr:
await self.execute("UPDATE bot SET Value = CURRENT_TIMESTAMP WHERE Key = 'last commit'")
To ensure that transactions are being executed
solaris/bot/bot.py line 50
self.db = Database(self)```
`solaris/bot/bot.py` line 100
```py
await self.db.connect()```
Also it seems like @with_connection would create new connection on nested function calls, it probably won't work as expected
Ok...
Then how should I do this?
Do what exactly?
Learn how to use asyncpg ๐
What did you mean here and where?
That doesn't matter, your .close method is only called once in your bot
Issue might be related to your decorator
It is bc I was getting rollback. But I've fixed this
It's not an error though
is there a better way to do this with pymongo? feel its super long winded and there is a way to do it much easier
def get_user(collection_name: str):
print(f"search for: {collection_name}")
for data in utils.MONITOR.find({}):
for user in data['Collections'][collection_name]['Users']:
print(user)
When I want to import a backup which way should be preferred:
- Use
subprocess.run()and do it likemysql -u username -ppassword < backup.sql - Use the MySQL connector for Python, iterate through each line of
backup.sqlto usecursor.execute(line)
It's still raising the same error as before after removing the decorator
Code -> https://hastebin.com/baxijefawu.rust
Error -> ```
Traceback (most recent call last):
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/pool.py", line 218, in release
self._con.terminate()
AttributeError: 'NoneType' object has no attribute 'terminate'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/workspace/Blue-Brain/bluebrain/bot/bot.py", line 89, in on_starting
await self.db.connect()
File "/workspace/Blue-Brain/bluebrain/db/db.py", line 37, in connect
await self.executescript(self.build_path)
File "/workspace/Blue-Brain/bluebrain/db/db.py", line 128, in executescript
async with self.cxn.acquire() as conn:
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/pool.py", line 1005, in aexit
await self.pool.release(con)
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/pool.py", line 879, in release
return await asyncio.shield(ch.release(timeout))
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/pool.py", line 220, in release
raise ex
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/pool.py", line 210, in release
await self._con.reset(timeout=budget)
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/connection.py", line 1366, in reset
await self.execute(reset_query, timeout=timeout)
File "/workspace/Blue-Brain/.venv/lib/python3.10/site-packages/asyncpg/connection.py", line 317, in execute
return await self._protocol.query(query, timeout)
File "asyncpg/protocol/protocol.pyx", line 338, in query
asyncpg.exceptions.ConnectionDoesNotExistError: connection was closed in the middle of operation
Very weird, can you check if you only call connect once?
ok...let's try...
ahh...what did you mean?
did you mean asyncpg.connect(...)?
Ok, is it right?
SELECT title, content, url, image FROM news ORDER BY id DESC LIMIT 5;
Looks good, did you try it?
SQLAlchemy is a Python library for implementing SQL databases without using the SQL language itself.
why are you posting these messages?
Yes, and it work, thank you
Good to know
while True:
try:
CID = input("Enter Customer ID: ").upper()
mycur.execute("select * from Booking")
mycur.fetchall()
srno = mycur.rowcount + 1
vals = (srno,CID,tmp1_checkin,tmp1_checkout,roomNum)
comd = "insert into Booking values(%s,%s,%s,%s,%s)"
mycur.execute(comd,vals)
db.commit()
print("Booking Successful")
break
except
print("Wrong Customer ID!")
print()
What should I type in except if the foreign key gives an error?
I have the CID linked
so if someone enters incorrect CID which does not exist in the parent table it returns an error
so what is the except block? Please tell
unfortunately this will depend on the specific database library that you use, they all use different exception classes.
mysql.connector
you'll have to check the docs
sometimes the docs aren't very helpful. in that case, you can try to deliberately trigger the error and then see what the exception class is
also, do not use database cursors for more than one query.
one query per cursor only
i see you using mycur which makes me think you're creating 1 cursor and using it over and over
yea
i do fetchall everytime tho
it leaves the cursor empty
dw about it cuz the rest of it works fine
all thats left in that project is that one exception
trust me, it causes weird problems especially when inserting data
wha-
there's no downside to making 1 cursor per query. it's the correct thing to do, and not doing it will cause problems.
so u mean rather than creating one global cursor
I use temp cursors for each query?
yes, exactly. that's the intended use of cursors
btw the possible errors are listed here https://dev.mysql.com/doc/connector-python/en/connector-python-api-errors.html
so that means they teach us wrong in school ๐
you can read about the different kinds of errors and put the right one in your except
ty
note that CREATE CURSOR in the database is different from a "cursor" in python database libraries
i don't think that's even the right syntax. some databases have a built-in "cursor" feature. it's advanced anyway
i've never needed it
that's good. it lets you make queries and work with the results right away in python
but whoever is teaching this should know better than to reuse cursors
... lmao PGT CS be like
our main syllabus is python with sql as backend
btw @harsh pulsar
this works?
it will recreate mycur variable again and again with that
cuz Home() function is called after every data insertion / fetching
i aint good ok m beginner
it's not your fault if this is how you were taught
-
why capitalize the names of functions? that's usually meant to indicate a class
-
reusing cursors, as discussed above
-
a global shared cursor. why??
this is not uncommon. there are a lot of people teaching python who clearly aren't good programmers or who don't like python and teach style from other languages in python, even when they don't make sense in python
we didnt learn creating classes yet
reusing more like recreating
global shared cursor cuz of above code
i am wrong on 2nd point but yeah reusing but it still recreates it cuz mycur is still a var
we didnt learn creating classes yet
fine, then. don't use capital letters for function names, it will confuse other programmers who are reading your code. "code is read more often than it is written".
reusing more like recreating
you're reusing it for every query until it's re-created
global shared cursor cuz of above code
which is a bad idea for many reasons. a global database connection is maybe excusable in some cases. but in a program like this, it's much much better to pass the db connection as a function parameter
can someone help me
i want to take the last message from my database
but now it only takes the first
yeah ok will keep that in mind
also every function uses mycur just once and only 1 function uses it twice
database rows in general do not have a defined order unless you use ORDER BY. if you want the "last" message according to a certain attribute, use ORDER BY with DESC. then your fetchone should work
all the more reason not to use a global. now you have to think about the state of some global variable in every single function, instead of getting the db conn through a parameter and just creating a new cursor for every query.
should I just place mycur = db.cursor() in every function indivisually then?
def get_user(conn, user_id):
cursor = conn.cursor()
cursor.execute("SELECT * FROM users WHERE id = ?", [user_id])
return cursor.fetchone()
this is how most people use databases in python, and for good reason
also it was integrityerror
So I am using sqlite3 to connect to a db
Anyone knows how can I get the number and names of tables in the db?
Nvm found it
I am having an issue with sqlalchemy (using postgresql DB) where session.bind is None: ```py
def is_clocked_in(user_id):
"""
uses the user id to query the database to see if the most recent timekeeper event from today was a clock_in_event
:param user_id:
:return: Tuple where first element is whether the user is clocked in, and the second element is any errors that
may have occured while looking this up
"""
user_id = int(user_id)
#TODO query by datetime: todays_events_query2 = TimekeeperEventModel.query.filter_by(user_id=1, datetime=datetime.now().date())
start_of_today = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
#db.session.commit()
todays_events_query = TimekeeperEventModel.query.filter(TimekeeperEventModel.user_id == user_id,
TimekeeperEventModel.datetime > start_of_today)
todays_events_df = pd.read_sql(todays_events_query.statement, todays_events_query.session.bind)
# if there are no records for the user, they are not clocked in
if todays_events_df.shape[0] == 0:
return False
todays_events_df.sort_values(by='datetime', ascending=False, inplace=True)
if todays_events_df.iloc[0]['clock_in_event']:
return True
return False
Running the above function yields the error AttributeError: 'NoneType' object has no attribute 'cursor'
Why do you only have terminal access?
https://sqlbolt.com this is a nice online tutorial
SQLBolt provides a set of interactive lessons and exercises to help you learn SQL
It's a website-
Terminal on what?
And depends what database you want to use
..what?
.
You should be able to install software in the terminal. Other wise you may want to spin up a linux server and to learn databases on the server. For your purposes, you could probably use free tier of server hosting from linode or aws or whatever.
Hey
I am getting a
_mysql_connector.MySQLInterfaceError: Unknown database
How do I find the databse name?
I know that is the issue
When I want to import a backup of MySQL database which way should be preferred:
- Use
subprocess.run()and do it likemysql -u username -ppassword < backup.sql - Use the MySQL connector for Python, iterate through each line of
backup.sqlto usecursor.execute(line)
Try show databases
I'd think 1 especially if you capture the stdout and stderr into a log.
You could try my tutorial. It has an option to run entirely in the browser: https://owencampbell.me.uk/sql_python_tutorial/
I have a bigquery table that logs purchases per customer
e.g.
id,type,amount
1,food,5
2,electronics,20
etc..
I want to label each customer with the type they buy more often, what is the best approach to this? customers are 100k+, types are around 50
my initial approach is:
get count and group by id, type
then select top 1 rows per customer
hello just a question are we always to only use aggregate functions in sql immediately after select
does postgres transfer between platforms? for example if i moved from railway to some other hosting service for my bot in the future how would i move my database over
You should be able to dump and import fairly easily but I haven't had to do it myself
file = db.getDb("saves.json")
CharacterInfo = {"name":name,"Class":MainClass,"Exp":"0","Farm Exp":"0","Hunt Exp":"0","Kills":"0","Stats":Stats}
file.add(CharacterInfo)```
Got an error here, i this lines, anyone able to get what's wrong in there?
Not without more context... What is db and what is the error message?
db is from pysondb
Maybe an issue with line breaks? What does the error message say?
Is this the appropriate place to ask a question about converting a text file to a dataframe?
No though I scroll through the help channels looking for interesting pandas problems. In my experience it is better if you are prepared to share a minimal repeatable example much like many of the pandas docs do.
If you get a help channel I should be able to jump in and help.
How can i make this calculation where column is passed as a parameter
main_df['next_order_date'] = datetime.today() + timedelta(days=main_df['day_of_supply'])
'next_order_date' and 'day_of_supply' are just strings you can just replace them with regular variables
I got an issue connecting to Mongo, and I have no idea what's going wrong as it was working perfectly a second ago.
I'm using it for a discord bot, but I assumed as it's a database issue I should be coming here.
File "C:\Users\main\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\discord\client.py", line 409, in _run_event
await coro(*args, **kwargs)
File "C:\Users\main\Downloads\Personal Stuff\Christmasbot\cbot.py", line 558, in on_message
if (cb_cdata.count_documents(query_one) == 0):
File "C:\Users\main\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pymongo\collection.py", line 1835, in count_documents
return self._retryable_non_cursor_read(_cmd, session)
I believe this is where the issue is, but the error is quite long. I can attach a .txt if needed. Any ideas?
Hi everyone, I need some suggestion/solutions regarding a project that I'm doing, so basically the aim of my project is:-
There will be some fixed number of scripts that will be available to every user (an app for frontend), these scripts can be purchased (purchasing logic later) and run by the user (with authentication ofcourse), now i'm confused as to how do I manage this for multiple users, how do I create different instances of these running scripts individually for users and maintaing their data, is a cron job necessary here? because once a user press "RUN", it will take the authenticated user info, arguments and run the script in the backend, and he/she could stop it whenever he/she wants, also these scripts will be running on sensor data (sensor data will be common for every script, so maybe I can create a layer and take data from this layer for every script running instead of fetching data for every running script..??) which would also be fetched in the backend and those scripts will do some mathematical computations and update the results for that user....
Can someone pls help me with defining a scalable architecture for this? I intend to use Python/NodeJs for backend, Firebase for database...
How can i do an on lookup to mysql if something have added
Need it for my Discord bot
cron probably won't be sufficient, if only because you don't want to be writing to the crontab file from concurrent processes. if you want to offload "scheduled" jobs to a separate tool, anacron is a better option. however i think it sounds like you will need to implement your own scheduling system. as for database architecture, i think it's going to be the same as for any app with concurrent users doing a mix of reads and writes.
and yes, each "script" should probably be run independently. i think you'll want to run them as separate processes (not thread or async task). either using multiprocessing, or probably subprocess or its asyncio equivalent.
however this is getting more generally into #software-architecture
Is this achievable in flask python? I have heard it causes problems using concurrency in flask...
@harsh pulsar when i hear stuff like that, my first association is kubernetes cluster.
It allows literally launching different backend... instances multiple times, with different arguments. And disabling them back if desiring.
plus with kubernetes it will be really horizontally scalable xD, and solves issues of correctly handling workload distribution, requesting necessary amount of hardware resources for its running
technically kubernetes is exactly... container scheduling system
solution it will be very simple i think.
Lets say... some SQL table with rasks we are supposed to be having running -> we convert to Skaffolding of stuff we need to launch. Applying at Skaffolding level necessary parameters to jobs.
We just apply them to cluster as skaffold run current_user_scheme
kind of wondering if skaffold will be able to delete no longer needed applications on its own based on the removed code
even if not, it will be simple job to do, to request from user namespace deployed package names, and then removing those ones which aren't present in skaffold
๐ค
For starters, i was storing each script in firebase storage and importing it in real-time then running for every user... But as you can see it's a bad approach...
So you're suggesting to use kubernetes clusters instead?
Ergh, it really fits description of what u wish at least.
Yea I'm just asking as I'm don't really know how kubernetes work, will have to look into it...
What about the sensor data layer? ๐ค
What about it?
how application is supposed to interact with it?
Like the scripts are going to execute some logic based on sensor data which is being fetched realtime on server...
okay. How script is going to receive this data from sensor data layer?
how this data is accessable?
Idk like, it will be a process running separately, fetching sensor data, and from that I'll call a function to get me the current values?
I'm not really sure about this part too...
where this process will receive data from?
From api
A paid api
So i was thinking to maintain a specific data layer, and fetch from it for every script that is running...?
what are expected allowed delays to get the data?
Instead of calling api in every script
how often your scripts will be quering this process?
Like, every second...
It's a websocket api
It sounds to me like u need to make a backend microservice. Backend microservice will be quering your third party API, and saving somewhere results.
And making available through... rest api or through web socket stuff to your user launchable applications
it can be launched in same kubernetes cluster xD just once
Is this achievable with python?
pretty much yes
i would recommend FastAPI ๐ for the sake of automatically generated documentation.
although if u a that new to python, it could be not a bad idea to go with Django
I know machine learning stuff but not backend stuff..
So yea pretty much new to this field...
Django solves code architecture backend stuff for you... Just u need to use django way and using its already suggested stuff
Good option for new people to backend
I see, thank you for your valuable suggestions, really ๐
Can i DM you if I run into any problems?
u a welcome. Suggesting next books to read.
Docker Deep Dive is necessary prerequisite to work with Kubernetes
The kubernetes book to jump start using kubernetes
Code Complete by McConnel to get started in code architecture from backend point of view
Clean Architecture by Robert Martin to see stuff at more global scale of architecture
And going to book From Monolith to Microservices book in order to see architecture of backend code at infra level size (splitting to micro backends stuff)
Building microservicesbook
If it's okay..?
ask here, feel free to ping in current channels in addition as a possible person to answer that too. Just don't DM me with stuff like that.
It is better asking in public, in order to have all people involved.
Here people are answering when they have free time, and with doing it in... crowds, overall quality of answers is better, and each participant gets additional... experience catching from other people. I want to see stuff like what @harsh pulsar says in order to catch some useful bits for me from him, or seeing already made dialogs, and coming to answer to already acquired requirements instead of interrogating you for requirements on my own. More time efficient to do that for me.
Alright, got it!! I'll be active here until I solve my problem ๐ซก
Anyway, i would say additional remark that it is the only option i see to have it horizontally scalable at least at the moment. And as any... backend dev, i think how to have it horizontally scalable first.
Horizontal scalbility = is making app workload distributed between multiple servers
So for running the scripts, you suggested to run kubernete cluster for every instance, suppose there are 4 scripts available to a user, and he purchased 1, now he taps on RUN with some default arguments needed, the request goes to my fastapi server, now how do you suggest to run it for that user because other users can run the same script but at different time...
I'm using firebase db to store all data related...
I suggest having One Kubernetes cluster only (or may be two, kind of wishing to separate your main application from your user applications just to be sure)
one cluster can distribute workload for multiple applications across multiple servers
So is there a way to execute a cluster from my fastapi/django/flask server..?
Based on user
yeah... sure
Alright
It will be kind of tricky to make it sanitinized from user input though 
I'm using firebase authentication
Like I'm gonna pass authentication token in request header...
Oh
What do you mean,m
?*
Ergh, forget about it for now, u a very eons afar from any solutions at the moment.
i suggested solution ideal for me, knowing those technologies already. U need to make a pretty much big jump in learning unknown technologies to reach it.
May be it is actually bad idea to recommend it for you
What is expected amount of users
perhaps to make a more simple solutions based on some sort of subprocesses indeed
Multiprocessing?
yeah
So no kubernetes for now?
it depends on requirements to project u wish to achieve (how many users it is expected to handle), and available money to spend on developers and maintanance
But imagine how many subprocesses will be created if all users run their scripts at once...
indeed. kubernetes solves it by being scheduling system distributing stuff across multiple servers
but question is, do you need it? how many users u a going to have, and how much workload from each one, and what type
Not much, around 50-100 for now
Workload is like running the scripts, doing some simple mathematical computations, then sending push notifications to devices if some condition becomes true
will be this workload one time running on user request, or it will be running in background every second for each user?
Yes it will be running unless the user stops it or the script stops itself
What do you suggest now? ๐ฅฒ
well, as i said kubernetes ideally fits it
the problem is... i am not sure if u can afford cost of this solution. U are missing a lot of knowledge to use it
therefore solution on multithreading-multiprocessing can be better xD
Multiprocessing or multithreading?
Flask is not good for concurrency tasks right..?
I tried earlier and it choked
as good as any other python web framework.
may be u used in the wrong way
Yea maybe lol
So... How do u suggest to manage the scripts? Like right now I'm storing them in firebase storage and importing them realtime as user press "RUN" on app...
lets try to rewrite our solution for multiprocessing then 
Yea...
can we set name to process, which we could query outside of process? 
๐ค we could, for example?
When the user runs a script, we import it, give the required user-specific arguments from the request, run it inside while true loop...?๐ค Doesn't sound good..
subprocess.run("echo '{\"abc\": 123}' && sleep 30", shell=True, check=True)
$ ps aux | grep "abc"
naa 749439 0.0 0.0 2616 600 pts/5 S+ 19:14 0:00 /bin/sh -c echo '{"abc": 123}' && sleep 30
Okay, i can request processes my user opened.
So the raw solution if we would not find any library would be...
making in process something to identify it as a user application for specific user, launching with necessary params
u can just grep ps aux, and seeing all applications running for specific user then
ergh... it will involve a lot of reinveing scheduling system
that's bad idea
a no limits to resources usage
I think the only problem would be to manage large number of process... Will it be able to manage, suppose there are 100 users, and each user has purchased 4 of the scripts, now every user runs the script, there would be 400 processes... ๐ฅฒ
perhaps u made wrong assumtion what u need in the first place
we should question ourselves if we need processes/separate applications for user in the first place.
perhaps what we need is actually just a regular message queue xD
Lets imagine simple Flask application with Celery.
you have some process, that quering present users and processes it needs to run for them
And just schedules as tasks into celery.
Celery completes as soon as free resources are available, not taken by any task
Scalable up to 10`000 tasks per second with RabbitMQ as a message broker
Sounds good...
both technologies are simple as... a stool.
Thus we eliminated technological... overhead with learning too much stuff.
and made solution simple to your level of knowledge
Yea... Okay we use celery, now, let's talk about the scripts management...? Will my current solution work with this one? Like I've told I'm currently storing the scripts in firebase storage and importing them when user runs them...
I don't really like this approach tbh
The storage one that I'm doing right now...
can user edit scripts? or only developer can edit them and they will change only between your application different deployments?
then why do you keep them in firebase. They are supposed to be present in backend application filesystem all the time then.
Only developer can edit if needed and redeploy it....
Hmm yea I mean I could do that...
I thought separating the backend from static files...
how static files are present here?
what are they
No i was talking about the scripts...
u said they are scripts invokable by flask app, which only developer can change. Those aren't static assets, this is just a working code of your application. Part of flask/celery app code.
Yea... U r right
Static assets are html/css/js/jpeg/png and etc files, served to user. Sometimes user can upload pictures too.
Yes my bad i used the wrong terms
Okay, we store the scripts in backend application only, next is to import them when user needs to run any of them...
import them only once on application start xD (Flask app / Celery workers)
invoke entry functions from them.
u will see how it will interact with celery. Just directly call entry function of script from celery task
Hmm, okay yeah, will need to read about celery...
Yes i think this should work...
xD, i liked how we refactorized solution. And suggested and destroyed completely different implementations without writing even single code line
That's called system design and analysis. (book like that is present by Alan Dennis for example)
Yes totally, thank you i really appreciate it
I will develop a demo and will share it to you ๐
If I succeed*
although this book would not help with this particular situation. U need cloud/backend specific system design, like learning stuff from here https://github.com/donnemartin/system-design-primer
Message Queue pattern is explained there too
yeah, all is left to erase from my memory that i once again suggested kubernetes as a solution to the problem.
it is kind of like shooting birds from Death Star Laser Array
That's okay haha,i learned a thing or two from there too
docker+kubernetes is still great to launch backend stuff though. I would have used it anyway just to launch your Flask app + Celery workers.
now it is optional stuff xD
docker alone will greatly improve your deployment process
If I manage to do this, then I'll definitely give kubernetes a try...
Yes I've heard...
forget that i mentioned it, u are having enough stuff to catch up with already
Yes... I thank you once again...
u a welcome. Highly recommend learning Docker/Docker-compose, because it makes usage of Flask+Celery very simple.
Also i googled out of curiosity Firebase...
is it not having any data integrity? U can just in real time any structure making and changing? adding any fields from your already working code and deleting them?
This is trully horrible if true
One last optimization. I would suggest abandoning Firebase in favour of Postgresql
and using Django + Celery + Django ORM(with Postgresql engine)
Django has inbuilt Django ORM for the easiest approach
get it Postgresql as engine, and u'll get happiness with better data integrity
u will be sure that all your user data, will be having all fields/valid across all present data, migrated to new application state
๐คhmm
I would but I have not used postgresql yet...
I would definitely in future if this prototype works...
What am I deleting?
lets imagine u created user with code that created them as name and address
if i understand correctly your Firebase, if you will change your working app code, new users can be created with name, phone for example
and u will have trouble... that you have already hundreds users with data structure name, address and no mechanism to get them into new state
Then lets imagine u have some declared tasks for users.
In order for task to know to which user it is related... are u going to do what? copying whole user data to a task declaration? or making some kind of anchor that leads to already present user data?
how will you be able to query tasks that only specific user has? Quering all tasks that have ID leading to necessary user?
Hmm, i think firebase provides a realtime listener, that if a user is added, it will listen to that change and get it right away...
Ergh. noSQL databases are a trap. Surely easy to use. but u will regret it pretty quickly.
How would u do it with postgresql?
lets imagine your code had misspelling in one place... and in order to change name of a user, it accidentally added it as a Name to user data
your database will not stop you from mistake, it will add Name and having name at the same time to user
We could use types for that..?
types?
Yea like an interface, for example a USER will have a type { name: string, address: string}
Other than that if anything else is provided, we can throw a custom exception
in SQL database data structure is declared only once. (and then can be altered for all records but u need to satisfy altering rules)
If user tries to insert fields which table does not have => it will prevent from insertion
postgresql enforces also data types.
if user tries to insert integer into string field, it will give error too.
Postgresql protects from mistakes regarding data integrity
Yes... We could create custom data types for this too, like defining data types at early stage and write custom exceptions based on that...
but your custom data type will exist only in python code, no?
Yea...
What about realtime ness? How would you achieve it with postgresql? Using websockets?
So imagine if u changed python code, and deleted one field from your data type, and added new field names.
You will be able to insert data into database, despite all previous records having your data structure of user from previous application version
in postgresql it will be a bit more regulated procedure to do migration of data.
Yes that is an issue...i think firebase rules comes into place to tackle this issue, I'm not sure...
as far as i get, all u need just requesting your database for all task rows once in a while in a loop. U said that 1 second is enough delay between tasks. within 1 seconds and even way faster it can do pretty much for 100 users for sure.
Hmm, sounds good
There are other reasons also why I'm using firebase for now, for its notification service, so like why not use its storage and db... But definitely I agree postgresql seems like a better option...
what it does for notification service?
I need to learn it first though...
It provides free push notification service for Android/iOS
So in my scripts, when a condition is satisfied, i will send a push notification to that user device about the trigger event...
based on which trigger?
so... why not to continue using it. Sounds like a good feature.
just don't use it for main storage xD
Like whenever a condition is satisfied inside my script based on sensor data..
As I know, but as I said, i will first need to learn it and be good at it, but I know firestore been playing around with it for a while, so an easier option for now...
Sure. use Firestore then
But i will go with postgresql in future for sure ..
Firestore + Flask + Celery will be minimal solution to your problem, with least learning curve to tackle the problem
i would be highly recommending learning Docker-compose for the sake of easier raising Celery and its dependency though / for the sake of keeping infra deps as a code
Yes i hope so... Will definitely improve it with tech stack u suggested once I get it smooth and runnin
Yes for sure!
I got an issue connecting to Mongo, and I have no idea what's going wrong as it was working perfectly yesterday.
I'm using it for a discord bot, but I assumed as it's a database issue I should be coming here.
File "C:\Users\main\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\discord\client.py", line 409, in _run_event
await coro(*args, **kwargs)
File "C:\Users\main\Downloads\Personal Stuff\Christmasbot\cbot.py", line 558, in on_message
if (cb_cdata.count_documents(query_one) == 0):
File "C:\Users\main\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pymongo\collection.py", line 1835, in count_documents
return self._retryable_non_cursor_read(_cmd, session)
I believe this is where the issue is, but the error is quite long. I can attach a .txt if needed. Any ideas?
How can I get the TOP 10 Records with the highest RepAmount from my sqlite database
That's just a basic SELECT/ORDER/LIMIT query. If you're still stuck show us what you've tried
I managed to get it working
hi
anyone know mysql database?
i got a question asking Display department name with Number of employees less than 5 Numbers?
howdy howdy howdy
it seems your variable name in Core is __tablename_ instead of __tablename__ (one underscore is missing), possibly this is the source of your error.
Hi @wise goblet I tried to make a simple template using Flask, Celery and I've got some questions regarding running the scripts until the user requests it to stop or it stops itself when a condition as met, I have this right now:
index.py
app = Flask(__name__)
simple_app = Celery('workers', broker='RabbitMQ')
@app.route('/run', methods=['POST'])
def run_script():
content = request.json
additional_args = content["args"]
script_id = content["script_id"] # "one"
task = simple_app.send_task(f'tasks_queue.script_{script_id}', kwargs={'args': additional_args})
app.logger.info(f'SCRIPT RUN WITH ID: {task.id}')
return jsonify({"task_id": task.id}), 202
@app.route('/stop', methods=['POST'])
def stop_script():
content = request.json
task_id = content["task_id"]
simple_app.control.revoke(task_id, terminate=True)
return jsonify({"result": "script stopped"}), 202
@app.route('api/get_status/<task_id>', methods=['GET'])
def get_status(task_id):
status = simple_app.AsyncResult(task_id, app=simple_app)
return status
from celery import Celery
celery = Celery('tasks_queue', broker='RabbitMQ')
@celery.task(name='script_one')
def script_one(args):
logger.info('Got request, starting worker thread')
'''
1) update script status to "RUNNING" in firestore for this user
2) while True:
-> Do some computation on sensor data every second (get sensor data from data layer) until the user stops this script
from frontend or the script stops itself due to some other reason/condition.
-> Send push notification to user device if some condition is met.
3) update script status to "STOPPED" in firestore for this user
'''
return args
Please read the 3 comments under "script_one" Also, there could be 100-1000s of those scripts... So my question is, in order to run it continously, is While True good or scheduler function?
Also, as I discussed yesterday about a common independent data layer in which I will be getting and modifying my sensor data, which then should be available for every script, in simple words -> I want to provide simple access of sensor data accross different scripts from a single place i.e. data layer
No, it is not good
You missed important part from yesterday
Screw your model of literal running processes
It can be done in different better way
Have process as running only as a state in database
Query database within task launched in Celery Beat (loop task) for what should be active processes
For each active processes send task into queue to perform, having specific user and script

If u need to stop it, just turn in database state of process for user as off
Your loop task will stop capturing it then
by database u mean firestore or redis?
I assumed Firestore or postgresql it will be
so is all this of any use right now?
Celery task should not be continuous. But they can be repeating short tasks very often.
do you have an example of what you're suggesting to implement? If possible for you, can you make necessary changes in my code above...?
like sudo code only... not actuall working code
Ergh. May be. I guess. Later during those weekends I could write example.
any resource you could point me to? like the implementation ...
Quite easy to write 
so where do you suggest to run the continous process? the code under while True
I will drop you today or tomorrow link to github with written example
But essentially it will be
Any backend framework... FastAPI or Django.
Celery beat
https://docs.celeryq.dev/en/stable/userguide/periodic-tasks.html
-
Celery regular tasks
-
I will configure easy run of all of it through one command in docker compose. As it is the only option to show it to you in easy way.
Feel free to choose for which database I will configure it 
Link in msg above for celery beat
While true is replaced with celery beat rule how often to repeat
hmm I see, thank you, 1 last thing
...
should I make a worker for this layer too? then how should I share the data between different scripts...?
I'm right now looking at redis pub/sub..
firestore thankyou!!
xD not sure if I am willing to work with Firestore.
I could set for u in mongodb which is essentially same firestore but capable to run locally
there might be one problem with this, if I run the script_one task in a scheduled manner then I would be updating the firestore too at that frequency... that is a really big problem...
yes sure, mongodb is fine
How much data script recieves and writes?
I need to update the status of that script for that user in firestore, once it is started successfully, and once it is stopped
I shall schedule it for every second....
Is the data collected in time series, or it is replacing previous collected data
Data size
So u collect its history?
but a good thing is that, the server shall be active for 6 hours only for 5 days
yes...
because I need to apply some indicators on whole data ...
like calculating avg, moving day averages etc
What is size of one collected batch of data?
xD Inbuilt into SQL. Would be solved at the step of requesting from database.
there is a very good library for calculating this type of indicators -> talib for python
so I'm gonna use that...
its -> technical analysis lib for python
Will u have data retention policy. Deleting data older than X days?
Technically not really. The point of nosql DBs being fast to write and read.
Redis as in memory dB yields highest result of 100'000 read and write operations per second
Even SQL database will reach its limit not very soon
So... Depending on your amount of data, u a trying to optimize something u don't need to optimize
yes but I don't want that, I only want to update my user firestore db when the script successfully starts and when it stops... not any in between...
Like: status: RUNNING/STOPPED
hello i'm trying to run this code and its giving me this error
U need somewhere saving history of collected stock data
U will have one heavy writing dB in any case
hmm, is running While True inside the task not recommended?
Pub/sub subscriptions aren't magic too, they use dBs to store intermediate data
Very heavily not recommended
due to blocking?
Yes. U a breaking principles of event driven programming then (book Python Expert Programming, reading from async to event driven)
And u a breaking its horizontal scaling capabilities in this approach.
Short living tasks it can launch at any free machine.
While true tasks can't be moved
Let me think of best option.
I know default best option at least, but wishing in memory to check other ones
sure np, take your time...
meanwhile I will try to convert the current code with scheduling tasks instead of while true
Memory checked
Separate Backend application, with celery beat task in a loop requesting stock data
And writing to any database it finds comfortable.
Yields results to other our infra as Rest API end point in JSON format as default option.
Easy optimization is available to put in front of it Nginx reverse proxy with server side caching of this endpoint.
It will make it highly optimized in less time requesting database, if necessary later
for this, firestore would not be a good option, it's prices will go to sky
How is it calculating price
per read doc, per doc write
What is money budget for Infra?
The least hassle free cheap option would be using managed database from DigitalOcean
Or if u a in Google, using their managed databases
Considering that u need very little space, it should be cheap
i am using digital ocean right now, for deployement purpose
Well... Almost free option but with more difficulties will be having database in Droplet instance.
Hassle free option will be using DO managed db
DO has plenty of choice in managed DBs, postgresql and even mongodb should be available
Even Redis is there xD
yea i think mongo is there
What is money budget per month for Infra?
so as a conclusion, you're saying, to use DO managed db to store stock data and expose it as an api endpoint to use inside each script running..?
Yes
it is not decided yet but not that much, can't tell right now
Anyway... Depending on size of data, i would expect dB will cost pretty cheap.. like 10 or 20$ per month
will this be expensiv3e?
Your data retention makes things easier
It depends on wished reliability requirements
yes this easily do
i wont be having large userbase
If u a good with 99% uptime reliability, it will be cheap
If u wish 99.99%, it will cost u at least 3 or 10 times more
okay then
thank u once again
i shall keep disturbing u look this until i make this work xD
what will be the rate limit of DO's db api?
i am hoping per second
It has no rate limit, beyond which dB u choose, how much hardware CPU/ram/ssd type u selected for it being available
And for how much traffic u wish to pay (digital Ocean I think has traffic less than 4TB as free)
+Depends on how u query data from it
Managed db can be resized at any moment to higher hardware
what if I delete data from DO's mongodb at the end of every day, will it reset my usage? or will it still count...? it might be a stupid question but still... xD
because I won't be needing the same data the next day
In managed database in DO, u pay hourly price for current available limit of SSD storage to database (with data deleting u will reset this limit), CPU to handle heavier queries and more users in parallel (with having single script to write, and nginx caching u will have essentially one user only constantly for writing and reading), and RAM memory (proportional to currently needed ssd/CPU usage)
Traffic limit will not be reset, but u a unlikely to reach its limit in the first place
As long as u aren't resizing to have higher amount of storage/CPU/memory or HA reliability more available to database, u pay same fixed hourly price for current available hardware to database (+additional cost for backups if u selected them)
Data retention/deletion and current chosen caching strategy allows just to keep same small sized database without increasing its costs in a future
Hello Guys, I'm wondering how do you evaluate your skills in SQL (Intermediate, Advanced or otherwise) i've been practicing and working on projects yet i've not been able to identify my level yet.
yeah, btw i think i am probably not having time to setup everything for you, at it can kind of already became many things.
But i can give you link to my this one repo https://github.com/dd84ai/darkbot
Essentially it is having most of things i described to you
scrappy folder is... standalone backend microservice that scraps third party service and collects data into database, and then expose to my other infra.
Well, technically configurator is my rest api for settings
- Project is full of examples how to setup FastAPI + Celery + Celery Beat, and everything documented as a code in docker-composes
Well, except i am using Postgresql here... and i have stuff async here.
flowchart TB
Users-->Frontend
subgraph Web_Infra
subgraph GUI
Frontend
end
subgraph Scrappy
REST_API-->MessageBroker
REST_API-->DatabaseHistoryOfStocks
CeleryWorker-->MessageBroker
CeleryWorker-->DatabaseHistoryOfStocks
CeleryBeat-->MessageBroker
CeleryFlower-->MessageBroker
MessageBroker
end
subgraph Processes
CeleryWorker2-->REST_API
Frontend-->REST_API2
REST_API2-->DatabaseRedis
CeleryWorker2-->MessageBroker2
CeleryWorker2-->DatabaseRedis
CeleryBeat2-->MessageBroker2
end
end
style Web_Infra fill:#a9f,stroke:#333,stroke-width:4px
Estimated look of the possible infra in mermaid js
everything can be merged into one monolith if desired though
how do you put data into a SQL data base
I got this far
@bot.command() async def create(message): print(message.author.id) db = sqlite3.connect('main.sqlite') print('connected') userid = message.author.id #username = message.author.tag cursor = db.cursor() db.commit() cursor.close() db.close()
i would try searching the internet before asking about something so vague. id search insert into sql database that should give you some results to look at sql syntax. you put that syntax inside cursor.execute() to make it happen inside your code. do this before your db.commit() stage so your changes are committed to the database
heres an example of one of the spots i insert into a database
it is not recommended that you do it with f-strings like i do, it leaves you vulnerable to someone dropping your tables by their username and such
Im pretty much trying to learn how to make a database that records peoples user ids. Something simple like that
thanks
thats exactly what that does, but also points, level, and name. and if theres already that user id then it updates their level and points
i came with a question of my own. does anyone know if i can grab columns from all tables in mariadb? im trying to make a global leaderboard command where i make a temp table then i want to INSERT INTO temptable (value1, 2, 3,4) SELECT * FROM alltables ORDER BY value1 DESC;
but im not sure how to select from all tables not just one table
you can UNION the tables together. Just how many of them are there?
i saw some examples with joining the tables together but you needed the names of the tables specifically. right now theres 16 tables and the names of them are discord guild IDs, and there will be more added as the bot grows
I'm not familiar with mariadb and its been a long time since I've use mysql. Though would Prepared statements work using syntax like: https://stackoverflow.com/a/27542965
ill have to look into something like that more tomorrow
the main problem seems to be the varying size of the list of tables as well as being able to specify the names of unknown tables if i were to join them
Am creating a aiomysql pool object at start of program and using it through out the life of program by using it within functions by doing global pool is this good?
Thank you so much i will go through it ๐
Do you really need one table per guild? It would be better to have one users table where the guild_id is a column.
Making separate tables per guild is certainly a mistake, you only need one table for all users and one for all guilds (if you store that info)
As Berndulas said
Hello, I am trying to execute this sql statement: select * from test where Vooropleiding not like lower('%havo%') and Stakingsdatum is not null
(i am working with pandas and sqlite)
What i am trying to do is select EVERYTHING that doesn't match with the above statement. So both conditions need to be true at the same time. However, what I am getting back is results where in either the one condition seems to be true or the other. Basically I want to leave out every row where both conditions at the same time are true.
what am i doing wrong?
Seems to be ok except the lower part:
select * from test where lower(Vooropleiding) not like '%havo%' and Stakingsdatum is not null
oh yes, i tried that too but also didn't work
Wish I had thought of that before lmao I actually have 2 per guild. One for member leveling info and one for all their option info they can set with the bot
Although wouldn't that create an issue if the same user is in 2 guilds?
Wouldn't be able to use user ID as my key I'd have to use an auto increment
Which I guess isn't a big deal
No, you would just use a composite key (two columns key).
Hi I have a few questions
Is the relationship between a doctor and patient ( on to many )
Are you asking what it should be,.or what the diagram shows?
What it should be depends on whether you want to have the.possibility of a patient being associated with multiple doctors or only one.
The line in the diagram suggests one to many but it seems to be attached at random. Instead it should link the specific fields of the foreign key relationship
In this scenario
Many doctors and nurse can be present in an operation
So would they have a relationship of many to one with the operation entity
But a doctor is associated with many patients as well, right? Then many to many
in the diagram, only one doctor is associated with an operation, and no nurses.
whether it's "correct" depends on the real-world setup
e.g. patients and operations should probably be associated directly
every patient can have zero or more operations
some nurses work in the OR and some nurses work in patient rooms. they aren't necessarily the same group of nurses
so you have OR nurses associated with operations and room nurses associated with patients
likewise a doctor could be associated with a patient or a procedure or both
@keen ledge Per Rule 6, your invite link has been removed. If you believe this was a mistake, please let staff know!
Our server rules can be found here: https://pythondiscord.com/pages/rules
CREATE INDEX CONCURRENTLY
ON post (id);
when i try creating concurrent indexes,
I end up with error in sqlalchemy
CREATE INDEX CONCURRENTLY cannot run inside a transactio...
I tried different ways but nothing work
from sqlalchemy import text
from sqlalchemy.engine import create_engine
engine = create_engine(
database.full_url,
pool_pre_ping=False,
echo=True,
isolation_level = "AUTOCOMMIT",
)
connection = engine.connect()
connection.execute(text(schema_sql_code))
any ideas how to run for sure sqlalchemy query without transactions?
recreated even db just to e sure 
Nvm, i am idiot.
i did it.
Issue was that i tried multiple Indexes concurrently to create
CREATE INDEX CONCURRENTLY
ON post (id);
CREATE INDEX CONCURRENTLY
ON post (author_id);
CREATE INDEX CONCURRENTLY
ON post_edition (post_id);
so no matter what options i would have written, SQL statement was transactional already xD
Use alembic ๐
Nope. I go thorough raw SQL company tutorial
Everything must be raaaaw
alembic can render raw sql
I still use pytest, SQLalchemy and mkdocs though
And hacking with base automap to map to ORM and having easy model objects creations in factories for tests
Almost raw ๐
All queries for tutorial tasks itself are raw at least
It is fun tutorial/training. If to be correctly, it is about learning depth of postgresql
I have kind of a db manipulation task in #help-candy
Hello, do yu have some library example that can read an sql file without connecting to a database ?
(user + password)
What do you mean by "read sql file"?
Read tables and datas dumping in file.sql with sql syntax
You can't connect to db without proper authorization
ok i tested other way like psycop2 and sqlite3 (only work for file db) but I think you're right
ty for the response
my goal is to search a solution that can take data dumping and store in json
Just have username and password from database you want to dump ๐คจ
Also postgres has pg_dump util
I'm finding the best way to collect these data without requiring a lot of dependancy
With this solution, I need to install postgres
but np if I have no choice
can i copy data from windows function?
you can save a result of a window function to a temp table and copy it where you like.
can someone explain how the OVER() function works in SQL?
Im trying to create a column containing the index for each row using the ROW_NUMBER(), and I believe this is how its done
SELECT ROW_NUMBER() OVER (ORDER BY _rowid_) AS _rowid_, forest_area_sqkm
FROM (SELECT _rowid_, forest_area_sqkm FROM forest_area
ORDER BY forest_area_sqkm ASC)
Basically creating a new _rowid_ column after sorting the table
OVER() by itself is not a function. what you are looking to learn more about is called window function
and -
A window function performs a calculation across a set of table rows that are somehow related to the current row.
in your query, you are basically ordering the entire subquery dataset by its _rowid_ and then getting the row number of each row without any partitioning
(to explain partitioning, imagine you have column country in the subquery, you can split the dataset by country and get the row number per country by adding PARTITION BY country inside the OVER())
makes sense?
ok so no need for a subquery, this seems to work
SELECT ROW_NUMBER() OVER AS _rowid_, forest_area_sqkm FROM forest_area
ORDER BY forest_area_sqkm ASC'
I've heard of the partition by thing before, but havent used it.
It seems like its basicaly: GROUP BY and then apply the function to each group
correct, if you only use order by and partition by.
however window function is more powerful than this, because you can do a 7-day rolling mean for example (and not just limited to group mean), by using ROWS BETWEEN 7 PRECEDING AND CURRENT ROW or something to that effect.
ok thanks for explaing that, I need more practice with this
my ex-colleague made this http://www.windowfunctions.com/, maybe that's useful ๐คทโโ๏ธ
Thanks! will have a look at it
Hi if i have a comma seperated list, i want to add first element to first column, second elem to second column and third elem to first column and so on
Do i iterate over list to do this
is this a question about sql? or a question about python lists? if the latter, this is a better question for a help channel. read #โ๏ฝhow-to-get-help for instructions.
Is it silly to make some sort of graph representation of the possible combinations of fields in my database?
Hey folks, I need to execute certain raw SQL queries within the scope of a context manager in SQLAlchemy.
One way to execute raw queries in SQLAlchemy is to use the engine's connection. But how can I do this using the session?
My code would look something like the following:
def bulk_create_parsed_messages(session, objects):
session.bulk_save_objects(objects)
with Session.begin() as session:
bulk_create_parsed_messages(session, parsed_message_objects)
for update_query in update_queries_list:
execute_raw_query(session, update_query)
How do I build this execute_raw_query method within the context manager's scope?
Currently using:
def execute_raw_query(query):
result = engine.execution_options(isolation_level="AUTOCOMMIT").execute(sqlalchemy.text(query))
return result
The session object also has an execute method! Issue resolved.
def execute_raw_query(session, query):
result = session.execute(query).all()
return result
Hello, i am using Django for getting data from API and display them in HTML
the thing is that array key has spaces and i cannot call it
anyone has any idea how it should be ?
{% for i in products%}
<tr>
<td></td>
<td>{{ i.Lloji i dokumentit}}</td>
</tr>
{% endfor %}
it does not work
Hi folks, what is the most efficient way to implement ON CONFLICT ... DO NOTHING in SQLAlchemy (1.4) ORM?
Currently while doing bulk insert on a table with multiple objects (PK is pre-generated), for each object, I'm looking up records using PK, if not found I'm appending it to a final list of objects and then at the end doing bulk_save_objects on the list.
But this has latency implications. Please suggest a better approach!
i do this in my not-so-performance-critical tooling
statement = (
insert(self.model.__table__)
.values(
lat=bindparam("lat"),
lng=bindparam("lng"),...
)
.on_conflict_do_update(
index_elements=[self.model.unique_ref],
set_=dict(
lat=bindparam("lat"),
lng=bindparam("lng"),...
),
)
)
db.execute( # type: ignore
statement=statement,
params=[
{
"lat": task.lat,
"lng": task.lng,...
}
for task in snapshot_in.tasks
],
)
obviously you want to swap it out for on conflict do nothing, but the jist of it is use Session.execute's executemany mode (https://docs.sqlalchemy.org/en/20/orm/session_api.html#sqlalchemy.orm.Session.execute), so far no performance concern for me, worth a try for you i guess๐คทโโ๏ธ
Any help ?
Doesn't seems database related, but have you tried i['Lloji i dokumentit'] instead of i.Lloji i dokumentit
sounds like a question for #web-development
I am creating a new database and related schema to be able to monitor an experiment and multiple samples of data. I am thinking of creating a table for every experiment run, and then having a many-to-one relationship of another table (samples) and having a key in the samples table relating to the experiment created in the experiment table.
Because the samples taken happen almost immediately after the experiment is started and an experiment record is created, I am wondering what the best way to keep track of the ROWID pk (sqlite3) key would be so that I could insert this as part of the sample record created. Is it appropriate to do something like SELECT MAX(ROWID) FROM experiment LIMIT 1; and set the result to a global to share with the rest of the program?
"Use a line chart to display the monthly Fahrenheit temperatures in Stavanger in 2017."
Idk how to specify to only focus on the city of "Stavarnger" in my dataset
Do I really need a relational database? Can't I just start with one? My project is an inventory database. I could see it expanding, but all I hear is relational database which implies multiple tables
You can use a use a relational database (e.g. SQLite) and only use a single table if that's what's appropriate for your use case
Without details of any sort it's hard to answer but more than likely multiple tables will be beneficial
Thanks all. My project CAN have one huge table, but I suppoe the benefit of multiple is that the size of [Table 1 , Table 2] << merge(Table1,Table2)
I'll start with one, get stuff working, them move onto a relational one later. Let me know if that's a terrible idea tho
To be clear, a single table in a relational database is still a relational database. If you're talking about starting with a non-relational (schemaless) database like Mongo, that's significantly more complicated to change later and probably not a good idea
ahhh thank you for that distinction! Taking from https://www.quora.com/Can-Be-tables-in-a-relational-DB-without-any-relation-for-example-a-user-table-just-for-login
"You could have a relational database with many tables, all unrelated to each other, as long as each table is a relation.
Informally, a relation is a set of rows, where every row has the same columns, and there are no duplicate rows. Thatโs it. This includes a lone table that has no related tables."
I don't think i agree with that, mongo specifically... maybe, because people tend to be bad at creating database structures without being screamed at. But it's definitely not more complicated than setting up a relational structure compared to something like a wide column DB.
Just because something is not relational does not mean it's not schemaful or cant have types, Mongo is just a bad example.
provide good example?
I would default to relational databases, unless a non-relational one can be justified (which is quite rare and hyper specific).
No the benefit is in DRY. Don't repeat yourself.
Say you have a vendor with each item in your inventory, possibly multiple vendors. It's way better to give the vendors their own table.
Or say your inventory has classifications, you might want to extract that to it's own table so if you need to update a classification it'll change for all inventory
who could help me to make a login system in a python program but the user and pass are hosted on rental.co or other?
supporting @keen minnow comment. And @fading patrol comment says it correctly.
The main advantage of relational database is data integrity with migration of data between different application versions.
U can be confident in relational database... that for each ForeignKey / link from one table to another table that resource is present.
That all constraints regarding Not Null elements, or elemements having certain type, or being in certain range and other rules are enforced.
That all tables have exactly columns/fields as your current migrational code tells it to have.
That gives serious boost already to starting application. And especially crucial when your application code will become to change, and you will need your already deployed database data schema to match new state of application code.
Using non relational databases for main db => easy way to... painful, very painful situations later. Quick way to death from tech debt xD
could someone help me with a query? i'm trying to count the number of times users have said no to a house after a viewing (which is recorded as false), but if i use COUNT(renter_accepted = false), it groups the count of all false values under just the single user. how can i edit this so that it counts each time an individual user has said no?
edit: working now
SELECT
renter_id,
CONCAT(SUBSTRING(first_name, 1, 1), ' ', SUBSTRING(last_name, 1, 1)) AS initials,
COUNT(renter_accepted = false) AS properties_rejected
FROM appointments a
INNER JOIN (SELECT * FROM renters) rent ON (a.renter_id = rent.id)
WHERE renter_accepted IS false
GROUP BY renter_id;
example: without the COUNT() column, there are 2 user_ids who have rejected a viewing, but if i put COUNT() back in, it gives them all under user_id 1
nevermind, fixed it by adding GROUP BY renter_id after the where clause
using SUM(IF(renter_accepted = false, 1, 0)) matches the behaviour i was after
i hope this is useful for someone ๐คฃ
You could just SUM(IF(renter_accepted, 0, 1)) for false and SUM(IF(renter_accepted, 1, 0)) for true.
The SQL way is SUM(CASE WHEN renter_accepted THEN 0 ELSE 1 END) (and vice vers).
@bot.event
async def on_ready():
print(f"{bot.user.name} is online!")
await bot.change_presence(status=nextcord.Status.idle, activity=nextcord.Activity(type=nextcord.ActivityType.watching, name="Prefix !"))
async with aiosqlite.connect("main.db") as db:
async with db.cursor() as cursor:
await cursor.execute('CREATE TABLE IF NOT EXISTS role (role_id INTEGER , guild INTEGER)')
await db.commit()
@bot.command()
async def set_staff(ctx):
async with aiosqlite.connect("main.db") as db:
async with db.cursor() as cursor:
await cursor.execute('SELECT role_id FROM role WHERE guild = ?', (ctx.guild.id,))
data = await cursor.fetchone()
if data:
await cursor.execute('UPDATE role SET role_id = ? WHERE guild = ?', (ctx.guild.get_role, ctx.guild.id,))
else:
await cursor.execute('INSERT INTO role (role_id, guild) VALUES (?, ?)', (ctx.guild.get_role, ctx.guild.id,))
await db.commit()```
nextcord.ext.commands.errors.CommandInvokeError: Command raised an exception: InterfaceError: Error binding parameter 0 - probably unsupported type.```
what is a cursor connection i read up in google and i still dont understand
Don't overthink it, seems abstract but it's simple to use. You have a connection to the database and a cursor to issue commands. You need both to do anything, connection first. Keep the connection as long as you need, but cursors can be closed and opened as much as you want on the same connection
The way to understand is to use it. Show your code if you're stuck on something
# creating a connection cursor if needed
async with connection.cursor() as cursor:
await cursor.fetch("FETCH QUERY...")
```trying to understand this
You have an existing connection and in that code you're just creating a cursor to perform a query.
i see
thanks for ur explanation :D
Hi, I'm working on a dialect for SQLAlchemy for InterSystems IRIS Database, proprietary but with Community Edition available
If anyone, could give at a try, and give some feedback?
https://pypi.org/project/sqlalchemy-iris/
I have a table like this
items (
id SERIAL,
name TEXT,
condition INT,
PRIMARY KEY(id)
)
the item is identified by name and condition, but condition can be null, how can I set this as constraint? would UNIQUE(name, condition) work? I'm using postgresql
hey so i have this json file:
{"mojito": {
"main" : "rum",
"secondary" : ["lime", "sugar", "mint", "soda"]
}}
```and this is my code
```python
import json
with open('cocktails.json', "rb") as f:
cocktails = json.load(f)
ingredient = [cocktail for cocktail in cocktails if "sugar" in cocktail.get("main")]
i think you get the point but for some reason im getting this attribute error:
AttributeError: 'str' object has no attribute 'get'
my head is fried trying to think of this, im trying to create a simple inventory management.
How would i link multiple items into one inventory?
Is there a good website/ document i can read about this?
Have a look at cocktails... I think you'll find it's value is not what you think it is. Maybe you mean to iterate through cocktails["mojito"]["secondary"]
What exactly is the problem you're trying to solve here? For example is it that you are using a SQLite database or something else and have a doubt about how to structure your tables?
im new to databases so i thought id try a small inv management project to start, im using mysql and just wrapping my head around is a but blowing atm lol
if you don't want to reinvent the wheel this looks like a good start: https://mysql.tutorials24x7.com/blog/guide-to-design-database-for-inventory-management-system-in-mysql
cheers ill have a look
this is a perfect start cheers, the bit thats really puzzling me which unfortunately this doesn't include is say there was 2 uses with two separate inventories, how would you save/store the products in each of these inventories. like would it be a a new table for each inventory?
Yo
I don't think I follow what you mean by "two separate inventories". In that example the database itself is an overall inventory.
There is no inventories table, but you could design one however you like
And if there would be multiple inventories with multiple items and those same items appearing in multiple inventories, that's what's called a many-to-many relationship and you need an intermediate table that just tracks each connection
thats what i mean it doesnt explain that, id like to try make something with multiple users
ahh okay, right ill look into that
How to make the time saved and continue after restarting the bot. I know to keep the time, but how do I take the time and continue it after a reboot? (postgre)
My mongo Collection Structure :
{ _id : "value" , "name" : "Vegeta" "data" : [{d: "2022-05-16" , v : 10.06}, {d: "2022-05-18" , v : 9.11}, {d: "2022-05-20" , v : 7.06}]
Now I want to make a query say
{name : "something" , "data.d" : "2022-05-18"} --> this filter returns v = 9.11 {name : "something" , "data.d" : "2022-05-17"} --> this filter also should return v = 10.06 {name : "something" , "data.d" : "2022-05-19"} --> this filter also should return v = 9.11
Basically I want to the return the result from data which matches the given date if not present then return the previous date data.
I have tried using elemmatch. Unwind works for me but need to increase the performance of the query.If possible can we do in the mongo shell itself.
You'll probably need to share your code before anyone can understand what problem you're trying to solve there
When using sqlalchemy (in conjunction with alembic) echo=True can be passed to create_engine, along with some logging settings, in order to log the raw DDL commands to a file.
It also prints all the commands out to the terminal as well - which is annoying, I just want them logged to the file.
Is anyone with a setting / approach that will enable me to log the raw DDL commands to a file, but not print them out to the console ?
I don't have a direct answer for you, but I would look at https://docs.python.org/3/howto/logging.html. I think the logging being done in sqlalchemy is just normal python logging. You should be able to grab that logger and alter how it outputs
Thanks - yeah for some reason i can only seem to get it working if i have the logger setup and echo=True... I get a file output, but also output to the screen.
logging.basicConfig()
data_logger = logging.getLogger("sqlalchemy")
data_logger.setLevel(logging.DEBUG)
data_logger_output = logging.FileHandler("ddl.txt")
data_logger.addHandler(data_logger_output)
was the logging config that i used, along with echo = True.
no - create_engine( ... , echo = True)
yeah echo=True logs to stdout... Which isn't what you want, but when you turn it off, you're saying the file logging doesn't work?
yea - if i turn that off then the log file has nothing :S
@clear stirrup ok it doesn't have nothing - it just has Connection / Pool stuff ๐ค
none of the ddl
this is what's in the output log file if i don't have echo=True within create_engine
Sorry, I'm not sure. Is it possible that those logs are being output under a name other than "sqlalchemy"?
i'm not sure really - i tried sqlalchemy.engine as well
How can one define SQLAlchemy tables in a separate python script file then import it into the main script?
what if you just
logging.basicConfig(filename='ddl.txt', level=logging.DEBUG)
without the rest. Does it show up then?
@rugged locust #databases message spam
@clear stirrup ๐ค I'm not sure I'll try that now
it's going to log more than you want probably, but maybe it will show up then
i was hoping to be able to get a reproducible DDL script but i'll settle for something i can see at this point ๐
OK just ran that @clear stirrup - didn't work unfortunately
(it was empty)
oh
maybe it's an alembic thing
where are you defining the logger? Somewhere at the beginning of the alembic script? Not familiar with alembic
I think it does need to be defined immediately
it's in the env.py script in the alembic file - and i'm defining it at the top of that
just found this https://stackoverflow.com/questions/42427487/using-alembic-config-main-redirects-log-output , dk if it's useful yet
the disable_existing_loggers kwarg seems promising
yea - i seem to get everything but DDL now
going to leave it for the eve, uk time ๐ฉ
never managed to create a column constraint, am struggling to argue to myself for alembic + sqla over just writing ddl scripts in sql atm
bleugh, sorry. Hope you figure it out or someone else more knowledgeable comes along
no worries, thanks for the help ๐
Should be a basic import... If you're having trouble, show what you're trying. Maybe this is relevant: https://stackoverflow.com/questions/58512291/importing-sqlalchemy-models-used-in-relationship
nice, thanks
can I get this functionality some other way preferably in the table definition? I use postgresql https://cdn.discordapp.com/attachments/268002637905920001/1042726611847483492/image.png
Hi guys, long time lurker first time poster ๐
I've been scratching my head trying to get a dynamodb query to work for a few days now.
Is anyone aware of how I can essentially do a "select * from my_table where attribute1 in ('group1','group2','group3')" in DynamodDB?
Example of code:
my_table = dynamodb.Table('my_table')
my_group = ['group1','group2','group3']
response = my_table.scan(
ย ย ย ย FilterExpression = Attr('attribute1').eq('group1') or Attr('attribute1').eq('group2') or Attr('attribute1').eq('group3'),
#FilterExpression = Attr('attribute1').contains(my_group),
)
print(response["Items"])
Query and Scan are different in Dynamodb
@upbeat bronze
This should help if you working with Scan.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.Scan.html
Thanks @dusk patio - I can perform scans and queries without this particular filter expression without any problems.
What I'm just trying to do is scan for results where the value of an attribute is present in a list e.g. If this was tsql:
Select * from my table where attribute in ("option1", "option2")
In filter expression, can you try using bitwise operator | instead of or
(According to this: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/dynamodb.html#querying-and-scanning:~:text=You are also able to chain conditions together using the logical operators%3A %26 (and)%2C | (or)%2C and ~ (not). For example%2C this scans for all users whose first_name starts with J and whose account_type is super_user%3A )
@upbeat bronze
@bruhhh you absolute legend. I replaced the "or" for "|" and it seems to be working as expected now.
I've just manually typed in the attributes but I'm just going to write some logic to build the filter expression and then test again.
Will keep you updated!
The link you've send above doesn't seem to be working though could you resend it with a working link? I would be interested to take a look at that example.
Are you getting an error message?
Looking at the docs the boto3 Attr has an is_in method? https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/dynamodb.html#boto3.dynamodb.conditions.Attr.is_in
my_table = dynamodb.Table('my_table')
my_group = ['group1','group2','group3']
response = my_table.scan(
ย ย ย ย FilterExpression = Attr('attribute1').is_in(my_group)
)
print(response["Items"])
Ok, but you seem to imply that a relational database must have multiple tables. Can I start with a single table that is a relational database? I'm not at production yet
Most of what Darkwind says there would still be relevant with one table... The foreign key part is a clear exception
Most of my stuff will indeed apply even to one table
As for having one table or multiple, read the manga, it will explain difference between denormalized and normalized data up to third level
In general it is very good to have data normalized in order to utilize relational db to full capacity, in order to have access to its rich query syntax asking for your DRY data in whatever way u need combined
LOL that cover
Got a situation for a complex query.
Let's say I've got a table called CarParkings with the carparking ID, license plate number, car make, car model, dateCreated and dateFulfilled.
Let's say that a reservation can be made for a car, giving the make and model, or specifically giving the license plate number, with the "arrived" datetime set to null. When going around the parking lot to check the cars, you first try to update a carparking reservation with the matching license plate (a specific reservation) where you update the arrived date, if you can't, you try to update a carparking reservation with the make and model only given (a general reservation) where you set the license plate and mark the arrived date, and if you can't find a reservation for either, insert a new reservation for this make, model and license plate, and mark it as created and fulfilled at current timestamp.
So like,
if specific reservation found, try to update it, else, if general reservation found, try to update it, else, insert a new reservation and mark it as arrived.
How do I translate that into a SQL query?
How to store the time in the database so that after restarting the bot, it continues and does not reset? (postgre)
This seems like a good case to use an UPSERT .
Sticking a timestamp in a table is easy enough. But without seeing code we can't guess how your bot works and what you mean by continuing/ not resetting
Perhaps easiest to have your bot inserting into a bot_log table that has a auto timestamp periodically while its alive. And when it restarts it just gets the max of the timestamp out of the log.
Do not create custom time measurers, use timestamps (Datetime)
Just store in SQL timestamp of your start.
And query time difference with now from it
(And therefore no need each second updating dB xD)
But it feels more complex than an upsert
Because of updating under two separate conditions
I do it through tasks, I want that after restarting the bot, the time for the team is not reset, but continues to go
When it is supposed to be reset and when not?
After restarting the bot, the time is reset, but I want it to continue
If both specific and general conditions have the same unique keys I don't see the difference.
Well, do not reset time then after restart ๐คฃ
Just do not send query that updates timestamp of your start, if u use timestamps
We don't know the unique ID (carparking ID). It's an auto-generated ID for every carparking. This table is kinda like a transaction table.
I understand. Though if you can always derive the auto-generated ID from the combination of license plate number, car make, car model, dateCreated then you can define this combination as a unique constraint and use it as a UPSERT on conflict condition.
would this be right? im new to learning sql and im using this website https://sqlbolt.com/
the example says this
SELECT column, another_column, โฆ
FROM mytable;
wait nvm i got it
i needed to just remove the *
Is sqlite3 or aiosqlite more suitable for making lots of quick/simple queries into a database?
sry but can u do it this way ๐
async with self.bot.db_pool.acquire() as connection:
matchcreated = await connection.execute(f"SELECT * FROM match WHERE matchid = {ctx.channel.id}")
third level? >< well the definition of relational database being no rows repeating and all rows having the same columns seems easy enough to start with and what should happen anyway. I"ll read the manga i guess lmao
what does normalized/denormalized mean in this context? Probably in te manga
there is 0NF 1NF 2NF and 3NF i believe that is what he is talking about
fucking hilarious... https://oberstar.eu.org/share/Documents/The-Manga-guide-to-databases.pdf
manga guide...
every entry should be scalar in a row/column right?
seems like definitions i found didn't specify that
thanks! this is good stuff. seems like you increase the number of total rows by going up normalization levels.
basically yes but no
in 1NF we make one table for each entity
in 2NF we destroy any half related (whatever it is)
in 3NF we destroy Transitive (whatever it is)
my school teaches computer science in malay so ye
i only know the malay term
async def exec_write_query(self, query: str, data: typing.Optional[tuple] = None) -> None:
if data:
await self.pool.execute(query, *data)
return
await self.pool.execute(query)
```what is the `if data:` for
@west hill
already answered it in #discord-bots
wait really?
its for parameterized queries
yep
but what happens during if data tho
it unpacks data and executes the query
and if data is None it just executes the statement itself
data is the argument given?
yeah a tuple of stuff
i see
i still dont understand why they make the syntax so confusing
it was me ๐
and now i dont even know how to start mine
i made it
abstraction 
how would i go about inputting a class object into a database?
for context im trying to create a save and load system for my discord bot game.
i have stuff like the below;
class dagger:
name = 'Dagger'
dmg = 4
value = 25
class game:
name = 'bob'
money = 70
weps = [Dagger]
i get how id write and load the name, money etc into the db, but how would i go about doing it with an object? esp as i have objects w diff attributes
so the query is just a string and i can make it a formatted string with variables inside right?
not recommended coz injection attacks
better to use parameterized queries
injection attacks?
its a monopoly bot u see
google it
most of the value is just easy integer
and some is just ctx.channel.id and ctx.author.id
i tried creating a iter of the dagger class and input that objects stats but i cant figure how to do that; plus idk how id load it again and reverse the process
so can i go
'SELECT * FROM match WHERE playerid = {ctx.author.id}'
ok
Helo
If I have a relation {A, B, C} and the candidate key is {A, B, C} with no non trivial functional deps then what would the nf be?
I think the FDs wuld all be Prime -> Prime so it in 3NF?
Hi, using sqlalchemy, does anyone know what is the column type for a list of key value pairs please?
I thought it might be Column(ARRAY(dict)) but I get an error
Hello, I'm having trouble with SQLAlchemy, maybe someone here can help me. I want to ensure that a table is referenced by at least a fk of another table. Maybe doing a check on create/update. How can I achieve this?
Normalisation level is level of how DRY data in relational database. In maximum level, all data is supposed to take smallest possible size
Holy shit! I don't know how I missed this but it works ๐ my Google fu failed me on this occasion! Many thanks for this @grim vault
Hey I am new to SQL
Currently working on MySQL with python
I have to connect two tables A and B
A having primary key(auto increment) and Category
B having Sub-Category
so how do I connect these tables with a foreign key?
In this case Category is also unique but contains VARCHAR so cant implement autoincrement
and Subcategory is a part of categories also contains VARCHAR
Each Category having multiple Sub-categories
sounds like a regular one to many relationship resolved with just one Foreign Key in table B
have unique autoincremented A_id in A
and unique autoincremented B_id in B (just for consistency and future)
write to table B foreign key that references A_id
???
PROFFIT!
I have created two columns in A A_id and Category
and created three columns in B B_id , Subcategory, categoryid (which will use for foreign key)
but now the problem is if I want to insert data I have two use the ID as a foreign key
but i don't want to do that
I want my foreign key linked with the category name
so is it possible to do that?
and do i have to make three columns in Table B?
yes, u can use category name as a foreign key too.
but it will break if u will have any category name with spaces in it
so u would have to be very careful then. in general it is bad idea and should be avoided
U can mitigate damage by creating check/constraint that verifies no names with space were added
plus integers have better performance
additional danger to using varchar as foreign keys unique to some databases (mysql can be in it), some values are case insensetive treated.
And do I need to add three columns in Table B ?
u can have added only one for it to work, or if desired u can add two or three, it will work anyway, your choice
CREATE TABLE JobType1 (Category_ID INT AUTO_INCREMENT PRIMARY KEY, Category VARCHAR(50))
CREATE TABLE JobType2 (Sub_Category_ID INT AUTO_INCREMENT PRIMARY KEY, Sub_Category VARCHAR(50), Category_ID INT, FOREIGN KEY(Category_ID) REFERENCES JobType1(Category_ID))
These are the two queries
So if i remove third column from table 2 then where i assign foreign key ?
Sqlalchemy: how can i create a primary key with an autoincrementing id so that when column 2 is deleted then column 3 gets the id 2 and all foreign keys refrencing to column3 are updated to 2?
Is it a problem if I have like 1000 indexes in a collection (mongoDB)?
I'm trying to figure out the best possible way to organize it, and there is a variable that takes up about 1000 different values, each having its own numeric value as well. So I was thinking to use the variable's text value as a key and the number as a value, but then I'd have to add an index to each 1000 different key
the result would be that I could search the 30 required items (having the same number on the same key) very quickly over 1.5 million entries, which is what I try to achieve
I just don't know if this would be a bad practice and there might be a similarly fast solution that requires fewer indexes
Why would you want to do that? I suspect there's a better way to achieve whatever you want to achieve.
I have no idea what u have there, but very highly likely you are digging yourself a pithole with just using wrong database.
perhaps perhaps. Postgresql has syntax that with one declared (inverted) index maps everything in such values as json.
but probably u could be needing Elastic Search if you have such complex text like data
Anyone have idea, how can I insert data from an array in mySql
Are you asking how to do this in Python? Like this: https://www.w3schools.com/python/python_mysql_insert.asp
yes but it is not working, because I have data in this format:
list = ["Lorem", "Ipsum", "Nor", "Neither", "etc..."]
I want to insert this data in database
How do i do that?
A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.
this is also not working
because it creates the whole list as a tuple not the content inside list
tuple = ("Lorem", "Ipsum", "Nor", "Neither", "etc...")
like this but I want data like this:
data = [("Lorem"), ("Ipsum"), ("Nor"), ("Neither"), ("etc...")]
you need a comma , inside each () to make it a tuple. otherwise (x) is just x
that is, (x,) is a length-1 tuple, but (x) is just x
this is just an ugly python syntax quirk
I only search for exact matches, so I don't need elastic. I don't intend to work on anything using mongo after this, but I'm not going to rewrite this entire database to postgres
@silver crow the answer of course is to loop over elements of the list, wrapping each one in a tuple; [(x,) for x in data]
query = 'INSERT INTO ...'
params = [(x,) for x in data]
cursor.executemany(query, params)
does anyone know if this is a good video to watch for sql https://youtu.be/HXV3zeQKqGY
In this course, we'll be looking at database management basics and SQL using the MySQL RDBMS.
Want more from Mike? He's starting a coding RPG/Bootcamp - https://simulator.dev/
The course is designed for beginners to SQL and database management systems, and will introduce common database management topics.
Throughout the course we'll be lookin...
if for some reason you feel like you need to watch a video, sure, freecodecamps is popular and I've heard no complaints
but you won't learn until you do stuff, and there are lots of interactive sites/apps like Datacamp, Dataquest, etc. that cover the basics of SQL
Iโm fine with interactive lessons, but reading it all makes me inattentive
Like I learn better by someone teaching me like on a video and showing examples
yeah, those sites I mentioned aren't heavy on reading so hopefully you find them useful in combination with the video
@commands.command()
@commands.cooldown(1, 1, commands.BucketType.user)
async def test2(self, ctx):
try:
await ctx.send('1')
async with connection.cursor() as cursor:
await cursor.fetch("SELECT part_id, part_name FROM parts ORDER BY part_name")
rows = cur.fetchall()
await ctx.send('2')
await ctx.send(rows)
except Exception as e:
print(e)
```how do i define connection in this case
How to make database?
Hello. I am using python aiomysql, and I am trying to create a table with the condition it doesnt already exist. why doesn't it work?
await cur.execute("CREATE TABLE IF NOT EXISTS tweets(id INT)")? ๐ค
Idk
u can do it basically anywhere
online, in ur desktop and even on a piece of paper
is there a discord group like this to solve mysql queries
flips pages in bold YOULL NEVER LEARN IT PEASANT
relax my man
nah dawg im chill its freezing rn
im using psql after long time
did psql have password
was i supposed to remember it
cause i forgot
or did it have a default pass
ahh nvm
i just went to pgadmin4
and changed pass
items PK(id, date)
| id | value | date |
| 3 | 30 | 2022 |
| 4 | 40 | 2021 |
| 3 | 59 | 2019 |
I have something like this, I need to get the latest price, can I get it without a nested query? this was my attempt
SELECT id, value, date
FROM items
GROUP BY id
ORDER BY date DESC
but value is not related to the aggregate function so I get an error
You need a window function or a sub-query.
SELECT *
FROM items AS a
WHERE date = (SELECT MAX(date) FROM items AS b WHERE b.id = a.id);
wait would this work?
SELECT id, value, date
FROM items
GROUP BY id, value, date
ORDER BY date DESC
No, you'll get all rows.
SELECT *
FROM (SELECT *, row_number() OVER (PARTITION BY id ORDER BY date DESC) AS row_nr
FROM items) AS a
WHERE row_nr = 1;
This happens when im trying to run the bot API web.
I suppose some model issues.. OAuth2 tokens are not being saved to db
class User(DiscordEntity):
"""Base User model, meant to be reference by extensions"""
async def fetch(self, bot: InteractionBot) -> DUser | None:
"""Returns the associated user id for the given user Model."""
return bot.get_user(self.id)
class Welcomer(Guild, Model):
guild = OneToOneField("Bot.Guild")
channel_id = BigIntField(null=False)
enabled = BooleanField(default=False)
title = TextField(null=True)
description = TextField(null=True)
icon = TextField(null=True)
image = TextField(null=True)
color = TextField(null=True)
footer_text = TextField(null=True)
footer_icon = TextField(null=True)
class OAuth2User(Model):
"""Oauth2 User."""
id: str
token: str
expires_in: int
oauth_response = await OAuth2User.get(id=member.id)
print(oauth_response.token)
add_to_guild(oauth_response.token, member.guild.id, member.id)
I also tried printing the oauth token to see if there's something
Hey can someone please check my schema in free time and tell me, if there is anything i could do better? Thank you
Hastebin is a free web-based pastebin service for storing and sharing text and code snippets with anyone. Get started now.
What are you doing and what doesn't work?
I'm trying to learn SQL Queries, i want to insert a email,username and a hashed password into a table
but it seems to not work
Is there an error?
Nope
Are you committing the changes to the database?
do you need to commit
Does that work?
probably cursor.commit() i imagine
Oh yes that worked, but it is '%s'
I tried to use the santization way you showed me
cursor.execute("INSERT INTO lookup (username) VALUES ('%s')", ("test"))
Can you post the actual code? Not everyone can read screenshots.
You'll want ("test",) at the end. Make it a tuple.
you probably don't want to wrap it in '' you probably just want %s on it's own.
mysql.connector.errors.ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '%s)' at line 1
send the full query
cursor.execute("INSERT INTO lookup (username) VALUES (%s)", ("test"))
Okay, adding a comma at the end of the tuple made it work
("test") isn't a tuple. Adding the comma makes it a tuple.
Quirk of ()s in the language. ("test") is just "test" while ("test",) is a one entry tuple.
Thanks
whats the best module to open csv file?
Hello there! I'm trying to make a function to add a new table a database in sqlite3 with this function
def create_table(table_name):
connection = sql.connect(f'{directory}\main.sqlite')
cursor = connection.cursor()
cursor.execute("CREATE TABLE ? (name TEXT, price REAL, url TEXT)",(table_name,))
connection.close()``` but I get the following error: `cursor.execute("create table ? (name text, price real, url text)",(table_name,)) sqlite3.OperationalError: near "?": syntax error`. I refuse to use f-string to avoid problems but I am a bit rusty on the use of `?`. Does anyone know how to fix that? Apparently I can't use the `?` for table name assignement ?
And I refuse to use f-sting
you cannot do parametrised queries on table name, column names, etc...
You shouldn't be in a situation where you're creating tables based on user input though, so technically it should be safe to format the string.
if you are in the case where you're generating tables based on user input, then you're probably going the wrong direction
What is the fastest SQL Server for Reading and writing to, i am storing 20 billion records and i want the fastest response time possible, i am experimenting this on a Azure Server with fast NVMe speeds.
Hello, I need a db and I am stumbling at the first hurdle. All I have done is follow the unified sqlalchemy tutorial and step 1 fails for me. I cannot from sqlalchemy.orm import DeclarativeBase because I cannot import anything at all from sqlalchemy.orm, I get ImportError: cannot import name 'DeclarativeBase' from 'sqlalchemy.orm'. This is a default brand new installation (1.4.44) and therefore there is little for me to debug. Can anyone offer any suggestions?
ok cool so installed beta 2.0.0b3 and now it functions, despite the documentation being clear that this tutorial is also for version 1.4 ("SQLAlchemy 1.4 / 2.0 Tutorial"), but this new declarative format does not appear to function in 1.4. This is not indicated in the documentation, so I will just stay with 2.0 because I have no idea what parts of the doc will work for 1.4, despite the fact the rest of the doc had thus far indicated when there was a version mismatch and the new DeclarativeBase is not mentioned in the "migrating from 1.4" docs. Truly, open source is a blessing but also a blazing, hellish curse.
Do you actually need SQL relations and JOINs or can you just use a wide column.
If the answer to that is yes then use Scylla and CQL over SQL.
otherwise:
- if it's timeseries data and heavily relational:
timescaleDB - If it's timeseries and not heavily relation (basically no joins):
clickhouse - If it's anything else:
postgres
We can add here that Apache Cassandra Cassandra is greatly scalable for writing
+As universal writing/reading thing can be used Mongo DB... At least works well for Amazon
But in reality the fastest alive there is Redis for writing and reading.... For 20 million records u can easily run out RAM though, plus u can easily have data lost since it is in Ram and only sometimes synced to drive (it has difference mods though)
you could also do something like dumping data into some super fast blob/schemaless storage and processing it later into a data warehouse
idk how fast blob stores actually are though, never benchmarked