#databases
1 messages · Page 81 of 1
I've some background in analytics, so that's not my concern.. my concern is more of how to support mobile applications because I don't have any background with those
no but like what kind of analytics are you expecting to be doing
e.g.
i have a sensor
i want a realtime display of what that sensor is putting out on my phone, with full sampling granularity
vs
i have a fitbit, i want a daily summary of my sensor data
completely different solutions will meet those needs
yes, but I'm not going to be involved with serving the mobile application.. and focusing right now on how to get the data to the application and making it available to its developers
yeah i can't really tell what your scenario is. Things that might be worth checking out: ksqldb, postgres, clickhouse, apache flink, spark-streaming
if all you need to do is sink a high resolution kafka feed and summarise it into time buckets, clickhouse can do all that in one shot
i'd still be moving data out of clickhouse into pg for long term storage though
as far as autoscaling, nah. Just pick a large enough cluster size from the start. And how you scale up/out will be governed by what you're doing (read heavy vs write heavy etc)
i'd say 19/20 times i encounter organisations trying to manage complex clusters for their data, the whole thing could have been done on 1-4 high spec servers instead and cost a fuckton less in both development and opex
spark-streaming for?
I'll take a look at clickhouse to serve as a mobile application backend for data upto 1 year, and if users want to retrieve data beyond that from postgres, would that work?
no idea because you never described what the actual usage patterns for the data are
clickhouse is an analytics column store
if that's what your workloads look like, great you can serve directly from it
ok, my usage pattern is this:
- Users want to retrieve historical data for their device. 2. Users want to see a live feed of their device data.
All analytics related to the data being served on the mobile application, happens on the mobile application.
why doesn't the live feed get done locally on the phone
why does it need to traverse through your infra
because the live feed is from an iot device that might be further away from the mobile phone.. and connectivity to the iot device is managed by the network
ok well if you want full sampling resolution live feeds with things that emit something like 1 datum per 1-10 seconds
databases are pretty much out
that's where something like flink or spark-streaming can come in, and you then stream the result to the end user
while also dumping it into a db for long term history
ok, that solves 1.
also if you don't actually need to do any complex processing or multiplexing of the data you can probably just take your kafka stream and dump it into a websocket server to feed to the end users
pardon my ignorance, but I'm only aware of how we can run queries on a DB.. so wouldn't there be a huge load on the DB service when there's hundred thousand users trying to query for historical data?
historical data comes from a db
realtime stream comes from the realtime hot path
an example topology would be iot devices -> kafka -> (streaming server, write records to db)
I was hoping to cut out reliance on traditional db and retain data up to a year on ksqldb
then i might query the historical db for records up until now to initialise my chart and then start appending the realtime feed records to it as they come in
you might be able to do that, i'm trying to answer similar questions and trialling it at the moment
but i'm treating it the same way as clickhouse, an exotic accelerator and not a viable primary store
with something like pg there's high reliability and well documented patterns for how not to lose data and recover from fuckups
with the exotic stuff like ch and ksqldb it's anyone's guess
you also need to think about the ratio of writers to readers
if i have 10 readers for every writer than i might have 1 node sinking data but then replicating it to 10 nodes for serving it
also if people are actually querying a year of data at high res, that's a fuckton of bandwidth
you might be bound by the io and network throughput rather than the database itself
what you're asking is a pretty big architectural design thing not just a database question
and i can't stress this enough, if you don't actually have hundreds of thousands of users querying and pushing data on a constant basis, don't worry about any of this. Worry about getting users first
I understand.. but we do have lots of resources and a lot of load
I dont have background in databases or mobile applications.. which is why I'm trying to find alternatives so no one else can screw this up for me by asking me to support something legacy
what do you mean by the 10 readers part, I dont get it
for each iot device or whatever source you have
how many devices are watching the realtime stream or querying for it
https://www.enterprisedb.com might be worth considering
it auto scales so there's that if you really want it
is there something open source I dont have to pay for...
else I could just pay the license for ksqldb
ksqldb is open source what license
yeah, it's not..
?
you're not making a competing saas offering
the only way to trip that clause if you try to take ksqldb and extend or rebrand it and then compete in the same space
and as far as i know, there's no option to get a license that negates it?
the second part
Excluded purpose
so I can't make a PaaS offering that incorporates it, that competes with any service that offers it
again idk what you're doing, but 'competes with' is a pretty narrow envelope
I'm building something like cloud data fusion, but where people can manage their iot workflows
i'm not a lawyer but our legal dept did scan this and approved it for our paas project
check out that book i linked
it will help you figure out the right stuff to consider
ok
the main issue i have with ksqldb is it's very immature, not super well documented, and doesn't yet have a community critical mass. So if you run into some issues, you're pretty much on your own
I'll go through the legal terms with someone.. it got me wondering if in case I use any other traditional db, do I still have to pay a license.. because I'm offering it up as a service
and since i'm offering it on our infrastructure.. does that mean I'm liable just like a cloud provider would be
yeah that can get complex. For permissive licenses it's pretty much do whatever you want, for licenses like the confluent one it's generally 'don't just take our shit and extend/rebrand it to compete with us'. Incorporating those components as part of a larger composed offering is generally fine but again IANAL check with your lawyers
i think the confluent license is specifically targeted at AWS
they're notorious for taking foss projects, slapping a new name and some minor changes to them, and then offering them up as saas/paas
lol
I think I need to read an entire book on databases before Im going to be good at designing them.
Hi !
I'm trying to finish a query and don't understand how to implement my last statement :
SELECT
Relic.Name,
Relic.Era,
Relic.Quality,
GROUP_CONCAT(`Pseudo`) AS Pseudo_g,
GROUP_CONCAT(`Quantity`) AS Number_g
FROM
Relic
WHERE
Relic.Era = 'Lith'
INNER JOIN User ON Relic.IDOwner = User.IDUser
GROUP BY
Relic.Name,
Relic.Era,
Relic.Quality
I want to add a
WHERE Era = 'Lith'
But everytime i get syntax errors :/
Thans in advance !
Look, i edited it above
All the query without the WHERE was workin
But i want to add a layer of filtering
SELECT
Relic.Name,
Relic.Era,
Relic.Quality,
GROUP_CONCAT(`Pseudo`) AS Pseudo_g,
GROUP_CONCAT(`Quantity`) AS Number_g
FROM
Relic
INNER JOIN User ON Relic.IDOwner = User.IDUser
WHERE
Relic.Era = 'Lith'
GROUP BY
Relic.Name,
Relic.Era,
Relic.Quality```
THis should work
After the JOIN ?
obviously
It works ! Thanks a lot !
I often struggle to understand the syntax logic in SQL.
You place the where here, because it filter after the join ? I initially tried after the group BY, but wasn't working either.
GROUP BY is always the last
because it first needs the data before it can group
OH thanks a lot, it 'll be really usefull for me in the future !
one exception btw, for GROUP BY always being last. Having. I hate having.
because it filters the group
So having is after group by
https://www.postgresqltutorial.com -- is this a good website?
Learn PostgreSQL quickly through a practical PostgreSQL tutorial designed for database administrators and application developers.
You know how certain books in Python will raise debate about how questionable their material is while others will be authority material? I'm trying to make sure I'm learning from the agreed best source if there is such a thing.
the postgresql doc itself is very good
Cool. I found that was also true of Django docs so I'll do that then
But..
I am very new to databases period.
So it may not be targeted at me
They do explain about concepts in the begining
Good to know.
Welp, they endorse that website in their docs, too so
I might continue with it since I already started
PostgreSQL 10.9 (Ubuntu 10.9-0ubuntu0.18.10.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 8.3.0-6ubuntu1~18.10.1) 8.3.0, 64-bit
I should upgrade to 11 right? I had this installed automatically from ubuntu server packages in the ubuntu os installation
I want stable and secure, not really concerned with features
I might just stick with this. It just seems one version behind and it seems like it would be best practice to make sure I have the most to date secure version before I even get started
My conclusion is that at this stage in my learning, whatever features it offers are an unnecessary distraction to worry about upgrading it now. I'm just trying to understand this and in all likelihood the first thing I build isnt the thing that will go to production anyway.
If I had started reading this documentation from the very beginning @solid void I can't even begin to quantify how much better off and less trouble I would have had.
Ive learned a valuable lesson about always just opening documentation.
yep, has been my experience a few times as well, before just googling and following blogposts or stackoverflow answer, just have a look at the official doc of stuff, that can save a lot of time.
@acoustic leaf First of all, performance. Second, you can have relations between the tables.
So if you have a list of cities you make a table with it, if a user comes from one of those, you only insert the id of the city into the city field of that user. That is called a relation.
It does not only save space, but also create logical connection between the fields
keeping some of your sanity.
Am I able to get the ping to the database through asyncpg?
If you need a database for running scripts so you don't have to query scraping a website a gagillion times, do I use a framework, do I do raw queries or do I use a module? If so, which framework / module?
And also, is there something like Eloquent ORM for python?
what exactly are you looking for @torn sphinx? storing data from scraping or what?
How can I keep my db updated? For example, I have a db that stores a players kills in a game, but their kills won't be the same in like an hour or a month. So how can I keep my db always up to date?
If that makes sense...
well, you make a request to your db to update the information when that happens.
you could just save the current kill count, or you could save each individual kills, and count them to display the info, depending on how much data that'll make and how much info you need for your app.
Well I'm using a limited API so I can't make like constant requests to the API.
@solid void
To keep grabbing data.
then only update up to your limit?
Okay but how do I update it every like hour?
I feel like I'm not explaining this well.
just put your program to sleep for an hour I guess? that's up to you
yeah we don't follow I guess. you first asked how to update a database entry now it's more about the API
that's entirely up to you. that's like tshirt said, you make an update request to your database to update those entries
Oh, hello, Dexter.
Haven't seen many people with the name. For me, it's just an alias
any1 here has good understanding of elasticsearh?
!ask
Asking good questions will yield a much higher chance of a quick response:
• Don't ask to ask your question, just go ahead and tell us your problem.
• Don't ask if anyone is knowledgeable in some area, filtering serves no purpose.
• Try to solve the problem on your own first, we're not going to write code for you.
• Show us the code you've tried and any errors or unexpected results it's giving.
• Be patient while we're helping you.
You can find a much more detailed explanation on our website.
SELECT * from users where server_id=$1 and $2=ANY(json_object_keys(contacts));
contacts is a json column
I am trying to search whether $2 exists in a row's contacts's keys
but I'm getting this error:
ERROR: set-returning functions are not allowed in WHERE
any workaround for this issue?
using psql fyi
Elasticsearch:
I want to do a "sum" sub aggregation for each user and get top 10 ordered by that "sum" sub aggregation. How would I go about this?
According to official documentation, this is not possible as sorting by sub aggregation, with size, increases the error on document.
is there a way in sqlite to be like:
SELECT points, id FROM users WHERE points > 0 ORDER BY points DESC and then find where in that list id: 12 is?
so id: 12 would be 5 from top since he has the 5th most points
aka find what position a certain row is in the selected rows when ordered
ROW_NUMBER () OVER () is what i was looking for
SELECT rows from ( SELECT ROW_NUMBER () OVER (ORDER BY points) rows, points, id from demo) where id = 12
ok so
i'm having a strange issue
my final queary looks line
SELECT rows FROM ( SELECT ROW_NUMBER () OVER (ORDER BY total) rows, total, user_id, guild_id FROM users WHERE guild_id = '674355323972091948' ) WHERE user_id = '146416331896520704'
which works fine in DB Browser for SQLite
but aiosqlite says OperationalError: near "(": syntax error
???
why does aiosqlite hate me
did you put your request in double quotes?
yes
can you show the bit of code that does the request?
here's my ungodly long line
cursor = await db.execute("SELECT rows FROM ( SELECT ROW_NUMBER () OVER (ORDER BY total) rows, total, user_id, guild_id FROM users WHERE guild_id = '674355323972091948' ) src WHERE user_id = '146416331896520704'".format(guild=str(ctx.guild.id), user=str(member.id)))
i thought it was me splitting the request into multiple lined string
but
making it one line didnt fix it
the format is left over
instead of the hard coded numbers it usually uses the format
but i wanted to test if the format is what was breaking it
which it is not
just havnt turned it back yet :P
the src was to test
it wont run if i remove that
ill show ya
gone, and still gives the error
here's what the line looks like normally
cursor = await db.execute("SELECT rows FROM ( SELECT ROW_NUMBER () OVER (ORDER BY total) rows, total, user_id, guild_id FROM users WHERE guild_id = '{guild}' ) WHERE user_id = '{user}}'".format(guild=str(ctx.guild.id), user=str(member.id)))
oops
yeah, that's not your problem here, but that's not the proper way to inject params, but as you said you have the error even with hardcoded ones.
i like to use format because it looks nice and makes me happy :p
i'm not sure what goes wrong here 😐 no experience with aiosqlite myself, i don't see how it would parse the thing differentely.
it should just pass it through
yeah i dont get whats wrong
you tried with simpler functions to see on which ( it would fail?
simpler?
fair fair
it could have some weird syntax requirement that neither db browser or sql online has
its def not the select rows from where guild id = guild id
that ive tried
so
its very likely the row_number or over ( order by)
select row_number () over (order by user_id) from users
it still hates that
i'm curious about why you put a space before ()
it's a function name, no?
i get that sqlite doesn't care as you ran it like this in it
but maybe it confuses aiosqlite
cursor = await db.execute("select row_number() over (order by user_id) from users")
still broken
i did that because
the example code i was following did /shrug
ok
np, now we are sure that's not the problem
Window function support was first added to SQLite with release version 3.25.0 (2018-09-15).
im on that page rn
i would check that the version of sqlite your python links against or builds into, is superior or equal to that version
it's not that old
idk what version of sqlite3 aiosqlite uses
sqlite3 is a standard python module
i know
probably aiosqlite wraps that
well
im using python 3.7
maybe its just time to update python
(and break fucking everything lol)
😬
you can check first though
import sqlite3; print(sqlite3.sqlite_version)
3.21
fffff
FFFFFFFFFF
FFFFFFFFFFFF
i guess im updating python then lmao
guess im going to 3.8 since 3.9 isnt out yet
here on linux i'm seeing the same version in 3.7 and 3.6, so i think it comes from the system, so the details might depend on your os/distro
damn
nope
the newest sqlite3 is 3.31.1
h
python ;-;
well, thanks for helping
♡ @solid void
happy to help, seems 3.31 should be good then! 🙂
Hi all, so I'm not sure if this is something that's relevant for this server, but I wasn't able to find a better place to ask about this.
I run a moderator bot on Discord (it's python based) and one of the issues I've been running into is that one of my databases (Postgres) needs to have bi-monthly mainteance. While this is part and parcel of any kind of bot that gets lots of traffic, the problem is that to perform the maintenance to the highest quality, I have to shut down the bot for up to 20 minutes so that I can run the maintenance tasks with an exclusive lock across the database.
Running maintenance windows every 2 months is also not particularly wise either, because in some cases, there are spikes of activity to the database that can cause problems much earlier than this.
So my questions are really this:
- How do I configure the autovacuum daemon in a way that it'll run only when there's a large amount of dead tuples from inserts/updates etc?
- How do I make sure that autovacuum will do VACUUM, ANALYZE and REINDEX?
- What is the most effective way to partition my postgres database table (which contains over 25.5 million records) in a way where each partition is manageable and not slow?
- In addition to the above question, how would I develop a trigger that auto creates new partitions on the fly as and when the conditions have been met? Currently, I don't store datetime objects (i convert them from a message id) but should I consider including this to make partitioning easier?
Thanks for the help if you can provide it! (and if this server isn't an appropriate place to ask, do let me know where a better place is 😅)
@tender dagger oh, i misread you last night though it was up to date in latest python, if it's not you might need to compile python yourself to link it against the proper libsqlite3, or maybe it's actually dynamicaly linked in your system and you should be able to just update libsqlite3 in your system.
if i do apt show libsqlite3-0 i see the same version number that python gets when i run it.
so finding a way to upgrade that could be enough.
In SQLITE, how can I insert a timestamp in the SQL as iso8601?
Here is my current statement: SQL INSERT OR REPLACE INTO t_character_info(characterId,alliance, corporation,faction,ancestry,bloodline,race,gender,securitystatus,birthday,timestamp) VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, DATETIME('now'));
I know I can just generate it with Python, but I'd like a sql solution
Does someone know a good ORM for python? I know the one used by Django is good but I don't use the framework
SQLAlchemy is basically the python go-to afaik @lofty summit
Yeah but I've heard mixed feedback on it
Lemon talked about it yesterday, have anyone used it?
I tried to use psycopg2 driver directly, but basically I was reinventing the wheel...
Well maybe I didn't use it in a proper way, but what I was doing was just creating SQL queries with some args in strings and sending them
Using an ORM is like working on a higher level, and therefore it seems faster to code with it @cinder sierra
i know what an ORM is, it's okay
I know you know
if you're looking for easy then yes, ponyorm should be your go-to. if you're looking for full suite and capabilities, go with SQLAlchemy
Why do you use psycopg2 directly? When the database become a bit complex, I find writing directly SQL queries hard/boring
I thought like that before, but now I find the price to high for that, I and hope ORM are well made enough not to constrain me
Sqlalchemy is nice, except when it's not and you spend time finding how to express something properly through the abstraction, and doing without is not so hard, both have merits.
I want to find all links in my mongodb thats first_seen is less then now. but somehow my query never matches. anyone can help? ```
urls = collection.distinct(
'link',
{
'first_seen': {
'$lt': datetime.now()
}
}
)
date is ISODate in the database
Hello I have a question. What terms do you wish you knew before going into this part of python? Please DM me as i might miss the answers!
@fervent valley You want to know what normal forms are and SQL
You also want to know the impact of PRIMARY KEY and what NOT NULL and UNIQUE do to a data set. Knowing basic data types is also a requirement
SQLITE does not statically type, but a lot of DBMS do
But then I guess that isn't the point of views
You can't use the ? for that type of replacement, it would only be suitable for WHERE clauses for comparison. You have to edit the query string
Yeah I have thought wrong
but be careful because that opens you up to SQL injections
Views are supposed to make things faster, tha tis why they are static
I know I can construct the sql string myself, but if you do that all world yells at you
no matter if there is no user input at that point
I have database models and they all share 2 functions that are the same
only the table names differ
I'm not sure what you mean "views are static", views are basically SQL queries stored in the database
has_entry() and is_expired()
Yeah, but views are meant to be static
at least the theory says so
Anyway, I'll just pass the whole sql string
It's clunk and verbose, but alas
my error:
unsupported operand type(s) for +: 'NoneType' and 'int'.
my code:
user = await self.bot.pg_con.fetch("SELECT * FROM punish WHERE user_id = $1 AND guild_id = $2", author_id, guild_id)
if not user:
await self.bot.pg_con.execute("INSERT INTO punish (user_id, guild_id, mutes, kicks, bans, warns) VALUES ($1, $2, 0, 0, 0, 0)", author_id, guild_id)
user = await self.bot.pg_con.fetchrow("SELECT * FROM punish WHERE user_id = $1 AND guild_id = $2", author_id, guild_id)
await self.bot.pg_con.execute("UPDATE punish SET warns = $1 WHERE user_id = $2 AND guild_id = $3", user['warns'] + 1, author_id, guild_id)
yes warns is a int value
In this context, it is apparently None
put a print(type(user["warns"]) there and see what it spits out
@shell drift
I think I have broken SQLite
It always says "database locked", even after reacreating it
doesn't it use flock?
@shell drift I'd assume a NULL value is being converted into python's None. print(None + 1) returns the exact same error
@rancid root do you think I can use apache druid as a mobile application backend
I haven't found a good option to support as a mobile application backend.. for all my iot use cases
right now, it's like device protocol layer -> kafka -> transformations (possibly with sparkSQL and streaming) ----> Druid--> metatron (for visualization)
device protocol layer -> kafka -> transformations (possibly with sparkSQL and streaming) ----> Backend for mobile applications (???)
^ this part is missing
@rancid root could you suggest some options.. preferably open source
did you read the book i linked
yeah that's because the fact that it's a 'mobile' app changes nothing
focus on the data flows you need to support, they're pretty agnostic of what the client physically is
number of writers, number of readers, typical types of query, latency requirements, SLAs, HA requirements, security requirements
that should dictate your choices
SLAs?
more broadly, uptime requirements and timings around managing downtime
hmmm
I found this
@rancid root https://www.yugabyte.com/yugabytedb/
and this
Thanks @rain wagon !
does anyone have any knowledge on sqli injections
my python sql dont refresh the changes from other computer like phpmyadmin
like when i update a value on mysql via php i cant see the changes on my python script
i have to re-run the script to see the changes
???
I've been trying to write a MYSQL query that returns a list of groups a given member (users.id) is not a part of. Essentially, the flow is this: When a user joins a group, a row is added to group_members (groupID, userID) -> (Example: user (id: 5) joins group (id: 3), new row in group_members = 5 | 3.
I can get a list of groups that a member belongs to with no problem, it's just a simple left join - but when I try to do the inverse, I can't seem to get it right and I'm not quite sure why.
Attached is the diagram. Thanks in advance for your help!
TL;DR: I want to return a list of groups that a given user is not a part of.
Should work with inner join
can someone help me?
SELECT username, email, fname, lname
FROM users
INNER JOIN group_members AS gm ON gm.userID LIKE users.id
INNER JOIN groups ON gm.groupID NOT like groups.id```
Not sure if this works, but should
@red tapir
@rain wagon do you know how to help me ?
@stark lion There is no issue with what you describe
Databases don't live update, you have to query data
@stark lion then phpmyadmin didn't commit the changes yet
The problem is that I'm looking for a list of groups that the specific user isn't in. When I try to reverse the query you wrote (grabbing data from groups -> inner join group_members -> inner join users), it essentially just returns what's already in the group_members table.
I'm not sure how it'd work, but I need to get a list of groups that a user isn't in. So, if user "admin" (id 1) is in groups (5, 6, 7), I'd want to return groups (1, 2, 3, 4).
i will show the new value
what is cli
command line interface
k let me check via putty
Why did you reverse my query?
The last line was supposed to filter out users in a group
| 1 | ΔΗΜΗΤΡΙΟΣ ΠΡΕΒΑΖΗ | 69696 | 2 ΕΠΑΛ | Δ1 | 0 | 1 | 100 | 100 |
I mean, I could be wrong, Inner join was never my favorite
did you run the python script AFTER you have updated the value?
That's why I was thinking left join would be the way to go, since I want the data from the groups table filtered through the scope of group_members. Essentially in pseudocode it'd be smth like: If ([groupID, userID] not in group_members) {return groups}
I flipped it around because I needed groups returned, not users haha.
oh
i run the script before i update a value on phpmyadmin
meh
SQL can't see into the future
Because it's a query language.
mycursor.execute("SELECT * FROM members")
MyResultMembers = mycursor.fetchall()
print('\n')
print(tabulate(MyResultMembers, headers=['ID','ΟΝΟΑΤΕΠΩΝΥΜΟ', 'ΑΡ.ΜΕΛΟΥΣ','ΣΧΟΛΕΙΟ','ΤΜΗΜΑ','ΕΠΙΣ/ΕΒΔ','ΣΥΝΟΛ/ΕΠΙΣ','ΒΙΒΛΙΟ 1','ΒΙΒΛΙΟ 2'], tablefmt='fancy_grid'))
print('\n')
here the code shows all the members
why it dont get update for the new values
Dude, again. If you run the script before you change a value, then it won't work
You need to query after each change
Because SQL is a query language.
yes, run another select
how?
wait
no
you dont understant
def ShowMembers():
clear()
display = [{"ΜΕΛΗ ΤΗΣ ΒΙΒΛΙΟΘΗΚΗΣ"}]
print(tabulate(display, tablefmt='fancy_grid'))
mycursor.execute("SELECT * FROM members")
MyResultMembers = mycursor.fetchall()
print('\n')
print(tabulate(MyResultMembers, headers=['ID','ΟΝΟΑΤΕΠΩΝΥΜΟ', 'ΑΡ.ΜΕΛΟΥΣ','ΣΧΟΛΕΙΟ','ΤΜΗΜΑ','ΕΠΙΣ/ΕΒΔ','ΣΥΝΟΛ/ΕΠΙΣ','ΒΙΒΛΙΟ 1','ΒΙΒΛΙΟ 2'], tablefmt='fancy_grid'))
print('\n')
bye = input("ΓΡΑΨΕ ΟΤΙΔΗΠΟΤΕ ΓΙΑ ΕΠΙΣΤΡΟΦΗ: ")
here is my def
i run this def after the update
and it show me the old value
You are also reusing the cursor
you are not supposed to
get a new cursor for each query
or at least do mycursor = mycursor.execute("SELECT * FROM members")
But it's really better to get a new one
wait
def ShowMembers():
clear()
display = [{"ΜΕΛΗ ΤΗΣ ΒΙΒΛΙΟΘΗΚΗΣ"}]
print(tabulate(display, tablefmt='fancy_grid'))
mycursor = mydb.cursor()
mycursor.execute("SELECT * FROM members")
MyResultMembers = mycursor.fetchall()
print('\n')
print(tabulate(MyResultMembers, headers=['ID','ΟΝΟΑΤΕΠΩΝΥΜΟ', 'ΑΡ.ΜΕΛΟΥΣ','ΣΧΟΛΕΙΟ','ΤΜΗΜΑ','ΕΠΙΣ/ΕΒΔ','ΣΥΝΟΛ/ΕΠΙΣ','ΒΙΒΛΙΟ 1','ΒΙΒΛΙΟ 2'], tablefmt='fancy_grid'))
print('\n')
bye = input("ΓΡΑΨΕ ΟΤΙΔΗΠΟΤΕ ΓΙΑ ΕΠΙΣΤΡΟΦΗ: ")
like this?
yes
yea i did it before
again the python have the old values
on phpmyadmin and my server(putty) i see the new values
and i run the def ShowMember() after i update the values
how could i modify a column that is used as a foreign key in another table
@lime cobalt Turn off referential integrity, modify it on both tables, make sure the transfer was successful, turn referential integrity on again. However, the better way would be to export the tables, re-create them and inject the data again.
If it is sqlite, that does not care about referential integrity unless you pragma it after connecting
is learning DBMS necessary for data science ??
or any web development related stuffs ??
@rain wagon i found```sql
set foreign_key_checks=0;
@feral spruce Webdev definitely
someone able to help me with a mongodb query?
thnx @rain wagon
@lime cobalt It is different for each dbms, so use what your manual tells you to
I would like to be able to have 2 computers (1 local windows, 1 VPS ubuntu) be able to access the same database. How would I go about being able to do that? And because I would like to access the database quicker on the VPS, is it possible to set the database local to VPS and remote from my computer?
@reef hawk you just set up the database on your vps. then youd use localhost:<port> to connect to the db on your vps. on your local computer you just use <ip_of_the_vps>:<port>
ofc make sure that your firewall on the vps doesnt block you
hmm okay, thanks! do you know what's the best resource to learn about this stuff?
I've been reading on it and it says I need to make sure it has to listen as well?
tbh I think you should start with the basics. what is an ip, what is a port, localhost vs remote, etc. maybe start here https://commotionwireless.net/docs/cck/networking/learn-networking-basics/
Community Technology Field Guide: (Re)Building Technology
np
I made a discord bot and my database is mysql. I use mysql connector in the code. Now the problem is that I get errors after running it for a few hours. errors.OperationalError('Mysql connection not available') is what I get. It can't be an idle timeout as it constantly reads and writes data as the bot is a level system. How can I make it so that the bot will reconnect to database when this error is caused? Anyone can help?
Also in the code, the connection is a function and I call the function every hour incase it disconnects. But once it is disconnected, the function doesn't work. The only fix I know is to restart the bot.
aight so i have a mysql database for discord leveling system and i made all id related columns of type integer
but ids on discord are massive numbers
so i get the out of range column error
is there is a way to change every column that has 'id' in the names' datatype to varchar
I'm not on mysql right now, but you could try getting the columns and checking with LIKE for '%id%' .. not sure it'll work, but worth a shot
then cast the columns after checking type
You know.. what I don't get.. why we use = instead of ==
Im fairly new to database programming in python, so hear me out here. I have created an RDF graph, a fairly simple one, that i then parse and save on a .txt file, which i then copy, paste and update to my database, but i am gettin a weird prefix on all my triples called "ns1" that i have no idea where comes from. It just appears out of thin air, and is an empty prefix that messes up my queries in the database.
it looks like this @prefix ns1: <http://> .
@lime cobalt u are storing IDs and IDs are only integers
@grand lark thanks for the info but is there is a way to change all columns that contain ids instead of manually altering each one
All of the id columns end with _id
I'm looking to transfer the db from a local one to a remote one
however, I want to designate the remote server to have the local db, and the local computer to be remotely connected to it
well, just make them both connect to the address of the remote server
Hi all, I'm new to this community, just joined a couple days ago.
Just wondering if I have SQL related questions, should I ask in one of the help channels or here is fine?
Here is probably fine
ah ok
Was wondering how to query this on SQL.
To give some context, I have a table that is presented in as "events" or logs that shows changes made to (let's say) an account.
Table Event
|event_id|acc_id|changes |timestamp |
|1 |1 |"status": ["pending", "reviewing"]} |2019-01-01 07:00:00|
|2 |1 |"status": ["reviewing", "passed"]} |2019-01-01 16:32:10|
|3 |1 |"status": ["passed", "activated"]} |2019-01-01 22:10:59|
|4 |2 |"status": ["reviewing", "passed"]} |2019-01-02 18:34:22|
|5 |2 |"status": ["passed", "activated"]} |2019-01-02 15:09:43|
But for the end result, I would like to aggregate the acc_id's together into one row and have the different timestamps as new columns, something like:
|acc_id|reviewing timestamp |passed timestamp |activated timestamp
|1 |2019-01-01 07:00:00 |2019-01-01 16:32:10 |2019-01-01 22:10:59
|2 | xx |2019-01-02 18:34:22 |2019-01-02 15:09:43
what sql server are you running
postgres
sorry, cant help; I only know mysql
ah ok no worries
pg_dump -C -h remotehost -U remoteuser dbname | psql -h localhost -U localuser dbname
what does the remotehost stand remoteuser stand for, vs localhost and localuser?
ah ok nevermind it's the IP address right?
yes
and user would be postgres if I didn't edit it?
remotehost = db to connect to, remoteuser = user on the database
okok thanks!
Has anyone already tried to use redis with python?
I am pretty sure people have
how do I find the remotehost / localhost address for postgresql? I don't have much clue where to find it 😔
where is the database hosted (what is their IP address / domain )
or is it your own?
it's all my own, I'm reallly dumb with this stuff
basically I have a local computer running on windows, and a VPS on Ubuntu
I'm trying to transfer to database over to the VPS, and start running the the db on the vps
oh fk wait ik the vps one
is it just the ipv4 that's shown in the VPS?
pg_dump -C -h localhost -U postgres (database on your personal computer)
| psql -h (ip of your vps) -U postgres (database on your vps)
just splitting the command for ease of reading, put back together upon running it
ah okay, and the ip is the ipv4 right?
yep
I get an error that says FATAL: no pg_hba.conf entry for host"x.x.x.x"
I'm assuming that just seems I have to put an entry point in my config file?
or could that mean another error
(while trying to access the server on PGadmin4 on my local computer)
or actually when trying to paste it too
sorry, im not a postgres fella
np
instead of localhost, how would I be able to find my ip address of the local computer? I think I need to add that to the remote
Are you on your local computer with the database?
yep
localhost will always be resolved to 127.0.0.1 - changing localhost to anything else will not change the command.
I'd suggest copying dump file over with scp
or starting a ssh tunnel
exposing your database over public IP may be dangerous
ah alright, thanks for the info
(postgresql will connect unencrypted by default, I think)
tunnel should be easy, like ssh hostname_you_usually_use -L 54321:localhost:5432 - then localhost:54321 will be a tunnel to postgres inside hostname_you_usually_use so you can use localhost:54321 in pgadmin on your local machine to connect there (5432 is postgres port, I've added 1 at the end of the local port to avoid conflicting with your local database)
I am using flask-sqlalchemy, I want to change some column names, is there a way to update my database, keeping the current data?
There should be. Are you using sqlite?
If you're not, then you're looking for Alembic
Just learning python , what sql software to use for database create edit and all
@rancid root I have a predicament
so, I'm using kafka to manage my event streams, I'll use an open source schema registry to maintain the schemas.. but
lol
the thing is, my source of streaming data (payload from devices) gives me new json every few seconds
the payload contains information for n devices as inner jsons within a json array
yep
the problem is.. not all devices send information at the same time.. so if I have 10 devices, I might receive payload for 1 device one time, then after a few seconds later few more devices.. and later, the rest of the devices among the 10
so how do I account for this, when I want to convert payload to Avro and write to a sink for example
ok, now I realize it
if they are all devices, they go under the same columns anyway..
phew
im so stupid
it'd be great if you could let me know of some gotcha's when handling data like this.. or your experience with those telecom data streams
so I know what to design and expect
ok so my internet connection died... random. On hotspot now
what you're describing is pretty typical, idk how you're ingesting the data so the answer is general
if it's something like clickhouse, it'll do it natively
if you're at low level, just parse out the individual records that make up a row and push them onto the sink
yes, but to infer that they are individual records, I need to see the schema first, or allow users to see it..
I found a sink that can serve as an application database.. yugabyteDB
looks promising
the schema is, an array of dictionaries representing a row each
if i'm understanding correctly
doesn't matter
you know the structure of each dict
just iterate on them
again clickhouse does this natively
idk about yugabyte how it works in that scenario
the thing is, I was hoping to do this early on in the pipeline, and then convert to a lighter format like avro to write to the DB
and another thing about doing it early in the pipeline is being able to write it out for doing analytics
shrug then just write a kafka worker that eats a list of dicts and spits out individual dicts or something
keep in mind that will cost in perf, kafka is bottlenecked by messages per second rather than bytes per second
Hm. I believe Alembic has problems with SQLIte, you still want to use it probably, but I'm not near experienced enough to be able to help you
If I were you, I'd dump my data to something like JSON, and start again with SQLAlembic from the start - but its very much a hacky solution
MySQL db question: am I able to update table add column... and then fill that column in based on the data being sorted a particular way (ORDER BY...)?
actually nvm there is an ORDER BY RAND() 😄
How come when I use the host ip address instead of localhost for postgres, it requires authentication? e.g. it doesn't recognize its own IP address
and if I put the IP address instead of local host, would it be a remote connection instead or still be local?
There are rules for who needs to authenticate.
yeah, but I'm surprised it wouldn't recognize it's own IP address
it works when I leave it as localhost
Its not meant to for security reasons.
Even when connecting to localhost, it will still use sockets and still be a remote connection.
oh i see
I'm basically trying to make it fast I guess..
would there be a difference if I put localhost vs its own ip address?
In speed, yes; but its so minimal and for your use case its not a deal breaker.
If i install mysql on windows
It will start automatically when the computer starts or i have to run it every time i open my computer?
Can I insert 4 value in one column
did you have an option during install to use mysql as a windows service @stark lion ?
but i would assume there are some options to have it auto-start if it does not do that already
I'm trying to collect all columns in a table to display, if the column is not equal to 0. How would I go abouts doing that using asyncpg?
e.g.
useritems = await self.bot.pg_con.fetchrow("SELECT * FROM items WHERE user_id = $1", user_id) #if the column is not 0
CREATE TABLE items (
item_id BIGINT,
guild_id BIGINT,
title VARCHAR(30),
info VARCHAR(400),
price INT,
reply VARCHAR(3000),
given_role INT,
removed_role INT,
dm_msg VARCHAR(3000),
PRIMARY KEY (item_id, guild_id),
FOREIGN KEY (guild_id) REFERENCES guilds(guild_id) ON DELETE CASCADE
);```
#1215 - Cannot add foreign key constraint
why
what's the guilds table definition?
i don't think it'll work in mysql
anyway, i'm not sure there is a unicity constraint on the guild_id there
was it defined as such?
CREATE TABLE guilds (
guild_id BIGINT,
min INT,
max INT,
cooldown INT,
PRIMARY KEY (guild_id)
);
what is \d+ guilds supposed to do
it's just a psql command to display basically the same info you showed in that capture, so you can forget about it
oh
https://www.percona.com/blog/2017/04/06/dealing-mysql-error-code-1215-cannot-add-foreign-key-constraint/ seems like there are quite many reasons this could happen
1) The table or index the constraint refers to does not exist yet (usual when loading dumps).
it exists
2) The table or index in the constraint references misuses quotes.
it doesnt have quotes
3) The local key, foreign table or column in the constraint references have a typo:
nope
4) The column the constraint refers to is not of the same type or width as the foreign column:
uh i will try specifying the width maybe?
nope
maybe
5) The foreign object is not a KEY of any kind
isnt having foreign keys as primary keys valid....
that would be pretty broken
i think it means the opposite, and that's not your problem, as the FK is a PK
that was my first question
6 is not your problem either
10 nope too i assume
11 too
not 12 either and not 13 unless it's implicit
so i would maybe add a "NOT NULL" to the "guild_id BIGINT" line
it's already implicit in the guilds table (primary key), so only on the items one
Creating a foreign key constraint requires the REFERENCES privilege on the parent table.
theck you have that permission
yep
Error Code: 1142. SELECT command denied to user 'mHThQOLvY3'@'154.237.72.84' for table 'user'
oof
(also, i pointed to the 5.7 doc, make sure to check with the version of mysql you use)
hm, so that's a good hint you are not using the root user, so maybe that's the reason, though i don't see why you wouldn't have this permission.
oh ok
The permissions you have on your database are SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, REFERENCES, INDEX, ALTER. Others will come in the future, but not at the moment. These can all be viewed on your dashboard.
yeah
so that one seems good as well
that's weird.
maybe someone else more knowledgeable in mysql than me will have an idea, because i'm a bit short here.
well thanks for your help anyway
yw
hello
so my teacher gave me some quaries that i put into pg admin.
and im suppose to find the ID and name of athletes who perform better than someone at some sport
was thinking i would try it liek this
id and name is from a table called person. peopleid is a foreign key in results and result is the records of everyone in all sports
but im getting an error in the last line when i try this
maybe i should rather try to do it like this?
but i also get an error
What does this mean?
sqlalchemy.exc.ArgumentError: Textual SQL expression 'Example' should be explicitly declared as text('Example')
Hi Guys I have that many lines like BR:username:password in file.txt. How I can read each line and get 3 variables a=BR, b=username, c=password . The idea is get everything on txt and send to mongodb ( that part i already know ). Many thanks
csv.DictReader
Thanks I will check that
Why while using Discord.py SQLite databse acts like a ghost? I can put values in it and just after putting data I can get them out, but after closing the connection and reconnecting... There's no data
My code: py try: conn = lite.connect('users.db') c = conn.cursor() print('User database data:') for row in c.execute('SELECT * FROM last_online ORDER BY uid'): print(f'- {row}') print('That\'s all') except lite.Error as e: print(f'Error: {e}') finally: if conn: conn.close()
Searching C:/ for users.db has no effects
Output is of course:
Hm...
when inserting the users that is
Okay, now data is saved ^-^
Searching C:/ for .db file ;-;
It still doesn't exist in script directory
Neither next to __main__.py and python.exe
What is your current project structure?
Are you running your code with Pycharm?
If so, try putting it in the root of your pycharm project
main_folder
| - bot
| | - __main__.py
| | - __init__.py
| | - cogs
| | | - __init__.py
| | | - basic.py
| - README.md
| - LICENSE
```Mostly
If so, try putting it in the root of your pycharm project
@upbeat lily what "it"?
the file you're trying to open
Doesn't work
Also putting it next to __main__.py doesn't work
But... I can read data from SQLite databse
Maybe file is in cloud? ;v
If you can connect to the database at all
you should be able to both read and write to it
Yup
how are you trying to write to it?
Writing works
then whats the problem
you
you create the database
but you can't find the file anywhere
?
Writing code: ```py
conn = lite.connect('users.db')
c = conn.cursor()
sample_data = [
('KH06-GH78-SGYR-8J06', '2020-01-20 07:10:04')
]
c.executemany('INSERT INTO last_online VALUES (?, ?)', sample_data)
conn.commit()
Yup
File doesn't exist
try providing it an absolute path
so something like conn= lite.connect('C:\\path-to-your-directory\\example.db') and see if that works
assuming that works, then the problem is to do with how you're running your python Code I'd guess
Which is an area I'm rubbish at fixing problems in
Works
So this: https://stackoverflow.com/questions/4060221/how-to-reliably-open-a-file-in-the-same-directory-as-a-python-script/4060259#4060259 should also work
What are you using to run your code?
yeah
that should work
although I believe pathlib is a better solution
Project virtual env
Are you using Pycharm, or another tool?
PyCharm
If its Pycharm, you can fix it by messing about with the project interpreter configuration
I think
but the solution you posted is better anyway
SELECT tcs.characterId, ts.name AS station_name, tco.name as constellation_name,
t_region.name AS region_name, tss.name AS solarsystem_name, tss.securityStatus
FROM t_location
INNER JOIN t_characters AS tcs ON tcs.characterId = t_location.characterId
INNER JOIN t_station AS ts ON t_location.stationId = t_location.stationId
INNER JOIN t_solar_system AS tss ON tss.solarSystemId = ts.solarSystemId
INNER JOIN t_constellation AS tco ON tco.constellationId = tss.constellationId
INNER JOIN t_region ON t_region.regionId = tco.regionId;```does someone spot my mistake? The statemet is returning multiple rows for some reason but I don't see it
it should return one row per ID, but it returns 2
Found it, thanks discord highlighting 😄
couldn't spot it for the life of me in dbeaver
So I want to save user generated IDs (that have a relationship in other table) that I generate in groups. For example a group would look like [4, 593, 122, 600, 500] but since they're related to another table to have the relationship done I would need to store one ID per row. The issue is I want to keep track of which group they belong to (maybe on a per-user basis)
I thought of something like:
user_id, group_id, id
1, 1, 4
1, 1, 593
1, 1, 192
1, 1, 600
1, 1, 500
1, 2, 233
...
How can I achieve something like that?
with a foreign key @torn sphinx
Your group table must have a primary key
then store each number or group in a table with the group id as a FK
but remember: atomic data
only one value per field
@rain wagon How would I know what group_id to use? As a serial?
yes, auto incrementing integer
@torn sphinx https://dbdiagram.io/d/5e5b1f8c4495b02c3b879b35
If I understood you correclty
@rain wagon Well the amount of ids per group would vary so wouldn't it be better to have them in different rows in this case?
yes
I was just unsure about the amount
if it is one number per row or otherwise does not matter
but what you have here is a typical n:m relation
and that third table resolves that relation
Yeah I guess it makes sense to do it that way then, thanks
Generally if you're inserting the current UTC time on the database, would you use a default column (TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT (NOW() AT TIME ZONE 'UTC')) or calculate it on the application side (in this case a Python script) and insert it with the rest of the values?
Hello friends. Not technically a Python but it’s linked to my usage of Python.
I need to create a pretty voluminous SQL database (500+ Gb) for my work but it doesn’t need to be super high performance or reliability as it’s mostly for analytics that’ll be computed once or twice a day. Would it make sense then to make it as an AWS EC2 instance with big storage?
Also I’ll run quite a lot of code that’ll make a lot of small but stupid requests to the server, so being on the same machine could help (SQLAlchemy yay)
Hi I am looking to update the rows of a user but I am getting AttributeError: can't set attribute
def updatehandler(User):
print(User.username)
userobject = session.query(users).filter_by(username=str(User.username)).first()
print(userobject)
userobject.username = User.username
userobject.password = User.password
userobject.email = User.email
userobject.firstname = User.fn
userobject.lastname = User.ln
userobject.birthdate = User.date
userobject.about = User.aboutme
session.commit()
return
this is my function
suggest me some some dbms project idea
database management system
@rancid flicker Write a database abstraction layer to get to know the lower functions a bit better
Use sqlite, write some classes that have a database as a backend to store their data
but abstract in a way that you don't need to worry about it
I will think about
Hi I am using sqlalchemy. I have created an Upadte User Info panel where users choose the info they want to change. I want the values that the user has filled to be changed and the rest should stay the same
This is an example response I get with only some of the fields filled
ImmutableMultiDict([('username', 'efa'), ('password', ''), ('email', ''), ('firstname', ''), ('lastname', ''), ('birthdate', ''), ('aboutme', '')])```
So like, if you have a song by Dr Dre featuring Eminem, Lil Kim, Snoop Dogg, and DatBoi... the specific song on the song table has a OneToMany Relationship to the artist table?
Yes
What if you wanted to differentiate between the head artist and the featuring artists? Would each song then have its own table with two columns, one with a OneToOne for the Main Artist and OneToMany for featured artists?
songs <-> artist is somehow many to many, then you can add data to the relation table
Hrm, I think at this point the main thing that would help me understand database relations any better is to see as many example databases with accompanying PSQL/SQL as possible.
Are there any good repos or lists of such things that helped you guys?
The way I learned databases somewhat properly was reading through Wikipedia articles during my database lectures
they have a lot of good examples there for things like normalisation
Never would have guessed.
A database lecture might be exactly what I need! Im going to look on YouTube.
In this video we will begin learning SQL Basics. To get started writing SQL, we need to first set up a server and a database. The server we will be using for these tutorials is Postgres. We'll walk through how to download, install, and setup PostgreSQL. Then we will create our...
Look who it is.
I should have remembered that this guy generally has great content on this kinda stuff I never got around to back when I used to watch his old videos.
Hi I am trying to update a user row using SQLAlchemy
When trying to update the user's attributes I am getting this:
userobject.username = str(User.username) AttributeError: can't set attribute
Please help me I have been stuck for hours
hey guys
how do mobile applications use databases.. like, I'm not familiar with the concepts.. to me it seems like a lot of clients to service.. and I'd like to understand how it works
also, what's partitioning and indexing in this context
they mostly use sqlite as a local db
and use a rest api for accessing data online
Because having database access tokens such as passwords in an app in the hands of a user is not advisable
ok rest api.. check..
but how does one DB service so many clients (same app running on thousands of phones)
because to me, a DB is something you query, and maybe allow a couple more users to query
For something that services many users they typically have many nodes distributed horizontally to share the load
Or that ^^
it would be nice to see a picture or something.. of the architecture
With things like MariaDB you can also configure to store a lot in memory to cache it
The kinds of databases used to run big services like a social network are designed to scale massively
so distributed databases would have lot of nodes, ok.. but data is replicated across the nodes? how do you know where to go for which query
There's a couple of ways to do that
typically an incoming request is just dispatched to whatever node is available, and the data is synchronized between nodes. Some services don't need absolute consistency between nodes to be useful
a financial institution has very different consistency requirements than a social network
Also, with the REST api in front of the db, you can cache a lot of the requests, if there is no new data
There is something called ETag which is used to facilitate that
so every node has the same data to be able to service all incoming requests? that means writes are done to only one node in some cases?
If no new data is there, it just sends back a 301
causing very little cpu/network time
@torn sphinx Yes, with a replicating db, each db sends all DML type queries to the other
there's a couple of different ways you handle consistency - if the DB cluster is "eventually consistent" (many NoSQL systems do this) then the writes are propagated out from the node where they were made without a guarantee that they'll be immediately available in all nodes at the same time
You do need to take measures against deadlocks though
obvs this doesn't work for a bank :D
ok, so in that case some nodes are behind the master
(I hope I'm using the master reference here correctly; i
my mind is on k8s when I think master node, and that does very little over there)
On each db, you can set one slave
so that master sends all DML type requests to their slave
now you activate the slave on the other db for their master
and hence, master-master replication
No, you need to load balance it
oh
So each db gets the next user, one left, one right so to say
and then they replicate each other
so wherever the write happens (in either master) the read can happen on any slave, because it gets replicated across all slaves?
yes
yay im finally getting it
could you tell me what partitioning and indexing are..
it would be great if you could say it in context of a mobile application..say fit bit
This was my final exam work for my profession. It utilizes nginx and mariadb for a load balanced wordpress environment
And 2 master-master replicated dbs
like, how do they keep each user's location info and have it available to be retrieved.. they must maintain one instance of DB serving thousands of clients right?
A database is made to serve a lot of data in a short amount of time
And there are a lot of techniques to make this happen
for example, clustering
memory storage
and so on
listening
What shall I say? On my last job we had a cluster running 30 linux RHEL servers for websites
With 3 memory servers that do nothing but store user sessions
Data was delivered with a SAN
ok, I understand clustering.. you have more nodes to service more traffic
Whcih is a high performance Storage Area Network using Fibre channel
user sessions?
that has about 2.5 tb/s throughput
oh.. the stateful stateless thing.. so you can remember and service users without new login?
yes
ok
got it
this is a fibre switch
eh wrong link
QFabric System,QFX Series,EX4600. Fibre Channel Transport Protocol, How FC Works on the Switch, FCoE-FC Gateway, FCoE Transit Switch, FCoE VLANs, Supported FC Features and Functions, Lossless Transport Support
So, with a data connection like that, you can deliver a lot of data in a short amount of time with a low latency on a very high distance
We've actually put 1km more fibre into one rack, so the latency to both locations is the same
I dont think I need to understand the hw details.. I'm on the data engineering side, and trying to get a handle on what to provide to mobile apps to make our sink available for those applications
it's why I'm trying to understand partitioning and indexing in this context..
It basically splits many tables into smaller slices
but tbh, I haven't personally dabbled with that 😄
like.. for example, in normal analytics context, we would partition by key, which would make it easy to write the messages to specific partitions.. that makes it faster to query for end users
so i'm basically trying to understand, if partitioning and indexing concepts work in serving mobile apps.. in ways to make querying faster and reduce latency
ooh this is useful
I don't think there is an universal answer on that though
I think it really depends on how large your db is, to tell if the partitioning has an effect
im wondering how it works for consumer facing apps
I mean, the only unique thing is the User ID or something,
so what would they apply in that case
I am sure there are more uniques
Article ID
just one example
Category ID
you get the gist
Post ID
or pokemon go
like.. they store user information linked to things on the app
wonder if they partition or index by anything
You'd have to ask the creators of such systems 😄
speculating isn't knowing 🙂
https://instagram-engineering.com/ Maybe you find something here
ok.. so ideally this approach should work
I find out what db service they use and see best practices are there for those
Just select one database system you want and have a look
MySQL is just as capable as PostGres or Oracle etc
Or look at what those db do best
Some are better in some areas the other isn't
yes, but the methods for partitioning change depending on the type of service yeah?
for example, storing items for an ecommerce website.. is different from maintaining customer linked info
thanks for the help.. I learned a lot.. will do some reading in the AM
I just read this entire conversation and my mind is blown. Thats all.
Ok, I’m losing my mind over hybrid properties in SQL Alchemy.
https://docs.sqlalchemy.org/en/13/orm/extensions/hybrid.html
I have a column that’s a timestamp and that I would like to keep as such, but I would like to add a field that’s a datetime object that I can filter on.
They even specify:
The Hybrid Value pattern is very useful for any kind of value that may have multiple representations, such as timestamps, time deltas, units of measurement, currencies and encrypted passwords.
Ok for timestamps it’s actually pretty easy to do a hybrid property with SQL functions.
But let’s say I want to have a full Python hybrid property and want to be able to query/filter on it, what is the exact syntax and which functions do I need to implement in the comparator?
This looks like what I’m trying to do (and I have a similar problem as what he’s talking about with FX rates)
I think I have a python/database question? I have a query that is a tad bit complicated. I'm updating a particular column in a database and setting some null values to a string value under the condition that ANOTHER column left joins with a column in a DIFFERENT table in that database. Actually, it's a little bit more complicated than a left join because it's a left join where the other value is null, to make it just the outside bit of the venn diagram. whew.
I think between the query being complicated and the database being large, it's just not working very well trying to get it to go through. It's taking forever. And it times out.
I'm trying to find a way to fix this and my guess is I should chunk the data so the query gets run multiple times but just on portions of the data? Is it not super complicated to write something in python to do that?
Or am I thinking about this the wrong way and should be doing something to my database to make it work better?
oh sorry Tolki, your stuff first
Nah it’s ok everybody is lost here anyways
"""Hybrid value representing the champion name in English"""
def __init__(self, champion_id):
self.champion_name = id_getter.get_champion_name(champion_id)
self.champion_id = champion_id
def __eq__(self, other):
return self.champion_id == id_getter.get_champion_id(other)
def __str__(self):
return self.champion_name
key = 'champion_name'
This is sort of doing what I wanted but I’m not sure it’s right 🤔
I can filter on it with that but when accessing the hybrid field directly it’s... not very good looking
It is working but I spent 3 hours reading the doc and I’m still not sure I did the right thing so I’m just gonna go to sleep, gl hf @wary panther
If you’re using SQLAlchemy the yield_per option is very useful when dealing with big subqueries
MySQL database, using pymysql when I'm using python to connect and do things with it from python
id = db.Column(db.Integer, primary_key=True)
paper = db.Column(db.String)
time = db.Column(db.DateTime)
author = db.Column(db.String, db.ForeignKey('User.username'))``` am I using the ForeignKey right here? I'll still set the author when creating a new row but I am connecting it to my User table. right?
do u have to say its a primary key everytime u access it 
@hexed epoch That should be, yes, although I'd strongly suggest not using a db.String foreignkey and using a user id number instead
hi guys
I'm doing a new project it needs redis, but I use windows 10 system
Does redis support for windows
if so please guide me how to install and run in windows
I have worked with postgres, mysql and mongodb but I'm new to redis
Hello!
I am currently trying to scrape some data from a firebase realtime database I found online. I have the apikey, authDomain, databaseURL, projectID, storageBucket, and messagingSenderID. I have also found on the website a few instances of url suffixes that produce json data. However, I have reason to believe there are other possible suffixes that store data which I'd like to access. Is there a way of listing all of the available files on a domain with the info I have?
I don't think firebase exposes an endpoint that tells you all other endpoints available
I have a non-python issue in that i have an established MSSQL database and need to prepare a script that (destructively) enforces new constraints. My current approach is to add the constraints in a disabled state, get the offending lines with DBCC CHECKCONSTRAINTS, delete the offending lines and enable the constraints. However i have issues with adding DISABLED NOVALIDATE constraints in MSSQL
any ideas?
can someone explain RAFT to me please
I dont get it.. do all the nodes contain all the data
The former is a single constraint on two columns, meaning that individually the columns could have duplicate values, but the two columns together must be a unique combination.
The latter has a separate constraint for each column, meaning each column must have unique values on its own.
so the former has the tightest constraint ?
I'm not certain on what tightest means but I think it'd be the latter, since each column must be unique. The former would allow for duplicates as long as the combinations were unique.
In the latter, it is not even possible to have two different people which share a birthday.
SQL = "SET @potion_name = (SELECT `Item Name` FROM `UserItems`,`Items` WHERE `UserItems`.`UID` = {0} AND `UserItems`.`SID` = {1} AND `UserItems`.`PID` = `Items`.`IID`);SELECT @potion_name".format(member_id, self.server_ID)
print(SQL)
self.cur.execute(SQL)```
the query itself works perfectly in phpmyadmin
but when i run it on python lik eabove,
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SELECT @potion_name' at line 1")```
whats going on..... 😦
@glad bobcat if you want to be able to run SQL order/filter by computed properties with sqlalchemy
you need to define the computed property as an SQL function in your db
I actually managed to do it just by defining a comparator
then in your sqlalchemy class definition (after class but before def __init__(self) if you have it)
I'm unsure if that's the right way to do it but it works sooooo
computed_property = Column(Float, server_default=FetchedValue())
You can have pure python attributes computed at the instance level though
And still filter on them
My understanding is that it's generally faster to offload that onto the db
Well for my case it definitely isn't, it's just a comparison to a dictionary and the SQL produced directly queries the right DB files with the transformed values
I just implemented eq() because I only filter on equalities for that field and it works perfectly.
class GameDate(object):
def init(self, timestamp):
self.timestamp = timestamp
try:
self.date = datetime.fromtimestamp(timestamp / 1000)
except TypeError:
self.date = None
def __lt__(self, other):
return self.timestamp < other.timestamp() * 1000
def __gt__(self, other):
return self.timestamp > other.timestamp() * 1000
def __str__(self):
return str(self.date)
def __repr__(self):
return str(self.date)
key = '
I'm on phone sorry
But this worked to translate a timestamp to a proper datetime object on instance level
And still be able to filter on it
class GameDate(object):
def __init__(self, timestamp):
self.timestamp = timestamp
try:
self.date = datetime.fromtimestamp(timestamp / 1000)
except TypeError:
self.date = None
def __lt__(self, other):
return self.timestamp < other.timestamp() * 1000
def __gt__(self, other):
return self.timestamp > other.timestamp() * 1000
def __str__(self):
return str(self.date)
def __repr__(self):
return str(self.date)
Formatted
Why is the result of sqlite3 sometimes simply None, and sometimes a tuble containing only None?
So either
None
or (None,)
This is absolutley confusing, and just makes checking for none so much harder
it's i think that sqlite connector requires a tuple to interact with database
So do I always have to do a bunch of try, except stuff just to check if the result is None, or contains None? This is so tedious...
class Class(db.Model):
id = db.Column(db.Integer, primary_key=True)
label = db.Column(db.String(4))
pupils = db.relationship('User', backref='class_', passive_deletes=True)
assignments = db.relationship('Assignment', backref='class_', passive_deletes=True)
class Assignment(db.Model):
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String)
number = db.Column(db.Integer)
class_id = db.Column(db.Integer, db.ForeignKey('class.id', ondelete='CASCADE'))
task = db.Column(db.String)
deadline = db.Column(db.DateTime)
papers = db.relationship('Paper', backref='assignment', passive_deletes=True)
class Paper(db.Model):
id = db.Column(db.Integer, primary_key=True)
paper = db.Column(db.String)
time = db.Column(db.DateTime)
author_id = db.Column(db.String, db.ForeignKey('user.id', ondelete='CASCADE'))
assignment_id = db.Column(db.Integer, db.ForeignKey('assignment.id', ondelete='CASCADE'))``` I'd like to get all assignments provided a User's class_id where User hasn't got any papers.
I've tried python Assignment.query.filter((Assignment.class_id == current_user.class_id) & (Assignment.papers.any(Paper.author_id != current_user.id))).all() doesn't work tho.
Anybody got ideas?
(it's flask_sqlalchemy)
I got it figured out, it works but only if there's a Paper object in Assignment.Paper, I mean a solution could be to just always create a blank paper with every Assignment but can I check someway if Assignment.papers is None?
can I partition a database table that is already created or do I have to do that at table creation?
what's the use of time series being supported as a native data type in time series databases
how to add values to a table in sqlite3
@torn sphinx often faster/more intuitive operations
I'm trying to connect mysql with python and got this:
mysql.connector.errors.InterfaceError: 2003: Can't connect to MySQL server on 'localhost:3306' (10061 No connection could be made because the target machine actively refused it)
my code:
import mysql.connector
'''from mysql.connector import errorcode
try:
cnx = mysql.connector.connect(user='root',database='menagerie')
except mysql.connector.Error as err:
if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
print("Something is wrong with your user name or password")
elif err.errno == errorcode.ER_BAD_DB_ERROR:
print("Database does not exist")
else:
print(err)
else:
cnx.close()'''
cnx=mysql.connector.connect(user="root",host="localhost",passwd="password",database="menagerie")
...do you have MySQL running?
yes
https://controlc.com/2f34542e --> isnt working for some reason, im tryna query using variables but i dont get how to make it work
import pymysql from pymysql import * con = pymysql.connect('localhost', 'root', 'Siddh - 2f34542e
@shell ocean can I use spark streaming on small data, like few mb every few seconds.. i'm trying to find some benchmarks.. but can't seem to
uh...why not?
