#databases

1 messages · Page 81 of 1

rancid root
#

that will dictate your choice of tech and it might not involve databases in the hot path at all

#

night be something like flink

torn sphinx
#

I've some background in analytics, so that's not my concern.. my concern is more of how to support mobile applications because I don't have any background with those

rancid root
#

no but like what kind of analytics are you expecting to be doing

#

e.g.

#

i have a sensor
i want a realtime display of what that sensor is putting out on my phone, with full sampling granularity

#

vs

#

i have a fitbit, i want a daily summary of my sensor data

#

completely different solutions will meet those needs

torn sphinx
#

yes, but I'm not going to be involved with serving the mobile application.. and focusing right now on how to get the data to the application and making it available to its developers

rancid root
#

yeah i can't really tell what your scenario is. Things that might be worth checking out: ksqldb, postgres, clickhouse, apache flink, spark-streaming

#

if all you need to do is sink a high resolution kafka feed and summarise it into time buckets, clickhouse can do all that in one shot

#

i'd still be moving data out of clickhouse into pg for long term storage though

#

as far as autoscaling, nah. Just pick a large enough cluster size from the start. And how you scale up/out will be governed by what you're doing (read heavy vs write heavy etc)

#

i'd say 19/20 times i encounter organisations trying to manage complex clusters for their data, the whole thing could have been done on 1-4 high spec servers instead and cost a fuckton less in both development and opex

torn sphinx
#

spark-streaming for?

#

I'll take a look at clickhouse to serve as a mobile application backend for data upto 1 year, and if users want to retrieve data beyond that from postgres, would that work?

rancid root
#

no idea because you never described what the actual usage patterns for the data are

#

clickhouse is an analytics column store

#

if that's what your workloads look like, great you can serve directly from it

torn sphinx
#

ok, my usage pattern is this:

#
  1. Users want to retrieve historical data for their device. 2. Users want to see a live feed of their device data.
#

All analytics related to the data being served on the mobile application, happens on the mobile application.

rancid root
#

why doesn't the live feed get done locally on the phone

#

why does it need to traverse through your infra

torn sphinx
#

because the live feed is from an iot device that might be further away from the mobile phone.. and connectivity to the iot device is managed by the network

rancid root
#

ok well if you want full sampling resolution live feeds with things that emit something like 1 datum per 1-10 seconds

#

databases are pretty much out

#

that's where something like flink or spark-streaming can come in, and you then stream the result to the end user

#

while also dumping it into a db for long term history

torn sphinx
#

ok, that solves 1.

rancid root
#

also if you don't actually need to do any complex processing or multiplexing of the data you can probably just take your kafka stream and dump it into a websocket server to feed to the end users

torn sphinx
#

pardon my ignorance, but I'm only aware of how we can run queries on a DB.. so wouldn't there be a huge load on the DB service when there's hundred thousand users trying to query for historical data?

rancid root
#

historical data comes from a db

#

realtime stream comes from the realtime hot path

#

an example topology would be iot devices -> kafka -> (streaming server, write records to db)

torn sphinx
#

I was hoping to cut out reliance on traditional db and retain data up to a year on ksqldb

rancid root
#

then i might query the historical db for records up until now to initialise my chart and then start appending the realtime feed records to it as they come in

#

you might be able to do that, i'm trying to answer similar questions and trialling it at the moment

torn sphinx
#

ok, and what about the load on the db server by users?

#

ok

#

sorry

rancid root
#

but i'm treating it the same way as clickhouse, an exotic accelerator and not a viable primary store

#

with something like pg there's high reliability and well documented patterns for how not to lose data and recover from fuckups

#

with the exotic stuff like ch and ksqldb it's anyone's guess

#

you also need to think about the ratio of writers to readers

#

if i have 10 readers for every writer than i might have 1 node sinking data but then replicating it to 10 nodes for serving it

#

also if people are actually querying a year of data at high res, that's a fuckton of bandwidth

#

you might be bound by the io and network throughput rather than the database itself

#

what you're asking is a pretty big architectural design thing not just a database question

#

and i can't stress this enough, if you don't actually have hundreds of thousands of users querying and pushing data on a constant basis, don't worry about any of this. Worry about getting users first

torn sphinx
#

I understand.. but we do have lots of resources and a lot of load

#

I dont have background in databases or mobile applications.. which is why I'm trying to find alternatives so no one else can screw this up for me by asking me to support something legacy

#

what do you mean by the 10 readers part, I dont get it

rancid root
#

for each iot device or whatever source you have

#

how many devices are watching the realtime stream or querying for it

#

it auto scales so there's that if you really want it

torn sphinx
#

is there something open source I dont have to pay for...

torn sphinx
#

else I could just pay the license for ksqldb

rancid root
#

ksqldb is open source what license

torn sphinx
#

yeah, it's not..

rancid root
#

?

torn sphinx
#

gimme a sec

rancid root
#

you're not making a competing saas offering

#

the only way to trip that clause if you try to take ksqldb and extend or rebrand it and then compete in the same space

#

and as far as i know, there's no option to get a license that negates it?

torn sphinx
#

the second part

#

Excluded purpose

#

so I can't make a PaaS offering that incorporates it, that competes with any service that offers it

rancid root
#

again idk what you're doing, but 'competes with' is a pretty narrow envelope

torn sphinx
#

I'm building something like cloud data fusion, but where people can manage their iot workflows

rancid root
#

i'm not a lawyer but our legal dept did scan this and approved it for our paas project

#

check out that book i linked

#

it will help you figure out the right stuff to consider

torn sphinx
#

ok

rancid root
#

the main issue i have with ksqldb is it's very immature, not super well documented, and doesn't yet have a community critical mass. So if you run into some issues, you're pretty much on your own

torn sphinx
#

I'll go through the legal terms with someone.. it got me wondering if in case I use any other traditional db, do I still have to pay a license.. because I'm offering it up as a service

#

and since i'm offering it on our infrastructure.. does that mean I'm liable just like a cloud provider would be

rancid root
#

yeah that can get complex. For permissive licenses it's pretty much do whatever you want, for licenses like the confluent one it's generally 'don't just take our shit and extend/rebrand it to compete with us'. Incorporating those components as part of a larger composed offering is generally fine but again IANAL check with your lawyers

#

i think the confluent license is specifically targeted at AWS

#

they're notorious for taking foss projects, slapping a new name and some minor changes to them, and then offering them up as saas/paas

torn sphinx
#

lol

torn sphinx
#

I think I need to read an entire book on databases before Im going to be good at designing them.

glacial sparrow
#

Hi !

I'm trying to finish a query and don't understand how to implement my last statement :

SELECT
  Relic.Name,
  Relic.Era,
  Relic.Quality,
  GROUP_CONCAT(`Pseudo`) AS Pseudo_g,
  GROUP_CONCAT(`Quantity`) AS Number_g
FROM
  Relic
WHERE
  Relic.Era = 'Lith'
  INNER JOIN User ON Relic.IDOwner = User.IDUser
GROUP BY
  Relic.Name,
  Relic.Era,
  Relic.Quality
#

I want to add a

WHERE Era = 'Lith'
#

But everytime i get syntax errors :/

#

Thans in advance !

rain wagon
#

Where did you add it?

#

oh you edited

glacial sparrow
#

Look, i edited it above

#

All the query without the WHERE was workin

#

But i want to add a layer of filtering

rain wagon
#
SELECT
  Relic.Name,
  Relic.Era,
  Relic.Quality,
  GROUP_CONCAT(`Pseudo`) AS Pseudo_g,
  GROUP_CONCAT(`Quantity`) AS Number_g
FROM
  Relic
  INNER JOIN User ON Relic.IDOwner = User.IDUser
WHERE
  Relic.Era = 'Lith'
GROUP BY
  Relic.Name,
  Relic.Era,
  Relic.Quality```
#

THis should work

glacial sparrow
#

After the JOIN ?

rain wagon
#

obviously

glacial sparrow
#

It works ! Thanks a lot !
I often struggle to understand the syntax logic in SQL.
You place the where here, because it filter after the join ? I initially tried after the group BY, but wasn't working either.

rain wagon
#

GROUP BY is always the last

#

because it first needs the data before it can group

glacial sparrow
#

OH thanks a lot, it 'll be really usefull for me in the future !

rain wagon
#

one exception btw, for GROUP BY always being last. Having. I hate having.

#

because it filters the group

glacial sparrow
#

So having is after group by

rain wagon
#

yes

#

It's a WHERE clause for the group, essentially

torn sphinx
#

You know how certain books in Python will raise debate about how questionable their material is while others will be authority material? I'm trying to make sure I'm learning from the agreed best source if there is such a thing.

solid void
#

the postgresql doc itself is very good

torn sphinx
#

Cool. I found that was also true of Django docs so I'll do that then

#

But..

#

I am very new to databases period.

#

So it may not be targeted at me

solid void
#

They do explain about concepts in the begining

torn sphinx
#

Good to know.

#

Welp, they endorse that website in their docs, too so

#

I might continue with it since I already started

torn sphinx
#

PostgreSQL 10.9 (Ubuntu 10.9-0ubuntu0.18.10.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 8.3.0-6ubuntu1~18.10.1) 8.3.0, 64-bit

I should upgrade to 11 right? I had this installed automatically from ubuntu server packages in the ubuntu os installation

#

I want stable and secure, not really concerned with features

#

I might just stick with this. It just seems one version behind and it seems like it would be best practice to make sure I have the most to date secure version before I even get started

#

My conclusion is that at this stage in my learning, whatever features it offers are an unnecessary distraction to worry about upgrading it now. I'm just trying to understand this and in all likelihood the first thing I build isnt the thing that will go to production anyway.

solid void
#

Or 12

#

But 10 is not too bad.

torn sphinx
#

If I had started reading this documentation from the very beginning @solid void I can't even begin to quantify how much better off and less trouble I would have had.

#

Ive learned a valuable lesson about always just opening documentation.

solid void
#

yep, has been my experience a few times as well, before just googling and following blogposts or stackoverflow answer, just have a look at the official doc of stuff, that can save a lot of time.

acoustic leaf
#

What’s the perks of using SQL over JSON?

#

Please ping*

rain wagon
#

@acoustic leaf First of all, performance. Second, you can have relations between the tables.

#

So if you have a list of cities you make a table with it, if a user comes from one of those, you only insert the id of the city into the city field of that user. That is called a relation.

#

It does not only save space, but also create logical connection between the fields

solid void
#

keeping some of your sanity.

torn sphinx
#

Am I able to get the ping to the database through asyncpg?

solid void
#

What do you mean the ping of the database?

#

@torn sphinx

torn sphinx
#

If you need a database for running scripts so you don't have to query scraping a website a gagillion times, do I use a framework, do I do raw queries or do I use a module? If so, which framework / module?

#

And also, is there something like Eloquent ORM for python?

cinder sierra
#

what exactly are you looking for @torn sphinx? storing data from scraping or what?

keen rock
#

How can I keep my db updated? For example, I have a db that stores a players kills in a game, but their kills won't be the same in like an hour or a month. So how can I keep my db always up to date?

#

If that makes sense...

solid void
#

well, you make a request to your db to update the information when that happens.

#

you could just save the current kill count, or you could save each individual kills, and count them to display the info, depending on how much data that'll make and how much info you need for your app.

keen rock
#

Well I'm using a limited API so I can't make like constant requests to the API.

#

@solid void

#

To keep grabbing data.

cinder sierra
#

then only update up to your limit?

keen rock
#

Okay but how do I update it every like hour?

#

I feel like I'm not explaining this well.

cinder sierra
#

just put your program to sleep for an hour I guess? that's up to you

#

yeah we don't follow I guess. you first asked how to update a database entry now it's more about the API

keen rock
#

Okay, let me try again.

#

How do you keep a database updated all the time.

cinder sierra
#

that's entirely up to you. that's like tshirt said, you make an update request to your database to update those entries

keen rock
#

Hm, alright.

late stratus
#

Oh, hello, Dexter.

keen rock
#

@late stratus Dexter 😛

#

Hello there, Dexter.

late stratus
#

Haven't seen many people with the name. For me, it's just an alias

hazy peak
#

any1 here has good understanding of elasticsearh?

solid void
#

!ask

delicate fieldBOT
#
ask

Asking good questions will yield a much higher chance of a quick response:

• Don't ask to ask your question, just go ahead and tell us your problem.
• Don't ask if anyone is knowledgeable in some area, filtering serves no purpose.
• Try to solve the problem on your own first, we're not going to write code for you.
• Show us the code you've tried and any errors or unexpected results it's giving.
• Be patient while we're helping you.

You can find a much more detailed explanation on our website.

sinful stratus
#
SELECT * from users where server_id=$1 and $2=ANY(json_object_keys(contacts));
#

contacts is a json column

#

I am trying to search whether $2 exists in a row's contacts's keys

#

but I'm getting this error:

ERROR:  set-returning functions are not allowed in WHERE
#

any workaround for this issue?

sinful stratus
#

using psql fyi

hazy peak
#

Elasticsearch:
I want to do a "sum" sub aggregation for each user and get top 10 ordered by that "sum" sub aggregation. How would I go about this?
According to official documentation, this is not possible as sorting by sub aggregation, with size, increases the error on document.

tender dagger
#

is there a way in sqlite to be like:
SELECT points, id FROM users WHERE points > 0 ORDER BY points DESC and then find where in that list id: 12 is?

#

so id: 12 would be 5 from top since he has the 5th most points

#

aka find what position a certain row is in the selected rows when ordered

#

ROW_NUMBER () OVER () is what i was looking for

#

SELECT rows from ( SELECT ROW_NUMBER () OVER (ORDER BY points) rows, points, id from demo) where id = 12

tender dagger
#

ok so

#

i'm having a strange issue

#

my final queary looks line

#

SELECT rows FROM ( SELECT ROW_NUMBER () OVER (ORDER BY total) rows, total, user_id, guild_id FROM users WHERE guild_id = '674355323972091948' ) WHERE user_id = '146416331896520704'

#

which works fine in DB Browser for SQLite

#

but aiosqlite says OperationalError: near "(": syntax error

#

???

#

why does aiosqlite hate me

solid void
#

did you put your request in double quotes?

tender dagger
#

yes

solid void
#

can you show the bit of code that does the request?

tender dagger
#

here's my ungodly long line

#
cursor = await db.execute("SELECT rows FROM ( SELECT ROW_NUMBER () OVER (ORDER BY total) rows, total, user_id, guild_id FROM users WHERE guild_id = '674355323972091948' ) src WHERE user_id = '146416331896520704'".format(guild=str(ctx.guild.id), user=str(member.id)))
#

i thought it was me splitting the request into multiple lined string

#

but

#

making it one line didnt fix it

#

the format is left over

#

instead of the hard coded numbers it usually uses the format

#

but i wanted to test if the format is what was breaking it

#

which it is not

#

just havnt turned it back yet :P

solid void
#

you have more there than in the other one, no?

#

the "src" wasn't there

tender dagger
#

the src was to test

#

it wont run if i remove that

#

ill show ya

#

gone, and still gives the error

#

here's what the line looks like normally

#
        cursor = await db.execute("SELECT rows FROM ( SELECT ROW_NUMBER () OVER (ORDER BY total) rows, total, user_id, guild_id FROM users WHERE guild_id = '{guild}' ) WHERE user_id = '{user}}'".format(guild=str(ctx.guild.id), user=str(member.id)))


#

oops

solid void
#

yeah, that's not your problem here, but that's not the proper way to inject params, but as you said you have the error even with hardcoded ones.

tender dagger
#

i like to use format because it looks nice and makes me happy :p

solid void
#

i'm not sure what goes wrong here 😐 no experience with aiosqlite myself, i don't see how it would parse the thing differentely.

#

it should just pass it through

tender dagger
#

yeah i dont get whats wrong

solid void
#

you tried with simpler functions to see on which ( it would fail?

tender dagger
#

simpler?

solid void
#

maybe it dislikes the space after ROW_NUMBER

#

yeah, just simpler queries

tender dagger
#

fair fair

#

it could have some weird syntax requirement that neither db browser or sql online has

#

its def not the select rows from where guild id = guild id

#

that ive tried

#

so

#

its very likely the row_number or over ( order by)

#

select row_number () over (order by user_id) from users

#

it still hates that

solid void
#

i'm curious about why you put a space before ()

#

it's a function name, no?

#

i get that sqlite doesn't care as you ran it like this in it

#

but maybe it confuses aiosqlite

tender dagger
#
cursor = await db.execute("select row_number() over (order by user_id) from users")
#

still broken

#

i did that because

#

the example code i was following did /shrug

solid void
#

ok

#

np, now we are sure that's not the problem

#

Window function support was first added to SQLite with release version 3.25.0 (2018-09-15).

tender dagger
#

im on that page rn

solid void
#

i would check that the version of sqlite your python links against or builds into, is superior or equal to that version

#

it's not that old

tender dagger
#

idk what version of sqlite3 aiosqlite uses

solid void
#

sqlite3 is a standard python module

tender dagger
#

i know

solid void
#

probably aiosqlite wraps that

tender dagger
#

aiosqlite is a wrapper

#

yea

solid void
tender dagger
#

well

#

im using python 3.7

#

maybe its just time to update python

#

(and break fucking everything lol)

solid void
#

😬

tender dagger
#

god i hope not

#

lets find out, ill update python just for this project

solid void
#

you can check first though

tender dagger
#

how do i check that :p

#

i mean, i should probably be updating to 3.9 anyway

solid void
#

import sqlite3; print(sqlite3.sqlite_version)

tender dagger
#

3.21

#

fffff

#

FFFFFFFFFF

#

FFFFFFFFFFFF

#

i guess im updating python then lmao

#

guess im going to 3.8 since 3.9 isnt out yet

solid void
#

here on linux i'm seeing the same version in 3.7 and 3.6, so i think it comes from the system, so the details might depend on your os/distro

tender dagger
#

damn

#

nope

#

the newest sqlite3 is 3.31.1

#

h

#

python ;-;

#

well, thanks for helping

#

♡ @solid void

solid void
#

happy to help, seems 3.31 should be good then! 🙂

tender dagger
#

or i can use this hot garbage

#
from users a order by total desc```
#

lmao

onyx seal
#

Hi all, so I'm not sure if this is something that's relevant for this server, but I wasn't able to find a better place to ask about this.

I run a moderator bot on Discord (it's python based) and one of the issues I've been running into is that one of my databases (Postgres) needs to have bi-monthly mainteance. While this is part and parcel of any kind of bot that gets lots of traffic, the problem is that to perform the maintenance to the highest quality, I have to shut down the bot for up to 20 minutes so that I can run the maintenance tasks with an exclusive lock across the database.

Running maintenance windows every 2 months is also not particularly wise either, because in some cases, there are spikes of activity to the database that can cause problems much earlier than this.

So my questions are really this:

  • How do I configure the autovacuum daemon in a way that it'll run only when there's a large amount of dead tuples from inserts/updates etc?
  • How do I make sure that autovacuum will do VACUUM, ANALYZE and REINDEX?
  • What is the most effective way to partition my postgres database table (which contains over 25.5 million records) in a way where each partition is manageable and not slow?
  • In addition to the above question, how would I develop a trigger that auto creates new partitions on the fly as and when the conditions have been met? Currently, I don't store datetime objects (i convert them from a message id) but should I consider including this to make partitioning easier?

Thanks for the help if you can provide it! (and if this server isn't an appropriate place to ask, do let me know where a better place is 😅)

solid void
#

@tender dagger oh, i misread you last night though it was up to date in latest python, if it's not you might need to compile python yourself to link it against the proper libsqlite3, or maybe it's actually dynamicaly linked in your system and you should be able to just update libsqlite3 in your system.

#

if i do apt show libsqlite3-0 i see the same version number that python gets when i run it.

#

so finding a way to upgrade that could be enough.

tender dagger
#

i got it

#

3.8 has high enough

rain wagon
#

In SQLITE, how can I insert a timestamp in the SQL as iso8601?

#

Here is my current statement: SQL INSERT OR REPLACE INTO t_character_info(characterId,alliance, corporation,faction,ancestry,bloodline,race,gender,securitystatus,birthday,timestamp) VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, DATETIME('now'));

#

I know I can just generate it with Python, but I'd like a sql solution

lofty summit
#

Does someone know a good ORM for python? I know the one used by Django is good but I don't use the framework

cinder sierra
#

SQLAlchemy is basically the python go-to afaik @lofty summit

lofty summit
#

Yeah but I've heard mixed feedback on it

#

Lemon talked about it yesterday, have anyone used it?

#

I tried to use psycopg2 driver directly, but basically I was reinventing the wheel...

cinder sierra
#

reinventing what wheel?

#

i directly use psycopg2 all the time

lofty summit
#

Well maybe I didn't use it in a proper way, but what I was doing was just creating SQL queries with some args in strings and sending them

#

Using an ORM is like working on a higher level, and therefore it seems faster to code with it @cinder sierra

cinder sierra
#

i know what an ORM is, it's okay

lofty summit
#

I know you know

cinder sierra
#

if you're looking for easy then yes, ponyorm should be your go-to. if you're looking for full suite and capabilities, go with SQLAlchemy

lofty summit
#

Why do you use psycopg2 directly? When the database become a bit complex, I find writing directly SQL queries hard/boring

cinder sierra
#

full control

#

if something's inefficient or going wrong, it's only my fault

lofty summit
#

I thought like that before, but now I find the price to high for that, I and hope ORM are well made enough not to constrain me

solid void
#

Sqlalchemy is nice, except when it's not and you spend time finding how to express something properly through the abstraction, and doing without is not so hard, both have merits.

median night
#

I want to find all links in my mongodb thats first_seen is less then now. but somehow my query never matches. anyone can help? ```
urls = collection.distinct(
'link',
{
'first_seen': {
'$lt': datetime.now()
}
}
)

#

date is ISODate in the database

fervent valley
#

Hello I have a question. What terms do you wish you knew before going into this part of python? Please DM me as i might miss the answers!

rain wagon
#

@fervent valley You want to know what normal forms are and SQL

#

You also want to know the impact of PRIMARY KEY and what NOT NULL and UNIQUE do to a data set. Knowing basic data types is also a requirement

#

SQLITE does not statically type, but a lot of DBMS do

rain wagon
#

But then I guess that isn't the point of views

woeful maple
#

You can't use the ? for that type of replacement, it would only be suitable for WHERE clauses for comparison. You have to edit the query string

rain wagon
#

Yeah I have thought wrong

woeful maple
#

but be careful because that opens you up to SQL injections

rain wagon
#

Views are supposed to make things faster, tha tis why they are static

#

I know I can construct the sql string myself, but if you do that all world yells at you

#

no matter if there is no user input at that point

#

I have database models and they all share 2 functions that are the same

#

only the table names differ

woeful maple
#

I'm not sure what you mean "views are static", views are basically SQL queries stored in the database

rain wagon
#

has_entry() and is_expired()

#

Yeah, but views are meant to be static

#

at least the theory says so

#

Anyway, I'll just pass the whole sql string

#

It's clunk and verbose, but alas

shell drift
#

my error:

unsupported operand type(s) for +: 'NoneType' and 'int'.

my code:

                user = await self.bot.pg_con.fetch("SELECT * FROM punish WHERE user_id = $1 AND guild_id = $2", author_id, guild_id)
                
                if not user:
                    await self.bot.pg_con.execute("INSERT INTO punish (user_id, guild_id, mutes, kicks, bans, warns) VALUES ($1, $2, 0, 0, 0, 0)", author_id, guild_id)

                user = await self.bot.pg_con.fetchrow("SELECT * FROM punish WHERE user_id = $1 AND guild_id = $2", author_id, guild_id)
                await self.bot.pg_con.execute("UPDATE punish SET warns = $1 WHERE user_id = $2 AND guild_id = $3", user['warns'] + 1, author_id, guild_id)

yes warns is a int value

rain wagon
#

In this context, it is apparently None

#

put a print(type(user["warns"]) there and see what it spits out

#

@shell drift

rain wagon
#

I think I have broken SQLite

#

It always says "database locked", even after reacreating it

solid void
#

doesn't it use flock?

woeful maple
#

@shell drift I'd assume a NULL value is being converted into python's None. print(None + 1) returns the exact same error

torn sphinx
#

@rancid root do you think I can use apache druid as a mobile application backend

rancid root
#

druid is a timeseries database

#

and not a very good one

torn sphinx
#

I haven't found a good option to support as a mobile application backend.. for all my iot use cases

#

right now, it's like device protocol layer -> kafka -> transformations (possibly with sparkSQL and streaming) ----> Druid--> metatron (for visualization)

#

device protocol layer -> kafka -> transformations (possibly with sparkSQL and streaming) ----> Backend for mobile applications (???)

#

^ this part is missing

torn sphinx
#

@rancid root could you suggest some options.. preferably open source

rancid root
#

did you read the book i linked

torn sphinx
#

yeah

#

nothing in here about mobile apps

rancid root
#

yeah that's because the fact that it's a 'mobile' app changes nothing

#

focus on the data flows you need to support, they're pretty agnostic of what the client physically is

#

number of writers, number of readers, typical types of query, latency requirements, SLAs, HA requirements, security requirements

#

that should dictate your choices

torn sphinx
#

SLAs?

rancid root
#

more broadly, uptime requirements and timings around managing downtime

torn sphinx
#

hmmm

#

I found this

#

and this

fervent valley
#

Thanks @rain wagon !

serene nest
#

does anyone have any knowledge on sqli injections

stark lion
#

my python sql dont refresh the changes from other computer like phpmyadmin

#

like when i update a value on mysql via php i cant see the changes on my python script

#

i have to re-run the script to see the changes

#

???

red tapir
#

I've been trying to write a MYSQL query that returns a list of groups a given member (users.id) is not a part of. Essentially, the flow is this: When a user joins a group, a row is added to group_members (groupID, userID) -> (Example: user (id: 5) joins group (id: 3), new row in group_members = 5 | 3.

I can get a list of groups that a member belongs to with no problem, it's just a simple left join - but when I try to do the inverse, I can't seem to get it right and I'm not quite sure why.

Attached is the diagram. Thanks in advance for your help!

TL;DR: I want to return a list of groups that a given user is not a part of.

rain wagon
#

Should work with inner join

stark lion
#

can someone help me?

rain wagon
#
SELECT username, email, fname, lname
FROM users
INNER JOIN group_members AS gm ON gm.userID LIKE users.id
INNER JOIN groups ON gm.groupID NOT like groups.id```
#

Not sure if this works, but should

#

@red tapir

stark lion
#

@rain wagon do you know how to help me ?

rain wagon
#

@stark lion There is no issue with what you describe

#

Databases don't live update, you have to query data

stark lion
#

the 100 is the new value i changed via phpmyadmin

#

but python see the old one

rain wagon
#

@stark lion then phpmyadmin didn't commit the changes yet

stark lion
#

i dind

#

if i restart my program now

red tapir
#

The problem is that I'm looking for a list of groups that the specific user isn't in. When I try to reverse the query you wrote (grabbing data from groups -> inner join group_members -> inner join users), it essentially just returns what's already in the group_members table.

I'm not sure how it'd work, but I need to get a list of groups that a user isn't in. So, if user "admin" (id 1) is in groups (5, 6, 7), I'd want to return groups (1, 2, 3, 4).

stark lion
#

i will show the new value

rain wagon
#

COnnect to the database via cli and run a select command

#

Then you know for sure

stark lion
#

what is cli

rain wagon
#

command line interface

stark lion
#

k let me check via putty

rain wagon
#

Why did you reverse my query?

#

The last line was supposed to filter out users in a group

stark lion
#
|  1 | ΔΗΜΗΤΡΙΟΣ ΠΡΕΒΑΖΗ                   |     69696 | 2 ΕΠΑΛ     | Δ1    |          0 |            1 |   100 |   100 |

rain wagon
#

I mean, I could be wrong, Inner join was never my favorite

stark lion
#

see

#

the value is updated

#

python dind get the update

rain wagon
#

did you run the python script AFTER you have updated the value?

red tapir
#

That's why I was thinking left join would be the way to go, since I want the data from the groups table filtered through the scope of group_members. Essentially in pseudocode it'd be smth like: If ([groupID, userID] not in group_members) {return groups}

#

I flipped it around because I needed groups returned, not users haha.

rain wagon
#

oh

stark lion
#

i run the script before i update a value on phpmyadmin

rain wagon
#

@stark lion Run it after

#

of course it shows old values if you run it before

stark lion
#

meh

rain wagon
#

SQL can't see into the future

stark lion
#

why

#

look here

rain wagon
#

Because it's a query language.

stark lion
#
mycursor.execute("SELECT * FROM members")
    MyResultMembers = mycursor.fetchall()

    print('\n')
    print(tabulate(MyResultMembers, headers=['ID','ΟΝΟΑΤΕΠΩΝΥΜΟ', 'ΑΡ.ΜΕΛΟΥΣ','ΣΧΟΛΕΙΟ','ΤΜΗΜΑ','ΕΠΙΣ/ΕΒΔ','ΣΥΝΟΛ/ΕΠΙΣ','ΒΙΒΛΙΟ 1','ΒΙΒΛΙΟ 2'], tablefmt='fancy_grid'))
    print('\n')
#

here the code shows all the members

#

why it dont get update for the new values

rain wagon
#

Dude, again. If you run the script before you change a value, then it won't work

#

You need to query after each change

stark lion
#

k then why

#

when i change something via python

rain wagon
#

Because SQL is a query language.

stark lion
#

it show the updated value

#

is possible to make it update?

rain wagon
#

yes, run another select

stark lion
#

how?

#

wait

#

no

#

you dont understant

#
def ShowMembers():
    clear()
    display = [{"ΜΕΛΗ ΤΗΣ ΒΙΒΛΙΟΘΗΚΗΣ"}]
    print(tabulate(display, tablefmt='fancy_grid'))

    mycursor.execute("SELECT * FROM members")
    MyResultMembers = mycursor.fetchall()

    print('\n')
    print(tabulate(MyResultMembers, headers=['ID','ΟΝΟΑΤΕΠΩΝΥΜΟ', 'ΑΡ.ΜΕΛΟΥΣ','ΣΧΟΛΕΙΟ','ΤΜΗΜΑ','ΕΠΙΣ/ΕΒΔ','ΣΥΝΟΛ/ΕΠΙΣ','ΒΙΒΛΙΟ 1','ΒΙΒΛΙΟ 2'], tablefmt='fancy_grid'))
    print('\n')

    bye = input("ΓΡΑΨΕ ΟΤΙΔΗΠΟΤΕ ΓΙΑ ΕΠΙΣΤΡΟΦΗ: ")
#

here is my def

#

i run this def after the update

#

and it show me the old value

rain wagon
#

You are also reusing the cursor

#

you are not supposed to

#

get a new cursor for each query

#

or at least do mycursor = mycursor.execute("SELECT * FROM members")

#

But it's really better to get a new one

stark lion
#

wait

#
def ShowMembers():
    clear()
    display = [{"ΜΕΛΗ ΤΗΣ ΒΙΒΛΙΟΘΗΚΗΣ"}]
    print(tabulate(display, tablefmt='fancy_grid'))

    mycursor = mydb.cursor()
    mycursor.execute("SELECT * FROM members")
    MyResultMembers = mycursor.fetchall()

    print('\n')
    print(tabulate(MyResultMembers, headers=['ID','ΟΝΟΑΤΕΠΩΝΥΜΟ', 'ΑΡ.ΜΕΛΟΥΣ','ΣΧΟΛΕΙΟ','ΤΜΗΜΑ','ΕΠΙΣ/ΕΒΔ','ΣΥΝΟΛ/ΕΠΙΣ','ΒΙΒΛΙΟ 1','ΒΙΒΛΙΟ 2'], tablefmt='fancy_grid'))
    print('\n')

    bye = input("ΓΡΑΨΕ ΟΤΙΔΗΠΟΤΕ ΓΙΑ ΕΠΙΣΤΡΟΦΗ: ")
#

like this?

rain wagon
#

yes

stark lion
#

yea i did it before

#

again the python have the old values

#

on phpmyadmin and my server(putty) i see the new values

#

and i run the def ShowMember() after i update the values

lime cobalt
#

how could i modify a column that is used as a foreign key in another table

rain wagon
#

@lime cobalt Turn off referential integrity, modify it on both tables, make sure the transfer was successful, turn referential integrity on again. However, the better way would be to export the tables, re-create them and inject the data again.

#

If it is sqlite, that does not care about referential integrity unless you pragma it after connecting

feral spruce
#

is learning DBMS necessary for data science ??

#

or any web development related stuffs ??

lime cobalt
#

@rain wagon i found```sql
set foreign_key_checks=0;

rain wagon
#

@feral spruce Webdev definitely

median night
#

someone able to help me with a mongodb query?

feral spruce
#

thnx @rain wagon

rain wagon
#

@lime cobalt It is different for each dbms, so use what your manual tells you to

reef hawk
#

I would like to be able to have 2 computers (1 local windows, 1 VPS ubuntu) be able to access the same database. How would I go about being able to do that? And because I would like to access the database quicker on the VPS, is it possible to set the database local to VPS and remote from my computer?

median night
#

@reef hawk you just set up the database on your vps. then youd use localhost:<port> to connect to the db on your vps. on your local computer you just use <ip_of_the_vps>:<port>

#

ofc make sure that your firewall on the vps doesnt block you

reef hawk
#

hmm okay, thanks! do you know what's the best resource to learn about this stuff?

#

I've been reading on it and it says I need to make sure it has to listen as well?

median night
reef hawk
#

okay yeah that'd help

#

thank u!

median night
#

np

grand lark
#

I made a discord bot and my database is mysql. I use mysql connector in the code. Now the problem is that I get errors after running it for a few hours. errors.OperationalError('Mysql connection not available') is what I get. It can't be an idle timeout as it constantly reads and writes data as the bot is a level system. How can I make it so that the bot will reconnect to database when this error is caused? Anyone can help?

#

Also in the code, the connection is a function and I call the function every hour incase it disconnects. But once it is disconnected, the function doesn't work. The only fix I know is to restart the bot.

lime cobalt
#

aight so i have a mysql database for discord leveling system and i made all id related columns of type integer

#

but ids on discord are massive numbers

#

so i get the out of range column error

#

is there is a way to change every column that has 'id' in the names' datatype to varchar

torn sphinx
#

I'm not on mysql right now, but you could try getting the columns and checking with LIKE for '%id%' .. not sure it'll work, but worth a shot

#

then cast the columns after checking type

#

You know.. what I don't get.. why we use = instead of ==

stray thorn
#

Im fairly new to database programming in python, so hear me out here. I have created an RDF graph, a fairly simple one, that i then parse and save on a .txt file, which i then copy, paste and update to my database, but i am gettin a weird prefix on all my triples called "ns1" that i have no idea where comes from. It just appears out of thin air, and is an empty prefix that messes up my queries in the database.

#

it looks like this @prefix ns1: <http://> .

grand lark
#

@lime cobalt make the column data type to BIGINT

#

Not INT

lime cobalt
#

Why not string

#

@grand lark

#

Varchar*

grand lark
#

@lime cobalt u are storing IDs and IDs are only integers

lime cobalt
#

@grand lark thanks for the info but is there is a way to change all columns that contain ids instead of manually altering each one

#

All of the id columns end with _id

reef hawk
#

I'm looking to transfer the db from a local one to a remote one

#

however, I want to designate the remote server to have the local db, and the local computer to be remotely connected to it

solid void
#

well, just make them both connect to the address of the remote server

shadow path
#

Hi all, I'm new to this community, just joined a couple days ago.

Just wondering if I have SQL related questions, should I ask in one of the help channels or here is fine?

grand estuary
#

Here is probably fine

shadow path
#

ah ok

#

Was wondering how to query this on SQL.

To give some context, I have a table that is presented in as "events" or logs that shows changes made to (let's say) an account.

Table Event
|event_id|acc_id|changes                              |timestamp          |
|1       |1     |"status": ["pending", "reviewing"]}  |2019-01-01 07:00:00|
|2       |1     |"status": ["reviewing", "passed"]}   |2019-01-01 16:32:10|
|3       |1     |"status": ["passed", "activated"]}   |2019-01-01 22:10:59|
|4       |2     |"status": ["reviewing", "passed"]}   |2019-01-02 18:34:22|
|5       |2     |"status": ["passed", "activated"]}   |2019-01-02 15:09:43|

But for the end result, I would like to aggregate the acc_id's together into one row and have the different timestamps as new columns, something like:

|acc_id|reviewing timestamp   |passed timestamp      |activated timestamp 
|1     |2019-01-01 07:00:00   |2019-01-01 16:32:10   |2019-01-01 22:10:59
|2     | xx                   |2019-01-02 18:34:22   |2019-01-02 15:09:43
grand estuary
#

what sql server are you running

shadow path
#

postgres

grand estuary
#

sorry, cant help; I only know mysql

shadow path
#

ah ok no worries

reef hawk
#

pg_dump -C -h remotehost -U remoteuser dbname | psql -h localhost -U localuser dbname

#

what does the remotehost stand remoteuser stand for, vs localhost and localuser?

#

ah ok nevermind it's the IP address right?

grand estuary
#

yes

reef hawk
#

and user would be postgres if I didn't edit it?

grand estuary
#

remotehost = db to connect to, remoteuser = user on the database

reef hawk
#

okok thanks!

lofty summit
#

Has anyone already tried to use redis with python?

rain wagon
#

I am pretty sure people have

reef hawk
#

how do I find the remotehost / localhost address for postgresql? I don't have much clue where to find it 😔

grand estuary
#

where is the database hosted (what is their IP address / domain )

#

or is it your own?

reef hawk
#

it's all my own, I'm reallly dumb with this stuff

#

basically I have a local computer running on windows, and a VPS on Ubuntu

#

I'm trying to transfer to database over to the VPS, and start running the the db on the vps

#

oh fk wait ik the vps one

#

is it just the ipv4 that's shown in the VPS?

grand estuary
#

pg_dump -C -h localhost -U postgres (database on your personal computer)

| psql -h (ip of your vps) -U postgres (database on your vps)

#

just splitting the command for ease of reading, put back together upon running it

reef hawk
#

ah okay, and the ip is the ipv4 right?

grand estuary
#

yep

reef hawk
#

gotcha

#

lemme try right now

#

thanks a lot!

reef hawk
#

I get an error that says FATAL: no pg_hba.conf entry for host"x.x.x.x"

#

I'm assuming that just seems I have to put an entry point in my config file?

#

or could that mean another error

#

(while trying to access the server on PGadmin4 on my local computer)

#

or actually when trying to paste it too

grand estuary
#

sorry, im not a postgres fella

reef hawk
#

thanks a lot anyways, I think I can figure the rest out on my own

#

appreciate it!

grand estuary
#

np

reef hawk
#

instead of localhost, how would I be able to find my ip address of the local computer? I think I need to add that to the remote

grand estuary
#

Are you on your local computer with the database?

reef hawk
#

yep

grand estuary
#

localhost will always be resolved to 127.0.0.1 - changing localhost to anything else will not change the command.

flat solstice
#

I'd suggest copying dump file over with scp

#

or starting a ssh tunnel

#

exposing your database over public IP may be dangerous

reef hawk
#

ah alright, thanks for the info

flat solstice
#

(postgresql will connect unencrypted by default, I think)

#

tunnel should be easy, like ssh hostname_you_usually_use -L 54321:localhost:5432 - then localhost:54321 will be a tunnel to postgres inside hostname_you_usually_use so you can use localhost:54321 in pgadmin on your local machine to connect there (5432 is postgres port, I've added 1 at the end of the local port to avoid conflicting with your local database)

finite lynx
#

I am using flask-sqlalchemy, I want to change some column names, is there a way to update my database, keeping the current data?

upbeat lily
#

There should be. Are you using sqlite?

#

If you're not, then you're looking for Alembic

cold karma
#

Just learning python , what sql software to use for database create edit and all

torn sphinx
#

@rancid root I have a predicament

#

so, I'm using kafka to manage my event streams, I'll use an open source schema registry to maintain the schemas.. but

rancid root
torn sphinx
#

lol

#

the thing is, my source of streaming data (payload from devices) gives me new json every few seconds

#

the payload contains information for n devices as inner jsons within a json array

rancid root
#

yep

torn sphinx
#

the problem is.. not all devices send information at the same time.. so if I have 10 devices, I might receive payload for 1 device one time, then after a few seconds later few more devices.. and later, the rest of the devices among the 10

#

so how do I account for this, when I want to convert payload to Avro and write to a sink for example

#

ok, now I realize it

#

if they are all devices, they go under the same columns anyway..

#

phew

#

im so stupid

#

it'd be great if you could let me know of some gotcha's when handling data like this.. or your experience with those telecom data streams

#

so I know what to design and expect

rancid root
#

ok so my internet connection died... random. On hotspot now

#

what you're describing is pretty typical, idk how you're ingesting the data so the answer is general

#

if it's something like clickhouse, it'll do it natively

#

if you're at low level, just parse out the individual records that make up a row and push them onto the sink

torn sphinx
#

yes, but to infer that they are individual records, I need to see the schema first, or allow users to see it..

#

I found a sink that can serve as an application database.. yugabyteDB

#

looks promising

rancid root
#

the schema is, an array of dictionaries representing a row each

#

if i'm understanding correctly

torn sphinx
#

yes

#

and the number of dicts is not constant

rancid root
#

doesn't matter

#

you know the structure of each dict

#

just iterate on them

#

again clickhouse does this natively

#

idk about yugabyte how it works in that scenario

torn sphinx
#

the thing is, I was hoping to do this early on in the pipeline, and then convert to a lighter format like avro to write to the DB

#

and another thing about doing it early in the pipeline is being able to write it out for doing analytics

finite lynx
#

I am, but hie can I do that? @upbeat lily

#

How*

rancid root
#

shrug then just write a kafka worker that eats a list of dicts and spits out individual dicts or something

#

keep in mind that will cost in perf, kafka is bottlenecked by messages per second rather than bytes per second

upbeat lily
#

Hm. I believe Alembic has problems with SQLIte, you still want to use it probably, but I'm not near experienced enough to be able to help you

#

If I were you, I'd dump my data to something like JSON, and start again with SQLAlembic from the start - but its very much a hacky solution

wild nest
#

can anyone help me with connecting to firebase?

#

i keep getting this error

wary panther
#

MySQL db question: am I able to update table add column... and then fill that column in based on the data being sorted a particular way (ORDER BY...)?

#

actually nvm there is an ORDER BY RAND() 😄

reef hawk
#

How come when I use the host ip address instead of localhost for postgres, it requires authentication? e.g. it doesn't recognize its own IP address

#

and if I put the IP address instead of local host, would it be a remote connection instead or still be local?

grand estuary
#

There are rules for who needs to authenticate.

reef hawk
#

yeah, but I'm surprised it wouldn't recognize it's own IP address

#

it works when I leave it as localhost

grand estuary
#

Its not meant to for security reasons.

reef hawk
#

ah alright

#

would the connection then be a remote connection or

grand estuary
#

Even when connecting to localhost, it will still use sockets and still be a remote connection.

reef hawk
#

oh i see

#

I'm basically trying to make it fast I guess..

#

would there be a difference if I put localhost vs its own ip address?

grand estuary
#

In speed, yes; but its so minimal and for your use case its not a deal breaker.

reef hawk
#

ahh alright

#

thank you so much!!

stark lion
#

If i install mysql on windows

#

It will start automatically when the computer starts or i have to run it every time i open my computer?

steady epoch
#

Can I insert 4 value in one column

terse stump
#

did you have an option during install to use mysql as a windows service @stark lion ?

#

but i would assume there are some options to have it auto-start if it does not do that already

reef hawk
#

I'm trying to collect all columns in a table to display, if the column is not equal to 0. How would I go abouts doing that using asyncpg?

#

e.g.
useritems = await self.bot.pg_con.fetchrow("SELECT * FROM items WHERE user_id = $1", user_id) #if the column is not 0

lime cobalt
#
CREATE TABLE items (
    item_id BIGINT,
    guild_id BIGINT,
    title VARCHAR(30),
    info VARCHAR(400),
    price INT,
    reply VARCHAR(3000),
    given_role INT,
    removed_role INT,
    dm_msg VARCHAR(3000),
    
    PRIMARY KEY (item_id, guild_id),
    FOREIGN KEY (guild_id) REFERENCES guilds(guild_id) ON DELETE CASCADE
);```
#

#1215 - Cannot add foreign key constraint

#

why

solid void
#

what's the guilds table definition?

lime cobalt
#

@solid void i dont have the sql

solid void
#

\d+ guilds

#

(assuming postgresql)

lime cobalt
#

mysql

#

\d+ guilds?

solid void
#

i don't think it'll work in mysql

#

anyway, i'm not sure there is a unicity constraint on the guild_id there

lime cobalt
#

unicity?

#

its a primary key

solid void
#

was it defined as such?

lime cobalt
#
CREATE TABLE guilds (
    guild_id BIGINT,
    min INT,
    max INT,
    cooldown INT,
    PRIMARY KEY (guild_id)
);
solid void
#

oh right, the key icon says that

#

been a long time i didn't use mysql/phpmyadmin

#

ok

lime cobalt
#

what is \d+ guilds supposed to do

solid void
#

it's just a psql command to display basically the same info you showed in that capture, so you can forget about it

lime cobalt
#

oh

solid void
lime cobalt
#

1) The table or index the constraint refers to does not exist yet (usual when loading dumps).

#

it exists

#

2) The table or index in the constraint references misuses quotes.

#

it doesnt have quotes

#

3) The local key, foreign table or column in the constraint references have a typo:

#

nope

#

4) The column the constraint refers to is not of the same type or width as the foreign column:

#

uh i will try specifying the width maybe?

#

nope

solid void
#

maybe

lime cobalt
#

5) The foreign object is not a KEY of any kind

#

isnt having foreign keys as primary keys valid....

solid void
#

that would be pretty broken

#

i think it means the opposite, and that's not your problem, as the FK is a PK

#

that was my first question

#

6 is not your problem either

lime cobalt
#

7

#

both are bigints

#

8

#

nope

#

9

#

is still a nope

solid void
#

10 nope too i assume

#

11 too

#

not 12 either and not 13 unless it's implicit

#

so i would maybe add a "NOT NULL" to the "guild_id BIGINT" line

lime cobalt
#

hm

#

in which table

#

both?

solid void
#

it's already implicit in the guilds table (primary key), so only on the items one

lime cobalt
#

wow that

#

didnt work

solid void
#

Creating a foreign key constraint requires the REFERENCES privilege on the parent table.
theck you have that permission

lime cobalt
#

no idea how

#

oh show grants

solid void
#

yep

lime cobalt
#

Error Code: 1142. SELECT command denied to user 'mHThQOLvY3'@'154.237.72.84' for table 'user'

#

oof

solid void
#

(also, i pointed to the 5.7 doc, make sure to check with the version of mysql you use)

#

hm, so that's a good hint you are not using the root user, so maybe that's the reason, though i don't see why you wouldn't have this permission.

lime cobalt
#

i am poor

solid void
#

oh ok

solid void
#

The permissions you have on your database are SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, REFERENCES, INDEX, ALTER. Others will come in the future, but not at the moment. These can all be viewed on your dashboard.

#

yeah

#

so that one seems good as well

#

that's weird.

lime cobalt
#

should i just

#

drop guilds and make it again

#

dunno

solid void
#

maybe someone else more knowledgeable in mysql than me will have an idea, because i'm a bit short here.

lime cobalt
#

well thanks for your help anyway

solid void
#

yw

long cedar
#

hello

#

so my teacher gave me some quaries that i put into pg admin.

#

and im suppose to find the ID and name of athletes who perform better than someone at some sport

#

was thinking i would try it liek this

#

select p.id, p.name
from people p

except

select distinct p.id, p.name
from results r
join people p on r.peopleid = p.id
where r.result = select min(r.result)

#

id and name is from a table called person. peopleid is a foreign key in results and result is the records of everyone in all sports

#

but im getting an error in the last line when i try this

#

maybe i should rather try to do it like this?

#

select r.peopleid, p.name
from results r
join people p on p.id = r.peopleid
group by r.peopleid, p.name
where r.result != min(r.result)

#

but i also get an error

solemn ridge
#

What does this mean?

#
sqlalchemy.exc.ArgumentError: Textual SQL expression 'Example' should be explicitly declared as text('Example')

vestal wadi
#

Hi Guys I have that many lines like BR:username:password in file.txt. How I can read each line and get 3 variables a=BR, b=username, c=password . The idea is get everything on txt and send to mongodb ( that part i already know ). Many thanks

grand estuary
#

csv.DictReader

vestal wadi
#

Thanks I will check that

gleaming quest
#

Why while using Discord.py SQLite databse acts like a ghost? I can put values in it and just after putting data I can get them out, but after closing the connection and reconnecting... There's no data

#

My code: py try: conn = lite.connect('users.db') c = conn.cursor() print('User database data:') for row in c.execute('SELECT * FROM last_online ORDER BY uid'): print(f'- {row}') print('That\'s all') except lite.Error as e: print(f'Error: {e}') finally: if conn: conn.close()

#

Searching C:/ for users.db has no effects

upbeat lily
#

You need to commit your changes I believe

#

with conn.commit()

gleaming quest
#

Hm...

upbeat lily
#

when inserting the users that is

gleaming quest
#

Okay, now data is saved ^-^

#

Searching C:/ for .db file ;-;

#

It still doesn't exist in script directory

#

Neither next to __main__.py and python.exe

upbeat lily
#

What is your current project structure?

#

Are you running your code with Pycharm?

#

If so, try putting it in the root of your pycharm project

gleaming quest
#
main_folder
| - bot
|   | - __main__.py
|   | - __init__.py
|   | - cogs
|   |   | - __init__.py
|   |   | - basic.py
| - README.md
| - LICENSE
```Mostly
#

If so, try putting it in the root of your pycharm project
@upbeat lily what "it"?

upbeat lily
#

the file you're trying to open

gleaming quest
#

Doesn't work

#

Also putting it next to __main__.py doesn't work

#

But... I can read data from SQLite databse

#

Maybe file is in cloud? ;v

upbeat lily
#

If you can connect to the database at all

#

you should be able to both read and write to it

gleaming quest
#

Yup

upbeat lily
#

how are you trying to write to it?

gleaming quest
#

Writing works

upbeat lily
#

then whats the problem

#

you

#

you create the database

#

but you can't find the file anywhere

#

?

gleaming quest
#

Writing code: ```py
conn = lite.connect('users.db')
c = conn.cursor()
sample_data = [
('KH06-GH78-SGYR-8J06', '2020-01-20 07:10:04')
]
c.executemany('INSERT INTO last_online VALUES (?, ?)', sample_data)
conn.commit()

#

Yup

#

File doesn't exist

upbeat lily
#

Oh sorry

#

now I understand

gleaming quest
#

Before I've added 3 rows of sample data

#

Now added 1 more

#

And I can read all 4

upbeat lily
#

try providing it an absolute path

#

so something like conn= lite.connect('C:\\path-to-your-directory\\example.db') and see if that works

#

assuming that works, then the problem is to do with how you're running your python Code I'd guess

#

Which is an area I'm rubbish at fixing problems in

gleaming quest
#

Works

upbeat lily
#

What are you using to run your code?

#

yeah

#

that should work

#

although I believe pathlib is a better solution

gleaming quest
#

Project virtual env

upbeat lily
#

Are you using Pycharm, or another tool?

gleaming quest
#

PyCharm

upbeat lily
#

If its Pycharm, you can fix it by messing about with the project interpreter configuration

#

I think

#

but the solution you posted is better anyway

gleaming quest
#

Everything works fine ^-^

#

Very thanks to you @upbeat lily

rain wagon
#
SELECT      tcs.characterId, ts.name AS station_name, tco.name as constellation_name,
            t_region.name AS region_name, tss.name AS solarsystem_name, tss.securityStatus
FROM        t_location 
INNER JOIN  t_characters AS tcs ON tcs.characterId = t_location.characterId
INNER JOIN  t_station AS ts ON t_location.stationId = t_location.stationId
INNER JOIN  t_solar_system  AS tss ON tss.solarSystemId = ts.solarSystemId
INNER JOIN  t_constellation AS tco ON tco.constellationId = tss.constellationId
INNER JOIN  t_region ON t_region.regionId = tco.regionId;```does someone spot my mistake? The statemet is returning multiple rows for some reason but I don't see it
#

it should return one row per ID, but it returns 2

#

Found it, thanks discord highlighting 😄

#

couldn't spot it for the life of me in dbeaver

torn sphinx
#

So I want to save user generated IDs (that have a relationship in other table) that I generate in groups. For example a group would look like [4, 593, 122, 600, 500] but since they're related to another table to have the relationship done I would need to store one ID per row. The issue is I want to keep track of which group they belong to (maybe on a per-user basis)
I thought of something like:

user_id, group_id, id
1, 1, 4
1, 1, 593
1, 1, 192
1, 1, 600
1, 1, 500
1, 2, 233
...
#

How can I achieve something like that?

rain wagon
#

with a foreign key @torn sphinx

#

Your group table must have a primary key

#

then store each number or group in a table with the group id as a FK

#

but remember: atomic data

#

only one value per field

torn sphinx
#

@rain wagon How would I know what group_id to use? As a serial?

rain wagon
#

yes, auto incrementing integer

#

If I understood you correclty

torn sphinx
#

@rain wagon Well the amount of ids per group would vary so wouldn't it be better to have them in different rows in this case?

rain wagon
#

yes

#

I was just unsure about the amount

#

if it is one number per row or otherwise does not matter

#

but what you have here is a typical n:m relation

#

and that third table resolves that relation

torn sphinx
#

Yeah I guess it makes sense to do it that way then, thanks

toxic rune
#

Generally if you're inserting the current UTC time on the database, would you use a default column (TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT (NOW() AT TIME ZONE 'UTC')) or calculate it on the application side (in this case a Python script) and insert it with the rest of the values?

glad bobcat
#

Hello friends. Not technically a Python but it’s linked to my usage of Python.
I need to create a pretty voluminous SQL database (500+ Gb) for my work but it doesn’t need to be super high performance or reliability as it’s mostly for analytics that’ll be computed once or twice a day. Would it make sense then to make it as an AWS EC2 instance with big storage?

#

Also I’ll run quite a lot of code that’ll make a lot of small but stupid requests to the server, so being on the same machine could help (SQLAlchemy yay)

solemn ridge
#

Hi I am looking to update the rows of a user but I am getting AttributeError: can't set attribute

#
def updatehandler(User):
    print(User.username)
    userobject = session.query(users).filter_by(username=str(User.username)).first()
    print(userobject)
    userobject.username = User.username
    userobject.password = User.password
    userobject.email = User.email
    userobject.firstname = User.fn
    userobject.lastname = User.ln
    userobject.birthdate = User.date
    userobject.about = User.aboutme
    session.commit()
    return
#

this is my function

rancid flicker
#

suggest me some some dbms project idea
database management system

rain wagon
#

@rancid flicker Write a database abstraction layer to get to know the lower functions a bit better

#

Use sqlite, write some classes that have a database as a backend to store their data

#

but abstract in a way that you don't need to worry about it

rancid flicker
#

I will think about

solemn ridge
#

Hi I am using sqlalchemy. I have created an Upadte User Info panel where users choose the info they want to change. I want the values that the user has filled to be changed and the rest should stay the same

#

This is an example response I get with only some of the fields filled

#
ImmutableMultiDict([('username', 'efa'), ('password', ''), ('email', ''), ('firstname', ''), ('lastname', ''), ('birthdate', ''), ('aboutme', '')])```
torn sphinx
#

So like, if you have a song by Dr Dre featuring Eminem, Lil Kim, Snoop Dogg, and DatBoi... the specific song on the song table has a OneToMany Relationship to the artist table?

void otter
#

Yes

torn sphinx
#

What if you wanted to differentiate between the head artist and the featuring artists? Would each song then have its own table with two columns, one with a OneToOne for the Main Artist and OneToMany for featured artists?

cerulean pendant
#

songs <-> artist is somehow many to many, then you can add data to the relation table

torn sphinx
#

Hrm, I think at this point the main thing that would help me understand database relations any better is to see as many example databases with accompanying PSQL/SQL as possible.

#

Are there any good repos or lists of such things that helped you guys?

upbeat lily
#

The way I learned databases somewhat properly was reading through Wikipedia articles during my database lectures

#

they have a lot of good examples there for things like normalisation

torn sphinx
#

Never would have guessed.

#

A database lecture might be exactly what I need! Im going to look on YouTube.

#

Look who it is.

#

I should have remembered that this guy generally has great content on this kinda stuff I never got around to back when I used to watch his old videos.

solemn ridge
#

Hi I am trying to update a user row using SQLAlchemy

#

When trying to update the user's attributes I am getting this:

#

userobject.username = str(User.username) AttributeError: can't set attribute

#

Please help me I have been stuck for hours

torn sphinx
#

hey guys

#

how do mobile applications use databases.. like, I'm not familiar with the concepts.. to me it seems like a lot of clients to service.. and I'd like to understand how it works

#

also, what's partitioning and indexing in this context

rain wagon
#

they mostly use sqlite as a local db

#

and use a rest api for accessing data online

#

Because having database access tokens such as passwords in an app in the hands of a user is not advisable

torn sphinx
#

ok rest api.. check..

#

but how does one DB service so many clients (same app running on thousands of phones)

#

because to me, a DB is something you query, and maybe allow a couple more users to query

solemn ridge
#

Db and server arent the same

#

Db request are almost instant

celest zodiac
#

For something that services many users they typically have many nodes distributed horizontally to share the load

solemn ridge
#

Or that ^^

torn sphinx
#

it would be nice to see a picture or something.. of the architecture

rain wagon
#

With things like MariaDB you can also configure to store a lot in memory to cache it

celest zodiac
#

The kinds of databases used to run big services like a social network are designed to scale massively

torn sphinx
#

so distributed databases would have lot of nodes, ok.. but data is replicated across the nodes? how do you know where to go for which query

celest zodiac
#

There's a couple of ways to do that

rain wagon
#

@torn sphinx The key here is master-master replication

celest zodiac
#

typically an incoming request is just dispatched to whatever node is available, and the data is synchronized between nodes. Some services don't need absolute consistency between nodes to be useful

#

a financial institution has very different consistency requirements than a social network

rain wagon
#

Also, with the REST api in front of the db, you can cache a lot of the requests, if there is no new data

#

There is something called ETag which is used to facilitate that

torn sphinx
#

so every node has the same data to be able to service all incoming requests? that means writes are done to only one node in some cases?

rain wagon
#

If no new data is there, it just sends back a 301

#

causing very little cpu/network time

#

@torn sphinx Yes, with a replicating db, each db sends all DML type queries to the other

celest zodiac
#

there's a couple of different ways you handle consistency - if the DB cluster is "eventually consistent" (many NoSQL systems do this) then the writes are propagated out from the node where they were made without a guarantee that they'll be immediately available in all nodes at the same time

rain wagon
#

You do need to take measures against deadlocks though

celest zodiac
#

obvs this doesn't work for a bank :D

torn sphinx
#

ok, so in that case some nodes are behind the master

#

(I hope I'm using the master reference here correctly; i

#

my mind is on k8s when I think master node, and that does very little over there)

rain wagon
#

On each db, you can set one slave

#

so that master sends all DML type requests to their slave

#

now you activate the slave on the other db for their master

#

and hence, master-master replication

torn sphinx
#

so writes happen on both masters?

#

concurrently?

rain wagon
#

No, you need to load balance it

torn sphinx
#

oh

rain wagon
#

So each db gets the next user, one left, one right so to say

#

and then they replicate each other

torn sphinx
#

so wherever the write happens (in either master) the read can happen on any slave, because it gets replicated across all slaves?

rain wagon
#

yes

torn sphinx
#

yay im finally getting it

#

could you tell me what partitioning and indexing are..

#

it would be great if you could say it in context of a mobile application..say fit bit

rain wagon
#

This was my final exam work for my profession. It utilizes nginx and mariadb for a load balanced wordpress environment

#

And 2 master-master replicated dbs

torn sphinx
#

like, how do they keep each user's location info and have it available to be retrieved.. they must maintain one instance of DB serving thousands of clients right?

rain wagon
#

A database is made to serve a lot of data in a short amount of time

#

And there are a lot of techniques to make this happen

#

for example, clustering

#

memory storage

#

and so on

torn sphinx
#

listening

rain wagon
#

What shall I say? On my last job we had a cluster running 30 linux RHEL servers for websites

#

With 3 memory servers that do nothing but store user sessions

#

Data was delivered with a SAN

torn sphinx
#

ok, I understand clustering.. you have more nodes to service more traffic

rain wagon
#

Whcih is a high performance Storage Area Network using Fibre channel

torn sphinx
#

user sessions?

rain wagon
#

that has about 2.5 tb/s throughput

torn sphinx
#

oh.. the stateful stateless thing.. so you can remember and service users without new login?

rain wagon
#

yes

torn sphinx
#

ok

rain wagon
#

And carts

#

Shoppping carts

torn sphinx
#

got it

rain wagon
#

this is a fibre switch

#

eh wrong link

#

So, with a data connection like that, you can deliver a lot of data in a short amount of time with a low latency on a very high distance

#

We've actually put 1km more fibre into one rack, so the latency to both locations is the same

torn sphinx
#

I dont think I need to understand the hw details.. I'm on the data engineering side, and trying to get a handle on what to provide to mobile apps to make our sink available for those applications

#

it's why I'm trying to understand partitioning and indexing in this context..

rain wagon
#

It basically splits many tables into smaller slices

#

but tbh, I haven't personally dabbled with that 😄

torn sphinx
#

like.. for example, in normal analytics context, we would partition by key, which would make it easy to write the messages to specific partitions.. that makes it faster to query for end users

#

so i'm basically trying to understand, if partitioning and indexing concepts work in serving mobile apps.. in ways to make querying faster and reduce latency

#

ooh this is useful

rain wagon
#

I don't think there is an universal answer on that though

#

I think it really depends on how large your db is, to tell if the partitioning has an effect

torn sphinx
#

im wondering how it works for consumer facing apps

#

I mean, the only unique thing is the User ID or something,

#

so what would they apply in that case

rain wagon
#

I am sure there are more uniques

#

Article ID

#

just one example

#

Category ID

#

you get the gist

torn sphinx
#

what about instagram

#

or fit bit

rain wagon
#

Post ID

torn sphinx
#

or pokemon go

#

like.. they store user information linked to things on the app

#

wonder if they partition or index by anything

rain wagon
#

You'd have to ask the creators of such systems 😄

torn sphinx
#

T.T

#

but I need to know

#

I'll keep searching for resources then

rain wagon
#

speculating isn't knowing 🙂

torn sphinx
#

ok.. so ideally this approach should work

#

I find out what db service they use and see best practices are there for those

rain wagon
#

Just select one database system you want and have a look

#

MySQL is just as capable as PostGres or Oracle etc

#

Or look at what those db do best

#

Some are better in some areas the other isn't

torn sphinx
#

yes, but the methods for partitioning change depending on the type of service yeah?

#

for example, storing items for an ecommerce website.. is different from maintaining customer linked info

#

thanks for the help.. I learned a lot.. will do some reading in the AM

torn sphinx
#

I just read this entire conversation and my mind is blown. Thats all.

glad bobcat
#

They even specify:
The Hybrid Value pattern is very useful for any kind of value that may have multiple representations, such as timestamps, time deltas, units of measurement, currencies and encrypted passwords.

#

Ok for timestamps it’s actually pretty easy to do a hybrid property with SQL functions.

But let’s say I want to have a full Python hybrid property and want to be able to query/filter on it, what is the exact syntax and which functions do I need to implement in the comparator?

#

This looks like what I’m trying to do (and I have a similar problem as what he’s talking about with FX rates)

wary panther
#

I think I have a python/database question? I have a query that is a tad bit complicated. I'm updating a particular column in a database and setting some null values to a string value under the condition that ANOTHER column left joins with a column in a DIFFERENT table in that database. Actually, it's a little bit more complicated than a left join because it's a left join where the other value is null, to make it just the outside bit of the venn diagram. whew.

I think between the query being complicated and the database being large, it's just not working very well trying to get it to go through. It's taking forever. And it times out.

I'm trying to find a way to fix this and my guess is I should chunk the data so the query gets run multiple times but just on portions of the data? Is it not super complicated to write something in python to do that?

Or am I thinking about this the wrong way and should be doing something to my database to make it work better?

#

oh sorry Tolki, your stuff first

glad bobcat
#

Nah it’s ok everybody is lost here anyways

#
    """Hybrid value representing the champion name in English"""

    def __init__(self, champion_id):
        self.champion_name = id_getter.get_champion_name(champion_id)
        self.champion_id = champion_id

    def __eq__(self, other):
        return self.champion_id == id_getter.get_champion_id(other)

    def __str__(self):
        return self.champion_name

    key = 'champion_name'
#

This is sort of doing what I wanted but I’m not sure it’s right 🤔

#

I can filter on it with that but when accessing the hybrid field directly it’s... not very good looking

#

It is working but I spent 3 hours reading the doc and I’m still not sure I did the right thing so I’m just gonna go to sleep, gl hf @wary panther

#

If you’re using SQLAlchemy the yield_per option is very useful when dealing with big subqueries

wary panther
#

MySQL database, using pymysql when I'm using python to connect and do things with it from python

hexed epoch
#
    id = db.Column(db.Integer, primary_key=True)
    paper = db.Column(db.String)
    time = db.Column(db.DateTime)
    author = db.Column(db.String, db.ForeignKey('User.username'))``` am I using the ForeignKey right here? I'll still set the author when creating a new row but I am connecting it to my User table. right?
pliant pendant
#

do u have to say its a primary key everytime u access it tired

rich trout
#

@hexed epoch That should be, yes, although I'd strongly suggest not using a db.String foreignkey and using a user id number instead

turbid mulch
#

hi guys

#

I'm doing a new project it needs redis, but I use windows 10 system

#

Does redis support for windows

#

if so please guide me how to install and run in windows

#

I have worked with postgres, mysql and mongodb but I'm new to redis

half basalt
#

Hello!

I am currently trying to scrape some data from a firebase realtime database I found online. I have the apikey, authDomain, databaseURL, projectID, storageBucket, and messagingSenderID. I have also found on the website a few instances of url suffixes that produce json data. However, I have reason to believe there are other possible suffixes that store data which I'd like to access. Is there a way of listing all of the available files on a domain with the info I have?

radiant minnow
#

I don't think firebase exposes an endpoint that tells you all other endpoints available

prisma kernel
#

I have a non-python issue in that i have an established MSSQL database and need to prepare a script that (destructively) enforces new constraints. My current approach is to add the constraints in a disabled state, get the offending lines with DBCC CHECKCONSTRAINTS, delete the offending lines and enable the constraints. However i have issues with adding DISABLED NOVALIDATE constraints in MSSQL
any ideas?

torn sphinx
#

can someone explain RAFT to me please

#

I dont get it.. do all the nodes contain all the data

torn sphinx
#

what is the difference between those two ?

#

which is the "tightest" constraint?

pure cypress
#

The former is a single constraint on two columns, meaning that individually the columns could have duplicate values, but the two columns together must be a unique combination.

#

The latter has a separate constraint for each column, meaning each column must have unique values on its own.

torn sphinx
#

so the former has the tightest constraint ?

pure cypress
#

I'm not certain on what tightest means but I think it'd be the latter, since each column must be unique. The former would allow for duplicates as long as the combinations were unique.

#

In the latter, it is not even possible to have two different people which share a birthday.

gaunt jay
#
SQL = "SET @potion_name = (SELECT `Item Name` FROM `UserItems`,`Items` WHERE `UserItems`.`UID` = {0} AND `UserItems`.`SID` = {1} AND `UserItems`.`PID` = `Items`.`IID`);SELECT @potion_name".format(member_id, self.server_ID)
print(SQL)
self.cur.execute(SQL)```
#

the query itself works perfectly in phpmyadmin

#

but when i run it on python lik eabove,

#
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SELECT @potion_name' at line 1")```
#

whats going on..... 😦

wicked fog
#

@glad bobcat if you want to be able to run SQL order/filter by computed properties with sqlalchemy

#

you need to define the computed property as an SQL function in your db

glad bobcat
#

I actually managed to do it just by defining a comparator

wicked fog
#

then in your sqlalchemy class definition (after class but before def __init__(self) if you have it)

glad bobcat
#

I'm unsure if that's the right way to do it but it works sooooo

wicked fog
#
computed_property = Column(Float, server_default=FetchedValue())
glad bobcat
#

You can have pure python attributes computed at the instance level though

#

And still filter on them

wicked fog
#

My understanding is that it's generally faster to offload that onto the db

glad bobcat
#

Well for my case it definitely isn't, it's just a comparison to a dictionary and the SQL produced directly queries the right DB files with the transformed values

#

I just implemented eq() because I only filter on equalities for that field and it works perfectly.

#

class GameDate(object):
def init(self, timestamp):
self.timestamp = timestamp
try:
self.date = datetime.fromtimestamp(timestamp / 1000)
except TypeError:
self.date = None

def __lt__(self, other):
    return self.timestamp < other.timestamp() * 1000

def __gt__(self, other):
    return self.timestamp > other.timestamp() * 1000

def __str__(self):
    return str(self.date)

def __repr__(self):
    return str(self.date)

key = '
#

I'm on phone sorry

#

But this worked to translate a timestamp to a proper datetime object on instance level

#

And still be able to filter on it

wicked fog
#
 class GameDate(object):
     def __init__(self, timestamp):
         self.timestamp = timestamp
         try:
             self.date = datetime.fromtimestamp(timestamp / 1000)
         except TypeError:
             self.date = None
 
     def __lt__(self, other):
         return self.timestamp < other.timestamp() * 1000
 
     def __gt__(self, other):
         return self.timestamp > other.timestamp() * 1000
 
     def __str__(self):
         return str(self.date)
 
     def __repr__(self):
         return str(self.date)

#

Formatted

torn sphinx
#

Why is the result of sqlite3 sometimes simply None, and sometimes a tuble containing only None?
So either
None
or (None,)
This is absolutley confusing, and just makes checking for none so much harder

void otter
#

it's i think that sqlite connector requires a tuple to interact with database

torn sphinx
#

So do I always have to do a bunch of try, except stuff just to check if the result is None, or contains None? This is so tedious...

hexed epoch
#
class Class(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    label = db.Column(db.String(4))
    pupils = db.relationship('User', backref='class_', passive_deletes=True)
    assignments = db.relationship('Assignment', backref='class_', passive_deletes=True)


class Assignment(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String)
    number = db.Column(db.Integer)
    class_id = db.Column(db.Integer, db.ForeignKey('class.id', ondelete='CASCADE'))
    task = db.Column(db.String)
    deadline = db.Column(db.DateTime)
    papers = db.relationship('Paper', backref='assignment', passive_deletes=True)


class Paper(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    paper = db.Column(db.String)
    time = db.Column(db.DateTime)
    author_id = db.Column(db.String, db.ForeignKey('user.id', ondelete='CASCADE'))
    assignment_id = db.Column(db.Integer, db.ForeignKey('assignment.id', ondelete='CASCADE'))``` I'd like to get all assignments provided a User's class_id where User hasn't got any papers.
#

I've tried python Assignment.query.filter((Assignment.class_id == current_user.class_id) & (Assignment.papers.any(Paper.author_id != current_user.id))).all() doesn't work tho.

#

Anybody got ideas?

#

(it's flask_sqlalchemy)

hexed epoch
#

I got it figured out, it works but only if there's a Paper object in Assignment.Paper, I mean a solution could be to just always create a blank paper with every Assignment but can I check someway if Assignment.papers is None?

wary panther
#

can I partition a database table that is already created or do I have to do that at table creation?

torn sphinx
#

what's the use of time series being supported as a native data type in time series databases

calm charm
#

how to add values to a table in sqlite3

shell ocean
#

@torn sphinx often faster/more intuitive operations

bold pond
#

I'm trying to connect mysql with python and got this:

#

mysql.connector.errors.InterfaceError: 2003: Can't connect to MySQL server on 'localhost:3306' (10061 No connection could be made because the target machine actively refused it)

#

my code:

#

import mysql.connector

'''from mysql.connector import errorcode
try:
cnx = mysql.connector.connect(user='root',database='menagerie')
except mysql.connector.Error as err:
if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
print("Something is wrong with your user name or password")
elif err.errno == errorcode.ER_BAD_DB_ERROR:
print("Database does not exist")
else:
print(err)
else:
cnx.close()'''

cnx=mysql.connector.connect(user="root",host="localhost",passwd="password",database="menagerie")

shell ocean
#

...do you have MySQL running?

bold pond
#

yes

indigo flame
torn sphinx
#

@shell ocean can I use spark streaming on small data, like few mb every few seconds.. i'm trying to find some benchmarks.. but can't seem to

shell ocean
#

uh...why not?

torn sphinx
#

ok.. let me explain

#

isn't it too much overhead