#databases

1 messages Β· Page 28 of 1

topaz aurora
#

Side note: I'm a bit nervous so I really appreciate your insight and advice

#

I'm thinking of looking into it

topaz aurora
#

MySQL is the one I've worked a bit with

#

Should I?

stoic stream
#

ok, many things are similar across the board to be honest

#

are you interested in being an analyst/data scientist? Or more of a database admin/data engineer?

#

or just use databases in software development?

topaz aurora
#

Both but more Data Engineer now

stoic stream
#

ok you want to be learning DDL/DML features of SQL, creating tables and views etc.

#

and also understanding the performance of different queries, different joins, what happens when you join on int vs join on string, how the join differs between two big tables, vs one small table and one big table

#

also ETL, processing data to make it ready to then load onto a database/data warehouse, e.g. python on unstructured data to turn it into csv

#

I'm an analyst

#

so I won't pretend I know too much about data engineering

#

you can start getting into Terraform etc.

topaz aurora
stoic stream
#

There are books as well as courses

topaz aurora
#

Any in particular you'd recommend?

stoic stream
#

but see if you can find an equivalent book on AWS

topaz aurora
stoic stream
#

Good luck!

stoic stream
#

You're welcome!

west hill
#

anyone has a tutorial to get oracle db up and running for remote connections in azure vms i tried setting it up though azure cli but towards the last steps it asks for a password for the db but i have no idea what its set to based on the tutorial Microsoft has on this

#

sudo su - It asks for a password when i am logged in as oracle user and try to invoke the above command

paper flower
#

What's the problem with using enums? πŸ€”

strong aurora
#

does anyone know of an opensource relational database that I can use to practice with?

strong aurora
#

not the system software, a database example that I can use

keen minnow
strong aurora
#

yes

fading patrol
wheat goblet
#

guys i need to choose between two database to store a thousands of records in a single table like a table containing thousands of names in a single column. I need help for what to use Mysql or Mongo for such purposes? so that its faster to store and fetch at the same time

wise goblet
# wheat goblet guys i need to choose between two database to store a thousands of records in a ...

Mongo is for lazy devs, making app that will not be (ideally) ever improved later or having next versions. There is less than 1% specific usage case when it could be needing for some stuff beyond relational dbs though. Mongo can flare managably somewhat if u utilize Pydantic very strictly, but eventually u will still face consequences if it is constantly developed app.
Mysql is needed in 97% cases to build a backend that will be of adequate quality and improved later. Fitting for millions/terrabytes of data

wheat goblet
wise goblet
#

sqlite3 can operate fully locally

wheat goblet
#

Thanks for that info as well.

still geyser
#

Not sure if this is the right place, but I'm having some issues with my code I've wrote for interpretting and plotting datasets for my uni physics work. Is anyone able to maybe help me out? I imagine its's fairly basic stuff but I've only been coding a few months.

storm mauve
paper flower
wise goblet
#

it is a real string in almost everything, and writeable and comparable and etc as a string, without need for .name/.value stuff

#

it may fail only in case of exact isinstance(str) smth comparisons, there is some warning written about it in docs

coral wasp
tough bough
#

Can anyone with knowledge in SQLAlchemy 2.0 help me with some queries? I'm really really really lost here. The docs only makes me more confused

manic ibex
#

I have a SQLite table and I want to fzf/fuzzy search any number of columns. There could be multiple readers, but it’s unlikely. Should I ad-hoc return all columns or implement some caching, for example with a local file and flock

#

All readers are local, I can just pass a file handle with the cached file

native zinc
#

Hey guys! I'm looking for good workflows for building databases in Python. I've found working with SQLAlchemy to be typing the same stuff over and over again and kind of a nightmare to change things, and Django is a dream, but understanding concepts like unit of work and some other software architecture concepts that should supposably "Make things easier" hasn't been so fun.

I'm looking for the best place to learn the following as quickly as possible:

  • Database migrations
  • Organizing and deciding what makes a "unit of work" for SQLAlchemy

I find database migrations to take forever when it seems like it should be something that should be easy (Add a column in an excel sheet, no? πŸ€·β€β™‚οΈ)

Trying to find a good general workflow for Python projects, but like, Domain model, sqlalchemy orm, pydantic validators, 3 things to create the same concept in fast api? Seems a little bit like overkill imo πŸ˜…

Wondering how you guys are experiencing this and how to get to a point where it's not confusing as hell.

I only know databses through the Python ORM but need to know how to keep things simple.

Does anyone really know these things like the back of their hand?

fading patrol
#

I've only done one project each with SQL Alchemy and Django but am finding Django substantially easier when it comes to migrations just working as expected

As for the "typing the same stuff over and over", that just sounds like an IDE issue. In the age of Copilot (better) or CodeWhisperer (completely free) I no longer find that much of a problem

As for organizing and deciding what makes a unit of work, I don't really see a problem there specific to using an ORM so maybe you want to elaborate

tender oak
#

Hi i am new to this community, I am currently working on a project from work that involves PySpark. I have some implementations issues, can I ask them here or am I in the wrong channel?

stoic stream
wet bay
mild agate
#

hi, i was planning on making an application like "medium" where you can create your own articles. Of course, in a much smaller scale since i believe i would be the only one using it. Which type of database should I use? Because I know SQL but i don't know if it would be appropriate because i don't know how would i query each article and then style it. My first thought was of doing a row for every paragraph but.... i don't think that's a very good idea. Could anyone give me a hand? also i was planning on using django

west hill
#

!pypi wagtail

delicate fieldBOT
west hill
#

a cms seems good for this endeavour

wise wind
#

If you don't commit with sqlalchemy, does it rollback all the changes you made to the database?

signal junco
#

Somewhat off-topic. We have Grafana in our stack and I am trying to find out if there exists a Discord for discussions on Grafana, or perhaps if not, a more general Discord for discussions around data visualization and or "NoSQL"'ish systems.

#

I would then have a question regarding Variables in Dashboards in Grafana, in particular, whether or not it is possible to use a Variable within the definition of another Variable, and if yes, whether or not Grafana is smart enough to resolve the first Variable before resolving the second Variable.

#

There is a notion of chained variables in Grafana, but that is not quite it, it seems that chained variables are more a thing revolving around namespace'ish kinda property-path'ish logic.

signal junco
# wise wind If you don't commit with sqlalchemy, does it rollback all the changes you made t...

I found https://docs.sqlalchemy.org/en/20/orm/session_basics.html#what-does-the-session-do and https://docs.sqlalchemy.org/en/20/core/connections.html to be helpful on this question. If you visit these two pages, make sure that you do read the very first paragraphs on either page as well, they already contain key information that points in the right directions, I mean just don't TL;DR-skip over the first paragraphs 😊

rare crown
#
class BaseCount(models.Model):
    property = models.CharField(max_length=32)
    subgroup = models.CharField(max_length=16, null=True)
    end_time = models.DateTimeField()
    value = models.BigIntegerField()

    class Meta:
        abstract = True

class UserCount(BaseCount):
    user = models.ForeignKey(UserProfile, on_delete=models.CASCADE)
    realm = models.ForeignKey(Realm, on_delete=models.CASCADE)

    class Meta:
        # Handles invalid duplicate UserCount data
        constraints = [
            UniqueConstraint(
                fields=["user", "property", "subgroup", "end_time"],
                condition=Q(subgroup__isnull=False),
                name="unique_user_count",
            ),
            UniqueConstraint(
                fields=["user", "property", "end_time"],
                condition=Q(subgroup__isnull=True),
                name="unique_user_count_null_subgroup",
            ),
        ]
        # This index dramatically improves the performance of
        # aggregating from users to realms
        indexes = [
            models.Index(
                fields=["property", "realm", "end_time"],
                name="analytics_usercount_property_realm_id_end_time_591dbec1_idx",
            )
        ]
INSERT INTO analytics_usercount(property, subgroup, end_time, value, realm_id, user_id)
VALUES ('messages_read::hour', NULL, '2024-02-24T20:00:00+00:00'::timestamptz, 4, 2, 11)
ON CONFLICT (property, subgroup, end_time, realm_id, user_id)
WHERE subgroup is NULL
DO UPDATE SET
value = analytics_usercount.value + excluded.value

psycopg2.errors.InvalidColumnReference: there is no unique or exclusion constraint matching the ON CONFLICT specification

wise goblet
signal junco
#

Oh I need to ask ChatGPT about my above question, too. Thanks for the reminder.

stiff gale
#

hey yall im using sqlite to save some values and stuff and for some reason this code gives me this error and i do not understand why...
code:

self.db['cursor', database].execute(f"SELECT {values} FROM {table}{filters}{order_by}")

Error:

Traceback (most recent call last):
  File "D:\Python Project\Origins Updater\main.py", line 430, in <module>
    open_second_gui()
  File "D:\Python Project\Origins Updater\main.py", line 282, in open_second_gui
    app = App()
  File "D:\Python Project\Origins Updater\main.py", line 229, in __init__
    for item in self.forks[self.sqdb.dict_getone('settings', 'fork')]:
  File "D:\Python Project\Origins Updater\libs\database.py", line 163, in dict_getone
    data = self.fetchone(table, filters=f'itemkey={key}')
  File "D:\Python Project\Origins Updater\libs\database.py", line 135, in fetchone
    self.db['cursor', database].execute(f"SELECT {values} FROM {table}{filters}")
sqlite3.OperationalError: no such column: fork

Process finished with exit code 1

the query that gets generated if i take f"SELECT {values} FROM {table}{filters}{order_by}" into a print statement:

SELECT * FROM settings WHERE itemkey=fork
#

chatgpt cant figure it out either

#

okay i just figured it out. if i put fork in "" it works

coral wasp
#

!sql

delicate fieldBOT
#
SQL & f-strings

Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.

Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.

For example, the sqlite3 package supports using ? as a placeholder:

query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)

Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.

See Also

  • Python sqlite3 docs - How to use placeholders to bind values in SQL queries
  • PEP-249 - A specification of how database libraries in Python should work
thorny remnant
#

(note: using sqlite for this project)
I really feel like I should be using several .db files to keep things organised
for example one with general data that the application uses and one with user data but realistically I don't see why would I, it's just my organisational spirit telling me to
is there a good reason to use seperate databases sometimes? what are benefits of it? (as in cons I see that I lose out on possible joins or whatnot if it is ever needed)

jade wing
waxen finch
jade wing
paper flower
thorny remnant
#

alright, I'll bite my lips and keep it in one file

tough bough
#

How can I make a select in SQLAlchemy 2.0 ORM in table which have a relationship and the attribute of the relationship is a list?

idle girder
#

So I have this abstract idea I need to implement for my web app written in Python:

1. Upload various time series as CSV/polars DataFrame from user input to database - done
2. Catalogue each of these inputs in a separate table including metadata on the uploads - in progress
3. Allow user to make a join plan between different time series and define new/derivative columns.
4. Define a domain-specific schema for use in my application that dictates how the data will be treated internally (e.g., there are special columns that my application needs to be defined for certain operations to be valid on it, certain columns/series will be treated differently based on tags in this schema)
5. Reify the specified joined + transformed dataset and save it for later use
6. Save the schema so that it can be passed to downstream objects that will operate on the data
7. Map the reified dataset to the defined schema so that they are always used together when the dataset is selected from the DB

I was wondering if anyone has any suggestions for how to implement 2-4 in particular and whether you think I should enforce a schema definition as part of 1 or defer the schema definition until the data is being prepared for use in a job as the numbering suggests

#

if anyone has any experience representing database schemas as UI inputs from a web app thatwould be helpful I guess

jovial yew
idle girder
#

I'm trying to create a web app that show training and results of reinforcement training applied to time series problems

#

part of it would be allowing people to define their own datasets from data they upload to the app

#

I think it makes it a lot more usable

jovial yew
#

have you tried some existing solution?

idle girder
#

Not sure what I can integrate with

jovial yew
#

snowflake, big query

#

which cloud provider are you using?

idle girder
#

I'm developing local for now because GPU costs on cloud too much

#

I have a home server with a threadripper and a few RTX gpus

jovial yew
#

you can host a db locally

idle girder
#

I do have a postgres db

#

that's what the app uses

#

I'm blobbing dataframes and just unblobbing them for joins/column definitions

#

but I need a way to make it easy to join and apply transformations to the data after the user uploads them

#

basically I'm trying to generalize a particular use case I had myself and so I need to go back and parameterize things so it can work on arbitrary data with user provided schemas

jovial yew
#

all the use cases you mentioned can be done with a db and a database manager, don't you think?

idle girder
#

You mean if I actually have users do all the DB operations themselves to define the tables they want to use for a job?

#

Not sure I follow you

#

I mean yes I have the capability to do things I'm trying to figure out how to abstract it so it can be user configurable

jovial yew
#

who are end end users?

#

if there are technical, sql is a good frontend for them

idle girder
#

people who want to upload stock time series and other exogeneous data and try to fit reinforcement models for them

#

so I'm guessing mixed levels of sophistication with DB use

#

I'm seeing the value provided as being a place to ingest data, prepare it for time series analysis, fit a reinforcement model to it to try to find a trading strategy, and give analytics on the results

#

the thing is financial data is kind of BYOB

#

because it's all very commercialized and has proprietary licenses

#

and people might want to try different types of exogeneous data

#

so I need to make it a BYO data system

jovial yew
#

all this sound like what snowflake does, tbh

#

i'm not affiliated with snwoflake or anything, like i have used it the past

idle girder
#

what does snowflake provide besides the cloud DB

jovial yew
#

easy of use basically

#

data and compute is seperated

#

have you heard about datalake architecture?
snowflake is basically that

idle girder
#

that's more about how the DB's state is structured isn't it

#

like how they serve the shards of the DB etc.

jovial yew
#

you can make something similar with apache iceberg, if you want to do it yourself

idle girder
#

but if I replace postgresql with snowflake what does that get me in terms of something like a representation of how to join tables together to create a deferred result table

#

that's kind of what I'm talking about with making the data user configurable

#

they upload a bunch of CSVs let's say, then they have a UI to define how to join them

jovial yew
#

to summarize upload data, analyze, visualize and make models?

idle girder
#

yes

jovial yew
#

there are tools to already do this, why make one yourself, that was the initial question

idle girder
#
  1. Not sure what's out there
  2. Don't have a lot of money to spend on development
jovial yew
#
  1. low-code or code?
  2. making one cost $ too
idle girder
#

I'm doing it all myself so it only costs me time

#

and I already have the hardware

jovial yew
#

just you use it then?

idle girder
#

for now yes

#

I'm planning on making something usable more generally

#

like I said I implemented something for my particular use case

#

with specific logic

#

now I want to parameterize it to make it more generally useful

#

and part of that is being able to replace my specific dataset with configurable datasets

#

also it will save me time in the future too tbh

#

preparing a dataset can be a timesink

jovial yew
#

so to upload files to some place you need a file server, something like minio
and make schema, tables on them, maybe apache iceberg

#

connect to it via some database manager, jupyter or make a custom ui

idle girder
#

what does minio do that replaces something like this

@main_app.post("/upload_csv")
async def upload_csv(request: Request):
    """
    Uploads a CSV file, saves it to a specified output path,
    and inserts its contents into a database as a raw table.

    Args:
        request (Request): The request object containing the file path,
        output path, parse_dates flag, and index_col.

    Returns:
        dict: A dictionary with the status of the upload process.
    """
    if not request.files:
        return json({"status": "error"})
    input_file = request.files.get("file")
    if not input_file:
        return json({"status": "error"})
    if input_file.type != "text/csv":
        return json({"status": "error"})
    output_path = request.args.get("output")
    parse_dates = bool(int(request.args.get("parse_dates", 0)))
    file_path = os.path.join(app_path, "data", output_path)
    logger.info("Uploading %s to %s", input_file.name, file_path)
    with open(file_path, "w", encoding="utf-8") as f:
        f.write(input_file.body.decode("utf-8"))
    logger.info("Uploaded %s to %s", input_file.name, file_path)
    df = pl.read_csv(file_path, try_parse_dates=parse_dates)
    db.RawData.insert(df, output_path)
    logger.info("Inserted %s into database as raw table", output_path)
    return json({"status": "success"})
idle girder
#

ya

jovial yew
#

how would you deal with files with bad schema?

idle girder
#

I need to get rid of the output path as an arg too that's just a testing thing mostly

#

but good question

jovial yew
#

the common pattern for this would be to save the raw data as it is and then separately treat them afterwards

idle girder
#

I see

jovial yew
#

this is where datalake comes in

idle girder
#

thanks

jovial yew
#

ignore the products and branding

idle girder
#

even if it doesn't turn out to be useful for anyone but me I still think this is a valuable learning exercise for practical skills

#

I think knowing how to do this stuff yourself is a valuable skillset

#

seeing as everyone wants to sell it

#

I'm looking into MinIO now though

stiff gale
idle girder
#

I see so I will replace Postgresql with MinIO and use Kubernetes

jade wing
jovial yew
jovial yew
keen minnow
tribal oak
#
        CHROME_PATH_LOCAL_STATE = os.path.normpath(
            r"%s\AppData\Local\Microsoft\Edge\User Data\Local State" % (os.environ['USERPROFILE']))
        CHROME_PATH = os.path.normpath(r"%s\AppData\Local\Microsoft\Edge\User Data" % (os.environ['USERPROFILE']))
``` why is it this somehow causes a file to get written to with the information as it should, but when the file is sent to a webhook it sends an empty file with the filename
i have the EXACT same copy paste code for google chrome and opera gx, same send to discord function but they both send the full files
#
        CHROME_PATH_LOCAL_STATE = os.path.normpath(
            r"%s\AppData\Roaming\Opera Software\Opera GX Stable\Local State" % (os.environ['USERPROFILE']))
        CHROME_PATH = os.path.normpath(r"%s\AppData\Roaming\Opera Software" % (os.environ['USERPROFILE']))

        CHROME_PATH_LOCAL_STATE = os.path.normpath(
            r"%s\AppData\Local\Google\Chrome\User Data\Local State" % (os.environ['USERPROFILE']))
        CHROME_PATH = os.path.normpath(r"%s\AppData\Local\Google\Chrome\User Data" % (os.environ['USERPROFILE']))
``` this is the google chrome and opera gx versions and they both work fine
coral wasp
#

(might make some of that path stuff easier)

#

Also, this seems nothing to do with this channel. #python-discussion would be a better place to start.

tacit acorn
#

Any one here good with sqlmodel? I'm having a lot of problems getting multiple Relationships in the same table working

jovial yew
tacit acorn
#
class User(SQLModel, table=True):
    __tablename__ = "botsettings_user"

    id: Optional[str] = Field(default=None, primary_key=True)
    servers: List["Guild"] = Relationship(back_populates="owner", sa_relationship_kwargs={'cascade': 'all, delete'})
    premium_servers: List["Guild"] = Relationship(back_populates="premium_owner", sa_relationship_kwargs={"cascade": ""})

class Guild(SQLModel, table=True):
    __tablename__ = "botsettings_guild"

    id: Optional[str] = Field(default=None, primary_key=True)
    owner_id: str = Field(foreign_key="botsettings_user.id")
    owner: User = Relationship(back_populates="servers", sa_relationship_kwargs=dict(foreign_keys=["owner_id"]))
    premium_owner_id: Optional[str] = Field(foreign_key="botsettings_user.id")
    premium_owner: Optional[User] = Relationship(back_populates="premium_servers", sa_relationship_kwargs=dict(foreign_keys=["premium_owner_id"]))

Im getting sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join condition between parent/child tables on relationship User.servers - there are multiple foreign key paths linking the tables.

jovial yew
tacit acorn
jovial yew
#

also the list of guilds in the user, would violate the primary key condition right?

#

i guess a mapping table to store user and guild info would makes sense

tacit acorn
#

trying to convert from a Django model

jovial yew
#

something like User -> UserGuild <- Guild

#

user guild would save foreign keys to user and guild

tacit acorn
#

db is already made using django, just now trying to make a sqlmodel maping

jovial yew
#

the django model looked like that? pithink

tacit acorn
#

Looks like:

class User(models.Model):
    id = models.CharField(max_length=20, primary_key=True)

class Guild(models.Model):
    id = models.CharField(max_length=20, primary_key=True)
    owner = models.ForeignKey("User", on_delete=models.CASCADE)
    premium_owner = models.ForeignKey("User", related_name="premium_owner", null=True, blank=True, on_delete=models.SET_NULL)
jovial yew
tacit acorn
#

Not sure what you mean

jovial yew
#

maintain this same schema in sqlmodel

jovial yew
tacit acorn
#

what im asking help to do

fluid cipher
#

Hello!
I need someone who can build an Asterisk server.
Please DM me.

tacit acorn
jovial yew
# tacit acorn what im asking help to do
class User(SQLModel, table=True):
    __tablename__ = "botsettings_user"

    id: Optional[str] = Field(default=None, primary_key=True)


class Guild(SQLModel, table=True):
    __tablename__ = "botsettings_guild"

    id: Optional[str] = Field(default=None, primary_key=True)
    owner: User = Relationship(
        back_populates="servers", sa_relationship_kwargs=dict(foreign_keys=["owner_id"])
    )
    premium_owner: Optional[User] = Relationship(
        back_populates="premium_servers",
        sa_relationship_kwargs=dict(foreign_keys=["premium_owner_id"]),
    )

something like this would be the equivalent right?

jovial yew
jovial yew
#

ah i see, in the django model it'll become null on user delete

idle girder
#

I have an interesting issue. Async reads were working on my native install of my application and postgres but when I try to run my app in a container I get a message like this:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/src/sanic_vec_env.py", line 61, in _worker
    env = _patch_env(env_fn_wrapper.var())
  File "/workspace/src/trader.py", line 135, in __init__
    self.data = db.StdCVData.read(table_name, chunk).sort("date")
  File "/workspace/src/db.py", line 588, in read
    blob = cls.get(table_name, chunk)
  File "/workspace/src/db.py", line 610, in get
    fetched = cls.conn.execute(query).fetchone()
  ...
  File "/opt/conda/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1960, in _exec_single_context
    self.dialect.do_execute(
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) lost synchronization with server: got message type "5", length 808464432

Do I need to use asyncio and the async sqlalchemy API here so I don't lose synchronization or could it be something else?

hollow oar
idle girder
#

It's hard to share the full context without a huge dump but yes that's probably happening

#

let me post some bits that I think are relevant

#
class CVData:
    conn = sa.create_engine(CONN).connect()
    metadata = sa.MetaData()
    metadata.reflect(bind=conn)
    try:
        table = sa.Table("cv_data", metadata, autoload_with=conn)
    except sa.exc.NoSuchTableError:
        table = sa.Table(
            "cv_data",
            metadata,
            sa.Column("id", sa.Integer, primary_key=True, autoincrement=True),
            sa.Column("table_name", sa.String),
            sa.Column("chunk", sa.Integer),
            sa.Column("data", sa.LargeBinary),
        )

    @classmethod
    def get(cls, table_name: str, i: int) -> bytes:
        """
        Retrieve CV data by table name and chunk index.

        Args:
            table_name (str): The name of the table.
            i (int): The chunk index.

        Returns:
            bytes: The CV data.

        Raises:
            AssertionError: If data is not found for the given table name and
            chunk index.
        """
        table = cls.table
        query = (
            sa.select(table.c.data)
            .where(table.c.table_name == table_name)
            .where(table.c.chunk == i)
        )
        fetched = cls.conn.execute(query).fetchone()
        assert fetched is not None, f"Data not found for {table_name} chunk {i}"
        res = fetched[0]
        return res

    @classmethod
    def read(cls, table_name: str, i: int) -> pl.LazyFrame:
        """
        Read a chunked table from the database.

        Args:
            table_name (str): The name of the table.
            i (int): The index of the chunk.

        Returns:
            pd.DataFrame: The chunked table as a pandas DataFrame.
        """
        blob = cls.get(table_name, i)
        df = cloudpickle.loads(blob)
        return df.lazy()
hollow oar
#

just lower your multiprocessing worker count to 1 and retry, if it doesn't blow up then it's like 90% confirmed

idle girder
#

ah yeah I already didt hat and it worked

#

just trying to figure out how to fix it

#

so this is what's happening:

  def __init__(
        self,
        env_fns: List[Callable[[], gym.Env]],
        sanic_app: Sanic,
        jobname: str,
        start_method: Optional[str] = None,
    ):
        self.waiting = False
        self.closed = False
        n_envs = len(env_fns)

        if start_method is None:
            start_method = "spawn"
        ctx = sanic_app.shared_ctx.mp_ctx
        pipes = [ctx.Pipe() for _ in range(n_envs)]
        self.remotes, self.work_remotes = zip(*pipes)
        for work_remote, remote, env_fn in zip(
            self.work_remotes, self.remotes, env_fns
        ):
            hex_name = md5(
                str(uniform(0, 1)).encode(),  # nosec
                usedforsecurity=False,
            ).hexdigest()
            worker_name = f"RLVecEnv{hex_name[:10]}"
            kwargs = {
                "remote": work_remote,
                "parent_remote": remote,
                "env_fn_wrapper": CloudpickleWrapper(env_fn),
            }
            sanic_app.m.manage(
                ident=worker_name,
                func=_worker,
                kwargs=kwargs,
                workers=1,
                restartable=True,
            )

env_fn is actually a class definition and the callable returns the instance. the init connects to the db to read some data

#

this is happening in async across n cpus

#

so it spawns like 64 processes asynchronously let's say

#

they all start reading from the DB

#

when the env_fn is actually called

#

Sanic is an async http server

#

all this stuff worked NATIVE though

#

it's only when I moved the app (but not postgres) to a container and connected to the host postgres service that I started getting this error

hollow oar
#

and in native is the worker count > 1?

idle girder
#

yea

hollow oar
#

that's odd..

idle girder
#

agree, I thought it was a postgres security for external connections at first

#

then I read the traceback

hollow oar
#

in any case, you can always defer the creation of connection until your worker actually uses it.
right now you are creating the connection upon class definition.

there are probably other ways of fixing this (e.g. make all process go through the import/class definition process, iirc that's possible), but i think the above is cleaner.

idle girder
#

I wonder if it "didn't lose synchronization" because the hardware was all running on the same kernel

#

but when I switched to the container the app has its own kernel now right

#

can I fix this using async though?

#

I know you can do async engines with sqlalchemy

hollow oar
#

i wouldn't even try, since you are using multiprocessing already 🀷

i am curious, can you do a little multiprocessing.get_start_method() and report what it shows you?

#

my guess is native shows spawn, dockerised shows fork.

#

i wouldn't even try, since you are using multiprocessing already
i say this because by using multiprocessing, multiple python interpeters would be spun up, and then subsequently potentially using the same python object for db connection, i don't think this is particularly safe, nor asyncio will help you in anyway here, because the interpreters would just fight for control of that python object, with 0 coordination or intent of collaboration. (well this is just my hand wavy way of explaining this without me actually knowing EXACTLY what's happening under the hood)

thorny marsh
#

I need help

idle girder
#
dfs = asyncio.gather(*[cls.read(table_name, i) for i in range(chunk)])
std_df = pl.concat([df.collect() for df in await dfs])

I didn't know that you could use await like this

#

nice

jovial yew
#

doesn't polars automatically do a connection pool and get it in parallel?

#

just specualting, now sure

#

again reading this code, the collect is in the second line, so the actual work happens only after that, the async section is not likely doing anything, other than batching or creating connections, so idk 😐

#

you should just make something single thread, measure performance and then try to optimize it

idle girder
#

I was trying to fix an issue with concurrency in the container

#

that only happens when using the container

#

(the db is still on the host)

vernal shore
#

hi

#

i need some help

idle girder
#

so i wanted to see if using the async API would help with losing synchronization with the db on the host

jade wing
idle girder
#

how is that possible when you are running different versions of linux though

vernal shore
#

how i can input the name of table and the other columns by ask user do it

#

con = sqlite3.connect('data.db')

    nameT = self.ui.table.text()
    col_1 = self.ui.culonm_1.text() #+ self.typeC_1.text()
    col_2 = self.ui.culonm_2.text() #+  self.typeC_1.text()
    col_3 = self.ui.culonm_3.text() #+ self.typeC_1.text()
    col_4 = self.ui.culonm_4.text() #+ self.typeC_1.text()

    
    st = f""" CREATE TABLE '{nameT}' (
              id INTEGER PRIMARY KEY,'{col_1}', '{col_2}', '{col_3}', '{col_4}');"""


    crs = con.cursor()
    #crs.execute(cle)
    crs.execute(st)
#

i can do this ?

storm mauve
#

!sql-fstring

delicate fieldBOT
#
SQL & f-strings

Don't use f-strings (f"") or other forms of "string interpolation" (%, +, .format) to inject data into a SQL query. It is an endless source of bugs and syntax errors. Additionally, in user-facing applications, it presents a major security risk via SQL injection.

Your database library should support "query parameters". A query parameter is a placeholder that you put in the SQL query. When the query is executed, you provide data to the database library, and the library inserts the data into the query for you, safely.

For example, the sqlite3 package supports using ? as a placeholder:

query = "SELECT * FROM stocks WHERE symbol = ?;"
params = ("RHAT",)
db.execute(query, params)

Note: Different database libraries support different placeholder styles, e.g. %s and $1. Consult your library's documentation for details.

See Also

  • Python sqlite3 docs - How to use placeholders to bind values in SQL queries
  • PEP-249 - A specification of how database libraries in Python should work
storm mauve
#

iirc it doesn't allows for you to specify table/column names though, so you may as well just verify that it is within a set of allowed values then use normal string formatting techniques

#

that said, this is mostly for when you are running untrusted user input against a database they do not have direct access to.

If you're using a local SQLite file as the database, and the user is inputting directly on the same device + has full access to the database file anyway, it is not as important, but you must be careful not to create a bad habit of it

jade wing
# vernal shore con = sqlite3.connect('data.db') nameT = self.ui.table.text() c...

this is something that is very dangerous to do and you need to sanitize input data rigorously and know exactly what you are doing if you even consider going this route
one way is to match the input as an exact match against a key in a dictionary or an element in a tuple or a list of allowed values for that exact item/part of the SQL statement and use that as the string in to the SQL statement

jovial yew
#

SQL injection πŸ’€

jovial yew
idle girder
#

yes

jovial yew
#

With polars right?

#

Do some benchmarks with just one thread

idle girder
#

well the reason it's doing it concurrently is that all those threads are doing work on the data concurrently

#

I was originally passing the data to the spawned threads via pipes

#

that was super slow

#

so now they read concurrently from the DB

#

it worked without container but when put in a container vs host OS DB it started to have issues

#

so I went in and made alot of changes

#

I probably should have changed less and tested it without async first

#

but it's working now so will probably just keep it

#

maybe improve the async over time

jovial yew
#

Could we take a step back and understand what you're trying to do?

idle girder
#

I'm using vectorized environments to train a reinforcement learning model in parallel on an asynchronous webserver that will provide a UI for data selection and processing as well as analytics on the model fitting and the out of sample results

#

the environments need to get data from the db to do their job

#

they run in parallel

#

the reason they are running in parallel is so that I can do a parallel search of the parameter space of the model

#

the action space to be more precise

jovial yew
#

Each activity is a sequence of steps which are independent from each other? They all run in parallel and do their own stuff?

#

Why does this sound like a data pipeline to me?

idle girder
#

they are running independent searches of the action space for each batch and then their results get aggregated when calculating the policy gradient

#

there's a vectorized learning environment API from stable baselines 3 that I'm using

jovial yew
#

Ya like it can do it's own work

idle girder
#

yea

jovial yew
#

Whatever that could, get data, pre process, train model, evaluate, predict? Something in these lines?

idle girder
#

yes

jovial yew
vernal shore
jovial yew
idle girder
#

yeah but it's managed through a web server UI

#

so it's interactive

#

you can define different models interactively

#

that's the point

jovial yew
jade wing
jovial yew
idle girder
#

is your point that I should be doing something else?

jovial yew
#

You don't have to do this by hand, plenty of tools available

idle girder
#

will do although I will say I'm having fun

#

waiting to start new job and have some time

jovial yew
#

This reminds of myself making an etl tool when IT said you can't use external packages when I started out

jovial yew
idle girder
#

so what do those pipeline orchestration tools provide per se

#

what in my code would they replace

#

I haven't implemented the full general case yet btw that's what I'm working on now

jovial yew
#

Most have nice ui, track previous runs stats etc

#

Just try something small like prefect, it's just a pip install away

jade wing
# vernal shore new

now you are really treading into extra dangerous ground as you can no longer match the name against known good/safe value 😬

jovial yew
idle girder
#

yeah I'm looking at it

jade wing
# vernal shore ty

if you decide to continue on this path regardless of our repeated warnings you need to take several steps to try to safeguard your security

#

@vernal shore what database are you planing to use and is it for administrating a database on the users own system or on a remote system?

vernal shore
#

it just simple practice

#

sqlite

jade wing
#

on the users own system (local application) or on a remote system relative to the system the user is sitting in front of or owns?

vernal shore
#

local

#

wait i will tell u

#

i create a simple app with pyqt5 and sqlite3

#

to create a table and ask the user to input the name of table and columns

#

wait i will show u

jade wing
#

oh, then it's much less of an issue as it's local on the users own system where they have full access to the database anyways and if they would successfully exploit the application they would only be exploiting themself and do what they could have done directly in the database if they have the knowledge

#

just don't make it a habit so that you forget about the dangers of SQL injection when developing other types of applications and do something even remotely similar there

#

sa you are working with an embedded database that runs within the application itself you don't have to worry that the user might use the database to elevate privileges on the local system either

vernal shore
#

yes

#

ofcurse

jade wing
#

so i think you are rather safe in this instance πŸ‘

vernal shore
#

now u have solutin for me ?

jade wing
# vernal shore now u have solutin for me ?

as security is much less of an issue in this specific scenario you can do what you where planing on
or you can use a "SQL Query Builder" like SQLBuilder or PyPika (there are others as well) that hopefully has defenses against SQL injection and takes care of protecting you from such things already

vernal shore
jade wing
hollow oar
hidden creek
#

which is better
make a lot of small sql queries to get different data

or make one big query then filter the data in python
?

#

by big i mean the amount if data it would return

fading patrol
idle girder
#

I got the container working in docker but when I moved the app and db to kubernetes now I'm getting connection forcibly closed by peer

#

so the config now is db and app are both running on minikube cluster hosted on Fedora server

#

everything works, but long requests get forcibly closed

hollow oar
# idle girder the container and the db are running on a fedora server on my LAN, I am sending ...

gotcha.
if you were curious why it worked on your windows machine but not linux after it's been dockerised, it's because the default start method of multiprocessing on windows/mac is spawn, meaning your db connection is never shared between workers. whereas on linux, the default is fork which means whatever you have defined prior to starting the workers could be shared.

and sharing a db connection that's not designed to be thread/process-safe would potentially yield such weird errors without a clear explaination of what actually went wrong - that's what you were seeing when running in a dockerised environment

idle girder
#

thanks

#

I was running it in Linux without the container though

#

I am just using windows as the web client

#

the web host went from being the baremetal linux host to the container running on linux while the db host remained on baremetal linux

jade wing
hollow oar
# idle girder I was running it in Linux without the container though

right, i assumed wrongly. i assumed originally when you meant "native" or "host", it meant windows, and anything else is just dockerised/linux.

anyway, next time you see an issue that might be related to how multirpcoessing spawn process , you can chuck a multiprocessing.get_start_method() in and see what it gets.
i believe you will see fork instead of a spawn.

though again if it were me, i would just defer the creation of db connection to after the worker processes has started.

(i think you might have a different issue if you are seeing long requests getting forcibly closed, so you can very well just ignore me)

idle girder
#

that's what I ended up doing

#

I think they were two separate issues

#

I face different sets of issues when going from baremetal linux -> container -> kubernetes

#

think the last one is some kubernetes config issue

#

but not sure

hollow oar
#

do you have 2+ machines?

idle girder
#

I have 2 machines yes

#

the windows one I'm typing on now and a headless homeserver running fedora

hollow oar
#

hmm all of them are used for compute?

idle girder
#

no just the fedora one is being used for compute

hollow oar
#

πŸ‘

idle girder
#

whether it's in the form of baremetal, docker container, or kubernetes cluster

#

the thing that changes when I go from kubernetes are #1 I'm using the db in a pod instead of on the baremetal host #2 using kubernetes pod network instead of host network

#

but I think the latest error is more likely to do with the latter

#

maybe also an nginx thing since I'm reverse proxying the cluster to the host network

#

so I can access it from windows

#

I kinda want to post everything for a code review but it's not "complete yet" and I doubt people would want to spend the time understanding my application architecture or why I did things the way I did them

hollow oar
#

how long are the long requests btw?

#

long requests in what sense as well?
DB-related or your HTTP API?

idle girder
#

so what I'm doing right now and should probably change to a lazy representation is having the training of a model be a request and it returns when it's done training

#

so it's like a 30 minute request

#

I should probably just make it so it creates a process that is managing the training in the background

#

and returns when it's done with that

#

it works on baremetal and container though

#

I'm sure it's some default setting I'm running afoul of in kubernetes but that means that I'm also doing things well out of the ordinary

jade wing
# idle girder so it's like a 30 minute request

you would need to keep the connection alive with some periodic traffic if you are running that long of an operation or you risk getting the connection killed after for example five minutes or some other arbitrary timer

idle girder
#

I'm thinking the more elegant pattern is to have the request just start a process and having an API for checking in on the process

#

rather than waiting for it to finish to return

jade wing
#

many times you would want to do this asynchronously and just let the client poll at intervals to see if the job with a certain job id has completed or maybe implement webhooks or something to send a notification when it's done

idle girder
#

right I think that's what it should be

#

I'm learning web dev though so starting from very simple implementations and a place of ignorance

#

like I know it's not good yet but I'm not sure what good looks like

#

have some ideas

#

but need to figure out how to implement them

jade wing
#

many people use celery for such jobs on the server, even if i don't care much for celery myself

idle girder
#

why don't you like it?

jade wing
#

it's probably just because the implementations i've seen it used in that i really didn't like

idle girder
#
@main_app.get("/start")
async def start_handler(request: Request):
    """
    Handles the start request for training.

    Args:
        request (Request): The request object containing the run parameters.

    Returns:
        A redirect to the summary page.
    """
    args = request.args
    logger.info("Starting training")
    table_name = args.get("table_name")
    jobname = args.get("jobname", "default")
    start_i = int(args.get("i", 0))
    cv_periods = await db.CVData.get_cv_no_chunks(table_name)
    await db.Jobs.add(jobname)
    logger.info(
        "Starting training for %s with cv_periods=%s",
        table_name,
        cv_periods,
    )
    for i in range(start_i, cv_periods - 1):
        logger.info("Starting training on fold %i of %i", i, cv_periods)
        await train_cv_period(args, i, request)
        await validate_cv_period(args, i)
    redirect_summary = request.app.url_for(
        "get_job_summary", jobname=jobname, table_name=table_name
    )
    return redirect(redirect_summary)

so this is the bad boy request that takes a very long time to return

#

I want to move this part out to a managed process or coroutine:

    for i in range(start_i, cv_periods - 1):
        logger.info("Starting training on fold %i of %i", i, cv_periods)
        await train_cv_period(args, i, request)
        await validate_cv_period(args, i)
    redirect_summary = request.app.url_for(
        "get_job_summary", jobname=jobname, table_name=table_name
    )
    return redirect(redirect_summary)

and just have the original return some handle for it that gets stored

#

an async task or a subprocess etc.

#

I prefer async task

#

jobname can handle the labeling

#

I'm not sure if I need anything besides the event loop I already have

jade wing
#

i would have it in an external process rather then a coroutine so that you can restart the web server when ever you want without killing the long running job

idle girder
#

then the server cant manage its resources though

#

or can it

#

I guess what I'd have to do is keep the record of what's running in persistent storage

jade wing
#

depends on what you mean with "the server" and "manage its resources"

idle girder
#

just check if the GPU is already being used for a job that was started previously

#

I guess that's where the DB comes in

jade wing
idle girder
#

I want to have a more sophisticated GPU VRAM management system eventually too

#

so it knows if a job can fit before it tries to put it in the GPU

#

but that's pie in the sky for now

jade wing
#

you could have a job queue

idle girder
#

I don't know how to get any of the existing resource management stuff to work with GPUs

#

like I think the GPU will just eagerly try to put everything into VRAM

#
@main_app.get("/start")
async def start_handler(request: Request):
    args = request.args
    logger.info("Starting training")
    table_name = args.get("table_name")
    jobname = args.get("jobname", "default")
    cv_periods = await db.CVData.get_cv_no_chunks(table_name)
    await db.Jobs.add(jobname)
    logger.info(
        "Starting training for %s with cv_periods=%s",
        table_name,
        cv_periods,
    )
    coro = manage_training(args, request, cv_periods)
    asyncio.get_event_loop().create_task(coro)
    return json({"status": "success"})

async def manage_training(args, request, cv_periods):
    table_name = args.get("table_name")
    jobname = args.get("jobname", "default")
    start_i = int(args.get("i", 0))
    for i in range(start_i, cv_periods - 1):
        logger.info("Starting training on fold %i of %i", i, cv_periods)
        await train_cv_period(args, i, request)
        await validate_cv_period(args, i)
    redirect_summary = request.app.url_for(
        "get_job_summary", jobname=jobname, table_name=table_name
    )
    return redirect(redirect_summary)

so I guess the question is now what happens with that redirect

#

anyway think this is decently offtopic for dbs now, thanks everyone for the help

torn sphinx
#
for x in characters:
    name = x[0].replace(" ", "_")


    def column_exists():
        c.execute("SELECT * FROM pragma_table_info('users') WHERE name = ?", (name,))
        return c.fetchone() is not None

    if not column_exists():
        c.execute(f"ALTER TABLE users ADD COLUMN ? INTEGER PRIMARY KEY AUTOINCREMENT", (name,))

gives an error:

sqlite3.OperationalError: near "?": syntax error
storm mauve
#

SQLite does not allows for you to use placeholders for table and column names, these should be fixed, static, ideally hardcoded

torn sphinx
#

is there any way i could add a column to an existing table then?

#

i mean i can but

#

i also am forced to check if the column already exists because il be constantly adding more

storm mauve
#

the "professional" solution for that would typically be database migration tools

torn sphinx
#

oh, so sqlite just doesnt work with this?

storm mauve
#

just throw it in a separate file and run it manually

#

You should not have to update your tables often?

torn sphinx
#

alright then

#

just wanted a bit of automation but thats fine

storm mauve
#

you'll have to edit the code whenever you update the tables anyway

torn sphinx
#

fair

vernal shore
#

@jade wing i get the solution of my problam

#

in PyQt5 we have qtsql

#

there is many statement there for sqlite

#

ty for help me

shut tiger
torn sphinx
#

im aware ab SQL injections, but this is a very small project for me and my friends only

shut tiger
pine saddle
#

I want to know if this is conceptually possible. Say you are using a buying/selling third party website. You list a bunch of items you want to sell from your account. Is there a way for your account on this third party website to interact with a database that you created? Such as when you create a list of items you want to sell, those items get inserted to a table. When you remove an item from the front-end, it also reflects that deletion on the database.

fading patrol
pine saddle
fading patrol
pine saddle
#

That’s why I delete and repopulate in case a user deletes something from the front-end

fading patrol
#

If you're going to constantly delete your whole database then you probably don't need a database

pine saddle
#

In practicality, yes I could manually delete a row that has to be deleted. The secondary issue is if I go on vacation, my co-workers don’t know how to access our database and delete data

#

But yes you are right.

fading patrol
#

To be more explicit... If a record exists in your data but not in the latest pull, then you know it needs to be deleted, right?

pine saddle
#

And when I pull and insert data I get the SQL duplicate primary key error anyway

fading patrol
signal junco
paper flower
tired breach
#

Guys, is it a good idea to reuse ID's for user id i the user record was deleted?

signal junco
tired breach
#

I'm using postresql

#

So UUID then..

grim vault
wheat sequoia
#

A question about constructing a SQL query using python (Oracle dialect if it matters):
Say i have a column named PROVISIONED_AT which can be either a timestamp or null (not provisioned yet). A query needs to filter out the active requests, which means that column is null. How can I construct a sql statement that selects the active/expired rows based on the boolean value of the filter active?

#

Ideally I believe it's best to not use string concatenation to construct the query, as it will hinder DB optimization (correct me if I am wrong)

oak pumice
#

Hopefully this is the right channel to ask in:

Does anyone have any suggestions on a means of using SQLAlchemy in a similar means to MERGE?
I am trying to update a table that contains approximately 50,000 rows of Parts with a table that contains only Updated Parts. If I were to write raw SQL I could execute a merge statement and be done with it, but I'd like to try and do this by means of ORM.

Any suggestions?

coral wasp
coral wasp
wheat sequoia
#

I am not writing 3 separate queries because there are many other where conditions and I hope to chain this one with and

coral wasp
#

So you want β€˜(field = value and field2 is null) or (field=value and field2 is not null)’

wheat sequoia
#

Can a bind variable be mapped to both null and 1/0 like that?

coral wasp
#

I guess that’d work. A bit weird, normally I’d hand that condition in code, in something like this

safe berry
#

Hello chat , I need suggestions for a sqlachemy database for flask . I’m working on a mechanic scheduler for a project and I was wondering what type of model tables I should have . I currently have a user model for users . User can book appointments based on mechanic availability slot . User can input their zip code and when user submits zip code a list of mechanics pop up with their profile .I want to integrate the Google booking api . I was wondering what other model classes I should have . I have already a login and register page functionality with the sqlalchemy database working . Suggestions thanks..

subtle basin
#

is it possible in SQLAlchemy load relationship with selectinload() technique but assign attribute with alias?

    person: Mapped["Person"] = relationship(

    )
    packaging_works = relationship(
        "PackagingWork",
        uselist=True,
        viewonly=True,
    )
    client_packaging_works = relationship(
        "PostOperationExtractClientChild",
        uselist=True,
        viewonly=True,
    )
    client_person = relationship(
        "ClientContactsNotification",
        uselist=False,
        viewonly=True,
    )

and alias client_person as person in statement, because it's a second case and relationship person not loaded

sharp isle
#

Hey , can i have a suggestion for postgresql book?

cyan bay
#

New "data scientist" with a messy question.

What are some potential strategies to reverse-engineer the primary-foreign keys that a massive obscure set of databases logically have, when these pk-fk constraints are not actually present in the database, and I do not have access to the code of most of the applications that use the database?

Got a bunch of big old legacy databases to work with, a lot of which is profoundly abstract data with no data definitions that are meaningful. These databases have no foreign keys (or at least available to me in my test/production views of it.). It is part of my role to query these databases for data. I want to know some tactics to reverse-engineer the relationships that the tables should have. At least some of the tables have what are clearly intended to be primary-foreign key relationships, it's just these constraints are not imposed on the database. I am hoping to build an interactive ER diagram in Python to help start addressing this.

#

My not-experienced self is considering starting like this: compiling a Dataframe in Python with every table and all of their columns. Not all of the PK-FK columns are identically named, but most (80-90%) I've found (so far) are. I would begin by building relationships by checking if column names are identical and having it draw a connection (maybe working with Mermaid ?) For each connection, I would check a random sample of data, and see if there is matching data on the other side. Then, it would check the matching data to see the # of occurrences of that data in each identically named column, to see if I can imply any one-to-one, many-to-many, or one-to-many relationships for it.

#

I would then collect notes from interacting with the applications and interviewing the developers that maintain them to see if I can include contextual data and snip/mend/update connections.

coarse ermine
#

123

coral wasp
#

The idea of trying to build a model with no domain/semantic knowledge sounds terrible. But, with domain knowledge, I might then do something like you suggested; explore the dataset to see what the relationships look like and confirm my expectations.

#

.describe() of each df (or subsets of a df) would be one early step , paying attention to both count and unique

somber ember
#

anyone worked with below scenario:

I have source table, and final output csv file. The final output csv file is transformed. I have user who's requesting to have same data as csv file to be inserted into new table. But csv file is already transformed. Source table has 150k rows while csv file only holds 900 rows. I can track how some of transformation is done, but there are some columns in csv file that gives me headache. For example, in csv file, there's columnA that has value of "Yes." But in source table, the key with same name has a nested json as value, and none of kv pairs inside that nested json has value of "Yes".

Is it possible to figure out how final output was created from source table even with these attributes? How did you solve this situation? I also do not know who worked on transformation logic (no one who's involved in this task knows about it)

storm mauve
cyan bay
somber ember
royal sparrow
#

whats something that is different for every chrome browser user so like an ID that I can get and verify?

fading patrol
jovial yew
unkempt prism
tired breach
#

hey guys,,

I get this error RuntimeError
RuntimeError: The session is unavailable because no secret key was set. Set the secret_key on the application to something unique and secret.

but this code seems to work
`@app.route('/authorize/<provider>')
def oauth2_authorize(provider):
if not current_user.is_anonymous:
return redirect(url_for('index'))

provider_data = current_app.config['OAUTH2_PROVIDERS'].get(provider)
if provider_data is None:
    abort(404)

# generate a random string for the state parameter
session['oauth2_state'] = secrets.token_urlsafe(16)

# create a query string with all the OAuth2 parameters
qs = urlencode({
    'client_id': provider_data['client_id'],
    'redirect_uri': url_for('oauth2_callback', provider=provider,
                            _external=True),
    'response_type': 'code',
    'scope': ' '.join(provider_data['scopes']),
    'state': session['oauth2_state'],
})

# redirect the user to the OAuth2 provider authorization URL
return redirect(provider_data['authorize_url'] + '?' + qs)`

Does anybody know what the hell?

#

RuntimeError: The session is unavailable because no secret key was set. Set the secret_key on the application to something unique and secret.

#

ok... needed to add

#

app.secret_key = 'super secret key'

#

for flask

shut tiger
#

So.. exactly what the error said? 🀣

tidal mica
#

A dumb question

#

I need to store a python class object inside a db

#

Stackoverflow recommends pickling it and storing it as a blob

shut tiger
tidal mica
#

sqlite3.binary(pickle.dumps(Object))

tidal mica
shut tiger
tidal mica
#

Well, JSON could be another option

shut tiger
#

But writing data properly to the db as rows in tables is much better.

tidal mica
#

I fear there will be too many rows

frank cloak
#

I have never had a problem with postgresql

tidal mica
#

Though this could be a motivation to funally learn sqlalchemy

frank cloak
#

at least from it

tidal mica
#

Okay, I'll consider switching to postgresql

wide schooner
#

bol

#

imagine trying to get voice chat

shut tiger
shut tiger
#

Trillion?

autumn void
#

For school I have to create a fullstack webshop, so a database aswell. Do I create one whole database for all the data (customer, products, orders etc). Or do I do them separate?

paper flower
#

It can contain multiple tables, if you create a db per table then you wouldn't be able to use foreign keys or do joins

autumn void
#

Okay thank you for making it clear.

#

Like how do I set up a database from scratch? Do I first collect all the data I am working with and make relevant tables with it. Or are there templates?

#

What is the proces

paper flower
#

You generally just design db from scratch

#

Depending on what you want to store and how

autumn void
#

okay than I will just do that

#

thanks

dusky sequoia
#

is it possible to get help for csv files

shut tiger
dusky sequoia
#

really

#

ty

#

its a stupid question, im stupid so dont be surprised why i need help with that

#

import csv

filnavn = "..."

with open(filnavn, "r") as fil:
reader = csv.reader(fil)
for linje in reader:
print(linje[1])

#

why is it telling me list index out of range

shut tiger
dusky sequoia
#

yeah i know

#

My csv file looks like that: ",""id"",""track_name"",""size_bytes"",""currency"",""price"",""rating_count_tot"",""rating_count_ver"",""user_rating"",""user_rating_ver"",""ver"",""cont_rating"",""prime_genre"",""sup_devices.num"",""ipadSc_urls.num"",""lang.num"",""vpp_lic""";
"1,""281656475"",""PAC-MAN Premium"",100788224,""USD"",3.99,21292,26,4,4.5,""6.3.5"",""4+"",""Games"",38,5,10,1";
"2,""281796108"",""Evernote - stay organized"",158578688,""USD"",0,161065,26,4,3.5,""8.2.2"",""4+"",""Productivity"",37,5,23,1";

shut tiger
dusky sequoia
shut tiger
#

hmm.. that's a new error. Did you remove the print(linje[1]) line without telling me?

#

or any other change?

dusky sequoia
#

i removed it without telling you yeah sowwy

shut tiger
#

btw, you should REALLY write all your code in English. It makes it a lot easier to get help.

dusky sequoia
#

shit

#

import csv

filename = "..."

with open(filename, "r") as file:
reader = csv.reader(file)
for line in reader:
print(repr(line))

dusky sequoia
#

import csv

filename = "..."

with open(filename, "r") as file:
reader = csv.reader(file)
for line in reader:
print(repr(line))
print(line[2])

#

nah wait

#

ok

#

it still gives error

#

import csv

filename = "-"

with open(filename, encoding="utf-8-sig") as file:
reader = csv.reader(file)

for line in reader:
    print(line[0])
#

why does that print the whole data

#

im tryna get the first indexes from the lines

shut tiger
dusky sequoia
#

oh sorry

#

ohshit

#

sorry

shut tiger
shut tiger
# dusky sequoia I got

why are you destroying random parts of your code in the middle of all this? It's impossible to help you.

dusky sequoia
#

its right above

#

wait

#

import csv

filename = "..."

with open(filename, "r") as file:
reader = csv.reader(file)
for line in reader:
print(repr(line))
print(line[2])

shut tiger
dusky sequoia
#

import csv


filename = "..."

with open(filename, "r") as file:
    reader = csv.reader(file)
    for line in reader:
        print(repr(line))
        print(line[2]) 
shut tiger
#

That's what I mean by randomly destroying the code

dusky sequoia
#

right

#

sorry

#

thanks for pointiong it out

shut tiger
#

You are flailing around. You need to take a deep breath. Think about things. Go slow. If you can't, then go sleep and come back to it tomorrow.

dusky sequoia
dusky sequoia
shut tiger
dusky sequoia
#

😦 Ik

shut tiger
# dusky sequoia

Stop with the pixels.

Anyway. The point of the print I told you to add was that you can look at the last line before the crash. Then you don't have to guess in panic.

dusky sequoia
#

yeah right

#

i tried a few alternatives but it keeps saying out of range

#

Might there something off with the file it self?

shut tiger
#

You have been ignoring what I say for 3.5 hours. How much of a hurry are you in? Clearly you are ok with wasting 3.5 hours.

shut tiger
#

ONLY THIS. Then look at what it prints before it crashes.

#

Then copy paste it so we can all see.

dusky sequoia
#

You told me to give me the output and I did

#

I didnt ignore it Im just stupid I dont understand your requests since Im not good english, neither am I at coding

dusky sequoia
#

the print

#

the output

shut tiger
dusky sequoia
#

I cant print the output

shut tiger
dusky sequoia
#

O

#

k

#

[",""id"",""track_name"",""size_bytes"",""currency"",""price"",""rating_count_tot"",""rating_count_ver"",""user_rating"",""user_rating_ver"",""ver"",""cont_rating"",""prime_genre"",""sup_devices.num"",""ipadSc_urls.num"",""lang.num"",""vpp_lic""";]
Traceback (most recent call last):
File "C:/Users/sha01021/OneDrive - Vestfold og Telemark fylkeskommune/IT2/BlaBlaBla/csv/AppleStore.csv", line 10, in <module>
print(line[2])

IndexError: list index out of range

#

there is an error line under print(line[2])

#

And I didn't know you couldn't see it, would've helped if u told me

shut tiger
dusky sequoia
#

its the first line from my csv file

shut tiger
#

Look at it. Carefully.

dusky sequoia
#

A list I know

shut tiger
#

Good. A list of what?

dusky sequoia
#

a list of strings

#

is this a test?

grim vault
dusky sequoia
#

you mean the frist line is made of one column?

#

I get you now

grim vault
#

Yes, somehow your csv file only consists of one column. Looks like a failure at generation time.

dusky sequoia
#

but if i could explain it that way, the csv shows information for each app in app store. So each line represents an app, and has its own information

dusky sequoia
#

oh wait

grim vault
ocean stirrup
#

Alguien que sepa hablar espaΓ±ol para enseΓ±arme Python en directo

dusky sequoia
#

goddamn

#

i get it

dusky sequoia
#

I didnt make the file so I didnt recognize the problem

grim vault
#

That's what a help channel is about: help you find the problem and solve it. ;)

dusky sequoia
#

yeah thanks

#

but cant I still use delimeter ?

#

because I asked, and the teach said it is supposed to be in one column

#

import csv


filename = "the name is there"



with open(filename, encoding="utf-8-sig") as file:
    reader = csv.reader(file, delimiter=",")
    for line in reader:
        print(line[2])

shut tiger
dusky sequoia
#

Nuh uh I didn't

#

you do realize I didn't solve my problem right? Cuz even if they're in one column, there should be a solving. And I actually tried to send another code that might be right, but you said I'm going too fast

#

I do realize the elements are in the same colunm now, great. But shouldn't the commas between the elements help to show indexes

#

At this rate I'm not even gonna ask for help

grim vault
#

It's not really a database problem now, is it?

fading cargo
#

format1 = ff.FortranRecordReader('(F8.4,F9.4,ES12.4, F8.4, 1x, 1x, 1x, I3, I3, ES12.4, F8.3, F8.3, 1x, ES12.4, 1x, ES12.4, ES12.4, 1x, ES12.4, ES12.4 1x, ES12.4, 1x, 1x,I3,I2,I3,I2,I3,I2)')
Im trying to read a fortran formatted output file using this command but I get the following error
InvalidFormat:
Token: type=ED7, value=ES has invalid neighbouring token
at first I had this error but with value=x, I fixed it by putting a 1 in front of every x. But when I try this fix with ES, I still get the same error
anyone know what Im doing wrong?

dusky sequoia
#

he told me its not the right place and i knew it wasnt

#

but he said I should just ask right away and I did

#

so it wasnt a database problem from the beginning

grim vault
#

A CSV file is some kind of database. We did find the error: Your csv file is not really a csv file. Now I would just correct the csv file and than your code would work.

shut tiger
grim vault
#

!e Just write out the read in line into a new file, that new file will be a correct csv file:

import io
import csv

old_file = io.StringIO('''",""id"",""track_name"",""size_bytes"",""currency"",""price"",""rating_count_tot"",""rating_count_ver"",""user_rating"",""user_rating_ver"",""ver"",""cont_rating"",""prime_genre"",""sup_devices.num"",""ipadSc_urls.num"",""lang.num"",""vpp_lic""";
"1,""281656475"",""PAC-MAN Premium"",100788224,""USD"",3.99,21292,26,4,4.5,""6.3.5"",""4+"",""Games"",38,5,10,1";
"2,""281796108"",""Evernote - stay organized"",158578688,""USD"",0,161065,26,4,3.5,""8.2.2"",""4+"",""Productivity"",37,5,23,1";
''')

old_reader = csv.reader(old_file, delimiter=";")
new_file = io.StringIO()
for line in old_reader:
    new_file.write(line[0] + "\n")

new_file.seek(0)
new_reader = csv.reader(new_file)
for line in new_reader:
    print(line[2])
delicate fieldBOT
#

@grim vault :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | track_name
002 | PAC-MAN Premium
003 | Evernote - stay organized
dusky sequoia
dusky sequoia
#

Pythong was treating every column as an index, but in reality I want every element in that column as an index, so I had to strip and split. ```py

import csv
from collections import Counter

filename = "I think everyone knows my location because of that already"

categories = []

with open(filename, encoding="utf-8-sig") as file:
reader = csv.reader(file, delimiter=",")
for line in file:
line = line.strip().split(",")
category = line[12]

now I have no idea why the delimeter method didnt work, since it worked with my friend, but this method did work at least
grim vault
#

One last time: Because your csv file is faulty. Your friend might have the correct one.

dusky sequoia
#

well my friend used the same one cuz he has no idea what he's doing so I dont think he'd find what the right csv file would look like, but I guess there's a problem

shut tiger
dusky sequoia
#

Im using it for the rest

shut tiger
dusky sequoia
#

maybe because I didnt need to show my whole code idk

narrow prawn
#

does anyone know of a thing like tinydb but it better supports nested arrays

#

like if you wanna store a dict as a value of a key in tinydb, you might as well not use it and make something with raw json and open()

harsh pulsar
#

sqlite with json columns can be surprisingly effective for that kind of thing

narrow prawn
harsh pulsar
#

have you considered just using postgres jsonb?

#

or heck, actual json files in cloud storage with uri/url stored in postgres

#

if you want in-memory data, then maybe an actual dict or list of dicts could be good enough

narrow prawn
#

oh tbh i was just wondering if there's any package recommendation like tinydb. any more complex than that and id be better off finishing my postgres module, ive just been slacking cause concurrency issues grief

paper flower
narrow prawn
paper flower
#

Is it a python or postgres issue?

narrow prawn
#

i think both idk

paper flower
#

Or using atomic operations, e.g. incrementing your counter only on db side (I think that should work)

narrow prawn
#

that feels so confusing

#

all of that stuff

#

i feel the need to pay someone whos done it before cause i dont trust myself to make something production ready

narrow prawn
shut tiger
#

Storing unstructured data is almost always a mistake.

narrow prawn
#

im looking for a non db solution still. cause when my postgres module is all done, im still gonna have reasons to store a plain json file

#

ill show a real world example in a sec

#
async def killbot_cmd(message,args):
    with open(os.path.join(sys.path[0], 'bot_state.json'), 'r+') as f:
        state = json.load(f)

        state['just_killed'] = True
        state['killed_in'] = message.channel.id
        state['killed_by'] = message.author.id

        f.seek(0)
        json.dump(state, f)
        f.truncate()

    await client.send_file(message.channel, 'images/killbot.gif', filename='killbot.gif')
    sys.exit()
#

i guess in the end to meet my needs im gonna have to make something. which will basically be a wrapper for with open and maybe subclassing dict for custom getting/setting

shut tiger
shut tiger
narrow prawn
#

its the stuff i havent coded an interface for yet

#

and dont need a whole table

#

like i write good code, and then i write code in every weird way when im debugging or testing new ideas. then sometimes i wanna store that data in a flat file for further debugging

#

like a scratch pad. is there a programming word for that

paper flower
narrow prawn
paper flower
#

Why that's a problem though?

narrow prawn
#

id rather have a vector saved as an .svg file so i can open it literally anywhere, rather than an .ai file so i it doesnt require a computer with adobe illustrator installed

#

idk maybe a poor analogy but it should be obvious esp when bringing up testing / debugging

paper flower
#

You can query a sql db from anywhere

#

I think you may be just making your life harder by working with files pithink

narrow prawn
#

why are we debating using 10+ commands to get there when i can just open 1 file and use my eyes

paper flower
#

Also in this particular case - if that "state" is global, why not simply store it in memory?

#

Or is it unique per server?

narrow prawn
#

idk why we're talking about this tbh

#

my use case is very personal to me

keen minnow
shut tiger
keen minnow
# narrow prawn both

rule of thumb is that you are going to have a hard time trying to maintain a consistent state for your data without a DB without loosing any data.
You can look at sqlite if you really want a file.

narrow prawn
#

yeah idk if anyone here gets what im doing

keen minnow
#

you are writing a bot?

narrow prawn
#

unrelated, my original question was asking for a tinydb alternative that better supported nested structures

keen minnow
narrow prawn
#

yeah

shut tiger
#

I still think you would be better off simplifying that code with using __file__ and opening the file twice, one for reading and one for writing.

keen minnow
narrow prawn
narrow prawn
#

im just seeking a library recommendation thats all

#

if there isnt one, i have zero issues writing something in house

keen minnow
narrow prawn
#

well no cause people are just telling me to use a database when the whole point is to not use a database

keen minnow
narrow prawn
#

?

shut tiger
narrow prawn
#

yeah im just gonna write something like tinydb but way more rudimentary

#

i like you can just do "db = TinyDB("name")" and then immediately do "db.search(...)"

subtle basin
shut tiger
subtle basin
shut tiger
#

There are use cases where it's ok, like historical facts where you specifically want the schema of old things to be incompatible with the new stuff. But that's rare.

wind glacier
wanton meteor
#

hello

#

would like to create a dataset of images with each image having transparent background.
id like to add #tags to each of these image files within the dataset but i dont know how to add #tags on the images.

shut tiger
spiral sand
#

what library do you use for postgresql

coral crown
#

Hey, So I made a Terminal Based Python Program which basically asks like 240 questions and take user input for each. I store the question number and user input in a dictionary (let it be dictionary1) I then use another python program to store the data (The dictionary1) using MySQL locally into a new table every time. Now I have two queries:

  1. I will be hosting the python program with PythonAnywhere. How do I store this data locally on my pc.
    Can I automate the process? Like storing that data (dictionary1) locally and also adding it to my database using the python program I made exclusively to take this dictionary and store it in a table.

  2. Since this is 240 questions long is there any way to save the user's progress? Something like how google forms work.

Note: If this query of mine belongs to some other channel please direct me to that channel so I can have solutions. Also, I am doing this as an project to improve my skills in programming as a final year highschooler.

shut tiger
shut tiger
spiral sand
#

what did you mean

coral crown
shut tiger
shut tiger
shut tiger
fleet rock
#

sqlalchemy.exc.ArgumentError: Could not parse SQLAlchemy URL from string 'SQLALCHEMY_DATABASE_URL'
I get that error when I try to put in the correct login info using sqlalchemy.
SQLALCHEMY_DATABASE_URL = 'mysql+mysqlconnector://root:PASSWORD@localhost/api'

paper flower
torn sphinx
#

guys i have a project coming up soon in which i have to use SQL and python , i have basic knowledge of both of them if you guys have an idea which is not a game and unique please share it with me

waxen finch
#

#1214594873366282250 message ah, sqlite does flush your writes automatically during the transaction: ```py
import contextlib, sqlite3

def setup_tables(conn):
conn.execute("CREATE TABLE IF NOT EXISTS person (name, eye_color)")

def insert_rows(conn):
for _ in range(1000000):
conn.execute(
"INSERT INTO person VALUES (?, ?)",
("Foobardian Hamspamming", "lavender")
)

with contextlib.closing(sqlite3.connect("test.db")) as conn:
with conn:
setup_tables(conn)

with conn:
    insert_rows(conn)
    input("Press enter to commit")```
#

speaking of which, @torn sphinx will there be more than one connection to the database happening concurrently? with sqlite's rollback journal (the default), a write transaction blocks all other read transactions, while using a write-ahead log (PRAGMA journal_mode = WAL) allows one writer to run concurrently with other readers

#

if it matters that inserting your data is atomic (all or nothing), and/or you don't care about concurrency with other connections, then yes

slim pewter
#

I'm reading up on DynamoDB and trying to get started with it. I'm a little confused as to how broad or granular my table needs to be
Like, do i create a "customers" table and create a Partition key called customerID and Sort key called inventory::carts?
my table is to store configuration data about these carts

tacit acorn
#

Question: What's the best way to select user.id where user.id not in guild.owner_id? (the users who do not own a guild) I have tried different ways to do join, but they all take forever.

waxen finch
paper flower
torn sphinx
#

i need help with something

#

pls

paper flower
torn sphinx
#

sorry

waxen finch
torn sphinx
#

so i have had this code for awhile its sherlock and its just now saying error raw file or something

paper flower
torn sphinx
#

how do i fix it

paper flower
#

I guess this? @waxen finch

select id from user
left join guild on guild.owner_id = user.id
where guild.id is null
paper flower
torn sphinx
#

ok thanks

waxen finch
# paper flower I guess this? <@153551102443257856> ```sql select id from user left join guild ...

interestingly sqlite does create different query plans for it: sql sqlite> CREATE TABLE user (id INTEGER PRIMARY KEY); sqlite> CREATE TABLE guild (id INTEGER PRIMARY KEY, owner_id INTEGER REFERENCES user (id)); sqlite> EXPLAIN QUERY PLAN SELECT id FROM user EXCEPT SELECT owner_id FROM guild; QUERY PLAN `--COMPOUND QUERY |--LEFT-MOST SUBQUERY | `--SCAN user `--EXCEPT USING TEMP B-TREE `--SCAN guild sqlite> EXPLAIN QUERY PLAN SELECT user.id FROM user LEFT JOIN guild ON user.id = guild.owner_id WHERE guild.id IS NULL; QUERY PLAN |--SCAN user |--BLOOM FILTER ON guild (owner_id=?) `--SEARCH guild USING AUTOMATIC COVERING INDEX (owner_id=?) LEFT-JOIN

paper flower
#

Postgres generates a different plan too, I guess that's expected πŸ€”

#

Do you have index on owner_id?

waxen finch
#

just tried it, only the latter query changes with its presence: sql sqlite> CREATE INDEX ix_guild_owner_id ON guild (owner_id); sqlite> EXPLAIN QUERY PLAN SELECT user.id FROM user LEFT JOIN guild ON user.id = guild.owner_id WHERE guild.id IS NULL; QUERY PLAN |--SCAN user `--SEARCH guild USING COVERING INDEX ix_guild_owner_id (owner_id=?) LEFT-JOIN

paper flower
#
select id from "user" except select created_by_id from "group";
HashSetOp Except  (cost=0.00..26.60 rows=110 width=20)
  ->  Append  (cost=0.00..26.00 rows=240 width=20)
"        ->  Subquery Scan on ""*SELECT* 1""  (cost=0.00..12.20 rows=110 width=20)"
"              ->  Seq Scan on ""user""  (cost=0.00..11.10 rows=110 width=16)"
"        ->  Subquery Scan on ""*SELECT* 2""  (cost=0.00..12.60 rows=130 width=20)"
"              ->  Seq Scan on ""group""  (cost=0.00..11.30 rows=130 width=16)"
select "user".id from "user"
left join public."group" g on "user".id = g.created_by_id
where g.created_by_id is null;
Hash Anti Join  (cost=12.93..24.44 rows=1 width=16)
"  Hash Cond: (""user"".id = g.created_by_id)"
"  ->  Seq Scan on ""user""  (cost=0.00..11.10 rows=110 width=16)"
  ->  Hash  (cost=11.30..11.30 rows=130 width=16)
"        ->  Seq Scan on ""group"" g  (cost=0.00..11.30 rows=130 width=16)"
waxen finch
#

huh, looks pretty similar...

paper flower
#
SELECT "user".id
  FROM "user"
  WHERE NOT EXISTS
    (SELECT "group".id FROM "group" WHERE "group".created_by_id = "user".id);

Generated she same query as join

#

I'm curious how would that work with larger second table, because that hash set could become big πŸ€”

#

but for that particular case it's more than fine

open raven
#

Good day! I'm working with a database using SQLAlchemy. Two tables are connected, but they dont use foreign keys (just ids as plain int fields). I cannot change this database. Is it possible to still get a relationship between these two tables in SQLAlchemy? If not, do you have a suggestion on how I can otherwise work on these tables in an intuitive way?

shut tiger
open raven
shut tiger
open raven
#

It just sounded complicated, and for something that'll just be used in one script, a simple solution is fine I think.

Here is the query if you're interested.

periodi_start = aliased(Periodi)
periodi_end = aliased(Periodi)

query = (
    select(Prenota, periodi_start, periodi_end)
    .join(periodi_start, periodi_start.idperiodi == Prenota.iddatainizio)
    .join(periodi_end, periodi_end.idperiodi == Prenota.iddatafine)
    .where(or_(periodi_end.datafine > cmp_time))
)

Its in italian because the Hoteldruid database is in italian.

prenota=booking
periodi=period
inizio=start
fine=end

#

So I'm basically getting the booking table, and the start and end period.

#

But anyways, your help was helpful, even if I didn't use it 😁

shut tiger
#

I don't use SQLAlchemy but isn't that already how you do it WITH fks?

#

So it's basically the same?

open raven
#

I'm not good with SQLAlchemy myself. But I know there are some ways to avoid writing SQL like syntax like this, and do something that looks more OO like.

Maybe not in this case though, as the query is a tiny bit complicated? Idk.

shut tiger
#

that's not a complicated query

open raven
#

Alright then πŸ˜‰

oak pumice
#

Does anyone have any suggestions for books or reference material (besides the docs) for SQLAlchemy?

I'm trying to avoid writing with raw SQL to merge two tables or basically do and update if item exists or insert if it does not and struggling to get it to work (I believe it's my pk that is the issue). So if anyone has experience with it or has any reading material they recommend I'd be forever grateful! β™₯️

fading patrol
oak pumice
#

As ashamed as I am to admit it, I don't believe I have noticed the help channel prior to your post. So thank you for pointing it out!
I'll get a snippet of my code and do as suggested. Thanks again!!

distant bloom
#

hi, how can i execute read queries to sqlite3 from the main thread and execute write queries on a separate thread without getting a "database locked" error?

shut tiger
waxen finch
# distant bloom hi, how can i execute read queries to sqlite3 from the main thread and execute w...

when you're using multiple connections, you should commit your read and write transactions early so they release sqlite's locks quickly
https://sqlite.org/lang_transaction.html#read_transactions_versus_write_transactions
https://sqlite.org/whentouse.html

SQLite supports an unlimited number of simultaneous readers, but it will only allow one writer at any instant in time. For many situations, this is not a problem. Writers queue up. Each application does its database work quickly and moves on, and no lock [should last] for more than a few dozen milliseconds.

#

and/or if feasible, turn on the write-ahead log so one write transaction can run concurrently with other read transactions
https://sqlite.org/wal.html

thin valley
#

hello , i need help , i make container oracledb local , i want copy my database from my original db to container

wanton meteor
# shut tiger We need more context. Are we talking Django?

Im preparing to use a Generative Adversarial Network with my own images. Will be using pytorch (i think).

Is there a way that i can create the datasets using django or other database languages?
I want to create a database or library or dictionary to exist where each image asset in the library has any number of #tags that are associated with the image.

Doesn't have to be django
Im visualizing a table with 1 column and each square (vertical) is an image and each image is followed by their own row consisting of #tags

coral wasp
wise goblet
#

Or just straight in filesystem

wise wind
#
 between parent/child tables on relationship Users.dms - there are multiple foreign key paths linking the tables.  Specify the 'foreign_keys' argument, providing a list of those columns which should be counted as containing a foreign key reference to the parent table.```
#
class Users(Base):
    __tablename__ = "users"
    username: Mapped[str] = mapped_column(primary_key=True)
    email: Mapped[str] = mapped_column()
    password: Mapped[str] = mapped_column()
    profile: Mapped[str] = mapped_column(default="https://firebasestorage.googleapis.com/v0/b/discord-83cd2.appspot.com/o/default.png?alt=media&token=c27e7352-b75a-4468-b14b-d06b74839bd8")
    dms: Mapped[List["Dms"]] = relationship(back_populates="user")
    def __repr__(self):
        return f"Users(username={self.username},email={self.email},profile={self.profile})"```
#
class Dms(Base):
    __tablename__ = "dms"
    id: Mapped[int] = mapped_column(primary_key=True)
    sender: Mapped[str] = mapped_column(ForeignKey("users.username", ondelete="CASCADE"))
    receiver: Mapped[str] = mapped_column(ForeignKey("users.username", ondelete="CASCADE"))
    user: Mapped["Users"] = relationship(back_populates="dms")
    def __repr__(self):
        return f"Dms(id={self.id},sender={self.sender},receiver={self.receiver})"```
#

Could someone help me get a better understanding about sqlalchemy relationships?

paper flower
#

Also probably don't use username as PKs pithink

#

You should add FK explicitly in this case:

    user: Mapped["Users"] = relationship(back_populates="dms", foreign_keys=[receiver]) # Depends on what FK you want to use here (receiver or sender)
wise wind
#

it's just a fun side project

wise wind
#

dms: Mapped[List["Dms"]] = relationship(back_populates="user",foreign_keys="Dms.receiver")

paper flower
#

I mean, yes, but only for one of the foregin keys

#

I think it would be better to add FK to Dms.user

#

I assume dms relationship on User would know what FK to use since you specify it on the other side of your relationship

wise wind
#

in the Dms class

paper flower
#

I think so

#

You may need to use list for foreign_keys though, I don't remember the exact API

wise wind
paper flower
#

Yes

wise wind
# paper flower Yes

Does relationship make sqlalchemy perform a join or select statement when accessing a row from the table with a relationship?

#

I read the documentation, but I want to make sure I undertand what they're saying with relationships being somewhat confusing.

paper flower
#

By default if you don't use async I believe it has the same behavior as djang orm, it would do a query per each access to a relationship field you didn't access before

#

So N+1 problem

wise wind
paper flower
wise wind
paper flower
#

I'd say

#

If you have many to many relationship it would be better to use selectinload

#

But if you have a 1:1 join should perform the best

wise wind
paper flower
#

I wouldn't recommend that

#

I'd use default load as raise so it throws an exception πŸ˜…

wise wind
paper flower
#

No

wise wind
# paper flower No

And using async, I have to explicitely state what type of load I want?

#

or is it for both sync and async

paper flower
#

With sync if you access user.groups it will try to load that relationship, and in async it's simply not possible because it can't await that operation, so it errors out

#

You can override that behavior in sync sqlalchemy though

wise wind
#

Okay thanks for all the info. Relationships are a tough concept to grasp

maiden cedar
#

like how do i do this thing

shut tiger
#

This doesn't sound like a database question, it sounds like a pandas thing or something

maiden cedar
#

sorry i will remove this

shut temple
wise wind
#

@paper flower py friends = await db.execute(select(models.Users.profile, models.Users.username,models.Dms.id).join(models.Friends, or_(and_( models.Friends.receiver == current_user.username, models.Friends.sender == models.Users.username), and_( models.Friends.sender == current_user.username, models.Friends.receiver == models.Users.username))).outerjoin(models.Dms,or_(and_(models.Dms.sender == current_user.username,models.Dms.receiver == models.Users.username),and_(models.Dms.receiver ==current_user.username,models.Dms.sender == models.Users.username)))) any recommendations on how to improve this query?

paper flower
# wise wind <@216540595403882496> ```py friends = await db.execute(select(models.Users.profi...
  1. Would be good to know what you're trying to achieve this
  2. Let's fix the formatting so it's easier to read (and probably get rid of models. priefix as it takes too much space IMO):
friends = await db.execute(
    select(Users.profile, Users.username, Dms.id)
    .join(
        Friends,
        or_(
            and_(
                Friends.receiver == current_user.username,
                Friends.sender == Users.username,
            ),
            and_(
                Friends.sender == current_user.username,
                Friends.receiver == Users.username,
            ),
        ),
    )
    .outerjoin(
        Dms,
        or_(
            and_(
                Dms.sender == current_user.username,
                Dms.receiver == Users.username,
            ),
            and_(
                Dms.receiver == current_user.username,
                Dms.sender == Users.username,
            ),
        ),
    )
)

So are you just selecting Dms.id essentially? πŸ€”

#

I think if you use relationships on your models it could be simplified a bit

zealous marsh
#

Hi

#

How to create alembic migrations for PostgreSQL database with different schemas and ORM-based sqlalchemy client?

#

I can do it without schemas supply

#

But can’t do with them. ORM classes of tables can’t see table names in query

paper flower
#

<@&831776746206265384>

knotty ermine
#

Is anyone aware of a way to create a relationship using a hybrid property that references a relationship? On the model TaskCircuitBuildSubtask below, I've defined the relationship 'nni_list', but this is causing the below error. The relationship itself is referencing 'supplier', which is a hybrid property. The hybrid property itself also references the relationship 'task', which I think is specifically causing the issue.

AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with TaskCircuitBuildSubtask.task has an attribute 'interface'

The hybrid property itself works as expected and returns the correct value.

#

`class Nni(Base):
tablename = 'nnis'

id = Column(Integer, autoincrement=True, nullable=False, unique=True)
parent_id = Column(Integer, ForeignKey("interfaces_juniper.id", ondelete="CASCADE"), index=True, primary_key=True)
nni_type = Column(String(32))
supplier = Column(String(32))
nni_ref = Column(String(128))
description = Column(String(64))
platform = Column(String(32))`
#

`class TaskCircuitBuild(Base):
tablename = 'task_circuit_build'

id = Column(Integer, autoincrement=True, primary_key=True)
customer_code = Column(String(16), nullable=False)
support_ref = Column(String(16), nullable=False)
created_by = Column(String(255), nullable=True)
date_created = Column(DateTime, default=datetime.datetime.now)
date_completed = Column(DateTime, nullable=True)
status = Column(Integer, default=0)
family_type = Column(String(255), nullable=True)
speed = Column(Integer, nullable=False)
supplier = Column(String(255), nullable=False)
layer = Column(Integer, nullable=False)
as_target = Column(Integer)
vrf_target = Column(Integer)
config_holder_set_id = Column(Integer, ForeignKey("config_holder_set_juniper.id"), index=True)

subtasks = relationship("flex.models.TaskCircuitBuildSubtask", viewonly=True, lazy='joined', backref="task")
routes = relationship("flex.models.TaskCircuitBuildSubtaskRoute", viewonly=True, lazy='joined', backref="parentTask")
config_holder_set = relationship("juniperConfigHolderSet", lazy='joined', back_populates="task_circuit_build")
task_queue = relationship("TaskQueueDeploy", lazy='select', backref="build_task")`
#

`class TaskCircuitBuildSubtask(Base):
tablename = 'task_circuit_build_subtask_unit'

id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey("task_circuit_build.id", ondelete="CASCADE"), index=True, nullable=False)
handoff_type = Column(String(20), nullable=False)
node_id = (Column(Integer))
interface = Column(String(64))
unit_num = Column(Integer)
vlan_id = Column(Integer)
inner_vlan_id = Column(Integer)
ip_address = Column(String(20))
configuration = Column(String(4294000000))
preference = Column(Integer, nullable=True)
bfd_enabled = Column(Boolean, nullable=True)
active = Column(Boolean, nullable=True, default=True)

@hybrid_property
def supplier(self):
    if (self.handoff_type=="CUSTOMER" and self.task.layer==2):
        return self.task.customer_code
    else:
        return self.task.supplier
    
nni_list = relationship('Nni', foreign_keys=[Nni.supplier], primaryjoin="Nni.supplier == TaskCircuitBuildSubtask.supplier", uselist=True, viewonly=True)`
harsh pulsar
delicate fieldBOT
#
Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

shut temple
#

I have the user_genres and a database table with party_genres, i want to return the party with maximum intersection, how do i write the mongodb query?

narrow prawn
#

i started playing with prisma and im impressed

#

dont really know the advantages it has yet, but im impressed solely by the fact it generates a python library that's tailored to my db

#

and that you have full control over the name of the attributes for your objects too, via mapping in the prisma schema

wise wind
harsh pulsar
harsh pulsar
wise wind
narrow prawn
harsh pulsar
#

i get a little leery because the prisma project itself seems less open to the idea of a broad ecosystem built around their language. there isn't really even a mention of it in the docs. seems like a second-class citizen thing. they seem to think of themselves primarily as a javascript/typescript framework

narrow prawn
#

im only using it for its ORM features, but i think so far it provides a pretty nice interface for connecting to the db

paper flower
wise wind
#

didn't think of that

paper flower
#

You can also specify a condition in that join too

harsh pulsar
paper flower
narrow prawn
#

im ditched sqlalchemy for it being way too hard to figure out

wise wind
paper flower
#
select(User).join(User.friends.and(User.username.ilike("something")))
#

Though not sure if it will use correct alias here

#

Since User.friends are Users too

harsh pulsar
paper flower
#

Maybe there's a way to simplify it, what db are you using?

paper flower
#

Maybe tuples could work πŸ€”

narrow prawn
paper flower
#

(a, b) in ((a, b), (b, a)) You could do an operation like this

wise wind
#

@paper flower is it true that you're one of the top contributors of FastAPI?

paper flower
paper flower
narrow prawn
paper flower
narrow prawn
#

like what readability means

paper flower
narrow prawn
#

gotta disagree there

paper flower
#

This is pretty readable to me pithink

wise wind
#

I feel like I have a better understanding

paper flower
#

Sqlalchemy docs look bad at first, you get used to them in like a week or two of active reading

#

Worth it tbh

narrow prawn
#

look at how bad the search is

wise wind
#

but then again I knew the basics of sqlalchemy when I read the documentation

wise wind
harsh pulsar
# paper flower Honestly you shouldn't care about design, and their docs structure isn't bad

the search doesn't do a good job. they do a poor job of separating "very important" information from "arcane miscellany". they tend to jump right into examples without spending some time to clarify the data model of the thing you're trying to read about. the api reference pages being huge massive all-in-one affairs makes it very hard to find things.

you can absolutely learn from the docs. but sometimes it will take you a while to find what you're looking for, or you might never even realize you should be looking for it.

paper flower
# wise wind I think I may know a solution
from sqlalchemy import select, tuple_
from sqlalchemy.dialects import postgresql

from app.db.models import Manga, MangaTag

stmt = select(Manga.id).join(Manga.tags).where(
    tuple_(Manga.is_public, MangaTag.name).in_([
        (True, "name1"),
        (False, "name2")
    ])
)
print(stmt.compile(dialect=postgresql.dialect(), compile_kwargs={"literal_binds": True}))
SELECT manga.id 
FROM manga 
  JOIN manga__manga_tag__secondary AS manga__manga_tag__secondary_1 
    ON manga.id = manga__manga_tag__secondary_1.manga_id 
  JOIN manga_tag 
    ON manga_tag.id = manga__manga_tag__secondary_1.tag_id 
WHERE (manga.is_public, manga_tag.name) IN ((true, 'name1'), (false, 'name2'))
wise wind
#

but I guess it's a learning opportunity

harsh pulsar
# paper flower Here be dragons

i actually do appreciate those. i shouldn't imply that it's universally bad. for example the pandas user guides are still worse.

#

the fact that a complete and completely type-hinted reference doc exists is amazing

#

and with near 100% docstring coverage too

paper flower
#

It did go a long way since 1.3-1.4

harsh pulsar
#

and the info is all there. it's just a questionable UI and information hierarchy

#

yeah i definitely remember it being worse.

#

like matplotlib. i was pleasantly surprised when some time around the 3.x release the docs started to become useful

paper flower
#

Optional type hints were introduces in 1.4.something, started using them immediatelly

#

And in 2.0 you can declare a column with just name: Mapped[int], awesome

harsh pulsar
#

yup, i love that

#

however db.execute(select().scalar()) vs. db.scalars(select()) is confusing

paper flower
#

Maybe
execute().scalar/s()?

#

They're the same, but Session.scalar/s are just helper methods

harsh pulsar
#

wait maybe i have that backwards

harsh pulsar
#

see, i can't keep them straight πŸ˜†

paper flower
#

You can turn scalar() into
execute().scalars().one_or_none() πŸ˜…

#

They should do the same thing

harsh pulsar
#

yup. too much convenience...

paper flower
#

execute returns everything you selected in select, in case there are multiple elements (ORM models "count" as one though)
scalar returns first element and a single row
scalars returns first element and all rows

#

E.g. if you want to fetch all users you'd do

session.scalars(select(User))
harsh pulsar
#

but there's no session.scalar 😠

paper flower
#

One user:

session.scalar(select(User).where(User.username="abc"))
paper flower
harsh pulsar
#

also .where and .filter being synonymous

harsh pulsar
#

maybe i wrote saclar or something

paper flower
#

Maybe old sqlalchemy version?

harsh pulsar
#

well that's convenient

#

i would hope not but anything is possible

#

i take back all criticism, sqlalchemy is perfect

paper flower
#

stream_scalars is kinda cool

#

I used it (once) πŸ˜…

#

For a etl

harsh pulsar
#

that one is new to me

#

do we have sqlalchemy in the doc bot?

#

alas

paper flower
#

fix it brainmon

harsh pulsar
#

i don't have that power

paper flower
#

It's cool that sqlalchemy allows you to use any custom operators too

#
if filter.search_term:
    stmt = stmt.join(Manga.alt_titles, isouter=True).where(
        AltTitle.title.op("&@~")(filter.search_term),
    )
wise wind
paper flower
#

Let me google a bit, maybe there's a better way to model that relationship

narrow prawn
shut temple
fallow forum
#

Hi, I need an advice..
So am working on an app that basically needs to perform alot of inputs, the user will create and I will save his filtered data and it will affect multible tables, then in update and delete, I will have to go through those tables to update the changes and it becomes a bit complicated..

#

I want to make a non-relation table that stores everything and deal with the creating and changing in a simple way, and then make a schedule task to take this raw table's data and store them properly in other tables to make the reports, what do you think?

shut tiger
cloud bronze
#

My Instagram account was ban please help me for unban instagram account

shut tiger
cloud bronze
#

Please help me

shut tiger
silver cipher
#

alright he's using an alt

fallow forum
shut tiger
fallow forum
shut tiger
fallow forum
#

No

#

You mean like atomic?

#

Doesn't Django handel this?

shut tiger
shut tiger
fallow forum
#

And this allows me to do what?
Like undo the transaction if it fails or something?

#

This is my first big project so I still have no idea about the best practices

shut tiger
fallow forum
#

Are there any sources that I could learn this type of things from?

shut tiger
fallow forum
#

I found a really nice book πŸ“™ called the temple of django database, but thank u anyways

sand oyster
wise wind
paper flower
#

tuple looks a bit better IMO, but it's a postges specific thing

wise wind
#

It's kind of hard taking advantage of relationships when there are multiple foreign keys in one table.

foggy spade
#

hello guys im using sqlite3 for a python/pygame game project and ive ran into a problem where my score is not getting updated into my database. i have aleady debugged and found that the problem lies somewhere within the data being updated into the database. not sure if theres something specific im mssing out

shut tiger
foggy spade
#
                                        import sqlite3 
                                        from login_form import LoginForm
                                        score = round(end_time, 2)
                                        print(score)  

                                        
                                        if LoginForm.logged_in_user:
                                            db = sqlite3.connect("main.db")
                                            cursor = db.cursor()

                                            update_score = 'UPDATE user_scores SET score = ? WHERE username = ?'
                                            cursor.execute(update_score, [score, LoginForm.logged_in_user])

                                            db.commit()
                                            db.close()```
#

sorry i realise that was vague but this is my logic for updating the database it just doesnt do it

#

im new to this as well id like to get that out there

shut tiger
frank gust
#

guys anyone expert with google cloud services ?

foggy spade
foggy spade
#

oh sorry i misinterpreted no i havent i will attempt that thanks

fading patrol
tall wasp
#

I have a MySQL database setup on an external host, I have the host, user, password, and port. Do I have to use a Python package or can I manually send requests to the database?

wise wind
#
friends = select(
                case(
                    (models.Friends.sender != current_user.username, models.Friends.sender),
                    (models.Friends.receiver != current_user.username, models.Friends.receiver)
                ).label("username")
              ).select_from(models.Users).options(joinedload(models.Users.friends)).join(models.Users.friends).filter(models.Users.username == current_user.username).subquery()
    friends_information = await db.execute(select(models.Users.profile,models.Users.username,models.Dms.id).join(friends,friends.c.username == models.Users.username)
                        .outerjoin(models.Dms,or_(and_(models.Dms.sender == current_user.username,models.Dms.receiver == models.Users.username),
                                                  and_(models.Dms.receiver == current_user.username,models.Dms.sender == models.Users.username))))
    friends_information = friends_information.all()``` @paper flower I spent hours just to produce this
wise wind