#internals-and-peps
1 messages Β· Page 102 of 1
What if the iterator is not infinite, but takes more memory than your computer has available to represent?
What if the object conforms to your type hints when you get it, but it's modified after it's passed to you and no longer conforms?
Well, it doesnβt actually fit the list API
yes, but python can't check that. And if you have runtime checks, they should be at least somewhat accurate
Even given something like ```py
x: list[int] = list(range(1000000)) + ["x"]
It's possible to validate that the type hint is incorrect, but it's _very_ expensive to do so.
And that's a simple, one dimensional structure.
And of course, there's cases like: ```py
x: asyncio.Future[int]
Which says that some other coroutine will eventually come along and set that future's result to an int value. That can't be checked when the object is passed to you... It can't be checked until after `set_result` has been called
Ah, even better example.
You check that with mypy, before you run the code.
Is there any way to have logic triggered by the named importing of specific variables? For example let's say I have main.py and datasets.py. In main.py I want to run from datasets import somedata and have it automatically download and return a CSV from someurl.com/somedata.csv, but if I do from datasets import someOtherdata it should instead download and return someurl.com/someOtherdata.csv
I already have this (https://github.com/adamerose/PandasGUI/blob/develop/pandasgui/datasets.py) such that all datasets get downloaded on first import, but I would like to instead have individual datasets only download when they are specifically imported
you should be able to specify def __getattr__(name) at the module level and fetch it there
alternatively , I think this is something import hooks can do
ok great thanks
hi guys! does anyone here use GitHub Actions for CI/CD?
this is a #tools-and-devops topic, but yes
thank you!
what is something like this called print(['false','true'][input()=='1'])? i was playing clash of code in shortest mode, and the winner used a construct like this. But im having a hard time even googling what it is.
its like a conditional using only two lists
It's just regular list indexing, which works because bool is a subclass of int
Using True where an integer is expected is the same as 1, likewise for False and 0
ah ty soo much, i think i understand, that means it would only work for 2 values because the index can only be 0 or 1 (true or false)
Right.
dang its crazy what ppl come up with
In that particular case you gave, note that it would be shorter to do ```py
print(str(input()=='1').lower())
but that's a special case π
here's the latest crazy thing I came up with: https://github.com/nedbat/gefilte/blob/main/src/gefilte/gefilte.py#L154-L174 it turns class methods into bare names as a DSL
No types Check !!!! For the Freedom of Bugs !!!
hello, everything ok?
Hey everyone I am looking to refactor a large conditional ~ 10 if el if statements and am trying to figure out what design pattern makes sense for this. Each conditional checks a variable for a value and if it is in the list it runs a method
trying to visualize this without just moving the conditional check
Smells like bad design
oh it is, thats what I am trying to fix π
But without some code sample on pastebin nobody will help you with that
And secondly, for all the complicated help topics there are dedicated help channels to not spam the general talk
got a usage example of that? I'm having trouble picturing it π
I don't see anything wrong with that.
definitely breaks open close, anytime we need to add a new condition we need to come into this class and make changes to this method
... type checking can probably be implemented by some semi formal method ( like pre and pos condition) at running time it can probably be implemented at the metaclass level for python , so no need for changing code. and cool thing is you can usually do this on a domain of classes ...
yes it is costly (cpu) be it works π
that's not what the open/closed principle is about.
A module will be said to be open if it is still available for extension. For example, it should be possible to add fields to the data structures it contains, or new elements to the set of functions it performs.
A module will be said to be closed if [it] is available for use by other modules. This assumes that the module has been given a well-defined, stable description (the interface in the sense of information hiding).[3]
we need to let some bugs the pleasur to room around π
the open/closed principle is more about: if we need to change this thing, do we also need to change a bunch of other things?
it should be possible to extend a thing without breaking all of the other things that already depend on it.
and adding a new "elif" here doesn't violate that.
I guess it all depends on which definition you read
What does Open for extension and Closed for modification mean? Open for extension means that the behavior of a software entity can be extended. This allows for making changes required to satisfy new requirements. Closed for modification means that we must not modify the code to extend it with new behavior. In other words: adding new behavior should not lead to changes in the source code.
I'd say "should not lead to unnecessary changes in the source code".
changes shouldn't have knock-on effects, you should be able to make targeted changes without breaking things.
the pithy version of the open/closed principle is "you should be able to put on a coat without needing open heart surgery"
look at the readme of the repo
My first thought was "how the heck is that working", followed by noticing
inspect.stack()[2][0].f_locals
heh, looks cute though.
@raven ridge it works π
Hi all, I need help
Is python thread native thread? if it's true, does GIL try to prevent some native thread features? So what are they?
Assuming we're talking about CPython - the most typical Python interpreter - yes, Python threads are native threads, and yes, the GIL limits what can be done by two threads. No two threads can be simultaneously executing Python byte code or calling Python functions or manipulating Python objects.
This is a vague question, but How do I become an advanced python programmer? or an advanced programmer in general? I have my own long-term projects also I feel very comfortable with self-learning. I feel like there is very few resources in terms of becoming an advanced programmer vs the sea of beginner tutorials. For some projects, I have since thrown out hope of looking up pre-made solutions of tutorials and started just reading straight documentation. I even tried to help out some open-source PRs of my own. When I did this, all I got in response was mostly about criticism about my methods and how they are bad without giving alternatives, basically just bashing me. I hope some of you here can help me understand the ins and outs of dev work and how I may become advanced. I hope this is the right channel for this discussion, maybe #career-advice would be better? let me know.
Yep, but I need some code example in cpython, code of interpreter, can u help me?
When I fork 8 thread in python code but I can only see 1 thread in top command? Why not 8?
if GIL limits native thread, Why don't we call the python thread a green thread?
they really are OS threads.
does top show threads?
I can see 1 thread, I'm using macos
can you show us the code you're running?
It is
u can see here https://pastebin.com/RYcySPjx
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
yes
when i run that, top says I have 11 threads.
Oh no
what's wrong?
where is the thread count there?
you have one process. you aren't showing the number of threads
Let's see PID 91632 and show tree of process but i can not it's children (threads)
the children of a process are more processses, not threads.
How to see threads? Can u help me?
My top looks like this. thr.py has 11 threads
the first line, python3.8 shows 11 threads
Thank you, can u show cpython code, which forks native thread? I can not find them
your code is forking native threads
I need detail implementation in cpython project
Maybe system call
i don't know what you are missing: your Python code runs in CPython, and makes native threads.
you are asking for the code in the CPython implementation?
yes, u can see here https://github.com/python/cpython/blob/master/Modules/_threadmodule.c#L1065
but i can not where thread was forked
i'm sorry, i don't know the details. can you say more about why you want to find the code?
Yeah - https://github.com/python/cpython/blob/master/Python/thread_pthread.h#L288 is where it starts the thread, on a Unix system.
it does it somewhere else on Windows.
I have a discussion with my friend about it
what is the discussion? whether they are real threads?
We have a wsgi server and I said that python thread look like green thread so my friend does not agree that
right, they are real OS threads
there are Python libraries that do green threads - asyncio and trio and gevent and greenlet, to name a few. But normal Python threads, created by the threading module or the concurrent.futures module, are real OS threads.
thanks @spark magnet for your support
hey guys, i have a massive issue. so i made this django site and i deployed it to heroku for free and it was working fine untill i started to create my login functionality. the logins were creating sessions and saving data but what i discovered now is that the passwords just start to either get deleted or just expire
i don't know how the hell this is happening, its definitely because of Heroku. I'm using SQLite3. is there is a issue on how i do my database or do i use another host
this is a channel for discussion of the language! try #web-development or #βο½how-to-get-help π
sup nerds, i hope you are having fun with this one https://www.youtube.com/watch?v=8qEnExGLZfY
[https://www.buymeacoffee.com/cogsci] In this video, I show how you can profile Python code using the cProfile module, and how you can use this information to optimize your code, resulting (sometimes) in massive performance improvements.
The Jupyter notebook is available from https://osf.io/upav8/
should utility clases by avoided in python?
Usually making an utility file or module is enough, making a class for the sake of namespacing isnβt very useful
so instead a utility module and from there import all of the functions?
Yup
a "utility module" can be a disaster as well
because you tend to dump every single thing that doesn't fit the existing structure into there
yeah, a utility module should be only functions that do not use any objects or values from other code
@acoustic crater you should ask in #c-extensions
kk
re util module, i've often found it as a great "intermediate" step in refactoring
So i tend to let it become a dumpyard for common functions in my module, and then i use it as a signal for refactor if it becomes too large
re threading discussion earlier, what is a "green" thread?
yeah - do exactly the same thing - and if it's used more make it into a package
is it possible to ignore both flake 8 and pylint errors in the same line ?
They're threads of execution managed by the runtime instead of the os
ah gotcha
why run both flake8 and pylint?
i didn't think they completely overlapped π€ in that, pylint would catch some things flake wouldn't and visa versa? Also - pylint is quite slow, so on pre-commit hook i just run flake
maybe this is a bit wonky tho
Hey. Do you know of any advanced tutorial/guides for prompt_toolkit aside from what's already in the repository and their documentation?
lmao i asked the question π
no, instead i read the source code, which is pretty decent
if you can't figure out something, read their code, and you'll solve it
that's my takeaway
I know, but I thought it was worth to ask, because you probably went down the same road I'm walking now :)
thnx
guys why is it when i explicitly enter the unix time in jwt args i am not getting any error but when i make it like return from a func i am getting an error ===================== this is when i directly enter the unix time for exp key ( no error )========= unix = expireDate(minutes= 1) token = jwt.encode( {'user' : uname, "exp": 1617126042}, app.config['KEY'], algorithm="HS256" ) but when i do this unix = expireDate(minutes= 1) token = jwt.encode( {'user' : uname, "exp": unix}, app.config['KEY'], algorithm="HS256" ) i get an error he view function did not return a valid response. The return type must be a string, dict, tuple, Response instance, or WSGI callable, but it was a int.
This channel is for discussing the Python language itself. You should see #βο½how-to-get-help or ask in #web-development
lol isnt this python ?
The channel is for discussing the language itself -- the future of the language, what you like and don't like in Python etc., not for individual help requests
ohhhhhhhh
If its too large what is refactoring?
refactoring simply refers to rewriting/reorganizing code to make it easier to maintain. in this context, it would depend, sometimes i break it into separate files based on where the utility functions belong. Other times, they turn into methods for a class, though that's been fairly rare for me
Ah
some other times, i notice im not really reusing a function i thought i would need, and i decide to pull it back into a different py file instead and have it sit "close" to where its used
do dataclasses work well with non dataclasses in inheritance tree?
I don't know, but it would make me uneasy. why do you want to do it?
I have a dataclass with behavior, but I found they work well but not with cooperative iheritance
what did you want to inherit?
tried multi iheritance, but dataclasses don't forward extra args to super
I don't think anything in python does anything automatically. you always have to call super, be it a normal class or dataclass
Feel like you're doing something wrong
@tacit hawk can you tell us more about what your inheritance structure looks like?
you can use init=False to stop the dataclass from generating its own __init__ at all. Or you can guarantee it's last in the MRO before object, so it doesn't need to forward any args on to another class, by carefully ordering your classes.
thanks, yes I set dataclass as last in mro. but it seems they don't follow mro right, I had a diamond iheritance where top and middle classes were dataclasses like
d0
/\
d1 d2
\/
child
where d are dataclasses, it seems only the mro child > d1 > d0 was followed, I expected child > d1 > d2 > d0
I was pointing out that you could make it so that, if there's only one dataclass in the MRO, you can sidestep the problem by ensuring it's last. Obviously you can't do that if there's more than 1 dataclass.
I removed that diamond
what is the impact of reading data from a list vs reading data list which is in dictionary?? i am taking about access time //..........
do you mean looping over it? or do you mean searching for a specific index/key? more details please
Is there a python expression evaluator with type hints that explains how to use generics with covariance?
my 2nd plugin, neos, adding menu items to the IDE "Script" menu: https://i.imgur.com/xTjNziq.png
is the PEG on github somewhere?
I'm considering making a function that takes input for struct and partitions it (e.g. 5sP3n becomes [s,s,s,s,s,P,n,n,n]), should I just keep it private to this project or would this be worth PRing into the _struct.c module
What do you mean by "the PEG"?
@weary garden https://github.com/python/cpython/blob/3.9/Grammar/python.gram is this what you're looking for?
@spark magnet the raw file version of what appears in the website docs; all I can find is a grammer containing python code rather than the source PEG
@timid orbit yeah that is no good as it contains python code
actually https://github.com/python/cpython/blob/3.9/Grammar/Grammar might be OK
Is this the best/fastest "reduce" a dict into one dimension? I have a JSON where the key of a sub-json should become a property of that sub-json. I do this by converting to a dict and doing the following:
data = dict()
data['2020'] = dict()
data['2020']['val1'] = "yes"
data['2020']['val2'] = 3
key = list(data.keys())[0]
flattened = {'date': key, **data[key]}
print(flattened)
run the code: https://onecompiler.com/python/3wtb86rp3
How do you type hint a literal value if you are stuck at 3.7 without typing.Literal
you can use typing_extensions
Why not?
I can't just go and pip install things at work
So if it's not in base install, I'm not going to have access
Government work, classified systems, always a few years behind
perhaps you have libraries that depend on typing-extensions? π
try importing typing_extensions
well, then you can't use Literal
And you can't vendor the dependency either?
Idk what this means
Basically keeping a copy of the dependency in your own repo
You don't need the internet to do that
I can't local install either
Like I can't just like copy the package onto a flash drive and plug it in
It's completely self contained
In fact I'd be breaking DoD rules if I plugged anything into the terminal box
then you can't use Literal AFAICT
is it really that bad?
you can use an Enum or something like that
Is what that bad?
not using Literal
No, but it does suck
To clarify, I mean that the source code of the dependency will be alongside your own code. So when the app is deployed, the dependency is copied along with your code. If your policies don't allow any third party code then that's understandable. But that wasn't totally clear.
Like I'm just not going to type hint that argument
No we can have 3rd party code
It's getting the code onto the system
It's a mess and would take over a year
If you want to emulate a few select allowed values, you can use enum.Enums
Better off waiting for new python to be approved and installed
Imagine working on a classified system that contains models and deployment code for missile defense systems that shoot intercontinental ballistic missiles out of the air
It's locked down lol
Okay, I understand.
I'm really shocked that typing didn't have literal from the getgo
Apply to your superiors for python3.11. Before your upgrade gets approved, it will be already out. Just gov work things
I've literally never found a use for it... It's such an odd thing to me to type hint that it's legal to do py func("val") but not ```py
x = "val"
func(x)
Lemme show u what I am doing
I think it's fairly interesting
!code
@raven ridge
So I found out that you can pass callables to .loc in pandas and made a logical slice equivalent to pd.IndexSlice. my first time every functional programming really and I think it's really interesting
I want the Literal for my __init__ argument
you could probably use an enum there, but it would be a few more lines depending on how you'd want it to work
class Op(Enum):
GT = ">"
ops = {
Op.GT: operator.gt
}
def init(op: Op):
ops[op] is operator.gt
init(op=Op(">"))
but you could also have
class Op(Enum):
GT = operator.gt
def init(op: Op):
op.value is operator.gt
init(op=Op.GT)
but then you cannot do op=Op(">"), which can be useful if you're eg loading some config from JSON and will have the op specified as a string
Thank you !
I hope the code works, I didnt run it
well, as you might guess, it depends on the type checker π
that's legal in Pyright
because the type of x is inferred to be Literal["val"]
Huh, I thought that wasn't allowed, but https://www.python.org/dev/peps/pep-0586/#backwards-compatibility - apparently the PEP completely declined to specify whether or not that should be inferred.
Is there any web api which allows 10000 of request calls for testing purposes?
So I added some example code in the comment of this gist and I was wondering if I could get some feedback
https://gist.github.com/Melendowski/0fc0eb453cdc434b5f4a46fdaddac2ee
you don't care what API it is?
you can just host it locally/on a VPS, right?
sure, if you have code to run. Can you say more about what you need to do?
looks cool, but i think you should take a look at https://github.com/construct/construct/blob/master/construct/expr.py
it looks like that's exactly what you're trying to implement, and it already supports almost every operation. usage:
f = (this['a'] > 5) & (this['b'] + 7 > 4)
df.loc[f]
Where is this example
what example
That example you gave
it's inside my message, wdym
I thought you copied it from somewhere
ah no, i tried to do something like what you've done
That's callabe?
yeah
What is this
this is a cool library, that does binary parsing, but it also has this callable trick like you want
this is a Path() object, pretty similar to your LiteralSlice, look at the link i sent
That gives me some strong ideas that's for sure
is it not exactly the same thing except without pandas?
It's very similar that's for sure
I could make borrow from that to change the syntax on mine
I think it's better to create some shared library instead of having similar code that does almost the same thing between these 2 libraries
Well I actually made an issue on pandas
I can see how short sighted mine is now
Just goes to show you, you're rarely the first person to think something up lol
Has anyone found any neat use cases for typing.Annotated yet? When working with JSON responses, I usually prepare local representations that I load the desired attributes into, so that it'll fail as soon as possible if the response lacks the expected keys, and the rest of the code can access the attrs naively, and so far this looks like a beautiful declarative approach:
class Video(RemoteObject):
snowflake: Annotated[int, "Video.Identifiers.Snowflake"]
Where RemoteObject.__init__ will loop over __annotations__ and get each attribute from the response using the annotated path.. I used to use the attribute name to match it to a key in the response, but that's annoying when a) the JSON's keys aren't snake_cased, which means the attributes cannot be either, and b) you cannot specify a nested path.. I'm just upset that operator.itemgetter doesn't allow nested lookup like attrgetter does, but maybe there's a good reason for that
I'm unsure on what the best way to specify the path would be though, because using a dotted path suffers from the issue that string keys themselves can have dots in them
I think maybe that's why operator.itemgetter won't allow a nested lookup, because there's no unambiguous way to delimit the path?
Well, it could take an iterable of keys, for example
Hmm, perhaps this would be better:
snowflake: Annotated[int, "Video", "Identifiers", "Snowflake"]
!d typing.Annotated
typing.Annotated```
A type, introduced in [**PEP 593**](https://www.python.org/dev/peps/pep-0593) (`Flexible function and variable annotations`), to decorate existing types with context-specific metadata (possibly multiple pieces of it, as `Annotated` is variadic). Specifically, a type `T` can be annotated with metadata `x` via the typehint `Annotated[T, x]`. This metadata can be used for either static analysis or at runtime. If a library (or tool) encounters a typehint `Annotated[T, x]` and has no special logic for metadata `x`, it should ignore it and simply treat the type as `T`. Unlike the `no_type_check` functionality that currently exists in the `typing` module which completely disables typechecking annotations on a function or a class, the `Annotated` type allows for both static typechecking of `T` (e.g., via mypy or Pyre, which can safely ignore `x`) together with runtime access to `x` within a specific application.... [read more](https://docs.python.org/3/library/typing.html#typing.Annotated)
huh, this looks widely useful, for everything that python types can't do yet. but it depends on what your tools support.
i'd want a way to limit a range of a number for example, something like snowflake: range(0,10) & range(20,30), but I don't think that exists, so I'd use annotated for it
also for example to annotate 2 kwargs that must be used together
anyways, to bypass whatever the limitations of python types are
I have a question:
How come when I override a method of an instance I don't automatically get the self parameter anymore? it works when I override on the class
It's like something to do with the method resolution, like it only passes the instance if it's not a direct attribute, but idk what/why
class A:
def hello(self):
return self
a = A()
a.hello() # = a
A.hello = lambda self: self
a.hello() # = a
a.hello = lambda self: self
a.hello() # missing 1 required positional argument 'self'
@sacred tinsel pydantic uses a lot of annotations that aren't valid type hints, so it is useful there.
@neon musk every python function has a __get__ method, which makes it a descriptor. Descriptors only trigger when the attribute doesn't exist on the instance and the type has to be checked for it. There is a real python article on descriptors, if someone could link it please.
You can also prevent the self binding using staticmethod
Is asking about multiple inheritance behavior within the scope of this channel?
If you want to ask about mechanics like MRO, etc. then yeah it probably is in scope.
I think that and mixin techniques might be what I might be asking about in a while. Thank you. I'll probably ask after the deadline since I can probably solve this without multiple inheritance for now.
Thanks! this really helped me figure out the behaviors I saw, I found a good article by Raymond in the python docs
!d asyncio.sleep
coroutine asyncio.sleep(delay, result=None, *, loop=None)```
Block for *delay* seconds.
If *result* is provided, it is returned to the caller when the coroutine completes.
`sleep()` always suspends the current task, allowing other tasks to run.
Deprecated since version 3.8, will be removed in version 3.10: The *loop* parameter.
Example of coroutine displaying the current date every second for 5 seconds:
```py
import asyncio
import datetime
async def display_date():
loop = asyncio.get_running_loop()
end_time = loop.time() + 5.0
while True:
print(datetime.datetime.now())
if (loop.time() + 1.0) >= end_time:
break
await asyncio.sleep(1)
asyncio.run(display_date())
So
How does the above cmd work internally?
How can it sleep for x seconds without technically sleeping
And reading it further, does asyncio.sleep(0) do anything?
Do ping on reply please
it schedules a function to run on the event loop to run after delay seconds. It then yields control to the event loop. Once either that function runs or is cancelled, the event loop will schedule the sleep call again, and resume it
should be a good high level overview of how async works in python
http://www.dabeaz.com/coroutines/ is an excellent talk that introduce you to the principles that coroutines work on. And http://www.dabeaz.com/finalgenerator/ goes on to explain asyncio.
Yeah asyncio.sleep(0) actually does something different. It calls a function, that just yields None, and then just returns.
The source code is here if you want https://github.com/python/cpython/blob/master/Lib/asyncio/tasks.py#L597-L611
would it be slow to call typing.get_type_hints() every time __setattr__ be called?
Is it expensive to use del?
I would think so
Why not use a descriptor?
Depends on how it's used, a bare name shouldn't take too much
Say I'm deleting a name that is bound to a pandas dataframe containing logicals, basically a binary mask
Basically to make the code look nicer I need to create the mask then use but after I've used it, I would like to delete it because it's just a giant object that's sitting around for no reason
I am working with dataclasses, I want to catch when some fields changes, they could be a descriptor yes
It be a lot less overhead that way
__setattr__ is going to get catch everything, even stuff u don't care about
only if I manage to declare dataclass fields as properties
my understanding is that a del should be cheap, it just removes the name from the namespace and decrements the object's ref count
but it doesn't guarantee that the memory is released
I don't find myself using it
I would naively think it's better to delete the binary mask data frame instead of leaving it around
Dataframes allegedly use 10x the amount of memory for a given size
So like 5 mb dataset is roughly 50 mb in a dataframe
yea, but del only directly affects the name, not the object
the object can remain in memory
until gc decides to clear it
Well that's shit
I heard del will just decrease the reference counter
So it is a crazy cheap operation
Assuming that's the only reference to the dataframe it would get garbage collected immediatey or no?
Well it's gotta be better than not don't it though
maybe
it doesn't collect immediately specifically because that would be expensive
unless you cannot afford to have the object in memory, I wouldn't worry about it, but I also don't have much experience working with massive data
I mean, look this guy cited 202mb csv is over a gig in a dataframe https://link.medium.com/errHvZWh8eb
but do you have a binary mask that takes a gigabyte?
Having these binary masks to slice a dataframe lying around just seems crazy
Well if the raw data set that I make a binary mask of is 1.2gb I'm better the binary mask, while smaller, is still significant
Don't we all do
CPython's reference counting collector should clean up as soon as something is unreachable. gc is the interface to the optional gc for cyclical references
For attrs it may also call a bunch of methods so it could be expensive there, but in general I'd say that it being cheap is a good assumption if you do have a reason to use it
@swift imp running a quick experiment holding a 1 GB dataframe of int64 dummies, making a binary mask of them with df == value, then calling del df, it does seem to release the memory immediately
sorry for providing false information π
however the same experiment reveals that the binary masks is not nearly as large
I also thought it did something different than βjust free the memory immediatelyβ
How much smaller
Well, it does: it removes one reference to that object. But, if that reference was the only reference to that object, then CPython immediately destroys that object.
Not all Python implementations use reference counting, so in a non reference counting implementation, removing that reference would just make it eligible to be collected the next time the GC runs.
Yeah
I thought it just marked it as ready to be freed, but actually freed it at some later time
there are implementations where it works that way, like PyPy
About maybe a 10th of the int64 data, but you should benchmark your case
I was just looking at htop 
whats a 3 way tcp handshake
it's the way that TCP connections are negotiated between the client and the server. But it's not on-topic for this channel, which is about the Python language itself. Try #networks
https://docs.hpyproject.org/en/latest/ is this something actually supported by Python itself? are they moving towards this over the tradition c-api for cpython?
saw it on the ideas mailing list like a week ago and wasn't sure what to make of it
but it seems like it has its own mailing list (according to the github readme)
It's not yet part of CPython, or necessarily aiming to be - it can be implemented without that. It's indeed aimed to be an alternative to the CPython API, which when fully complete might be moved into there. There's a PyPy branch which has direct support for it though.
Is it true google uses python ? ?
They probably do use it in quite a few places, but it's not really relevant to this channel
Hello.
news_date = dateparser.parse(analysis.date_modified)
analysis.date_modified = self._check_published_date(news_date)
In the second line. mypy gives warning
"Argument 1 to "_check_published_date" has incompatible type "Optional[datetime]"; expected "datetime"".
Since I wroteanalysis, I'm pretty surenews_dateis never going to be None. But mypy excepts it to be None or datetime. What should I do here? Addingif news_date is None: continueseems redundant.
Sorry if this is a wrong place to ask this.
It looks like it's complaining that the argument that gets passed to _check_published_date could be None
Which seems to be true based on this: https://dateparser.readthedocs.io/en/v0.3.4/#dateparser.parse
Oh, I see what you're asking
@terse orchid I believe you can do assert news_date is not None
Yea that suppressed the warning
I dont believe that mypy has any other way to get rid of Optional
Thank you. I just wanted to see if this could be done without additional code. If this is the wae, it's ok
If you do this, I'm guessing it will now warn about an incorrect annotation?
news_date: datetime = dateparser.parse(...)
guys. people pro at coding can you help me. How do i write an algorithm. How do i convert it to code
Adding the assert, or making a function that does the assert seems the best way to suppress the warning while being clear
Yea. Thank you both guys
think this would be a more relevant question in #algos-and-data-structs in the help section
also i think u should be a lil bit more specific with ur question too
That's a very broad question. Simply, sorting a list is an algorithm. You can start with any sorting algorithm to get used to this topic or a programming language
ok the fact that PyPy explicitly supports it is a good sign π
An object is iterable if it has a __getitem__. In that case, to iterate through it, Python will try indices starting from 0, and increment until it gets IndexError
"overloading unpacking" is just making your object iterable.
is there a way to overload how the operator functions or does that have to be done in the __getitem__?
which operator? * for unpacking?
ya say I want to unpack obj[:-1] with the operator, id still want obj[-1] to be accessible with __getitem__
If the syntax
*expressionappears in the function call,expressionmust evaluate to an iterable. Elements from these iterables are treated as if they were additional positional arguments. For the callf(x1, x2, *y, x3, x4), if y evaluates to a sequence y1, β¦, yM, this is equivalent to a call with M+4 positional arguments x1, x2, y1, β¦, yM, x3, x4.
so, the only thing you can do that would change the behavior is change what __iter__ returns for your object, as far as I can see.
like numpy arrays?
It seemed like the goal was to make it so that there's an element that is accessible via indexing, but not by unpacking/iteration
If you define __iter__ = None, the __getitem__ fallback will be skipped.
Whoa, I did not know that.
Somewhat new, it's been allowed for __hash__ but recently it was expanded to be explicitly handled for other methods with fallbacks.
It'd technically work before, you'd just get a TypeError("NoneType isn't callable") instead of a proper one.
hey guys, do you know if there exists a possibility to run a script on someone's computer / VM with significant resources (RAM) ?
I need to run code that requires about 20 GBs of RAM to finish, while I have only 8 on my laptop
on google colab there is 12.73 GB RAM available, so not enough sadly, my script crashes at 88%
You'd probably need to rent a VPN for that
You can get a spot instance on AWS for a few hours quite cheaply
Although it isn't the right channel for that
I'm gonna read up on that, thx
and sorry for the wrong channel, which one would be appropriate for that?
Hmm maybe #tools-and-devops or off-topic would work
in dataclasses, if init is False it must be false for all subclasses too?
it seems each dataclass has its own init and won't call superclass's init
hey, little question about typing: we can have code like python class Class: def copy(self: C) -> C: ... # some copying logic but what if our class is going to have generic arguments? doing self: Class[T] and -> Class[T] is not going to produce the exact same behavior as before
in case of subclassing, that is
If you want to make a generic type, you can use typing.Generic:
from __future__ import annotations
T = TypeVar("T")
class Class(Generic[T]):
def extract(self) -> T:
...
def copy(self) -> Class[T]:
...
ah, I see your question
yeah β it was more about making subclasses work
You want copy to produce the class that self belongs to?
mhm
C = TypeVar("C")
class Class(Generic[T]):
def copy(self: C) -> C:
...
this should work, let me check rq
wait
ah, well
yes
aight, but what about the case if we need to alter the type variable?, as in (Class[T]) -> Class[U]
That is not possible. That would require higher-kinded polymorphism, where the type constructor is abstracted over.
That's present in Haskell and Scala, but not here
as I assumed at first β oh well, quite unfortunate
well I basically am implementing some kind of an iterator, Iter[T], with a bunch of methods that return Iter[<other thing>]
and why do you want to subclass it?
well, the Iter itself is pretty general, as in, it is the main class of my library; and I thought it would be nice if users could extend the iterator by deriving from it, even more so if the typing worked well
in that case you're out of luck
but oh well, if we can't do that, then it stays as-is, I guess
@cloud crypt you could start a discussion on mypy or pyright's github repo if you think it'd be a worthy addition
yeah, a good idea to hover over current issues
thanks for your input :)
https://github.com/python/typing/issues/548 this seems like the related issue; one idea is to implement a custom mypy plugin, like returns did with their higher-kinded types
@cloud crypt There's an official telegram channel for returns and its siblings, if you want to ask there
alright, thanks!
can contextlib.asynccontextmanager supress exceptions?
sure, surround the yield with a with contextlib.suppress(SomeException):
I asked it wrong, wanted to know if that context manager could let some exception pass silently
is letting an exception pass silently the same as "swallowing" it?
hm yes, I had that here but not caused by that context manager, a task was raising exception on creation and my exception handler was being registered late
daemon.start()
await daemon.wait_exception()
start() spawns some subtasks and wait_exception() is triggered on any exception from that subtasks, but in this setup I was missing exceptions
it worked as expected this way
waiter = daemon.wait_exception()
daemon.start()
await waiter
should not that exception be propagated to start()?
this is how that subtasks are created in start()
def _create_subtask(self, coro: t.Coroutine, name=None):
task = asyncio.create_task(coro, name=name)
task.add_done_callback(self._check_subtask_result)
self._subtasks.append(task)
return task
it was raised inside that asynccontextmanager
Hi, I have written function decorators before. But can someone tell me how to write a class level decorator?
The purpose is to scrape some data from the function docstrings. And this data is not for documentation but to use during runtime.
I'm sorry if this is not the right place to ask.
Why decorator? That doesn't sound particularly decorator-esque, won't a normal function call suffice? Also, documentation being used as part of code sounds wrong to me.
It's not my idea. My manager wants us to implement that.
Okay, a normal function would work. But how can I achieve the same using a class level decorator?
scroll down a bit and it's basically "it's the same as a function deco but it takes a class as an arg"
but yeah your manager is probably an idiot
I can udnerstand using docstring stuff for debugging or testing but using it for program functionality is pretty ridiculous.
that said, I've made a class decorator that makes an Exception class's default message its docstring
lemme find it
I'm in QE-automation. His idea is to pull the test id's from the docstring at runtime and apply it as decorator during runtime and then invoke the tests using one of those markers.
def docstring_message(cls):
"""Decorates an exception to make its docstring its default message."""
cls_init = cls.__init__
@functools.wraps(cls_init)
def wrapped_init(self, msg=cls.__doc__):
cls_init(self, msg)
cls.__init__ = wrapped_init
return cls
if you aren't editing __init__ no need for most of it
class decorator is pretty simple
I guess it sorta makes sense in QE but why not just parse stuff in the files without a decorator?
idk much about QE
(I use a subclass of Exception for docstring message defaults now, cleaner)
It does look simple. Thanks.
And i'm not familiar with parsing stuff but i'll do my homework on that topic.
np, good luck
anyone here into UI automation testing? If so please tell me how to make tests run in parallel.
@compact scarab yes selenium with python + pytest.
it won't automatically launch multiple. also headless is causing trouble so we are using xvfb.
i tried pytest-xdist but it messes up the logs
@unkempt rock this is not a general help channel, see #βο½how-to-get-help
When you inherit from multiple classes, no matter what the MROs for the superclasses are, super() always follows the MRO for the class of the instance, right?
Classes don't have MROs, only instances do (or, a class has an MRO only because it itself is an instance of type)
The class you're using super in will have a linearized mro which will satisfy how the whole hierarchy works
Mro is the second third argument of __new__ in your type
How does classes not have mro
Unless I'm confusing bases and mro
You are. MRO is a linearized list of all of the parent classes of an instance. You don't provide that explicitly when constructing a type, though you do provide the direct base classes.
The only sense in which a class has an MRO is the sense in which that class is itself an instance like any other, and therefore has an MRO like any other
Then what's passed in metaclass new
The direct base classes
Is that not what the mro of an instance?
No. The MRO is the linearized set of all parent classes, not only direct base classes
Oh fuck
I get it now
Ok
So bases is what you pass when defining the class
Got it
But mro is the full inheritance, like what those bases may inherit from
Given:
class A: ...
class B(A): ...
class C(A): ...
class D(B, C): ...
The MRO of an instance of D would be D, B, C, A, object
In fact the MRO of an instance includes the type of that instance itself, not only base classes
D bases are C and B
Yep.
!e I think I'm explaining this poorly:
class A: ...
class B(A): ...
class C(A): ...
class D(B, C): ...
print(type(D).__mro__)
print(D.__mro__)
print(D().__mro__)
@raven ridge :x: Your eval job has completed with return code 1.
001 | (<class 'type'>, <class 'object'>)
002 | (<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <class 'object'>)
003 | Traceback (most recent call last):
004 | File "<string>", line 8, in <module>
005 | AttributeError: 'D' object has no attribute '__mro__'
Whenever we talk about the MRO of an instance, we mean the __mro__ of that instance's type
An instance that isn't a type doesn't have it's own __mro__ at all. The MRO of an instance of D is the __mro__ of the type D
or maybe I should be saying, it doesn't make sense to talk about the MRO of instances at all, and only about types
but regardless, in the context of super and @boreal umbra's original question: there aren't two different MROs that could be being looked up here, there's only one. Your instance has only one MRO, which is type(inst).__mro__
Let me know if that makes things any clearer - or less clear π
I think it would have been more correct to say that instances don't have MROs, only types do, and when someone says "in the instance's MRO", they mean it as a shorthand for "in the MRO of the type of the instance"
I think your explanation is clear but it would have helped if you opened with that last paragraph
Yeah. I realized my mistake as I was trying to flesh the explanation out, heh
right, so the MRO of the class in which a given method is defined is never under consideration, only the MRO of the class of the instance.
Right, yes.
Alright this is not related to python but maths. If we got the velocity starting out from u = 0.3263 what the end positions will be?
So we do have a starting g and after the g is getting lower and lower as it is more close to the end point.
the g multiplies the objects weight so
how am i able to check when the g will be at the lowest point possible
This is a channel for advanced discussion about Python itself. For maths, try off topic channels
Does anyone know where I can find the implementation of open()? I tried the _io module but it seems io.open is slightly different and returns a different object than builtins.open
nvm, found it
How do you folks feel about catching potential mistakes/and or accounting for them? For instance, let's say you have a function that expects an int but someone passes a string number to it. Is it bad practice to cast inputs to int in your function? What about fuzzy matching for mispellings? Should we try to guess what's being input? Or should we hold developers/code writing to a strict standard and leave that sort of thing to user interface?
What inspired this question was me and our API developer spending a whole weekend debugging what should have been a basic function but it was failing because it was expecting an int 0-4 and he was passing "1" to it
We could split the difference and catch then warn, I suppose
generally, I would say a single python type should be interpreted in a single way, or not at all. When it comes to APIs, I would favour not coercing things, but it should be reported early on as a schema error of sorts. If you tell the caller/client what they did wrong, it is fine.
There are exceptions, e.g. dates
Oh god, don't even get me started on dates
What a nightmare
07/08/09 what day is that >.>
One of my courses is having a unit on Python, and the slides say that the worst thing about Python is the bad documentation. Is the documentation still considered to be bad, or could the instructors knowledge of the docs represent an older state of the docs?
The Tutorial section isn't great IMO
some things aren't phrased very precisely, e.g. coroutine is used to mean "coroutine object" and "coroutine function"
but I wouldn't say that Python has horrible documentation
I don't know that I've ever used a language with more approachable documentation!
The only place where I tend to find the docs lacking is the documentation for the C API. There, I frequently need to check the code to figure out exactly how something behaves
Don't coerce inputs, you had a bad weekend because someone passed the wrong type, imagine the weekends you'll have when something silently gets coerced to an unexpected input.
a ton of python documentation is hidden away in very verbose highly technical prose, even in cases where examples would be more useful. We do get questions quoting some part of the docs and asking for examples every once in a while
Still the best docs I've read to date, always found very nuanced information in there that's made me pleasantly surprised
ye, I like it, but I can't imagine 12 year old me understanding that
The data model docs are very technical, but the tutorial is very approachable, as are the docs for most stdlib modules
But ah well - I got a PR accepted to improve the Python docs recently. You can too, if there's something you think isn't explained well enough!
The tutorial is the part of the docs that's most approachable to a 12 year old, I think - but it moves too fast to really teach you the language if you don't already know one, I think
yeah, the tutorial does seem targeted at existing programmers
there are other sources providing less technical explanations of various concepts, even if they are sometimes plain wrong or outdated
@raven ridge guess what got accepted?
I wonder if this will get applied to the inspect functions
Pyright currently doesn't detect type narrowing for those
do any of you recommend a python tutorial book thing for more advanced users
Fluent Python
ok
Hey everyone I wanted some help regarding the KivyMD library of python, which is used to make cross-platform applications, specifically regarding the input field (MDTextField) when I compile the app for android using buildozer. Android is detecting the input field as password field and therefore the keyboard is not showing suggestions. This is a known issue, does anyone know any workaround ? thanks in advance
try #user-interfaces
so.... i want an answer to something but i dont know how to phrase the question..
!e
a, b, tuple() = 1, 2, ()
@stable grail :x: Your eval job has completed with return code 1.
001 | File "<string>", line 1
002 | a, b, tuple() = 1, 2, ()
003 | ^
004 | SyntaxError: cannot assign to function call
@stable grail :warning: Your eval job has completed with return code 0.
[No output]
this does not
!e
a, b, () = 1, 2, ()
a, b, () = 1, 2, {}
a, b, () = 1, 2, []
a, b, [] = 1, 2, []
a, b, () = 1, 2, []
a, b, () = 1, 2, []
a, b, [] = 1, 2, {}
a, b, () = 1, 2, {}
a, b, () = 1, 2, {}
@stable grail :warning: Your eval job has completed with return code 0.
[No output]
why does it work, i know its a thow away
i was doing a, b, _ = 1, 2, () but i forgot to write the _ assignment and wrote () isntead
The empty tuple on the RHS is unpacked into the empty target
but i was under the assumtion that you could not do () = ()
It's like for () in [(), (), ()]: ... for example
assignment up to this point was, name on the left, value on the right
how early is the time on the east coast? to early to ping ned?
yeah, and thats carzy talk to me
() = () was not allowed back in the day
it would give the expected syntax error
when i do () = () i expect syntax error, not code that runs
https://docs.python.org/3/reference/simple_stmts.html#assignment-statements
The object must be an iterable with the same number of items as there are targets in the target list, and the items are assigned, from left to right, to the corresponding targets.
With () = () you have an empty iterable object and an empty target list
>>> a = []
>>> b = []
>>> c = []
>>> [a, b, c] = [1, 2, 3]
ok. so this would then work
and then (a, b, c) = [1, 2, 3] would work as well
ok.. but still..
Yes, it'd rebind the names to the corresponding items
so the empty value on the left hand side i still dont like that its not syntax error
no matter if its inside a tuple list set or dict
does the assignment statement talk about why it changed from py2 to 3?
>>> a = []
>>> b = []
>>> c = []
>>> (a, b, c) = [1, 2, 3]
this still works in python2
[] = () works in py2 but not () = []
guess that that might be the reason
thanks @peak spoke that dev thread cleared up my question to why
Berker's patch looks good.
It has several virtues:
* the error message is reasonable and clear
* it makes the language more consistent
* it doesn't break any existing code.
* it makes the AST a little simpler and faster
by removing a special case
for anyone wondering
TIL π
!e
if 0:
(_) = (o.o) = (_)
@grave jolt :warning: Your eval job has completed with return code 0.
[No output]
I have a curious challenge. I read a stream of frames from a websocket connection, but I expect upon establishing the connection for the first frame to have a certain format (let's call it "handshake frame"). This I need to enforce. How do I do it efficiently without running an if not handshake_received() on receiving every single frame?
You can try storing a different function to call after you've received the handshake, but I don't think it's going to be much faster than an if
or faster at all
Have you measured how much that if takes? Probably less than 2 us
it's just a boolean flag, right?
Python jumps are not much slower than any other byte code instructions. Keep in mind python doesn't really optimise. And even in C, an if that almost always has the same result is free due to branch prediction
@grave jolt yeah, pretty much just a boolean flag
@autumn nest if you are reading data from the network, that will be your bottleneck. don't worry about the cpu overhead of parsing the data.
!e
someone in #python-discussion asked about the space differences between sets and tuples, so i started playing around with stuff
from sys import getsizeof
print(getsizeof({}))
print(getsizeof(dict()))
why does this happen?
@feral cedar :white_check_mark: Your eval job has completed with return code 0.
001 | 64
002 | 232
how did you discover this?
im curious to see if the results are consistent across different architectures
!e
import sys
for _ in range(5):
print(sys.getsizeof({}), sys.getsizeof(dict()))
@boreal umbra :white_check_mark: Your eval job has completed with return code 0.
001 | 64 232
002 | 64 232
003 | 64 232
004 | 64 232
005 | 64 232
got 40 128 on my windows machine
so yeah dict() is still greater
really dont know y
>>> a = {}
>>> b = dict()
>>> getsizeof(a)
64
>>> getsizeof(b)
232
>>> a['test'] = 1
>>> b['test'] = 1
>>> getsizeof(a)
232
>>> getsizeof(b)
232
``` π
interesting
looks like there's an optimization for empty dicts?
In [2]: import sys
In [3]: sys.getsizeof({*()})
Out[3]: 216
In [4]: sys.getsizeof(set())
Out[4]: 216
In [5]: sys.getsizeof({})
Out[5]: 64
In [6]: sys.getsizeof(dict())
Out[6]: 232
why is a Set a weird number of bytes?
what's weird about 216?
Your prediction sounds about right @raven ridge, it's like dict expects to have at least one key-value pair
the minimum size should be 8 elements i think
not a multiple of 2
i assume you meant power. but 232 isn't either
I did some testing and some research.
Dictionaries are assigned a memory size until a certain threshold, in which it allocates an unnecessary amount of bytes so it doesn't have to constantly request for more on every new key-value pair. (This also works in the reverse iirc).
dict() may be assigned extra memory by default because it expects items to be given to itself instead of nothing, so the call for more memory can be delayed.
Yeah that's along the lines of what I was thinking
The only thing I can't really think of is when you would be keeping a dictionary empty, like why wouldn't the literal also be expecting elements?
!e
from timeit import timeit
print(timeit("{}"))
print(timeit("dict()"))
You are not allowed to use that command here. Please use the #bot-commands channel instead.
lol
that's what I was thinking too
literal would be faster @frosty shoal
everything else checks out, like they're the same after adding that first element, but I can't imagine a scenario when you wouldn't be adding anything
yeah that would make the most sense
!e ```py
from sys import getsizeof
print(getsizeof(dict.new(dict)))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
232
So it's the tp_new, not the tp_init
In [10]: {None: None}.__sizeof__()
Out[10]: 216
In [11]: dict().__sizeof__()
Out[11]: 216
How interesting lol
it's because it preallocates space
{} yields BUILD_MAP 0 which in turn calls _PyDict_NewPresized(0); that eventually calls PyDict_New() which returns a dict with keys and values both set to empty defaults. this results in a dict with size of dict.__basicsize__ + gc_overhead == 64 (gc_overhead is 2 pointers). dict() in turn calls PyDict_New() inside of dict_new. dict_init is then called which calls dict_update_common(...). this eventually leads to a call of dict_merge(...) which merges the new dictionary with the keyword args from dict() which in this case is an empty dictionary. but before merging, dict_merge checks if keys/values are initialized, and if they arent, initalizes them to new instances before attempting the merge *edited to correct some things
So it's more of a side-effect of the implementation?
yea more or less
I thought that was it at first, but it reproduces even if you call dict.__new__(dict) directly, which bypasses dict_init - right?
But https://github.com/python/cpython/blob/master/Objects/dictobject.c#L2431 does look like a bug to me - IIUC, kwds can never be NULL, and this should be short circuiting based on an empty dictionary instead of a NULL dictionary
some of dictobject.c looks like it hasnt been touched in a while
so it is entirely possible that it goes down some unexpected code paths
This may call for a debugger, instead of browsing GitHub from a phone π
>>> dict.__new__(dict)
dict_new called
{}
>>> dict.__init__({})
dict_init called
>>>
``` @raven ridge i have hooks that work
but neither tp_new or tp_init are called when {} is done
or when dict() is done
it may be pulling from a free list
or my hooks may be wrong
And dict() hits both dict_new and dict_init, right?
it didnt seem to hit either
Er... That's weird...
i think it pulls from the free list first maybe?
dict_struct = TypeObj.from_address(id(dict))
_FuncPtr = type(pythonapi.Py_IncRef)
dict_new = _FuncPtr(dict_struct.tp_new)
dict_init = _FuncPtr(dict_struct.tp_init)
@hook(dict_init, restype=c_int, argtypes=[py_object, c_void_p, c_void_p])
def hooked_dict_init(self, args, kwargs):
print('dict_init called')
return dict_init(self, args, kwargs)
@hook(dict_new, restype=py_object, argtypes=[py_object, c_void_p, c_void_p])
def hooked_dict_new(typ, args, kwargs):
print('dict_new called')
return dict_new(typ, args, kwargs)
@hook(pythonapi._PyDict_NewPresized, restype=py_object, argtypes=[c_ssize_t])
def hooked__PyDict_NewPresized(size):
print('_PyDict_NewPresized called')
return pythonapi._PyDict_NewPresized(size)``` my current hooks
>>> dict()
{}
>>> {}
_PyDict_NewPresized called
{}
>>> ```
it uses my asm_hook.py code that inserts jmps into the assembly
so its not like my hooks arent being called
!e Ah, but now that you mention it, this proves it isn't dict_init that's at fault:
from sys import getsizeof
d = {}
print(getsizeof(d))
dict.__init__(d)
print(getsizeof(d))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | 64
002 | 64
can anyone explain what it means for a programming language to be turing complete
It means the language is at least as powerful as a Turing machine - a really simple mathematical definition for a computer that is still very powerful. In practice it means that any Turing complete language can solve any computation that any computer can (ignoring memory problems, time constraints etc).
@quiet knot @waxen halo This is a discussion channel about the language itself. See #βο½how-to-get-help and #career-advice, respectively.
So I asked a core dev about this. It's basically what we thought: {} goes through _PyDict_NewPresized which doesn't initialize the hash table for empty dicts. dict() goes through dict_new which does, because the assumption is that dict() is used with keyword arguments far more often than not.
There are contexts where empty dicts are really common - like passing an empty set of kwargs to a function call. It makes sense to have a fast path to create empty dicts. And dict() just doesn't use that fast path, because it's almost always immediately filled in with hash table entries
Ah, and! The reason I couldn't find the difference in the source is that it isn't there anymore in 3.10
https://github.com/python/cpython/commit/db6d9a50cee92c0ded7c5cb87331c5f0b1008698#diff-b08a47ddc5bc20b2e99ac2e5aa199ca24a56b994e7bc64e918513356088c20aeR3407-R3409 makes it so dict_new doesn't initialize the hash table either.
import sys
print(sys.version)
This prints the full version information string. If you only want the python version number, then Bastien LΓ©onard's solution is the best. You might want to examine the full string and see if you need it or portions of it.
huh?
I've been looking at this: https://towardsdatascience.com/how-to-shrink-numpy-scipy-pandas-and-matplotlib-for-your-data-product-4ec8d7e86ee4
It says use C Compiler Flags with Pip, but this is causing all types of failures and causing the build time to SKYROCKET to what seems like infinity.
Many cases it fails outright
@distant quartz This channel is for advanced discussion rather than advanced questions. If you're having trouble with something data science related, try asking in #data-science-and-ml.
print(f'You spent {sum(expenses[:3])} dollars in the first quarter')
vs
print("Expense for first quarter:",expenses[0]+expenses[1]+expenses[2]) # 7150
Which one would have a lower time and space complexity?
@weary minnow you should use the first one regardless. However they're both going to be O(n) for both space and time for n elements in expenses
yeah makes sense
just clarifying (i learnt bigO notation just yesterday) O(n) is the complexity where n is the input size and it increases based on the number of iterations it has to do which is n in this case
!e
import timeit
print(timeit.timeit("""a = [1, 2, 3]; sum(a[:3])"""))
print(timeit.timeit("""a = [1, 2, 3]; a[0] + a[1] + a[2]"""))
@boreal umbra :white_check_mark: Your eval job has completed with return code 0.
001 | 0.49751316802576184
002 | 0.28837722912430763
!e
import timeit
for _ in range(5):
print(timeit.timeit("""a = [1, 2, 3]; sum(a[:3])"""), timeit.timeit("""a = [1, 2, 3]; a[0] + a[1] + a[2]"""))
@boreal umbra :white_check_mark: Your eval job has completed with return code 0.
001 | 0.4947110181674361 0.287811366841197
002 | 0.49890251411125064 0.2873172180261463
003 | 0.4799330329988152 0.2898636560421437
004 | 0.48030197201296687 0.2990665871184319
005 | 0.4802680320572108 0.28978786594234407
wait am i reading that right? is sum() slower than manually adding up the values?
@weary minnow I'm surprised by these results, but again, I would encourage you to use the less verbose solution.
yes, that's correct. though a[:3] requires copying the list.
wow, because someone here told me that sum() is written in C instead of querying values from the array and adding it up
makes sense
!e
import timeit
print(timeit.timeit("""a = [1, 2, 3]; sum(a)"""))
@boreal umbra :white_check_mark: Your eval job has completed with return code 0.
0.3281325239222497
yeah copying takes time
yeah i know that that's why a[:2] would require copying which results in a higher complexity and time
it's still a constant amount of added time/space
so it doesn't increase the complexity
ok this is rather surprising π¬ LOL turns out i should use the second solution then?
no, you should still use the first one
you shouldn't be trying to squeeze out every last drop of computational efficiency
readability counts
both solutions have the same space and time complexities
@boreal umbra though will the manually adding up elements ever get slower than sum() like for different lengths of the array
keep in mind that these are lists and are not arrays
are you familiar with __add__?
yep my bad π¬
no, i'm not
you can determine what the + operator does for any class that you write
sum just uses whatever the __add__ method does for each element
!e
print(sum([1, 2, 3, 'hahahahaha', 4, 5, 6]))
@boreal umbra :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 1, in <module>
003 | TypeError: unsupported operand type(s) for +: 'int' and 'str'
so it's technically always gonna be as fast except it just looks better to the eye?
so it's still going to be n calls to __add__, no matter how you cut it.
it's always going to be O(n), unless you do some really hacky stuff with the __add__ implementation
so if you destructure it the first and second are basically the exact same
in that sense yes, but then you're doing what the computer would have done for you
except the first one is slower because of copying list elements
hmm makes sense thank you :D
also 1 more question lol
sure
(sorry i just started data structures and algorithims yesterday)
this might be a very silly question but because the sum() thingy surprised me, just making sure.
expenses = [100, 200, 1000]
print(1000 in expenses)
vs
for value in expenses:
if value == 1000:
print('Value In List!')
first one is better right
?
or under the hood do they just do the same thing?
@weary minnow so when the in operator is used in this way, it's calling the __contains__ method. And yes, what it does for lists is along the same lines.
ah...
Sets have a contains method that is O(1)
so it's constant?
yes. it uses hashing to do it instead of iteration
you can think of it this way
a list is like a deck of cards; if you want to know if any one card is in the deck, you need to go through them one by one
yes but that would be iteration right?
yes, i did say that under the assumption that you were thinking about a variable amount of elements to sum
oh
but a set is kind of like a physical dictionary; if you're looking for a word, you can just go to the section for the first letter of that word
and if you extend the analogy...the more words there are that start with the same letter, the more "manual" searching you will need to do
(hash collisions)
also 1 more question related to bigO
would n^2 be when there is a nested loop (double iteration)? (and also what is the name for this kinda function)?
that would be quadratic time
generally, yes
that is the most common case
but it's only O(n^2) if you're iterating over two lists, the lengths of which are O(n) for the same n
which, if it's the same list, is that
yes
nested loops iterating on the same list
but remember that this is also a specific case
also, to generalise, O(n^k), where k is a constant, is termed polynomial time
though usually you don't see anything above n^3
yes, unless there's some reason why it's always going to break early. though that in itself wouldn't guarantee being below n^2
yes, that is the most common case:
l = [1, 2, 3, 4, 5]
# O(n^2)
for i in l:
for j in l:
print(i + j)
thanks!
@weary minnow have you heard of factorial time?
no π¬ i only started yesterday night π€£
would love to know tho
how does += compare to manual var = var + 1?
i reckon the same?
i'm speaking about time and space complexity in this case
in this case += is more efficient because it only has to look up var in the symbol table once. It's one function call and an assignment either way.
but you should pick += because it's easier to refactor if you ever need to change the name of var
var += 1 evaluates to var = var + 1 anyway, right?
it's also easier to parse as a human
var + 1 evaluates to two, whereas var += 1 doesn't evaluate to anything.
I see.
var += 1 is a statement, but it's not an expression.
someone correct me if I've got the terminology wrong.
indeed
indeed I'm wrong?
I'll be the last one to hold spelling against you
!e Here's excellent proof of the fact that a += b works differently from a = a + b:
vals = ([], [])
try:
vals[0] = vals[0] + ["a", "b", "c"]
except Exception as exc:
print(f"{exc=}")
print(f"{vals=}")
try:
vals[0] += ["a", "b", "c"]
except Exception as exc:
print(f"{exc=}")
print(f"{vals=}")
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | exc=TypeError("'tuple' object does not support item assignment")
002 | vals=([], [])
003 | exc=TypeError("'tuple' object does not support item assignment")
004 | vals=(['a', 'b', 'c'], [])
Ooh, that's cool. Thanks for the example!
so it has the intended effect (ie extending the list) but then raises an exception in another dunder method?
That's unintuitive. I would expect that to be a bug.
vals[0] isn't even a tuple.
"object which you're not even assigning to does not support item assignment"
Oops, it says "item assignment"
Second one works fine though if it's x = vals[0]; x += ...
I think the actual problem is that += is misleading for lists, since it doesn't really need to do an assignment to extend the list.
But it would make sense for an int, for example.
other way around - the dunder method, list.__iadd__, succeeds at extending the list. But __iadd__ is allowed to either modify an object in-place or return a new object (like int.__iadd__ does). So, the return from __iadd__ gets assigned to the assignment target. And that step fails.
so the failure occurs when it tries to do *vals[0] = *vals[0] (to loosely apply pointer syntax to Python--replacing the reference in vals[0] with the value that's already there)
yep.
looool
yeah, to use Python syntax, py vals[0] += ["a", "b", "c"] is roughly the same as ```py
temp = vals[0]
temp += ["a", "b", "c"]
vals[0] = temp
and if you think about ```py
lst = [0, 1, 2]
temp = lst[0]
temp += 5
lst[0] = temp
In the tuple of lists case that last assignment looks pointless, and all it does is fail. But in the list of ints case, that last assignment is the thing that actually makes the list contain the new integer instead of the old one.
I'm finding mypy kinda awkward to use, something such as: pathlib.Path( _dict['key'] ) where this dictionary contains a path as a str will return Argument 1 to "Path" has incompatible type "Union[Dict[str, str], Sequence[str], None, List[List[str]]]"; expected "Union[str, _PathLike[str]]", I can fix this with :
if isinstance(_dict['key'], str):
x = pathlib.Path(_dict['key'])
having these everywhere seems pretty ridiculous though, so I must be doing something wrong? I'm not sure what though - and am having difficulty making a reproducible example
ok this isn't exactly the same error, but hopefully it helps the former:
import pathlib
demo = {
"section": "data/nb_output/section_1",
}
demo["x"] = pathlib.Path(demo["section"])
which returns: check.py:18: error: Incompatible types in assignment (expression has type "Path", target has type "str") [assignment], and I don't really understand this.
@magic python I could very well be wrong, but perhaps it's because mypy infers demo to be dict[str, str] and the assignment makes it dict[str, Union[str, pathlib.Path]], that is, not the same type
What do you get when you typehint demo to demo: dict[str, Union[str, pathlib.Path]] on assignment?
@unkempt rock i don't know where i would type hint this in a simple script
demo: dict[str, Union[str, pathlib.Path]] = {
"section": "data/nb_output/section_1",
}
Union being typing.Union, and you might need typing.Dict instead of dict if not on 3.9
i'm using future annotations so that's ok - and that runs without error
There you go
thank you π hrm, I don't have just pathlib objects in the other place where the original error was from, so I guess I'll just try and use Any here
What does _dict look like?
ah that fixed it all @unkempt rock π , using Any is probably frowned at, but it's a dict that's meant to have any type π€
maybe i should find all the types and do a Union?
_dict is a dictionary with a lot fo metadata in it - so there are floats / lists / dicts / etc in there
yeah it has types:
<class 'list'>
<class 'str'>
<class 'NoneType'>
<class 'float'>
<class 'dict'>
<class 'int'>
<class 'tuple'>
in it - would it be more typical to use Any or Union here?
do the same keys always refer to values of the same type?
they should do I think
The problem you're facing is that mypy tries to prevent you from passing a value that's potentially of an unexpected type. If your dictionary in total allows you to assign values of incompatible types to all keys, that possibility is there.
However, if your dict has a clear type structure, you could use a TypedDict to type specific keys
@wide shuttle hrm ok - i'm pretty new to typing, Any fixes this - but it feels as though I'm basically saying - ignore this ( to mypy )
not sure at what point one just doesn't use a dict and uses a dataclass or something as well really
It basically says "this can be anything", yes, which takes a lot of the type checking away
right, looking at typed dict i don't get the point instead of dataclass / attrs
Dataclasses are a more recent alternative to solve this use case, but there is still a lot of existing code that was written before dataclasses became available, especially in large existing codebases where type hinting and checking has proven to be helpful. Unlike dictionary objects, dataclasses don't directly support JSON serialization, though there is a third-party package that implements
i guess there isn't a point at this stage
The point is that you're still using a dict and get dict objects like you'd expect
oh right, yeah that's a good point π
It's just that you now have a way of type hinting individual keys, instead of a general type hint that fits all the values.
If you have a dict types like this:
my_dict: dict[str, Union[int, str]]
mypy has to assume that every value can be either an int or str.
ok, so i'd just create a typed dict class and build up using that instead of what i have atm
Which means that it should throw an error or warning when you pass such a potentially int/str value into a place that expects an int or str specifically (an not the other)
@wide shuttle i'm going to try typed dict, thanks - if this fails then (for now) i'll throw Any at it and run away
typing is taking a bit of getting used to - for me
hi folks!
i have a question related to ctypes:
When using VirtualAllocEx with ctypes (ctypes.windll.kernel32.VirtualAllocEx) the function returns the memory address corresponding to the allocated memory.
But unfortunately, for x64 processes this returns a negative value, that (i assume) cannot be used "as is":
If i try to write to that "negative address" with WriteProcessMemory, i get a 998 error....
Has anyone dealt with this before?
Thank you in advance!
hey there, I've got a small problem with typing;
basically, I am implementing an object-oriented iterator that allows chaining methods;
and so I have the code as follows:
class Iter(Iterator[T]):
def flatten(self: Iter[Iterable[V]]) -> Iter[V]:
...```
then, if I call it with ```python
array = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
iterator = Iter(array) # -> Iter[List[int]]
normal = iterator.flatten() # -> should be Iter[int], but instead
# we get an error saying that "self" argument is invalid, for example:
# Invalid self argument "Iter[List[int]]" to attribute function "flatten" with type "Callable[[Iter[Iterable[V]]], Iter[V]]```
is there any way to fix this?
though Iter.flatten(iterator) will work, interestingly enough
where do you define V?
V is essentially TypeVar("V")
it does not really matter, does it?
Iter.flatten(Iter(array)) produces expected result, so I am suspecting it is due to some bug in mypy itself
what if your code is the bug
bet, haha
@cloud crypt ah, your Iter is invariant where you likely want it to be covariant
since Iter[List] is not a subtype of Iter[Iterable] in invariant generic classes
mypy -c "
>> from typing import *
>> T = TypeVar('T', covariant=True)
>> class Iter(Generic[T],Iterator[T]):
>> V = TypeVar('V')
>> def flatten(self: Iter[Iterable[V]]) -> Iter[V]:
>> ...
>>
>> def __next__(self): pass
>> Iter[List[int]]().flatten()"
```works fine
ah, I see
are linked lists slower than normal lists?
That's a better question for #algos-and-data-structs - but, generally speaking, yes
cool
btw: nouns don't have speeds. verbs have speeds.
@weary minnow python lists are implemented as dynamic arrays internally, which means that if you ever add more elements to the list than the underlying array can store, it has to create a new array and copy everything over. You don't get to know when this happens, and it's an O(n) operation.
However looking up elements by their index is O(1) because it can do some simple math to figure out where the ith element is in memory.
On the other hand, deleting an element from a python list is O(n) because each subsequent element has to get copied over to the previous memory. This can be especially bad if you're deleting the first element.
Let me know if you're with me so far and I'll explain how linked lists make some tradeoffs.
Or at least the deque class.
the important thing to know with dynamic arrays (and by extension python lists) is that it makes appending O(1) on average. the O(n) resizing that steleercus talked about happens infrequently, and less often as the array gets bigger
can someone revrrse this? ΓοΏ½1Γ§Εx
I'm not sure what you're asking, but this is strictly a discussion channel. See #βο½how-to-get-help
@bleak steeple recruiting for closed-source projects or paid/business opportunities of any kind is not allowed on this server.
@bleak steeple is it open source?
make sure it's on github before you ask anyone else to participate.
If you'd like to recruit for that open-source project, please do so in #python-discussion or whichever topical channel it most closely relates to. This is strictly a discussion channel about the language.
Hey guys, so I was wondering, Python uses call stacks, right? So each function would have its own call stack, with a pointer to it on the base stack, so are functions stored as a pointer to the call stack?
Each thread has a call stack, not each function. The call stack stores stack frames.
no...
Hm
Isn't a function a verb?(correct me if I'm wrong,)
Sorry for bad grammar I am using my phone
So say you have py a = 5 def func(): x = 10 y = 30 Won't this end up in the stack looking like ```
0 | 5
1 | 10
2 | 30
With the ----- being the separation of the two frames
Maybe
And func points to 1, and executes everything up from that and pops it off when its done
Oh wait, isn't the functions return value pushed to the stack first?
there is no call to func() in that code, so there's only ever one call on the stack - it's the call to __main__, which defines two global variables in its globals() as it runs, a and func
Oh ok
And if I call func?
If I don't call the function, are x and y never on the stack?
no - the stack is a stack of ongoing function calls. While a call to func() is in progress, x and y are on the stack as local variables of the func() call
while no calls to func() are in progress, there's no frames for that function on the stack, and so no local variables for it on the stack, either
and if a function is recursive, there might be multiple frames representing calls to that function all on the stack at once.
Hm, alright
the python runtime stack of execution frames is allocated on the heap, not the program stack
yes, but I don't think that's really relevant to the questions that were being asked.
ah i thought they were asking if python allocates frames on the stack for function calls
"functions" and "verbs" are unrelated concepts, though some people prefer to name functions using verbs.
(I mean I guess you could draw some connections between functions in programming and verbs in certain languages as they pertain to argument structures, though suffice to say that "verb" doesn't have a formal definition in the context of programming. At least not one that I'm aware of.)
It's better suited for #software-architecture than #internals-and-peps, but it's a common rule of thumb that in an OOP design, classes should be nouns (list) and methods should be verbs (append). Mostly. More or less.
Could anyone shed some light as to why __init__ is only called if __new__ returns an instance of its class
I kind of expected it to just be called after __new__ and passed whatever it returned
But that doesn't appear to be the case
So is it just to stop people from doing weird things or is there something else going on?
!e
class Test:
def __new__(cls):
return 'test'
def __init__(self):
print(self)
Test()
@amber nexus :warning: Your eval job has completed with return code 0.
[No output]
For example here I was expecting it to print test, but doesn't seem to call init unless it's returning an instance of the Test class
Probably because if you return a class that's not a subclass, you'll most likely use Class() which will run that class' init
The object gets made by the new function, so the Test object doesn't even get made in this way.
!e
class Test:
def __init__(self, word):
print(word)
Test("test")```
@zenith topaz :white_check_mark: Your eval job has completed with return code 0.
test
@amber nexus
I suppose I found it weird because I was unsure where the __init__ was actually being called. I knew of course that __new__ was creating the instance, but then I guess I assumed it just passed whatever it returned to the __init__ of the class. But I suppose it's just calling the init of the instance it's created so long as the instance returned is one of the class.
tbh, __new__ is rarely written.
and i don't know what the use-case is for __new__ returning something other than an instance of the class.
so we're talking about the .001% case of the .01% case
this is interesting, I think i found an example somewhere that did return something else, but if only i could find it
it is exceptionally rare
__new__ is useful for singletons is one case I know, where new will return the previously created instance
there's some weird stuff to do when subclassing numpy arrays
numpy has ways to make arrays take on a subclasses type though
people should stop making singletons π
but what about loggers and event loops ?!?
mostly, they should stop making classes that lie. why use __new__ when you can just make one instance?
that's true
even in the Java world, a class that lies seems like a silly trick that has somehow been anointed as an approved technique.
yeah 99.9% of cases where __new__ is used are just abusing syntax

also annoyed cause idk what happened to the original emote, cause i had to manually add it to one of my servers now 
I've abused that pattern by using it in Python and I regret that decision. We have top-level statements that we can use to create a single instance that can be shared, if necessary.
i believe reading somewhere (or watching somewhere, i dont remember) that new is useful if you wanted to subclass immutable builtins
that's another trap. it rarely works out, and there are probably better ways to solve those problems.
These are probably immutable for a reason
the real problem is then you want to perform operations on them. But MyInt + MyInt will produce int, not MyInt.
then you start implementing dunders to make it work, and you've built a whole mess of a thing.
i always think "what if someone else needs to change something" which makes like 80% of the things i could do just a stupid idea
One use is what pathlib does, where Path gives you the appropriate subclass for your OS. But is still the base class in case you manually construct the other subclasses.
@prime estuary why not use a function to do that though? I'm not sure why Path chose that approach.
Well one reason is a function doesnβt let you isinstance(), subclass etc.
but at least you get a new object every time, and it is an instance of the class you made.
the function would return the same object, which would do isinstance, etc.
pathlib would tell people: use this function to make them, and then they are instances of Path
I mean youβd need to import BasePath and make_path or whatever.
you'd import Path like you do now. but this is the best use of __new__, it's true.
Bc it's alot easier with __new__
This way you can kinda ignore the os-specific implementation unless you need one.
yep, this is the rare good use of __new__
Though actually looking at the interpreter it would mean it still calls init like normal.
yes, as it should
At work we use abc's that implement __init_subclass__ with required keyword arguments. The kwargs are used as a key to hash the subclass. At run time we use __new__ (who has those same required kwargs) to pick the correct subclass based how they defined it in the class type
There is another use for new, though in the opposite direction - stuff like pickle can use it to skip construction when you know you're throwing that out anyway.
Imagine there's stuff you want to happen during instantiation without having to call super
That would be a way
Like so that anyone that subclasses doesn't have to use super() in init
i can't tell if we agree that __new__ is rare or not? π
Depends who you talk to
My colleague at work, is like you, thinks there should just be functions for that stuff
He's also pretty against oop for some reason
I'm of the opinion, that if it's in the data model we should just utilize it
i'm curious: of the classes you write, how often do you write a __new__?
I've wrote exactly one
Our classes at work at essentially callables that determine how to do step[X] of the simulation post processing. We have like 20 teams whose step implementations may or may not be different depending on the simulation
So we use that simulation meta data as the arguments to __init_subclass__, so teams just subclass the parent class with abcs
And then we use __new__ to pick the correct subclass
Since we use this so much, we made a custom metaclass
And use __prepare__
So I've written, technically, a lot of classes with __new__ but depending on how you look at it, I've only written one technically
ok, one __new__ out of roughly how many classes?
ok, sounds rare π
The way I think of __new__ is that if you have a need or desire to abstract class instantiation, it's what you want
But that's a rare need
ok, we agree
Like metaclasses themselves, it's a tool important to how the object model works, but rarely needs to be messed with directly. But handy to have when you do...
it sounds like we've reached an #internals-and-peps consensus π
Every class should have a unique metaclass, but they're not important to the object model.
Sorry, just wanted to be provocative.
the key is to make your statement just a little outside the realm of reasonable. Yours is too outlandish π
ot!help
Credits: otiopo (editing commands) tokito (scripting) +CC (Embeds) Amir (sending me memes)
^ we are aware of this selfbot, we will have to keep the message up to report them
@swift imp didnβt know yβall like dunder methods that much , tbh they are confusing
To me
They can be very useful. Start with simple ones like __str__
!e
import re
email_pattern = re.compile(r'([a-z]+)@([a-z]+\.com)')
match = email_pattern.match('stelercus@gmail.com')
print(list(match))
@boreal umbra :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 4, in <module>
003 | TypeError: 're.Match' object is not iterable
It appears that they did __iter__ = None so that the getitem fallback wouldn't work for Match objects? Why would they do that?
yeah, not sure. maybe some odd edge case?
!e
import re
email_pattern = re.compile(r'([a-z]+)@([a-z]+\.com)')
match = email_pattern.match('stelercus@gmail.com')
print(match.groups())
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
('stelercus', 'gmail.com')
not sure
I guess they wanted one way to get the groups, i.e. match.groups() vs list(match) | tuple(match)
If they were to use the fallback, you'd get 'stelercus@gmail.com', 'stelercus', 'gmail.com' these elements - which is different to what .groups gives back
The 0th index of a match object is the entire match itself
ah, right
can someone link me the deceleration of a PyObject type / how PyObjects work using native types?
i'm not sure what you are looking for. What do you mean by "work using native types"?
well im basically just looking for how the PyObjects work under the hood being the basis of python's dynamic typing
essentially, the first few fields (in the macro PyObject_HEAD) are the same for every object, so you can unsafely cast them in a way that doesn't break.
and the type field means that you can safely cast them - every object has a type field at the same offset, so you can find out what type it is in a generic way, and then you can cast it to that type in order to do any type-specific things.
I've been working on a super Haskell-y combinatoric parser, it's pretty fun
even managed to implement fmap-ish functionality
Why python run its code in a virtual machine?
You mean in the sense that it's an interpreted language?
Using Turtle in Python, draw different n-gons with sides varying from 3 to 12 and
mention the names of the shapes and angles used for drawing it on the output screen.
Can someone plz help me i am new to python?
What?
I'm trying to get you to clarify what you mean by "run code in a virtual machine"
This isn't a help channel. Go to #βο½how-to-get-help
ok srrry
I mean Python Virtual Machine π
And what does that refer to? The bytecode interpreter?
Okay but why Python use PVM
instead of what?
if you mean "why does CPython run its code in a virtual machine instead of interpreting it from source", it's because the virtual machine approach is faster. If you mean "why does CPython run its code in a virtual machine instead of compiling it to machine code", it's because the virtual machine is less complex.
Are you trying to ask why it was chosen to be implemented as an interpreted language (rather than e.g. a compiled one)?
Yes
I don't have insights into specific reasons for Python but interpreted languages are more portable and that was probably a factor in the decision.
Python is a highly dynamic language. Ahead-of-time compilation doesn't add much for it, because the language's semantics require huge amounts of flexibility.
