#internals-and-peps
1 messages ยท Page 3 of 1
would that actually simplify the implementation?
not sure? the main concern would be about backward incompatibility
backward compatibility. the true enemy of improvement
Well said
Agreed
I think it is better to add class cell to every class function
how do break statements interact with generators? do they work like in loops? can generators catch them and do clean up?
I don't think generators really interact with break at all, if a generator isn't exhausted when the loop is over, the generator just gets discarded anyway
beyond maybe a try...finally statement, that would always guarantee that something's done, whether the generator is exhausted or not
yeah maybe? or i wonder if there is some iteraction with StopIteration. Also not even sure why or if a finally block in a generator would get claled
okay the finally gets called when the generator is garbage collected which should happen after a break
the generator may eventually get GeneratorExit thrown onto it
but as you say that may happen only during GC which could be at any time
or never if GC happens to be disabled
Just a question: why did you decide to use beartype?
over something like typeguard
I don't quite get what the point of beartype is to be honest. When it checks a collection type it only checks one random element
I thought it was more than 1
It's not a complete check but it does a random sampling
in the readme it says it only checks exactly one element (if it's not empty)
which seems to be the case from some brief testing
there is no other way to get O(1) runtime checking than to select a fixed number of elements
That's weak
Like what's the point
that is indeed my question ๐
I'm still fairly new to typing, as in its something I avoided for a very long time. It's just the last few months that I've been really getting into it. And so I don't know about many of the packages out there (beyond MyPy). Even doing a search the only ones I came across were pytype, pyannotate and monkeytype.
I just saw beartype being used in a project that I'm using and decided to try it. But will take a look at typeguard as a possible option for another project.
It's times like these that I wish there was a more complete categorized list than even awesome-python. Lots of decent packages still out there to be discovered.
The idea is that it ensures the time taken to check things is only related to how nested your types are, not the size of input data. Though it only checks one element, if you have an issue it's likely to affect a lot of elements, and occur repeatedly - so over multiple calls it'll catch the issue.
because more fast benchmarks
honestly that would make so much more sense considering how much magic is behind it
Seems to me making it a keyword would also make it much more difficult to patch, at least from a user dev's perspective. No way to override those without putting a new language on top I think
why not just put class cell in every function defined in class body? it will solve this problem
I think the original idea that methods are "just functions in a class namespace" didn't work out quite well
required a lot of new mechanics
the instance method binding stuff using a descriptor is pretty clever
it does turn out to be pretty elegant (other than super) for most purposes
until you try using a class-based decorator on a method
also, typing this stuff is very hard
You mean a decorator returning a non-function object? It should be relatively straightforward to write a decorator for that class that adds the required dunders. But yeah, I guess it's not consistent with other places where you just need a callable.
huh, does that break? i feel like i've done it and haven't had problems
You'd need to implement __get__ yourself, but it should work fine?
Can we construct bound-method instances from python?
If yes, does it support any callable object, or it supports only functions and builtin-functions?
f = obj.method; f() f is a bound-method instance. or is that not what you are asking?
I suspect they're asking if it's possible to create one without using the accessor descriptor protocol.
yes, considering the channel, i thought that might be the case, but not sure.
!e
import types
method = types.MethodDescriptorType('arbitrary value')
@boreal umbra :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 2, in <module>
003 | TypeError: cannot create 'method_descriptor' instances
Apologies for the screenshot, but the amount of text in the relevant part of the results exceeds the Discord limit.
I have a mostly working super patch, based on this[0]. I updated it so it works fine when given no args, but injecting it into builtins has been problematic. There's an issue with adding attributes to a super object. Given that the main tests passed and Pytest does its own patching, I'm wondering how concerned I should be about the exceptions that I'm bypassing:
def __setattr__(self, name, value):
osuper = object.__getattribute__(self, 'osuper')
desc = object.__getattribute__(self, '_find')(name)
if hasattr(desc, '__set__'):
return desc.__set__(osuper.__self__, value)
try:
return setattr(osuper, name, value)
except AttributeError as exc:
warn(f"{repr(exc)} was ignored")
pass
no, i mean we can explicitly create bound methods for any callable
it is not constrained to only function and builtin-function-or-method objects, and we can create it without messing with C-code (some classes are not creatable from python) - i was asking about this
>>> f = lambda: None
>>> bm = f.__get__(f,f).__class__; bm
<class 'method'>
>>> bm(print, 'Hello World')()
Hello World
>>> bm(print, 'Hello World')
<bound method print of 'Hello World'>
>>> bm(1, 2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: first argument must be callable
>>> class CallableThing:
... __call__ = print
...
>>> bm(CallableThing(), 1)(2, 3, sep=',')
1,2,3
code
You can also nest bm. Also, you can use it instead of functools.partial
is bm faster than functools.partial?
Im not sure if this values are correct, but anyway here they are:```py
tested code
with atm('plain call') as _:
for _ in _:
min(1, 2, 3)
a = lambda *args: min(1, *args)
with atm('lambda *args: min(1, *args)') as _:
for _ in _:
a(2, 3)
a = types.MethodType(min, 1)
with atm('types.MethodType') as _:
for _ in _:
a(2, 3)
a = functools.partial(min, 1)
with atm('functools.partial') as _:
for _ in _:
a(2, 3)
Results:
```py
plain call 126 ns ยฑ 25 ns [959 ms / 6993843]
lambda *args: min(1, *args) 234 ns ยฑ 30 ns [967 ms / 3942366]
types.MethodType 149 ns ยฑ 22 ns [1.3 s / 8024117]
functools.partial 183 ns ยฑ 14 ns [1.2 s / 6404689]
plain call == types.MethodType
functools.partial is a bit slower, because it accepts arbitary set of args and kwargs
lambda *args: min(1, *args) is a lot slower, because it is slow ๐
nice :D
>>> dis('f(*(1,2))')
0 0 RESUME 0
1 2 PUSH_NULL
4 LOAD_NAME 0 (f)
6 LOAD_CONST 0 ((1, 2))
8 CALL_FUNCTION_EX 0
10 RETURN_VALUE
>>> dis('f(1,2)')
0 0 RESUME 0
1 2 PUSH_NULL
4 LOAD_NAME 0 (f)
6 LOAD_CONST 0 (1)
8 LOAD_CONST 1 (2)
10 PRECALL 2
14 CALL 2
24 RETURN_VALUE
why it is not optimized?
that specific pattern hasn't been implemented, probably because it's rare enough not to make a difference in practical code
in general I would assume "why isn't this simple bytecode optimization made" is answered with "it didn't seem important enough to go out of our way to do"
ok, thanks
where these optimizations are implemented?
in https://github.com/python/cpython/blob/main/Python/ast_opt.c?
is there a reason that python doesnt do constant inlining?
it does sometimes
what?
wdym by constant inlining
that's constant folding
i think he meant constant folding
^^ my bad
Wait no
I mean like
num = 2
if num == 2: print(2)```
Which should compile to either: print(2) or if 2==2: print(2)
that does require semantic analysis
Yes
(and the inlined version would still assign num)
Mm I guess it is hard for python to implement a proper analysis to see if num is used anywhere else
Yup
in general it seems there's more effort in optimizing actual pain points in execution, like with e.g. attribute accesses using adaptive bytecode
it is not equivalent code
yes, it's optimized code
python doesn't have variable inlining
it's something the core devs disallow for some reason
for these reasons I guess
no, it is wrong code
class NS(dict[str, object]):
def __setitem__(self, key: str, value: object) -> None:
if isinstance(value, int):
super().__setitem__(key, value + 1)
else:
super().__setitem__(key, value)
class Meta(type):
@staticmethod
def __prepare__(name: str, bases: tuple[type, ...], **kwds: object) -> NS: # type: ignore[override]
return NS()
class X(metaclass=Meta):
x = 1
if x == 1:
print('x == 1')
print(x)
x = x
print(x)
output
also it doesnt work if you have non-dict globals mapping (is it even possible without ctypes?)
a bigger reason to stay away is probably debuggability
haha
if you access some var, it disappears and var_ becomes defined
if you assign var, it is assigned to var_
mypy is very angry
Is there a way to join strings without copy on the stdlib?
Like slices without copy with memoryviews
Why does super fail here when only given the class, but not when given class+object or no args?
def __init__(self):
> super(*make_args(self)).__init__(42, 13, more=False)
E TypeError: super() takes no keyword arguments
...
-> super(*make_args(self)).__init__(42, 13, more=False)
...
(Pdb++) make_args(self)
(<class 'tests.test_super_duper.test_duper_init_args.<locals>.InitNoArgs'>,)
(Pdb++) l 227
222
223 class InitArgs:
224 def __init__(self, arg1, arg2, a_kwd=None, **rest):
225 return
226
227 class InitNoArgs(InitArgs):
228 def __init__(self):
229 -> super(*make_args(self)).__init__(42, 13, more=False)
230 return
231
232 assert InitNoArgs()
You can use argument-less form of super in dunder init.
dis('nuts')
>>> from dis import dis
>>> dis('nuts')
0 RESUME 0
1 2 LOAD_NAME 0 (nuts)
4 RETURN_VALUE
omaga
I get that, but I'm finding this error message strange. Based on stuff I've seen explaining super, what I think I should be getting here is an error from InitArgs.__init__ about a missing positional argument. Like this:
(Pdb++) InitArgs.__init__(42, 3, more=False)
*** TypeError: __init__() missing 1 required positional argument: 'arg2'
Where would I go to start a discussion about stdlib documentation?
I was looking over the logging docs and while there is a lot of great info there
I think it's very very beginner unfriendly
this is true for several sections of the docs, not just logging. maybe the discourse forum? it's just hard because writing docs is hard, so it's a big commitment to fix, and people might be skeptical and/or think you're just complaining.
I don't think the needed fix is that big
They just need to put the actual elements of the basic use case right at the start
what adjustments did you have in mind? note that there's also the "logging howto" doc, which frankly also is not the easiest to read either
It just needs to start with a 20 line code block showing all the elements of typical usage
maybe post somewhere on the discourse forum or maybe one of the mailing lists with a rough outline of your intentions?
recall that i was a big proponent of adding a logger= kwarg to basicConfig which was shot down in favor of dictConfig
There's no code block with logger = logging.getLogger(name) until the advanced section of the tutorial
i think that's considered "advanced", even though it really isn't
Which itself is a small link off the main page that many people will miss
I mean it's not advanced
It's just standard usage
it seems to be a split in opinion. at least some people seem to think that it's advanced ๐คทโโ๏ธ the docs go to some pains to avoid bringing it up until that section
Instead the first code example is like logging.warning(...) Which is really not great because you shouldn't actually write that almost ever
I mean those people should present some really good arguments then
Afaics they're just wrong
i agree with you. but the people who don't agree might also be the people who need to approve your PR
Yeah, I mean I guess I see what they'll say. Is there any prior art?
like, a paper trail of emails or github discussion of someone who tried to make changes tehre before
# myapp.py
import logging
import mylib
logger = logging.getLogger(__name__)
def main():
logging.basicConfig(filename='myapp.log', level=logging.INFO)
logger.info('Started')
mylib.do_something()
logger.info('Finished')
if __name__ == '__main__':
main()
# mylib.py
import logging
logger = logging.getLogger(__name__)
def do_something():
logger.info('Doing something')
I feel like this sums up, effectively, 100% of the basic use case of logger. And even working in a pretty substantial project this still covers like 98% of my usage of it.
Beginners need to know:
- to create a package level logger
- to log messages to that logger using logger methods .warning, .info, .debug, etc
- to call basicConfig somewhere in main to setup log level, where the logs go (filename) etc
Just 3 things, something that one can absorb from the < 20 lines of code above, and even a beginner can easily write logging in a way that they will not regret a couple months later
this is the closest thing to a discussion I can find
it's considered advanced because that's typical for library code, and most people aren't writing libraries. The "basic" stuff is for people writing applications, and they can get away with only ever using the root logger.
sure, but then you end up with people writing their first library and not even realizing that they are in an "advanced" use case
I mean not really, even for decent sized application code, you shouldn't be writing that
also true
it's literally one extra line, there's no reason to ever be writing logging.warning to be completely honest
indeed. It is a bit tricky that the typical library code doesn't match the typical, simple application code.
it would be a net positive if those functions didn't exist
i think in general splitting docs into "beginner" and "advanced" isn't a useful distinction, at least not without a detailed table of contents for both categories that you can browse
it might also be nice if LoggerAdapter inherited from Logger, and if the entire logging mechanism didn't operate by dumping arbitrary instance attributes into an object then letting users extract arbitrary instance attributes from the same object in a %-formatted string
I think that in the case of logging it's actually pretty easy to draw the line of what to show "up front"
the code block I gave before is a complete use case. It works well for applications, it works well for libraries, etc
the logging config of course might not be sophisticated enough for one person, but there's nothing wrong with it in principle
what's the point of using a non-root logger in the examples you gave above?
audit trail basically
over a dozen+ modules and 1000s of lines of code you really want to know what generated which log message
the point is that you're doing it correctly so that later on, if you want to get into advanced features, you can, and it's not an issue
that's what I mean by there not being anything wrong with the code above.
in what sense? No logger other than the root logger is ever configured, and the formatter is never configured to include the logger name - right?
in a typical file that needs to do some logging but not otherwise interact with the logger, that's literally all that you need to know
I'm confused if we can show people how to use the library completely correctly, for the basic use case, in 20 lines, why we woudln't want to do that at the very start
the formatter is never configured to include the logger name
the formatter usually is configured to include the logger name, at least that's how i do it. that's why it's useful to use loggers with name__name__. in addition to a logging hierarchy "for free", you also get the fully-qualified module name mapping 1:1 with the logger name, which ends up in your log messages, and would be literally unobtainable otherwise.
I literally just had a talk with multiple people in a row that were discussing these docs being confusing, and how they used logiru instead
And the big difference is basically just that with logiru, they simplify the "correct" thing by one line, and then show the correct thing at the very start of their docs
the formatter usually is configured to include the logger name
But it isn't in the code snippets quicknir proposed, right?
doesn't loguru just dump everything into the root logger though? or do i misunderstand how it works?
it might be in the default format? i don't recall. but it can easily be in that snippet, a bit of custom formatting is not advanced imo.
logiru just has a single global logger, and makes more use of things like filters on handlers to handle some of the use cases of logging.
I don't like it, I honestly think that the author of logiru doesn't understand hierarchical logging and that's half the reason why it exists.
that's been my impression as well, but i hesitate to say it because it seems outlandish
I believe the default format doesn't include the logger name.
but like, the whole advantage of hierarchical logging is that doing the correct thing is very easy
so why not show the correct thing?
I think I literally learned the logger = logging.getLogger(__name__) dance from a blog post
frankly i got the same impression of structlog. i struggled with it for a couple of weeks until i ripped it out of my application and replaced it with a custom formatter + extra= and some logging adapters
Because in the logging docs, it's buried a way.
i don't remember where i first saw the __name__ trick, it might be in the "howto"
it's in the advanced section of the tutorial
the tutorial itself is a small link that I think I basically ignored when I first read the logging docs
fwiw the python tutorial itself is an absolute mess w/ respect to the issues you're raising
IMHO the code I gave above should literally be the first thing on that page, preceded by a couple of sentence
users can start with that and then pick up most things they need afterwards
this is a bigger job than you're giving it credit for, i think
writing good docs is really goddamn slow and hard
Well, I'm not saying that I know how to fix all the logging docs, I'm not the person for that job anyway
I'm just saying, this change, which amounts to adding about 30 lines of text at the start of logging documentation, would make it much more beginner friendly
well, if you do take this to Discourse, I'd advise you to keep Chesterton's Fence in mind. You're likely to cause yourself unnecessary conflict if you make arguments like "no one should ever use logging.info()". But I'm sure you could make some ground with "this should be organized differently" or "examples that use getLogger() should be more prominent"
that's why i suggested coming up with a rough outline of your proposed changes first
I will have to google chesterton's fence
not just "this needs to be better", but an outline of what you intend to change and how
do you think there's actually an argument for logging.info?
sure. It's simple, works in the simplest case, and automatically handles configuring a stream handler if no root handler had been configured already.
there's lots that you can't, or shouldn't, do with it - but there's plenty of applications where it's sufficient, and it's easier to get started with.
I think it's good for libraries to cater to a variety of different skill levels and a variety of different degrees of "productionization"
but having API that caters to the one file script level, by allowing users to save one line of code, in order to do something that just isn't very good for even a small python project of a few dozen files
seems like a very poor choice to me
but yes, your advice is good, there's no reason to get into an argument about that
People who've never written a package have a poor understanding of __name__ - people's eyes definitely glaze over when they see that, and they definitely find it confusing.
I don't know if that's a good enough argument for the convenience functions that use the root logger or not, but - yeah, it's an argument you probably don't want to step into. ๐
yeah.
i mean the nice thing IMHO is that people don't have to understand __name__, it's literally a piece of code they copy and paste.
They don't understand how logging.warning() works either ๐
but without understanding what __name__ does/is, they won't understand why that magic line of code is useful.
it's funny how python's abstraction facilities work pretty well, but still force you to write that one line logger = ... and that somehow becomes an issue
I don't think they need to understand why. You just learn the correct thing so that you're doing the correct thing, because it's just one extra line to do it correctly.
Later, when you are like "oh, I need my logger to do such and such" you read the docs more and rejoice that you're already using your loggers correctly and all doors are open to you
again this is really a huge part of the point of the design of hierarchical logging: you get (in python) 99% of the convenience of globals, and much of the flexibility of explicitly passing around loggers to classes/packages/etc
there was some discussion on the mailing lists a year or so ago that proposed adding a new builtin called __main__ that was set to __name__ == "__main__", so that people could write if __main__: instead of if __name__ == "__main__":
Didn't wind up happening, but as I recall there were a decent number of supporters, exactly because of how poorly understood __name__ is by beginners.
I would jokingly be in favor just because I've made typos while writing out that boilerplate ๐
i'd be in favor of def __main__(argv):
i also wished that getLogger() (maybe using some C extension magic?) would default to getLogger(__name__), and that you'd have to write '' or None for the root logger specifically. that would solve a lot of problems if it existed.
you could make something that uses inspect to find the calling module, but it'd be backwards-incompatible, so it ain't gonna happen
and it doesn't work when the calling code isn't a Python frame.
I mean I like languages that have good ways to support a way to do hierarchical logging without creating an object explicitly. But python doesn't really so I'd rather be explicit than use magic
It would be ideal if one could do logging.info and it was handled hierarchically rather than globally
I actually realized there's a nice parallel here that perfectly sums up how I feel about logging.info
It's appropriate in about the same situations where it's appropriate to just dump your code at top level, without an if name == main guard
We tell folks to write their scripts that way even though for a one file script you don't really need to. Simply because it's a good habit, clearly the way to go for even slightly larger projects (not necessarily libraries)
And it only takes one line, even if that line does have the scary dunder name in it
I think logger = ... Is pretty much in the same position and should be encouraged to the same degree
if '__name__' == '__main__':
And regarded as "advanced" to the same degree (not at all)
That hurts... Surely not because it's happened to me ๐
i think i've also done 'main'
I give people the opposite advice, actually. I think you should pretty much never use if __name__ == "__main__":
I definitely don't think it's a good practice or something that should be encouraged. The point of it is to allow different behavior when a file is run as a script than when it is imported as a module. I don't think you should put both of those things in the same file, generally. It's a rarely needed thing to have a module that is both designed to be imported as a library and to be run as a main script, and if you do need it you'd usually be better served by making a package and using a separate __init__.py and __main__.py
hm
lmfao
i have a file where i protected the entire thing with an if main
because if i don't do that it would kill the interpreter if i ever accidently import it :^)
so what replaces imghdr?
An opinion you're entitled too for sure but not one I agree with, and not one the stdlib documentation seems to agree with
Many of the examples use main guards, including examples in logging
textwrap
~ python -m textwrap
Hello there.
This is indented.
asyncio no wait, that's a module
Fwiw I write many many small scripts, which might also be used programmatically from another script so main guards work fine for that, definitely adding another level of nesting doesn't seem like it would buy me anything
just curious, what should be the determining factor between whether I try to start a discussion on the discourse:
https://discuss.python.org/c/documentation/26
or the github
https://github.com/python/cpython/issues
the github should only be used for issues and pull requests iirc
Issues are for more concrete proposals, discuss is for more open-ended discussions
hmm I guess I have a concrete idea of what I want to do (make changes to logging documentation), but I suspect that it will lead to broader discussion
but seems like still a better fit for github though, since I at least have a pretty good idea of what should change and why? Then obviously we'll see what other folks think
I guess this is about the long discussion above? I only skimmed it but yeah I think starting with an issue is fine
yeah, the long discussion above. btw if you have any thoughts, would be happy to hear them, sure that it would help the quality of the initial issue I open
Btw @raven ridge @paper echo thank you, that was very helpful
i feel like people just say to put that in your code without any explanation as to what it actually does as with most idioms told to beginners, and half the time imo it's not even needed like if you're never importing that file i don't see why you'd include that idiom
because it's super valuable to teach people the right thing when they're thinking about that thing
people think about how they should write a script, when they are learning to write a script
if you tell them just to put their code at top level, there most likely isn't going to be any light going off in their head when they then decide they need to import that file to use a function that was defiend there
They're just going to import it, and have bugs they don't understand.
the if name == main idiom literally just adds one line of boilerplate; a single line of boilerplate is so low cost that it seems best to just teach beginners that from day 1.
And they just avoid having any surprises later at all. Realistically there just isn't much of a downside. They can always learn exactly what's going on with that later on.
Really the situation has a ton of parallels with logger = logging.getLogger(__name__)
people learn logging.info("...") when they visit the logging page, and then they just keep doing that as their project grows and grows, and it pretty quickly becomes a problem; they're not going to stop and say "oh, I better switch to logger = ...; logger.info("...") because to do that they'd need to understanding logging, which they don't
you're trying to frame it as "the right thing" for writing scripts, but it's not - if __name__ == "__main__": is totally unnecessary for writing scripts (as opposed to importable modules). It's also unnecessary when writing importable modules (as opposed to scripts). It's a pattern for allowing you to create something that is both importable as a library and runnable as a script. That's a pretty rare use case, and it's not one that beginners ought to have thrown in their face on day 1 - in fact, the entire idea that you can split your code across multiple files isn't something that people learning their first programming language usually learn until months in.
And, in the few cases where it's reasonable to have something that is both importable as a library and runnable as a script, almost always it will have been designed as a library from day 1, and having the ability to run it as your Python entry point is an afterthought or a minor convenience. And again, beginners don't start off thinking about how to organize their code as a library usable from other modules.
on the other hand, it basically costs nothing to do it, and it has a lot of value for organizing your code even if you don't strictly need it for any functional purpose
this is more getting into #pedagogy , but i also hate the attitude of "ignore what these things mean, just use it"
it costs even less to not do it, though.
it's really not that complicated to explain what __name__ == '__main__' does at a high level
also true. i'm not the type to insist on it for small scripts
however i think it's useful to introduce it in a course at some point
for me, and I'm guessing for a lot of people writing python professionally, I don't think this is a particularly rare pattern
however i think it's useful to introduce it in a course at some point
If I had the choice between answering "what does thisif __name__ == '__main__':thing that I keep seeing do?" from a beginner, or answering "how do I import this function without theprint()calls below running?", I'd much rather answer the latter. It's a concrete question that shows that they understand importing, understand scripts, and are looking for a solution to importing part of a script without running the rest - which leads the groundwork for explaining__name__
I'm not against teaching it - you absolutely do need to learn it at some point on your journey, since it is a common pattern. I just don't think it should be taught as the default. It's a solution to a particular problem, and it should be taught in the context of the problem that it solves.
I think tbh this is a very principled and not very practical objection to this pattern.
in reality, people end up writing code that has both scripts and importable functions... pretty often, in my experience.
not teaching people this pattern early and as the default would just result in worse code in the wild
which in the end, is what I care about
I think we'd legitimately have better code in the wild if we got rid of the if __name__ == "__main__": idiom entirely, and encouraged people who need it to switch from a foo.py to a foo/__init__.py and a foo/__main__.py - it solves exactly the same problem, more elegantly, in a way that keeps the imports required by the script portion separate from the imports required by the library portion without relying on tricks like conditional imports.
and I think it's easier to explain.
I mean as someone who often has a directory with e.g. a dozen files that can be run as scripts, and also as functions... I don't see anything more elegant about turning every one of those files into a directory with two files inside of it.
And that "more elegant" is entirely an aesthetic thing.
Teaching foo/init + foo/main is far more cumbersome, and the reality is that nobody does it, and nobody's going to do it.
If we don't teach if name == main, what will happen instead in a huge fraction of cases is not a directory with two files inside of it but just top level code
And that "more elegant" is entirely an aesthetic thing.
It's not - as I said, it provides organizational advantages as well, like import hygiene.
Perhaps - but that's fine, too. If we didn't have the if __name__ == "__main__" idiom, people would be forced to factor the library-ish part and the script-ish part into separate files, and that would be... totally fine.
The organizational advantages are very much a trade-off with the extra nesting, and complexity. You're replacing a single file with 3 entities.
this single file is often just 100 lines in my casel.
but they wouldn't? People wouldn't magically start doing that just because we don't tell them about if name == main.
I'm honestly confused how somebody could think that. You're operating undersome kind of constraint like "people's code will be correct, if they don't know if name == main, they have to make it correct another way"
i wrote a lot of python code in between where I learned about if name == main, and where I learned enough about python packages to understand foo/main and foo/init
that code would just be a mess if we stopped teaching name == main. And even now, knowing both techniques I still use name == main extensively, and many experienced python devs do.
It's a very short, and decent technique that people still use even with a lot of experience; there's nothing really wrong with it. It may not be the best ultimate choice in the limit of scalability, but that's okay.
"magically"? No, someone would have to teach them, of course.
well, yes - if the code doesn't work, the coder will keep at it until they get something that does.
that's the point, teaching them foo/init and foo/main is definitely more involved than teaching name == main. And it's also a directory and two files, people may well just skip over the technique if they see that.
name==main is a one liner so people are generally willing to just paste it into their script
along the way, potentially creating a worse mess, and wasting lots of time...
or, you know, they could just do if name == main and everything would be fine?
there's nothing terrible about it; your preference for foo/main + foo/init is a preference, and it's a fairly ๐คทโโ๏ธ kind of situation
I'm pretty sure it isn't. I think teaching what __name__ means, and teaching the "__main__" magic value, is harder than explaining foo/__init__.py ("the importable part of your code") and foo/__main__.py ("the runnable part of your code")
but you don't need to teach that
the python stdlib explains all this using like 10 lines of code and two paragraphs
sure. You can just say "here's this magic incantation that can do this thing that you may one day need to do". I just don't like that, as a pedagogical technique.
i mean if your preference is people writing broken code until they hit the problems that force them to learn a more elaborate way to do it, then ok, I guess
I mean - isn't that the entire way that people learn to code?
we're supposed to try to teach people things that let them avoid headaches and make coding more convenient, and make it easier to learn things in pieces...
by doing if name == main, people can quickly have python code that behaves correctly as both an import, and a script, which is incredibly useful and practical
sure, that's a reasonable way to approach it. however i find that "trust me just do it" is hard to get students to accept. a 5-minute aside in lecture that at least demonstrates that there's some logic to it, is worth the time imo. so if you're already telling students to do it, you might as well at least give them a taste of why it works, so it doesn't feel like a magic incantation.
pedagogically and yes, even profesionally
and for me, it's definitely less overhead to add a single line to a script, then to start creating a directory and two files, and then have those two files start to import things from each other
like, much much less
i also usually start with if __name__ == '__main__' first, then upgrade from module to package later if needed
yep, me too
honestly that's the case for almost everyone I know
folks are aware of foo/main and foo/init but that's usually something you "upgrade" to if things get bigger.
Hy has a defmain macro that generates code something like this:
def _main(*argv):
...
if __name__ == '__main__':
sys.exit(_main(*sys.argv))
in hy that's just:
(require hyrule.control [defmain])
(defmain [argv]
...)
that's pretty similar to what I do, I just don't think I bother passing argv
also, why not just def main out of curiosity?
also, def main() -> int: ๐
idk, that's just how they did it
you can write defmain [] too and it will ignore the argv for you
Hey @halcyon trail!
You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.
a draft of the github issue, if folks (esp salt/godly) are itnerested: https://paste.pythondiscord.com/olugeqetin
makes me wish paste had automatic line wrapping ๐
but I like it
The browser wraps the raw version - https://paste.pythondiscord.com/raw/olugeqetin
hy rule, I'm dad
!e @feral island sorry for the ping but is optimizing {*()} to just BUILD_SET 0 (aka getting rid of the empty tuple, maybe in the AST optimizer) feasible here ```py
from dis import dis
dis("{*()}")
@rose schooner :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 0 0 RESUME 0
002 |
003 | 1 2 BUILD_SET 0
004 | 4 LOAD_CONST 0 (())
005 | 6 SET_UPDATE 1
006 | 8 RETURN_VALUE
I think it's feasible. No guarantees you'll be able to convince someone it's worth the implementation effort
ok i'll probably try to implement that in a PR after school
I suppose you could generalize it to {*(1, 2)}
ok
that compiles similarly to a SET_UPDATE
but the main argument against is going to be that nobody writes code like that
so my problem with it is that (constants...,) gets optimized to a constant itself
and i'm not sure how i'd convert that into set elements
so i may just check for the empty tuple with kind == Constant_kind and Py_SIZE(constant) == 0
maybe you could do this optimization before constant folding?
(caveat that I know very little about the AST optimizer)
https://github.com/python/cpython/issues/98731
@feral island @raven ridge @paper echo if you're interested
was anything else planned for the concurrent package or was it just some future proofing when futures was created?
I believe there was an expectation that something would be added, but there may never have been any concrete plans
what could be that something? I think for example asyncio could've fit there going just by the name, but then asyncio and futures that was there already don't exactly feel like they'd fall under the same package
Where would be appropriate to ask a PyPI-related question? I'm trying to publish a package, but get an error pointing here[0] that I'd like more information on
you can try here or in a help channel, but there's also a pypa discord (https://discord.com/invite/pypa) where a lot of the packaging people hang out
OK I'll start here, particularly since it's related to what you've been helping with. I made super-duper to help with the super issue, but it's violating PyPI's naming policies. My main question is, is there a way to determine if a name will cause issue before publishing? And - I'm not very hopeful but - is there some kind of process to potentially get an exception?
um, I didn't even know there is a naming policy. what is the violation?
This is the message:
Publishing super-duper (0.1.0) to PyPI
- Uploading super_duper-0.1.0-py3-none-any.whl FAILED
HTTP Error 400: The name 'super-duper' is too similar to an existing project. See https://pypi.org/help/#project-name for more information. | b"<html>\n <head>\n <title>400 The name 'super-duper' is too similar to an existing project. See https://pypi.org/help/#project-name for more information.\n \n <body>\n <h1>400 The name 'super-duper' is too similar to an existing project. See https://pypi.org/help/#project-name for more information.\n The server could not comply with the request since it is either malformed or otherwise incorrect.<br/><br/>\nThe name 'super-duper' is too similar to an existing project. See https://pypi.org/help/#project-name for more information.\n\n\n \n"
I guess because of https://pypi.org/project/superduper/
you could open an issue at https://github.com/pypa/pypi-support/issues, similar to the PEP 541 transfer procedure
though you're not exactly in PEP 541 territory because you don't want to take over the exact name
Ahh I see. I did a search for "super duper", but that package didn't show in the first 2 result sets
Will give this a try, thanks
so when does the cpython documentation redirect docs.python.org/3/ to the 3.11 documentation instead of 3.10?
its been 2-3 days now
I'm quite accustomed to concurrent.futures now hah
wdym
okay, it's fixed now
?
cpython documentation redirects docs.python.org/3/ to the 3.11 documentation
i don't know about that but i do know it reads "3.10.8 Documentation" to me
funny, if you put index.html on the end of that url it seems to show the 3.11 version
broken links?
"Whats new" page for 2.7
it says that 3.9 is prerelease and 3.10 is in development
3.9, 3.8, 3.7 shows 3.11 as "pre"
3.10, 3.11, 3.12 are ok
it also contains only changelog of 3.10-
https://docs.python.org/3/whatsnew/index.html - ok
https://docs.python.org/3/whatsnew/ - changelog of 3.10.8
#pypa and #python on Libera Chat might have answers too. in this server, #tools-and-devops seems to be where people post packaging questions
Who do I have to bribe/seduce/murder to get ++ to be an integer increment operator in python?
i think it's just not that compelling when you can += 1
in C++ it does a double duty on pointers/iterators
and you have two forms of ++, which is a terrible idea but still, might serve as some motivation to have it as a separate operator
I do a lot of scientific computing/algorithm development and wind up typing += 1 all the time and I'd much rather ++
๐คทโโ๏ธ
that has to be weighed against the cost of adding something to the language
Are you doing += 1 in loops a lot?
obviously there are exceptions but often there is a more idiomatic alternative to doing += 1 in a loop; if you have a small example of the last time you did += 1 in a loop might be possible to show you a refactoring
Sure, one sec
It would be challenging to grab an example atm. What did you have in mind?
Let's use a toy example
evens = 0
for i in range(50):
if i%2 == 0:
evens += 1
for a toy example like that you could do
evens = sum(1 for i in range(50) if i % 2 == 0)
for things that can be structured that way, sum() will be much faster than calling += in a loop.
obviously that will not scale up well. counting occurrences is a place where += 1 is often fine. I was thinking more of cases where you were doing += 1 on indices
it could scale up if you factor the predicate out to a function
not on 3.9. ```In [5]: %timeit loop()
2.83 ยตs ยฑ 7.01 ns per loop (mean ยฑ std. dev. of 7 runs, 100,000 loops each)
In [6]: %timeit callsum()
3.25 ยตs ยฑ 5.06 ns per loop (mean ยฑ std. dev. of 7 runs, 100,000 loops each)
wow, really? TIL
I'm not so much thinking about that, as having other things going on in the loop
I expected that because the sum() version has to repeatedly yield back and forth between the generator and the sum() code
out of curiosity, how does passing sum() a list comp compare?
if you have
for i in range(50):
# more work here
if i % 2 == 0:
evens += 1
# more work here
just tried that ๐ ```In [9]: %timeit callsum()
3.07 ยตs ยฑ 2.56 ns per loop (mean ยฑ std. dev. of 7 runs, 100,000 loops each)
then it's very quickly going to get ugly trying to factor it out like that
so faster than a genexp but not as fast as the for loop
I like writing things that way a lot in other languages but in python it's very common IME that the most straightforward thing is to write a for loop and mutate something
interesting. well, my intuition was totally off there, then.
thanks for the correction
i don't even try to have an intuition for python performance, other than "executing less python and more C is faster" ๐
well, we're seeing the opposite of that intuition here ๐
the loop is more Python and less C than the sum() call
I think Mark Shannon's vision is that simple builtins like sum() can be implemented purely in bytecode. Then you wouldn't pay the cost of calling back into the interpreter repeatedly, and the sum() version should get faster.
I tend to not worry about these kinds of things because python is just so slow to start, if I found myself caring about this sort of thing I'd be wallowing in deep regret for having written that paritcular code in python to start with
funny way to do that (assuming you have ++):
evens = 0
{i%2 == 0 == evens++ for i in range(50)}
# less esoteric:
{evens++ for i in range(50) if not i % 2}
evens = sum(map(1 .__sub__, map(2 .__rmod__, range(50))))
evens = 50 - sum(map(2 .__rmod__, range(50)))
``` Jelle, can you please measure speed of this code?
It has no python code, entire calculation is done in C
slower! ```In [6]: %timeit loop()
2.83 ยตs ยฑ 14.6 ns per loop (mean ยฑ std. dev. of 7 runs, 100,000 loops each)
In [7]: %timeit callsum()
4.05 ยตs ยฑ 130 ns per loop (mean ยฑ std. dev. of 7 runs, 100,000 loops each)
loop() is the original loop, callsum() is your code
evens = sum(map(2 .__rmod__, range(50))) + 50 % 2
this should be faster ๐
I guess the two map() calls + slot wrappers for rmod/sub kill it
that one beats the loop ```In [9]: %timeit callsum2()
2.46 ยตs ยฑ 24.8 ns per loop (mean ยฑ std. dev. of 7 runs, 100,000 loops each)
๐ฅณ
how are you expected to resolve this kinda situation
class One:
def __init__(self, foo):
super().__init__(foo)
class Two:
def __init__(self, foo):
super().__init__(foo)
class Three(One, Two):
def __init__(self, foo):
super().__init__(foo)
```such that all the `__init__` run for `Three` with the same `foo`, while the `object.__init__` doesn't error out
class Base:
def __init__(self, foo):
pass
class One(Base):
def __init__(self, foo):
super().__init__(foo)
print(1, foo)
class Two(Base):
def __init__(self, foo):
super().__init__(foo)
print(2, foo)
class Three(One, Two, Base):
def __init__(self, foo):
super().__init__(foo)
print(3, foo)
```I have arrived at this kinda solution, but it feels a little heavy-handed
what do you mean by error out
ah, will the super() call eventually resolve to object
definitely feels wrong that One and Two call the super init when they dont' have any bases
What's actually wrong with Three simply calling init twice?
it makes Three not work as part of another multiple inheritance chain
and I kind of assume that I shouldn't have to do multiple inheritance by hand C++ style if the linearization exists
it's not, that's the way that cooperative multiple inheritance is supposed to behave in Python.
that's exactly why super() can delegate to a sibling class, as opposed to a base class.
I think that's fine if you are custom designing a hierarchy to allow diamonds
every use of multiple inheritance creates a diamond, and like... it is a useful feature
so far, the answer seems to be to use the popping args from **kwargs structure
and just have one class which is the only one canonically allowed to accept a given argument
not just technically - object.__init__() does get called, and does need to be happy with the arguments it receives.
this is what I would do, if I was in this situation.
I have always found the python take on MI extremely bizarre. For all this to work, all of your inits need to accept all the same arguments
(for it to work "in general")
they need to be designed to cooperate, or wrapped by a class that enforces the cooperation
well, the convention is that each class removes the arguments it uses from kwargs and passes on the rest
yeah. The unusual thing in this example is one argument that multiple classes want to see.
!e
class NeedsFoo:
def __init__(self, *, foo, **kwargs):
super().__init__(**kwargs)
self.foo = foo
class One(NeedsFoo):
def __init__(self, one=0, **kwargs):
super().__init__(**kwargs)
self.one = one
print(self.foo)
class Two:
def __init__(self, two='', **kwargs):
super().__init__(**kwargs)
self.two = two
class Three(One, Two, NeedsFoo):
def __init__(self, **kwargs):
super().__init__(**kwargs)
print(3, self.foo)
print(vars(Three(foo=1, two=3)))
```this does work exactly as it should with no hacks, but sharing arguments seems to be a non-option
@flat gazelle :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 1
002 | 3 1
003 | {'foo': 1, 'two': 3, 'one': 0}
which I guess was more natural back when python was purely dynamically typed
yeah, this isn't exactly type-checkable
especially once the relationships get more complex and you go from just using arguments to ifs in kwargs.pop
what happens if One and Two simply don't call super().init?
after all, most classes do not
well, one of the two simply doesn't get initialized despite being in the mro
which is a problem
ah, I see
and potentially other classes unrelated to them that are also linearized after them in the MRO.
in this concrete example, Two wouldn't get initialized - but if there were a subclass of Three, it's possible other classes wouldn't be initialized as well.
what a mess
yeah, this is a very dynamic, other programmers aren't morons kinda API
I'm trying to decide which is worse, this, or virtual inheritance in C++
multiple inheritance is a quagmire in most languages. It seems to be less error-prone in Python than most, honestly.
well, the problem is with multiple data inheritance
just stacking interfaces is fine
yeah, that's why I said most don' thave it
mixins tend to work pretty well in Python
I wouldn't count satisfying multiple interfaces as MI and I don't think most people would
they work great in C++, and their constructors are even type checked ๐
but mixins in C++ are not done with virtual inheritance
typically
usually when you're doing mixins you don't have diamonds
if you don't have a supertype, then simply disallowing diamonds really removes a lot of the headaches in MI
although I don't know of a language that has taken this approach (MI being so rare to start)
well, no, you always have object.__init__, that never goes away. A mixin doesn't have init, so it just ends up skipped when initializing
in both Python and C++, the situation is about the same: inheriting from a class is unwise unless you know that class was designed to be inherited from. In C++ that's signaled by a virtual destructor, and in Python it's signaled by the class's __init__ calling super().__init__()
there are a couple languages which do do multiple inheritance this way, of which python is by far the most popular https://en.wikipedia.org/wiki/C3_linearization
I mean kinda yes, kinda no. Like I said, you'd be able to initialize your parents "normally" in C++ in the equivalent situation
You would be assuming that there are no diamonds
class Foo : Mixin1<Foo>, Mixin2<Foo> {
Foo(double x1, double x2) : Mixin1(x1), Mixin2(x2) { ... }
};
if you're assuming no diamonds, you can initialize your parents "normally" in Python as well, by just calling their __init__ methods directly
in python, it seems that you both have to design your hierarchy for "cooperative MI", and you have this loose passing of constructor/init arguments around
e.g. Three.__init__ could call One.__init__(self, foo) and Two.__init__(self, foo)
Well, when I said that, you told me that every MI hierarchy in python is "substantively" a diamond ๐
so which is it ๐
from the PoV of super() it is, since object eventually gets delegated to
well, you can sort of do it, if you also do it in all base classes.
but well, bypassing C3 leaves a bad taste in my mouth
if you want to try to avoid super(), you can, and it it will work as long as everyone cooperates in not using cooperative MI, and no base class (other than object) is inherited twice
or all the classes which are inherited multiple times are at the end of mro
even then, their __init__ would be called multiple times, by each of their direct child classes
which might be fine, but might not
ah yeah, the edge there would need some workarounds to make it behave the way I imagined
It seems vastly simpler to ban diamonds to me, and have every child responsible for initializing their parents, personally.
in python where super() is idiomatic this is riskier, so I see an issue there.
it was possible to do MI this way, mostly. Python didn't always have super, and the old way to do things was directly calling the __init__ methods of your base classes. And when no base class is a base of more than one type in the MRO, that works.
coming from C++ where the overwhelming strategy is just to avoid diamonds, and that overall works pretty well, this seems messy by comparison. idk.
I can count on one hand the number of times I've ever seen virtual inheritnace in C++ (for which I'm grateful)
I actually barely remember the rules for initializing classes that use virtual inheritance
iirc, it simply keeps delegating it "downward"
import gc
import sys
def _get_mappingproxy_dict(mp):
referents = gc.get_referents(mp)
assert len(referents) == 1, referents
dct = referents[0]
assert isinstance(dct, dict), dct
return dct
def __init__(self, *args, **kwargs):
pass
_get_mappingproxy_dict(object.__dict__)['__init__'] = __init__ # mypy is very angry about this line
sys._clear_type_cache()
# after this line object.__init__ support *args and **kwargs:
object.__init__('', 5, 8, 13, foo=42, bar=997) # no errors
class X:
def __init__(self, foo):
super().__init__(foo)
print(__class__)
class Y(X):
def __init__(self, foo):
super().__init__(foo)
print(__class__)
class Z(X):
def __init__(self, foo):
super().__init__(foo)
print(__class__)
class T(Y, Z):
def __init__(self, foo):
super().__init__(foo)
print(__class__)
T('foo')
# Output:
# <class '__main__.X'>
# <class '__main__.Z'>
# <class '__main__.Y'>
# <class '__main__.T'>
can someone explain the /?
all arguments before the / are positional only, you cant call the function as int(x=0) (ie, passing x as a kwarg)
thx
!e
import pandas as pd
s = pd.Series(['a', 'b', 'c'])
print(s.str.upper())
@boreal umbra :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 0 A
002 | 1 B
003 | 2 C
004 | dtype: object
Pandas objects have several properties like the .str. one, which are like namespaces for additional methods. Does anyone know what these are called? Looking up "python accessors" only gives me results for @property.
descriptors is the more general term. I don't know of a term specific for this pandas feature
the code looks like it's a normal class
yes, it's an instance of pandas.core.strings.StringMethods
that would work. thanks ๐
though StringMethods doesn't actually appear to be a descriptor. The Series class has a str attribute of type StringMethods, and Series instance have an instance of StringMethods in their __dict__, but I couldn't find where that gets set
oh actually, a newly constructed Series object doesn't have str in its __dict__. Probably there's some __getattr__ magic going on that later caches the attribute
pandas/core/series.py line 6150
str = CachedAccessor("str", StringMethods)```
That's exactly what it is
They have an entire accessor api they call it
yes, @dapper lily just linked the code ๐
the "accessor" system is a great design imo
it makes it easy to extend a framework while giving each extension its own namespace
there are accessors for strings, datetimes, categoricals, et alia
does anyone know/have a reference for the code object linetable format?
tho it's not required except for debugging, right?
https://github.com/python/cpython/blob/main/Objects/lnotab_notes.txt this is what I used to make sense of it
sorry I should have been specific.
I meant the 3.11 linetable
lakmatiol's link still applies
well technically the link is for 3.12 but I don't think it's changed
๐ค Oh, 3.11 code objects have co_linetable and co_lnotab
there's not a chance there's some updated documentation
Until you register an accessor globally which ends up being a column name in a 3rd party library and that said 3rd party library uses dot access to access the column instead of get item
Had that happen to us at work with seaborne. We made a units accessor and seaborne constructed a DataFrame internally that had a "units" column and they were using dot access instead of getiem
That sucked
Imo, you shouldn't be able to access columns as an attribute if u r allowing custom accessors, specially for that reason
Too easy of a name conflict
U could say we made a mistake by using such a common name, and seaborne was at mistake for using dot access
heh, this is one of many reasons why i don't use the attribute accessor for columns!
in pep 673 this example is used:
class MyMetaclass(type):
def __new__(cls, *args: Any) -> Self: # Rejected
return super().__new__(cls, *args)
def __mul__(cls, count: int) -> list[Self]: # Rejected
return [cls()] * count
class Foo(metaclass=MyMetaclass): ...
and they say that the return type of MyMetaclas.__mul__ would be Foo and not MyMetaclass.
that's news to me! is that __mul__ method going to be "inherited" by Foo? will MyMetaclass appear in the mro?
i thought i understood metaclasses, but apparently not
Foo * 3 would return [Foo(), Foo(), Foo()] yes
this means that type(Foo) would be MyMetaclass rather than type. foo = Foo(); foo * 3 would cause a type error, but Foo * 3 would not.
slightly misleading, since each element would be the same object
i see, __mul__ is just an instance method on the Foo object
i guess an instance method on the class itself is not much different from a classmethod
anyone knows when https://peps.python.org/pep-0582/ is going to be approved ??
Python Enhancement Proposals (PEPs)
when will python have private variables
like ones that actually give error when used outside the class
Why? There is no reason to have private vars in python. They are almost useless
@gray galleon in Python the norm is to use double-underscore prefix.
This denotes a private member.
And is obfuscated so it isn't easy to use by accident.
Not really, that triggers mangling and it's not recommended for general use
You normally just prefix with a single underscore to indicate that something isn't part of the public API
gonna create a class that does that
i tried to do that with __new__
why isnt that recommended for general use when its the closest thing to truly private variables/methods
because you generally don't need that. It only makes sense if your class is meant to be subclassed, and even then, _attr is the better one to use a lot of the time, since the subclass often also wants to access the attribute.
the subclass often also wants to access the attribute.
aka protected
here's an implementation of protected attributes ```py
from sys import _getframe
from types import FunctionType
class Private:
def new(meta, name, bases, ns, **kwargs):
if "annotations" not in ns:
return type.new(type, name, bases, ns)
attributes = {k for k, v in ns["annotations"].items() if v is meta}
all_code = {f.code for f in ns.values() if isinstance(f, FunctionType)}
old___getattribute = ns.get("getattribute")
old___setattr__ = ns.get("setattr")
def getattribute___wrap(self, name):
if name in attributes:
frame = getframe(1)
while frame and frame.f_code not in all_code:
frame = frame.f_back
if frame:
return old___getattribute(self, name) if old___getattribute_ else super(type(self), self).getattribute(name)
raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
else:
return old___getattribute__(self, name) if old___getattribute__ else super(type(self), self).getattribute(name)
def setattr___wrap(self, name, value):
if name in attributes:
frame = getframe(1)
while frame and frame.f_code not in all_code:
frame = frame.f_back
if frame:
return old___setattr(self, name, value) if old___setattr_ else super(type(self), self).setattr(name, value)
raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
else:
return old___setattr__(self, name, value) if old___setattr__ else super(type(self), self).setattr(name, value)
ns["getattribute"], ns["setattr"] = __getattribute___wrap, __setattr___wrap
return type.new(type, name, bases, ns)
class Foo(metaclass=Private):
attr1: Private
attr2: Private
...
``` works like this
holy cow
ok wait i need to fix something
is that a metaclass?
yes
metaclasses be like:
ok i fixed it https://paste.pythondiscord.com/ovohisacof.py
should work with property now
yeah, and protected generally makes more sense, very few attributes need to be private.
why is your code so badly formatted lol
that's usually how i write
probably because i'm an #esoteric-python member
if i want to keep it neat i'd probably make it neat
aren't these lines redundant```py
if isinstance(f, FunctionType):
all_code.add(f)
you already have```py
all_code = {f.code for f in ns.values() if isinstance(f, FunctionType)}
@rose schooner
I mean I wouldn't say that, most languages with protected recommend rarely using it
I think it's just in python, the view is mostly against using mangling as a form of protecting privacy
Because it doesn't really, and it's somewhat obfuscating
The reason mangling is recommended more in the context of inheritance is to prevent collisions, that's it
Usually in situations where the base and derived class are written by different people.
Btw in Java, C++ etc, the situation is the reverse: the language solves the collision problem automatically via shadowing
So protected/private are only for access control, they aren't needed for collisions
for a class which isn't meant to be subclassed, it makes no difference whether you use protected or private. For classes that are meant to be subclassed, I have seen protected way more often than private
both in java and C++
Then you've seen some really bad C++. The core guidelines even recommends against ever using protected data
But I mean that's common, overuse of OO in the wild is common in both languages
mostly with methods in C++, yeah.
At any rate the most common case by far are classes that aren't specifically designed to be part of an inheritance hierarchy, and you aren't advised to use protected in that case
yeah, there python just does protected to save on the mangling, since it doesn't make a difference.
But we've diverged more than I intended. The point is mostly just to view mangling as a way to avoid collisions with children, because that's all it is
I just wouldn't view _ as protected and __ as private
_ is not for use outside the "unit" which could be the class but could also be at the library level really. __ is for avoiding collisions, usually when different folks control base and derived. Thinking of it that way will lead to more accurate usage
indeed, that's a better way to look at it
hmm yeah maybe that should be set() instead
does python have some sort of test suite that implementations use to test compliance with cpython?
whether that be for specific parts or the entire thing
Isnt it yourpython/lib/test?
the CPython test suite can also be run with other implementations
CPython-specific tests are marked specifically
i'm not entirely familiar with cpython's structure, hence the question
thanks! that's exactly what i was looking for.
it's Lib/test/ in the repo
It is, IIRC those get shipped with Python in the standard library
Is there a difference between a singleton and a sentinal? Specifically talking about pep661
a sentinel is some value with special meaning, e.g. -1 for find
A sentinel could be a singleton though correct?
singletons are classes that only give you one instance, so the two aren't really related
yes, for example None
Interesting.
So if I made a singleton pattern, created 4 subclasses PassSingleton, FailSingleton, NoFaultSingleton, NoTestSingleton. The instances of those singletons when passed to int return 1, 0, -1, -3.
@hazy pawn
It would be correct to say they're also sentinel value?
whether something is a "sentinel value" or not is whether it represents something entirely different from other possible values. For instance, str.find returns the index of the first occurrence of a substring in a given string, and -1 if the string doesn't contain that substring. In that context, -1 is a sentinel - a special value that means something different from the other values that could be returned.
and something can be a sentinel value in one context and not in another. Within the context of the return value of str.find, -1 is a sentinel. Within the context of the return value of max(), it isn't.
all of the return values of max() are conceptually part of the same domain - the return will always be one of the values that were being compared to find the maximum. The return value of str.find() could be from either of two domains - an index into the string, or a special value that means "not found". -1 being a special value that means something distinct from all other possible returns is what makes it a "sentinel"
and that can apply to parameters or to return values. For instance, None is a sentinel within the context of the function parameter to the filter() builtin - the function is either a callable, or None - and if it's None, the filter() function does something special instead of calling a callable.
and None is not a sentinel within the context of the iterable parameter of the filter() builtin - if you pass None for the iterable parameter, filter() doesn't do anything special - it tries to call iter(None) and just fails.
the grammar has the following block. i'm struggling to understand what it means (especially the . in ';'.simple_stmt+). could someone explain this to me?
statement: compound_stmt | simple_stmts
# --snip--
simple_stmts:
| simple_stmt !';' NEWLINE # Not needed, there for speedup
| ';'.simple_stmt+ [';'] NEWLINE
```(https://docs.python.org/3/reference/grammar.html)
you can write a = 1; b = 2; c = 3; in one line
is that what the dot means?
i also don't really get the !';' and how that's a speeedup
okay, my real question is, can that be shortened to this?
statement: compound_stmt | simple_stmt
simple_stmt: stuff stmt_terminator
stmt_terminator: ';' | '\n' | (';' '\n')
```(ignoring whether CPython's parser can understand that)
I don't think that would allow multiple statements in one line
https://devguide.python.org/internals/parser/#s-e it means multiple simple_stmt separated by semicolons
oh! i was looking at the key at the top of the grammar, completely missed the guide.
I think the speedup is so that if there's only a single simple_stmt on a line (99% of the cases in real code probably) it doesn't have to go into the more complicated second branch
assuming a file is statement* (i know that's not exact), i don't see how that couldn't take a whole bunch of stuff ';' stuff ';' stuff ';'
(assuming stuff doesn't deal in newlines or semicolons)
oh yeah, I think you're right. but simple_stmts and simple_stmt must be different because in cases like if x: something the something can only be a single simple_stmt
i think that should be fine for my case, then
given that i'm trying to specify the structure of a CST, not generate a parser, and thus ambiguities and some correctness errors are fine
thank you!
Thank you. I think I understand now. I was originally going to reply to the pep661 thread due to my disliking of them rejecting the class definition route. However I don't think the peps purpose is what we are doing at work, which is singletons that happen to represent specific numerical values.
what controls the default text encoding for subprocess functions with text=True?
i know that you can control it explicitly with encoding=, but i am specifically asking about text= without specifying encoding=
i also just asked this on SO, if anybody wants a couple of rep points for answering: https://stackoverflow.com/q/74257943/2954547
oh i see, it's whatever io.TextIOWrapper uses
cool multiline lambda workaround in python:
https://billmill.org/multi_line_lambdas.html
A multiline lambda is just a normal function
which reminds me: when asked what a lambda is, like clockwork, someone will say "an a n o n y m o u s function", but in the context of Python, I don't think those who don't already know what a lambda is will understand why that's remarkable. I think it's more useful to describe them as in-place functions.
function expression is probably how I'd put it, "in-place" isn't super clear IMHO
in-place does sound a bit like you cant reuse it
idk, in-place could be interpreted in many ways. in-place for me mostly makes me thinking of mutating a value, rather than creating a new one.
statements and expressions have pretty agreed upon definitions and statement vs expression is one of the key differences between local function and lambda
i've been telling people that functions are objects that know their own name. def creates a function with a name and assigns the function object to a variable of the same name. lambda creates a function with no name, and does not perform any assignment on its own. nameless, and therefore "anonymous".
the problem is that this way of thinking about things is never used anywhere else
I've never heard anyone describe 5 as an anonymous integer or "hello" as an anonymous string
functions "knowing" their own name is also more of a reflection/debugging thing
maybe "function literal" is the best since it draws the direct connection with other literals.
Especially useful in python since python has more literals than some other literals (container literals)
sure, that's why functions knowing their name is a special case that is worth calling out
"function literal" is good for someone coming from another language, but then they probably already know what an anonymous function is
now that i think about it, most usages of "anonymous function" should probably be "function literal"
Hey I have a question about Python OOP and design. Do any of the PEPs say anything about this?
Trying to implement a Window class that has the attributes width and height. I've used the @property decorator to implement access to these attributes, but now I'd like to include a way to modify both of them together.
window.width = 500
window.height = 500
# Option 1
window.size = (500, 500)
# Option 2
window.set_size(width=500, height=500)
window.set_size(500, 500) # or this without keyword arguments
It might seem like a useless design feature, but what are your opinions on these 2 options? which one is more Pythonic?
Should @property be used to create getters & setters that modify multiple class attributes? or should methods be used instead?
What happens if we introduce another function that modifies even more attributes?
# Option 1
window.size_limits = (300, 300, 1000, 1000)
window.minimum_size = (300, 300)
window.maximum_size = (1000, 1000)
# Option 2
window.set_size_limits(min_width=300, min_height=300, max_width=1000, max_height=1000)
window.set_size_limits(300, 300, 1000, 1000) # or simply this
window.set_minimum_size(300, 300)
window.set_maximum_size(1000, 1000)
Here, option 1 feels a lot more desirable because it's much clearer what's going on. Then should methods always be implemented when modifying multiple attributes to ensure consistency in the class?
Option 1 looks good to me. There's nothing wrong with a property modifying several attributes at once; this is often the motivator for using properties in the first place (if all the property does is set and read an attribute, why have it in the first place?).
In this case you could also just do
window.width, window.height = 800, 600
@quick snow I suppose that is a 3rd option, I might want to put that up for a speed test to see if unpacking the values makes it more inefficient
I was leaning towards option 2 in cases where there are too many attributes to edit
Perhaps you meant to say "inline"
No
I had a cursed idea a moment ago
In [1]: def singleton(cls):
...: return cls()
...:
In [2]: @singleton
...: class Foobar: pass
wouldn't stop you from doing Foobar = Foobar.__class__ later if you wanted the class back, though
not cursed enough
use a lambda decorator!
then it would have to go in #esoteric-python
then go there ๐ฟ
@lambda _: _() 
too much whitespace
how about @type.__call__
how about class M(type): def __abs__(self): return self() @abs class X(metaclass=M): ...
!e ```py
class A(type):
def trunc(self): return self()
@import('math').trunc
class B(metaclass=A): ...
print(B)
@rose schooner :white_check_mark: Your 3.11 eval job has completed with return code 0.
<__main__.B object at 0x7f06f5bf4490>
oh it works
oh my
#community-meta message I'm sorry, Sebastiaan
Strange. hash(range(-n)) is constant, it doent depend on n
>>> hash(range(2))
7_853_416_581_674_910_768
>>> hash(range(1))
7_582_552_651_442_218_657
>>> hash(range(0))
-1_075_342_633_697_880_792
>>> hash(range(-1))
-1_075_342_633_697_880_792
>>> hash(range(-2))
-1_075_342_633_697_880_792
>>> hash(range(-3))
-1_075_342_633_697_880_792
because they all behave the same way? len=0
In [1]: range(-1) == range(-2) == range(-3)
Out[1]: True
ranges are compared as sequences, not 3-tuples
so every negative range is equal (and so is their hash)
Custom displayhook, or why are your numbers underscored?
makes sense, but i didnt expect that
i just found out f-strings can be nested really deeply
and this works for some reason ```py
f"{5:{5:<02}}"
' 5'
nvm
2 times maximum nesting
but this also works ```py
f"{5:{5}{0}}"
' 5'
not really, it's a feature
remember how you could do
"%*s" % (3, "something") to pad
this one is weird though
doesn't quite seem intended
seems like something esoteric python could abuse ngl
pow is just a straight-up built-in function, yep, it doesn't get translated into the ** operator or anything when interpreted. Also, did you know pow can actually take up to 3 arguments?
yup, so pow(a, b, c) has the equivalent result of (a ** b) % c
however, it does some clever maths trickery to actually try to avoid directly computing a ** b when possible, which means in some situations (such as cryptography, where you're often doing such operations with really really big values of a and b), it can run much faster than the method using regular operators
>>> py -m timeit "(6789 ** 91011) % 12"
10 loops, best of 5: 34.6 msec per loop
>>> py -m timeit "pow(6789, 91011, 12)"
1000000 loops, best of 5: 368 nsec per loop
``` that's like what, a 9000x increase in speed?
๐ฑ
!e ```python
class Powerful:
def init(self, x):
self.x = x
def __pow__(self, other):
print(other)
return self.x ** other
p = Powerful(2)
p ** 5
pow(p, 5)
@paper echo :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 5
002 | 5
!d object.pow
object.__pow__(self, other[, modulo])``````py
object.__lshift__(self, other)``````py
object.__rshift__(self, other)```
These methods are called to implement the binary arithmetic operations (`+`, `-`, `*`, `@`, `/`, `//`, `%`, [`divmod()`](https://docs.python.org/3/library/functions.html#divmod "divmod"), [`pow()`](https://docs.python.org/3/library/functions.html#pow "pow"), `**`, `<<`, `>>`, `&`, `^`, `|`). For instance, to evaluate the expression `x + y`, where *x* is an instance of a class that has an [`__add__()`](https://docs.python.org/3/reference/datamodel.html#object.__add__ "object.__add__") method, `type(x).__add__(x, y)` is called. The [`__divmod__()`](https://docs.python.org/3/reference/datamodel.html#object.__divmod__ "object.__divmod__") method should be the equivalent to using [`__floordiv__()`](https://docs.python.org/3/reference/datamodel.html#object.__floordiv__ "object.__floordiv__") and [`__mod__()`](https://docs.python.org/3/reference/datamodel.html#object.__mod__ "object.__mod__"); it should not be related to [`__truediv__()`](https://docs.python.org/3/reference/datamodel.html#object.__truediv__ "object.__truediv__"). Note that [`__pow__()`](https://docs.python.org/3/reference/datamodel.html#object.__pow__ "object.__pow__") should be defined to accept an optional third argument if the ternary version of the built-in [`pow()`](https://docs.python.org/3/library/functions.html#pow "pow") function is to be supported.
If one of those methods does not support the operation with the supplied arguments, it should return `NotImplemented`.
stuff = {'a': 1, 'b': 2}
print(f"{stuff!p}")
Has anyone ever proposed fstring syntax like this to use pprint-style formatting for containers?
that'd be pretty convenient
i agree it's a good idea but if you did want to add first class support for pprint
I'd want format specifiers probably
control the indent level etc
although, maybe at that point it's just "too much", I don't know
right now !s and !r are independent of format specifier, right?
yeah, !r, !s, and !a are part of f-string syntax specifically, called "conversions", and are not part of the "format specification mini-language" that works with str.format, f-strings, and format()
the conversion is applied first, then the format specification is applied
so you can't control pprint parameters specifically, but you can control the string presentation as usual
i suppose you could extend the ! syntax to allow for parameterization
the current syntax is like this: {x!r:<15} so you could put some parameter between the ! and the p: {x!4p:<15}
{x!r:<15} is equivalent to format(repr(x), '<15')
i feel like trying to jam too much parameterization in there would be a bad idea, but maybe you can use some kind of pretty printing default context
with pprint.context(indent=120, sort_dicts=False):
text = f'{x!p:<15}'
it would be a material improvement however if the parser allowed whitespace. currently it's not possible to have any whitespace before or inside !p: which limits what syntax you can jam in there
text = f'{ x ! indent=120, sort_dicts=False, p :_<15}'
kind of horrifying but also kind of practical
yeah I mean maybe this is all going down too deep a rabbit hole for too little benefit
right, it's not really better than ```python
f'{pprint(x, indent=120, sort_dicts=False):_<15}'
although supporting x= would be useful
that is:
text = f'{ x = ! indent=120, sort_dicts=False, p :_<15}'
Creating an object makes a stack frame right?
I'm mainly confused if a class object uses a stackframe or not
Yes and no. Calling a function makes one, including the one(s) used to make an object, but once the object is initialized, those stack frames are gone.
I see, thank you!
let: block when?
what would a let block do
Are builtin functions (i mean any functions implemented in C) creating stack frames?
not on the Python stack
though note that if the builtin function calls a function implemented in Python, that would show up on the Python stack.
the Python stack doesn't dead end when a builtin function is called, but that builtin function doesn't create stack frames on the Python stack. If you viewed the Python stack after a builtin function has called a non-builtin function, the builtin function is just sorta missing, since there's no Python frame for it.
(Also, in 3.11 even a function call of a Python function doesn't necessarily create a stack frame.)
3.11 made it possible for multiple Python stack frames to be evaluated by a single call to the Python eval loop - so within a single C frame.
what is the best way to see if there were native function calls in between the frames
use a tool that can investigate the C stack, I suppose. A debugger like gdb can do it.
why doesn't pow() work with a 3rd argument unless all arguments are integers?
i mean it depends on the implementation of .__pow__() but why isn't it allowed for float?
isnt the whole point of the 3rd arg to use a special algorithm for modular exponentation?
said algo doesn't work for floats...
(which was probably a design mistake, but thats a different story)
how about just directly fmod() the double when the 3rd arg is given if there's no algorithm for it?
better to error than hide potential bugs
if you want the fmod behavior its better to specify explicitly
i would rather see an error traceback than wonder why my code is super slow all of a sudden
what?
afaik the main use of modexp is in public key crypto
where you're usually working with large integers
if somehow, a float gets supplied as input to 3-arg pow, the choices are to either error immediately, or calculate pow and then take the modulus, which is much slower
It mostly makes no sense on floats since the float goes to inf before the perf impact to a large power matters. I do wonder if Fraction would work, for example.
ok
if you have a use case where calculating pow of a float and then taking the modulus is actually the correct path
you can still just write fmod(pow(a,b), c)
wouldn't that just do pow(numerator, exp, mod)
create a new lexical scope ๐
imagine
let for i in range(10):
print(i)
print(i) # NameError
hmm
what would the rationale for implementing that be?
Wouldn't it achieve exactly the same result as ```py
for i in range(10):
print(i)
del i
so to not override already-defined variables?
ok, but what would the rationale for implementing that be?
at best, it would allow someone to reuse the same variable name for different things within the same function, which seems like a bad thing to encourage
and - it still wouldn't prevent bugs caused by accidental reuse. At best, it would allow intentional reuse of the same variable name for two things within one function.
barf
Doing that at this point in python is basically admitting that python's scoping model is trash
so it's a hard sell
This idea's been floating around in my head a little, and its probably kinda esoteric, but what if you could fetch a tuple of attributes at once from an object? e.g. ```py
class Person:
def init(self, name, age, gender):
self.name = name
self.age = age
self.gender = gender
p = Person("Jolyne", 28, "F")
a, n = p.(age, name) # the theoretical syntax for getting multiple attributes at once
print(f"{n} is {a} years old.") # result: Jolyne is 28 years old.
interesting idea
what if along with that you can set a tuple of attributes at once ```py
class Point:
def init(self, @x, @y): # don't wanna do self.* = * so i'm just gonna use this rejected idea
...
p = Point(5, 3)
p.(x, y) = 2, -7
print(p.(x, y)) # (2, -7)
also how would this be implemented in __getattribute__/__getattr__ and __setattr__
!e
class Person:
def __init__(self, name: str, age: int, gender: str):
self.name = name
self.age = age
self.gender = gender
def __getattr__(self, var: str):
if var.startswith('multi_'):
return (self.__dict__[elem] for elem in var.split('_')[1:])
return self.__dict__[var]
tom = Person('tom', 42, 'm')
n, a = tom.multi_name_age
print(f'the person named {n} is {a} years old')
@neat delta :white_check_mark: Your 3.11 eval job has completed with return code 0.
the person name tom is 42 years old
it's a very terrible hack, but it sorta works
yup, being able to do the inverse and use it as a setter would also be pretty neat too.
as for how its implemented, you could either have __getattr__ and __setattr__ be able to handle being given a tuple of strings as well as being given just a single string, similar to how sequences' __getitem__ method can take both an integer or a slice object being passed. Either that, or you could possibly opt instead to have it implemented as an extension of python's extended tuple unpacking instead, being part of python's syntax, rather than something dealt with by the object itself
eehhh not sure
by default I usually keep all attributes "private" (with an underscore)
this just encourages making everything public
huh
like dataclasses, I guess. but that's more in the abuse region
what are we talking about here
like, instead of self.foo = 42 I do self._foo = 42 to indicate that others shouldn't fiddle with it
ok i don't get how this is related to the current discussion
are you on about the @x, @y thing they were doing? because i don't think that's meant to be the main focus of that code snippet, i didn't even notice it myself till now
isn't the demonstration the
p = Point(5, 3)
p.(x, y) = 2, -7
print(p.(x, y)) # (2, -7)
```stuff, not the class def?
yep
it seems like the @ stuff was just cereal deciding not to write that all
why did i even bother doing that if i had to write an even longer comment ๐ค
Oh wait, you weren't suggesting that syntax
no
I kinda think that the p.(x, y) syntax is sorta ironic insofar as what you are saving writing is the p
but even inside class scope where many earlier language would save you writing the name of the object, python makes you write it out (self)
Iirc that had a reason
well, a lot of people just like being explicit
what led me to this train of thought was looking at the original suggestion and thinking which languages have facilities to save this duplication already.
And the best example is probably Kotlin; the way Kotlin does it which is pretty elegant is that you can introduce an implicit receiver outside of class scope essentially, because you have lambdas that accept receivers
val p = Point(5, 3)
with(p) {
print("$x, $y")
}
the x and y there will refer to p.x, p.y
this is pretty limited by comparison, and it's kind of by design because python really went the opposite way here and made these things very explicit.
with the proposed syntax might as well just do ```py
def init(s, x, y):
s.(x, y) = x, y
ok so while implementing this i stumbled across a problem
if for example, .a and .b are methods, what would inst.(a, b)() do?
RuntimeError: Tuple is not callable
since the idea is that obj.(x, y) behaves the same as (obj.x, obj.y) its an uncallable tuple either way
Yup
ok
Yeah, it just seems like the end result is pretty convoluted. There was already an extremely well understood solution to preventing that repetition
That solution is used by many languages to this day, including more modern ones. So is the more explicit approach.
To opt for the more explicit approach and then introduce some fairly unusual syntax to save a very small amount of repetition from the explicit approach feels funny
Would this be valid then?
class Person:
def __init__(self, name, age, gender):
self.name = name
self.age = age
self.gender = gender
p = Person("Jolyne", 28, "F")
attrs = ("age", "name")
a, n = p.(*attrs)
Further could we get implicit getattr with that?
it should fail, just like (a, b) += 5 does.
ok
so i see the problem here
i think the operators done on it should behave like map() or zip() but also there's the standard behaviour where it returns a tuple
that seems pretty unexpected to me. I'd expect it to behave like unpacking on the left hand side of an =, or to return a tuple otherwise. Which is analogous to how (a, b) already behaves, it's just generalizing that to obj.(a, b)
it would be easier to teach that way as well - it would mean that obj.(a, b) is equivalent to (obj.a, obj.b) in all contexts.
there is some precedent in that a, n = operator.attrgetter("age", "name")(p) works
so i think i'm just gonna make __m(gs)etattr[ibute]__ (m for multiple)
when will python have null coalescing operators
someone proposed it but it hasnโt been implemented
!pep 505
TIL: super() also works with @property-decorated functions:
class X:
@property
def p(self) -> int:
return 42
class Y(X):
@property
def p(self) -> int:
return super().p + 1
y = Y()
print(y.p) # 43
i tried to before but it kept failing with the ? for some reason
Interesting. Seems odd for a child to override a parent property though
I have huge diamond-full hierarchy, and i need to calculate .size on each instance
.size of instance depends on exact type of instance.
I have a lot of simple classes and several mixins, which are changing behaviour a bit and they also affect .size.
So, i need to make .size a property or method, that uses super(), because i can not make it class variable
I wonder if not could be a built-in function and not an operator
there are also a is not b and a not in b
this is a bit like what "lenses" do in functional programming languages, or more mundanely what the Glom library kind of does. however glom mostly works on dicts and lists, not "objects".
!pypi glom
Doesn't that exist in operator?
Yesh but I meant, why do we need not as a dedicated operator?
Languages like Haskell just have it as a function
Yes
Although granted, not in and is not are handy, and would look strange without a not operator
i think probably the reasoning is that the logical operators are extremely important
and and or cannot be functions
so... it sort of just makes sense for not to not be a function either
Right, becuase of short circuiting
yep
Haskell is a bad example in that sense
it already short circuits all the things
Hmmmm yeah
or maybe not, actually, I'm not sure if Haskell would work properly for that
Yeah it's lazy by default
is && function in Haskell?
it's lazy by default but if you have a function with two arguments, it can't lazily evaluate one, partially evaluate a function
and then see if it needs to evaluate the other
if functions could inspect their arguments before them being evaluated, and and or could be functions
if it needs the function result, then afaik it will evaluate everything
(This exists in some languages, for example (IIRC) R)
Haskell delays execution until the last responsible moment. So if you have a function of two arguments and the first argument ends up never being used, its actual final expression will not be evaluated
i think there'sj ust very low value in making these specific things non-functions because of how important they are.
however the broader question of functions (or something function-like) that doesn't evaluate its argument is very useful
that's terrifying
So you could do something like:
ekat xs n = take n xs
a = take 5 [1..]
b = ekat [1..] 5
``` Then `a` and `b` are both `[1, 2, 3, 4, 5]`
!pypi lazex shameless plug
Yeah it is kinda terrifying
the two main solutions I know of to lazy arguments are to have decent macros (like Rust), or to have really nice lambda syntax and pass lazy values as thunks (like Kotlin and Swift)
like, python's dict.setdefault is criticized because it unconditionally evaluates the default, for example, slowing it down
my_dict.setdefault("hello", list()).append(5)
in kotlin you could write something like myList.getOrPut("hello") { mutableListOf() }.add(5)
and mutableListOf() does not get evaluated unless it needs to be
list.setdefault?
that doesn't exist
they mean dict.setdefault
yeah but they also use .append() which makes it even more confusing
sorry
nvm
yes, the append call is correct though
i got it
I mean tbh the perf doesn't actually matter often but people often use this as an excuse to use defaultdict instead which I strongly dislike
i decided to just pass the tuple of attribute names replacing the first argument
and i got the implementation working ```py
class P:
... def init(s, x, y):
... s.(x, y) = x, y
...
p = P(2, 3)
p.(x, y)
(2, 3)
p.(x, y) = -2, 7
p.(x, y)
(-2, 7)
p.(y, x)
(7, -2)
p.(a, y)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'P' object has no attribute 'a'
p.(x, b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'P' object has no attribute 'b'
p.(a, b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'P' object has no attribute 'a'
How is it compiled to bytecode?
Is it lazy? For example, in case fun_with_side_effects().(a, b) is function evaluated once or twice?
once
>>> from dis import dis
>>> dis("foo().(a, b)")
0 0 RESUME 0
1 2 PUSH_NULL
4 LOAD_NAME 0 (foo)
6 CALL 0
16 LOAD_MATTR 0
18 RETURN_VALUE
are you storing ('a','b') tuple in .co_consts (or in similar place) and LOAD_MATTRis loading attrs, whose names are in tuple .co_consts[i]?
yes
updated dis to actually show the constant ```py
from dis import dis
dis("foo().(a, b)")
0 0 RESUME 0
1 2 PUSH_NULL
4 LOAD_NAME 0 (foo)
6 CALL 0
16 LOAD_MATTR 0 (('a', 'b'))
18 RETURN_VALUE
cool
is x.(a,b) faster than x.a,x.b? (assuming there is no runtime bytecode optimizations happening)
it might be but i haven't tested yet
nope ```py
from timeit import main
class P:
... def init(s, x, y):
... s.(x, y) = x, y
...
p = P(2, 3)
main(['-s', "from main import p", "p.(x, y)"])
5000000 loops, best of 5: 52.2 nsec per loop
main(['-s', "from main import p", "p.x, p.y"])
5000000 loops, best of 5: 44.2 nsec per loop
might be because of specializations
it is good enough, imo
it's only now that i get to look at specializations
it makes me feel powerful that i can just improve speed using customizable caches
when will python have loop labels
in python 4
while True as loop:
inp = input()
for c in inp:
if c == "\0":
break loop```
from __future__ import labels
i don't hate it, although the "associativity" of the as might be quirky
also it adds totally new syntax w/ bare words
@unkempt rock this is not the appropriate channel. see #โ๏ฝhow-to-get-help and carefully read the guide to asking good questions.
wdym by that
isnโt the with <something> as <something> structure common in python
it is, but there's currently no precedent for "bare symbols" as syntax
match is kind of heading in that direction though
that is, the loop labels are not variable names, and their names would exist in a completely separate namespace from regular variables
while True as loop:
for loop in range(10):
if c == "\0":
break loop
confusion reigns
hmm
i suppose one option would be to prefix the labels with some sigil
i really wish they hasn't "wasted" the @ symbol on matrix multiplication
i'm an actual data scientist and i rarely see it in the wild
it would have been perfect as a sigil for loop labels
while True as @loop:
for loop in range(10):
if c == "\0":
break @loop
$
literally the most unused symbol ever
yeah, also valid. we have a couple of unused ascii symbols still
! $ ? `
and ? really needs to be reserved for null-coalescing operators
! is was adopted by f-strings but that's a very specific use case
@ has mnemonic value. i suppose it's still usable here because it's not valid syntax to put an infix operator after as or break
can still do
but right now i'm specializing the multiple attributes thingy so i'll maybe do that later
Hello, I am just wondering if there has already been a PEP for special handling of comprehensions/generator expressions inside class scopes? I encountered the bug described in the SO post linked below, and am curious if there is any plan to fix it.
while True:
for loop in range(10):
if c == "\0":
break @end
jump @end #or
goto @end #or
@end:
pass
I think goto is more flexible
yeah, I would avoid putting goto into a language that's meant to be written by humans
CPython core is written by humans and it has A LOT of goto's
yeah, because C doesn't have defer, so you need to do error handling with goto. That covers the vast majority of usecases of goto in C code. In python you can just use with statements and the GC.
you modified python compiler source code just to implement that?
That's quite a controversial feature IMO
You can emulate it with with though.
# outer break
with label() as outer:
for x in xs:
for y in xs.ys:
print(x, y)
if x == y:
outer.bail()
# outer continue
for x in xs:
with label() as outer:
for y in xs.ys:
print(x, y)
if x == y:
outer.bail()
for the one rare case where you really would benefit from it, and a flag would be ugly
class _Bail(BaseException):
def bail(self):
raise self
@contextlib.contextmanager
def label():
bail = _Bail()
try:
yield
except _Bail as exc:
if exc is not bail:
raise
why
Makes the language more complex (as any syntax addition), doesn't make anything that wasn't possible before possible now, and can lead to spaghetti quite easily
See:
!pep 3136
although it's quite old
However, I'm rejecting it on the basis that code so complicated to
require this feature is very rare. In most cases there are existing
work-arounds that produce clean code, for example using 'return'.
While I'm sure there are some (rare) real cases where clarity of the
code would suffer from a refactoring that makes it possible to use
return, this is offset by two issues:
The complexity added to the language, permanently. This affects not
only all Python implementations, but also every source analysis tool,
plus of course all documentation for the language.My expectation that the feature will be abused more than it will be
used right, leading to a net decrease in code clarity (measured across
all Python code written henceforth). Lazy programmers are everywhere,
and before you know it you have an incredible mess on your hands of
unintelligible code.
so the consensus is no loop labels ๐ญ
few languages have them, even languages that have them don't use them that often
the classic problem of loop labels is having nested for loops and breaking out of the outer one from the inner one.
but most people nowadays would be happier to see that solved by factoring that code out into a function and using return, or even just using a local function for this purpose
you also can iterate over two iterables using itertools.product
in this case you have only one loop, so you can break from it easily
Pretty sure this creates the entire product upfront, which could be costly.
no, it's not
it is iterator, so it produces results lazily
It consumes the inputs immediately
If you pass iterators, it will consume the iterators immediately into lists
That can cause memory errors
hmm, you are right
didnt know that itertools.product is eager
!e
from itertools import product
i1 = (print(i) for i in range(5))
i2 = (print(i) for i in range(0, 15,5))
product(i1, i2)
@dusk comet :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 0
002 | 1
003 | 2
004 | 3
005 | 4
006 | 0
007 | 5
008 | 10
Yeah...
Some iterators can be reset but I guess it's impossible to know that
I liked the PEP about multiple break, for example break break would break both loops.
It leads to better code thsn using variables like should_exit, followed by should_inner_exit and should_inner_inner_exit.
Isnt it more idiomatic to wrap such cases of multiple loops in functions and return in them? Isnt that cleaner than any other multilevel break implemented?
(function call is slow)
probably time to use a different language if function calls are too slow
yep
What does slow even mean, how slow, relative to what
relative to other python, I suppose.
function calls are significantly faster in 3.11 too
I'm always confused at how much discussion of micro-optimizations there is in python
Relative to nested loops?
Sure, function calls have some (constant time) overhead, but surely in almost all cases that overhead will be dwarfed by the (polynomial time) cost of the nested for loops.
is that just in Python? I feel like a lot of people in programming in general do that.
I was under the impression that it was common knowledge python function calls are incredibly slow, due to handling the various argument types, and minimizing calls can speed up a lot
What do you mean by "due to handling the various argument types"?
*args and **kwargs? and also optional, positional only, kw only
Arguments by default are positional and keyword
So python has to handle that, every call
It just doesn't assume positional
there is *args, **kwargs, default pos-only args, default kwarg-only args, pos-only args, kwarg-only args, kwarg-or-positional args, default kwarg-or-positional args
Like positionally you could put your first signature arg at the end by specifying it's name
you can call function using f(x), f(*x), f(x=x), f(**x) and you can combine them
Hm. That's only really expensive to handle in the case where the caller passes a keyword argument. When every passed argument was positional, it just needs to check that the number of arguments is less than or equal to the number of arguments the function takes, and that each missing argument has a default value
How does it know there's no kwargs passed?
It has to check the dict is empty
The dict is NULL, not empty, when no kwargs were passed
and that each missing argument has a default value
it is also O(1), becauselen(f.__defaults__)exists
Still has to check it
some data ```In [45]: def f(a, b): return a + b
In [46]: def g(a, b): return f(a, b)
In [48]: %timeit f(1, 2)
53.9 ns ยฑ 1.07 ns per loop (mean ยฑ std. dev. of 7 runs, 10,000,000 loops each)
In [49]: %timeit g(1, 2)
84.3 ns ยฑ 1.68 ns per loop (mean ยฑ std. dev. of 7 runs, 10,000,000 loops each)
(on 3.11rc2)
if you are passing **kwargs or a=b, it compiles to other bytecode instruction
this instruction can handle kwargs
Well, yes, but a check for whether a pointer is or isn't NULL is pretty much the cheapest operation a CPU can perform
Honestly, I can't think of any time where I ever fixed a performance problem in a Python program by inlining one function into another to avoid a call.
I mean yeah but python is just incredibly slow to start with, and does vastly fewer optimizations than most languages
I just don't think it's quite clear cut that adding a function call to remove a nested loop is going to be more performant in python
I know that people make decisions, have large codebases, it's not easy to change, etc
Figuring out how to break the nested loop may be faster
but if you are worrying about this in python on a semi-regular basis then I think something has gone wrong
the issue isn't performance though. the point is to avoid a situation where you need to use a loop label
Then I've clearly misread the argument sorry
The proposal was factoring a nested loop into a function, so that you can break both loops from the inner using an early return.
the way it kinda started was a discussion of loop labels, and me some other folks pointed out that the mos tcommon use case for loop labels (breaking out of nested loops) is usually just handled by returning from a function.
And then someone was concerned about the perf of the extra function call.
I prefer writing a new iterator that turns the nested loop into a single loop (if I can)
i don't know what i did to slow it down by like 3-5 times but p.(x, y) is faster now ```py
class P:
... def init(self, x, y):
... self.(x, y) = x, y
...
p = P(2, 3)
from timeit import main
main(['-s', "from main import p", "p.x, p.y"])
1000000 loops, best of 5: 205 nsec per loop
main(['-s', "from main import p", "p.(x, y)"])
2000000 loops, best of 5: 154 nsec per loop
>>> def a():
... return p.x, p.y
...
>>> def b():
... return p.(x, y)
...
>>> for _ in range(100000): a() and None
...
>>> for _ in range(100000): b() and None
...
>>> dis(b, adaptive=True)
1 0 RESUME_QUICK 0
2 2 LOAD_GLOBAL_MODULE 0 (p)
14 LOAD_MATTR_INSTANCE_VALUE 1 (('x', 'y'))
34 RETURN_VALUE
>>> dis(a, adaptive=True)
1 0 RESUME_QUICK 0
2 2 LOAD_GLOBAL_MODULE 0 (p)
14 LOAD_ATTR_INSTANCE_VALUE 2 (x)
34 LOAD_GLOBAL_MODULE 0 (p)
46 LOAD_ATTR_INSTANCE_VALUE 4 (y)
66 BUILD_TUPLE 2
68 RETURN_VALUE
``` this is how it's specialized now
awesome
Any docs on what your LOAD_MATTR_INSTANCE_VALUE specifically does?
exactly like LOAD_ATTR_INSTANCE_VALUE but with a tuple of attrs
TARGET(LOAD_MATTR_INSTANCE_VALUE) {
assert(cframe.use_tracing == 0);
PyObject *owner = TOP();
PyObject *elem;
PyTupleObject *res;
PyTypeObject *tp = Py_TYPE(owner);
_PyMAttrCache *cache = (_PyMAttrCache *)next_instr;
uint32_t type_version = read_u32(cache->version);
assert(type_version != 0);
DEOPT_IF(tp->tp_version_tag != type_version, LOAD_MATTR);
assert(tp->tp_dictoffset < 0);
assert(tp->tp_flags & Py_TPFLAGS_MANAGED_DICT);
PyDictOrValues dorv = *_PyObject_DictOrValuesPointer(owner);
DEOPT_IF(!_PyDictOrValues_IsValues(dorv), LOAD_MATTR);
PyObject **values = _PyDictOrValues_GetValues(dorv)->values;
res = (PyTupleObject *)PyTuple_New(cache->num_indexes);
if (res == NULL) {
goto error;
}
PyObject **items = res->ob_item;
uint16_t *indexes = cache->indexes;
for (Py_ssize_t i = 0; i < cache->num_indexes; i++) {
elem = values[indexes[i]];
if (elem == NULL) {
Py_DECREF(res);
goto miss;
}
Py_INCREF(elem);
items[i] = elem;
}
STAT_INC(LOAD_MATTR, hit);
SET_TOP((PyObject *)res);
Py_DECREF(owner);
JUMPBY(INLINE_CACHE_ENTRIES_LOAD_MATTR);
DISPATCH();
}
Where can i read statistics about how many percent of people are using different versions of CPython?
I cant find it (it was based on pypi download rate), but i recently saw that statistics and forgotten where i saw it
@dusk comet i'm not sure where it is either, but it can be misleading to gauge people based on downloads
yeah, for instance anything used in CI will have massively boosted download counts.
I have a coworker who loves them and often suggests them on reviews. It's horrible
Tell them they're using the wrong language ๐
example?
like, replacing a list of numbers with a tuple of numbers somewhere, because a tuple is not recreated each time
one time they suggested replacing py things = [x**2 for x in xs] return "".join(things) with ```py
things = (x**2 for x in xs)
return "".join(things)
it goes without saying they didn't benchmark their optimizations
or anything
why not ''.join(x**2 for x in xs). also wouldn't this cause an error?
why is the first one better?
It was a bit more complicated than that... or something, I don't remember
performance wise?
ya