#internals-and-peps
1 messages ยท Page 120 of 1
!coded
Here's how to format Python code on Discord:
```py
print('Hello world!')
```
These are backticks, not quotes. Check this out if you can't find the backtick key.
Hey there,
so I did some minor benchmark on something I always wondered how Python would handle this
a = True
b = True
c = False
d = True
e = False
true = True
false = False
# which is faster
def case1():
if a & b & c & d & e:
return
def case2():
if a:
if b:
if c:
if d:
if e:
return
def case3():
if false & true & true & true & true:
return
def case4():
if false:
if true:
if true:
if true:
if true:
return
def case5():
if True & True & True & True & True & True & True & False & True & True:
return
def case6():
if True:
if True:
if True:
if True:
if True:
if True:
if True:
if False:
if True:
if True:
return
As you can see it's always 2 cases against each other (1v2, 3v4, 5v6) to see which is faster.
I used timeit with number=10000000 and ran it multiple times but numbers are so far almost always in the same range.
PS D:\random> py .\py_conditions.py
0.9419427
0.7143202
0.9368509999999999
0.5433409999999999
0.44758319999999996
0.44890989999999986
PS D:\random> py .\py_conditions.py
0.9518120999999999
0.7242126
0.9262693999999998
0.5291763
0.4499335000000002
0.4791475999999997
PS D:\random> py .\py_conditions.py
0.9553244
0.6448077
0.9519328999999999
0.5475068999999997
0.4734653999999998
0.4550950999999994
PS D:\random> py .\py_conditions.py
0.9648111
0.7847829
0.9934985000000001
0.5216031999999999
0.4517395999999998
0.47423090000000023
So why is it that after the first False the and & chained if condition isn't just skipped as a whole? At least this is my theory of what is going on. I assume it will go on checking each bool. Tho the very last two cases are interesting in that sometimes case 5 is faster and sometimes case 6. However nested seems to be generally faster.
I hope you get the point in why I used multiple values for different booleans, two and then using True/False directly.
the last two cases get the dead code taken out and are equal to a blank function, with the others & is a bit slower as it's a proper operator with all the things around that
They also don't short circuit as the bitwise ops can be overridden and the behaviour may change because of that, that's why you should use the boolean and and or operators
aight. almost forgot about and
0.9826204000000001
0.6563553
0.6486467
0.9842187
0.6138250000000003
0.5497526000000001
0.4566882999999997
0.5040013999999999
0.4553172999999999
(2 + 3 * n) are the and results. It's definitely faster than bitwise but still the nested if's are a tad faster.
I believe nested ifs and the equivalent with one if and and chaining compile to the same thing
@mystic fable If you use and instead of &, case1..case4 compile to exactly the same bytecode.
How do you learn that information? the documentation for if statements doesn't mention that kind of information
That's because it's completely useless information, and it's implementation-dependent ๐
I just write to functions and run dis.dis on both
The documentation should explain the semantics of if, not all details of how CPython implements them.
@unkempt rock also because & more or less corresponds to __and__, and by default represents bitwise and. and is the short-circuiting boolean operation, and it's the correct choice here, purely from a language semantics perspective. that & happens to give the same answer in this one case isn't important.
or maybe i understood your question?
but for things like what i just said above, it's all in the docs somewhere, but not necessarily that well explained or kind of obtuse
so, experience and talking to other people who already know how it works is a good way to learn!
bool supports & and | as non short circuiting boolean operators
but you very rarely need that
hello , i was parsing a table , so to get informations i need , i gotta clean the output from "html comments and tags" , how can i do that?
this is not a help channel. try #python-discussion for that question
@unkempt rock you should ask in a help channel. see #โ๏ฝhow-to-get-help
they are all dead
You should read in #โ๏ฝhow-to-get-help how our help system works. You should claim your own help channel and post your question here.
ok
s = ""
for n in range(something):
s += str(n)```
In CPython is the line inside the loop realistically considered amortized O(1)? Or is that not necessarily true?
Assuming thereโs only ever one reference to s
Since it using realloc instead of always creating an entirely new array in memory
yeah
Thanks
If you append to a list, and it doesnโt have enough space and needs to resize, but the realloc is successful without having to move everything to a different place in memory, is that O(1), or is it still O(n) or something else?
Like does it still need to set the extra memory to some default value? Or does it not need to do that?
Hmmm I think that the algorithm would always be considered O(n) because the O(1) that is a particular case, the worst case means resizing O(n)
But what is it when that specific thing happens?
it seems to use PyMem_Realloc to resize the list, which according to the docs is modeled after C realloc, which either resizes the memory block if there is free space immediately following it (which would be O(1)?), or allocates new memory and copies to it the contents of the initial block, which, I believe, would be O(n)
here are my sources for all of that if you want to read more
https://github.com/python/cpython/blob/main/Objects/listobject.c#L45
https://docs.python.org/3/c-api/memory.html#c.PyMem_Realloc
https://en.cppreference.com/w/c/memory/realloc
so the short answer is that it's O(n) in the worst case
as Jaime said lol
Ok, then it sounds like itโs O(1) if it doesnโt need to move it to a different area in memory
Thanks
Hey guys where can I read which PEPs are being discussed? I would like to read about how Python is going to evolve
It's O(1) amortized, regardless of whether it moves. If there's not enough room, it doesn't reserve space for just one element. Instead, it scales the list by some factor, say, 2.
Do python lists have capacity separate from their size?
Right
yeah, but you can't get it without hacks
Yeah, I mean for an immutable string type, unlike a mutable one, it's not a given
I think at least on say, JVM, immutable strings do not have capacity. I think it's because python quietly treats your immutable string as mutable if the refcount is 1.
Or rather, I should say, cpython does.
It tries to use realloc when that happens
That said, it's still better to never write code like that, tbh
even realloc is no guarantee of anything really, that kicks things back to implementation details of the allocator
If youโre adding to the end of it
Someone earlier said you could realistically expect for it to be amortized O(1)
On common platformss, in common cases, yes, I think it will be.
Well, and certainly on CPython.
But why not just use "join", right
...(about lists) so if you insert 2^N elements one by one, you will spend 2 + 4 + 8 + ... + 2^N "time units". If you divide it by 2^N and invert the order: N(1 + 1/2 + 1/4 + 1/8 + ...), which can't be greater than 2 * (2^N). So adding M elements to a list will take O(M) time, amortized
Thinking more about it, I'm really not sure that realloc alone would be enough to save you from getting quadratic behavior. To not get quadratic behavior, you're counting on the allocator geometrically over-allocating
yeah, join looks better and is definitely constant
The problem is, and this is honestly a reasonably educated guess but not certainty, I doubt that most allocators work that way
allocators aren't going to give you 10% extra memory, as your allocations get large. It's just a weird thing to do.
For small memory sizes, yes, it may happen
I can't honestly see why though, if you ask for 4KB of memory, you'd get 5K, etc. You may get 4KB rounded to a multiple of 64 or 128 or something like that
but, to get amortized behavior, you need some constant C > 0 such that whenever you ask to allocate N bytes, you always actually get at least (1+C)*N bytes. It doesn't really make sense for an allocator to do that.
if you aim for performance, also don't forget to not use a generator expression with str.join (on CPython) ๐
because internally it will turn it into an array first anyway, but not with a list
that's pretty weird
it has to find out how much memory to allocate i guess
but surely the performance difference can't be that huge. at least, I wouldn't write things differently for performance reasons in that situation.
it's just not a language where worrying about constant performance factors makes that much sense
Wow, it doesnโt just check how much memory it needs and then start over at the beginning? Itโs even the same space complexity?
yeah, on it's own it's a pretty big difference, but the computations in the loop will take so much more time that it doesn't really matter
right
how would it check how much memory it would need?
yeah, it can't really know until it iterates through the entire thing
By generating all the values and then generating them again with the same starting point
wdym by 'generate them again'?
it can't consume the iterator twice
you can't assume that you can run through a generator twice
generator/iterator
things can have side effects
yeah, it creates an array and uses that
not all iterators can be iterated twice at all, etc
unless it is given a list, then it skips that step
that makes sense
the only real alternative here is to do more like what a list would do, and treat it as a series of extend calls
then you don't need to create a separate array
but then you will be doing reallocation
yeah, I am pretty sure strings don't support overallocation since they are immutable
there was just a whole discussion about this
I think that while your point makes sense, I don't think it's true
because CPython as an internal optimization will actually mutate strings if there's only reference to them
I checked the CPython source, strings don't have an additional capacity attribute
overallocation and opportunistically extending are two different things
Then maybe they really do just call realloc. But that would still be a mutation of course.
ye, it is odd that the exact reallocations end up good enough to be faster than any other method for concating strings
ye, a += "?" allocates exactly 1 extra byte for the ? afaik
reallocates
I mean, when you say "faster than any other method", it just happens to be the method chosen in python, a very slow language, I wouldn't read too much into it ๐
In a GC language you worry a lot about memory usage too, not just performance.
the string concat is mostly implemented in C, so I would expect it to at least somewhat match expectations
The python list growth factor of 9/8 probably isn't very optimal for performance, but it is quite nicely conservative with memory usage
that is true
realloc probably helps a lot with smaller allocations, and most strings tend to be relatively small, unlike say lists, where assuming that lists are small will eventually make your users sad
But, all said and done, doing += on a string in a for loop should probably just be avoided
agreed
aye, relying on implementation details should always be considered bad form.
and if you're creating a 50MB string in memory and then dump the entire thing somewhere, there is a chance you're better off dumping it in chunks in the first place
Yeah, a contiguous string isn't usually the right data structure at that point
if your program is CPU bound to that extent, you are better off in rust/nim/probably even pascal tbh most of the time
Wow, so it doesnโt even make a new object
You'd probably use a rope at that point
yeah, rope concat does tend to be faster
Strictly speaking if the refcount is 1, there is no reason to create a new python object regardless of what happens, allocation wise
there's always overhead with the python object itself
When the refcount is 1 presumably it simply calls realloc, which may or may not do a new allocation, and uses the memory from realloc
For user code there wouldn't be any reason to care whether the realloc was serviced by expanding or a new allocation
it's an alterate data structure for backing a string
the most common and simplest implementation is just a contiguous sequence of characters, an array
ropes are not contiguous
In computer programming, a rope, or cord, is a data structure composed of smaller strings that is used to efficiently store and manipulate a very long string. For example, a text editing program may use a rope to represent the text being edited, so that operations such as insertion, deletion, and random access can be done efficiently.
what's nice about them is that they're not entirely node based either, in the sense that the nodes can contain sequences of characters. This makes a lot of common string operations super super efficient. For example, concatenation becomes O(1). You just create a new rope with the first and second string in nodes.
I think it's used often to hold, say, all the text in a file, in an IDE/editor
not exactly
Well, yeah, that sort of thing is called a tree
Well, just a tree, tbh
It's a type of binary tree
trees are usually implemented as nodes with pointers to other nodes, though there are exceptions
so people don't usually call them "linked trees"
i have an implementation that pretty prints
In [1]: from sacks.sequences import Rope
In [2]: r = Rope('once upon a midnight dreary')
In [3]: r
Out[3]: Rope('once upon a midnight dreary', leafsize=8, type=str)
In [4]: r.prettyprint()
14
โโ7
โ โโ7 - 'once up'
โ โฐโ7 - 'on a mi'
โฐโ7
โโ7 - 'dnight '
โฐโ6 - 'dreary'
In [5]: r[:5] = 'twice '
In [6]: r
Out[6]: Rope('twice upon a midnight dreary', leafsize=8, type=str)
I love those lines
wondering about thoughts on
B008: Do not perform function calls in argument defaults. The call is performed only once at function definition time. All calls to your function will reuse the result of that definition-time function call. If this is intended, assign the function call to a module-level variable and use that variable as a default value
From : https://github.com/PyCQA/flake8-bugbear
which is triggered by something like:
def f(x : pathlib.Path = pathlib.Path('something.txt')):
x.write_text('this is pointless')
which gives ./c.py:3:27: B008 Do not perform function calls... (same error above).
What would the alternative be here ? Seems safe to ignore - not sure if i'm missing something though.
Default x to None and check in the function?
that'd defeat the point of a default arg?
should be safe ignore, but while I'm not sure why I don't really like creating the default like that
It'd still work like a normal default, unless you accept None as a value
feels a bit weird writing pathlib.Path twice, unless. you're specifically talking about passing a function result as a default arg
i don't really get what's up with using the result of a function as a default arg tho, thought maybe i was missing something
the return type may be mutable, or the function's return changes depending on other state
so you may get unexpected values if you don't do it intentionally
๐ค i guess the assumption is that the function is understood
It tells you how to handle it if this is intended
i'm aware - like i said - i wasn't sure if this was commonly frowned upon and i was missing some pattern or whatever
Im not sure about function calls, default arguments only evaluated once is a common source of bugs tho, maybe theyre just trying to get you to avoid it altogether
meh, instantiating a class is technically a function call but semantically probably not the "spirit" of that check
fair.
I'm wondering if the use of %s for string formatting is generally not idiomatic now? When I see:
x = '%s' something if k else 'no'
I want to change it to
x = f"{something}" if k else 'no'
not sure if both are generally considered alright though
f-strings are the best
f-strings are pretty cool
% is pretty not idiomatic, except that logging still uses it, probably forever ๐
yeah - i was wondering if % would be considered as such at this point (to me they are).
re logging, hadn't thought of that (i don't log ๐ฌ ), weird that it does that
you don't use the operator there directly, but pass in the arguments to the log call and it uses % internally which won't change because of compat
logging "style" seems like such a strange and ill-thought-out feature
idk what compat is
unless it's shorthand for compatibility
it seems the reason is to do with not evaluating the string until the very last moment, i don't understand enough to know why % doesn't and f does tho
it was
because if the log message is never emitted, an f-string still is evaluated and has to construct the string
yes
f strings get evaluated immediately as they're read so it can't be done with them. But it could be done with .format; it just didn't exist when logging was put in place
.format would be much nicer imo
you can use {}/.format-style strings https://docs.python.org/3/library/logging.html#logging.Formatter
those don't apply to the log messagse themselves
but you can't do "{x} ... {y}".format( x = f(a), y = g(b)) can you, in logger? or can you ๐ค
no, why would you?
https://github.com/python/cpython/blob/c786988b19d7b6808d4c940f7206b95e49e06b71/Lib/logging/__init__.py#L365-L368 the message factory applies % directly
Lib/logging/__init__.py lines 365 to 368
msg = str(self.msg)
if self.args:
msg = msg % self.args
return msg```
why wouldn't you include a result of a function in a log
oh, wait really?
i thought you could control the log parameter style too
i guess no
i remember what it was
you can override the log record factory to use something other than % but you will break existing logging code https://docs.python.org/3/howto/logging-cookbook.html#using-logrecord-factories
I just use f strings
Don't really care about the non laziness for the things I log
it's a bit of an unfortunate design
there are other reasons to use the logging framework
the laziness is kind of silly anyway, string-ifying is rarely an expensive operation
control over where output goes and how it's formatted, log levels, and named loggers are the main appeal for me
to have all that be independent of the program source code and controllable within each application is extremely valuable to me
you can liberally use logging all over your application and the user can turn it on or off at their discretion
and use lots of debug logging output that you don't have to delete later
okk..nice
You guys remember in The Martian when Matt Damon says "In the face of overwhelming odds I'm faced with only one option. I'm gonna have to science the shit out of this."?
I'm feelin that ๐ ๐
@paper echo on the topic of logging, is there a way to have a logger queue its messages until a file handler is attached? I've got a situation right now where I'm measuring some runtime data, then creating the logs directory, then creating the logger, then logging the runtime data
I'd prefer to create the logger with no handlers, measure the runtime data, log the runtime data (it being placed into a cache at this point), then make the logs directory, then the handlers, and then have the cached messages get logged
Its just a better sequence, less backtracking (in my mind)
Speaking of logging, which is something I need to get adjusted to, does anyone know a good guide on incorporating logging into a module?
The logging cookbook has never steered me wrong
What's the best way to track development of Python? In particular - track new PEPs and change in their statuses. Ideally, with an RSS feed.
The python-dev mailing list
Don't they have a RSS feed for the mailing list?
I was trying to see if there was a tool for automatically generating an RSS feed from a mailing list, but to my disbelief I can't find an obvious approach]
why does
a = "hey"
def foo():
print(a)
a = 10
foo()
``` error?
error is
`UnboundLocalError: local variable 'a' referenced before assignment`
Python assumes any names which have been assigned to inside a function to be local before the function even runs, and there is no local value for a before the print
When you assign a name in the function, it becomes local to that scope and you have to use global/nonlocal to use the name from the outer scopes https://docs.python.org/3/faq/programming.html#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value
I don't think you can do this as-described. You might have to implement your own handler that can enqueue/buffer messages and then dump them when requested
I disagree that this is better sequencing. If you have a main function or something like that, set up all the logging before running the logic
There might be performance reasons to do the logging in a separate thread or enqueue log messages though, so maybe you still want your thing
But if your program crashes ๐ชฆ โ ๏ธ
I use 1 logger per module and just log a lot. Some kind of structured logging has been on my radar but i never got around to experimenting with it
Thats the thing though, I can't set up the handlers until after the logic is performed
Its all happening in a single thread though. Its just that the location of the logs directory is OS/runtime specific, and so the local envrionment has to be mapped before the logs directory is created
So its measure runtime -> creates logs directory -> create logger -> log runtime details
Why not create log dir, create logger, then do whatever
I'f prefer create logger -> measure runtime -> cache runtime details -> create logs directory -> create file handlers -> write cache
Okay. With the disclaimer that that work flow doesn't really align with loggers in general, whose aim is to log things as they happen, you could always do a workaround if you really wanted.
Create a temp or inmemory file to log to, make a logger. Log to this temp file.
Then at the end of the run, copy the contents out to the actual log locations and call it good
def startup():
logger = Logger('Application')
runtime = RuntimeAuditor()
runtime.measureEnvironment()
runtime.logEnvironment() # <- message is cached
runtime.createLogsDirectory()
logger.createFileHandlers() # logs directory now exists, so handlers can be created
logger.writeCachedMessages()
# other stuff, logged as it happens
def startup():
runtime = RuntimeAuditor()
runtime.measureEnvironment()
runtime.createLogsDirectory()
logger = Logger('Application')
runtime.logEnvironment()
@visual shadow its not really a problem, its just sequencing. But I'm trying to set things up to 'log as I go' as seamlessly as possible. These are my potential sequences
I think always setup logger before doing anything that you want to log.
I get that, but I think I need to reiterate. I can't make the logger until I know where the logs directory will go
You can log to an intermediate place
So to re-quote Matt Damon, in the face of overwhelming odds I'm faced with only one option - to science the shit out of it. Your suggestion is one option, a custom logger class with a delay function is another. And my original question was just to see if you guys knew of a built in way to handle this
(Its not that serious, I just like that quote XD)
"delay functions" just aren't really a thing for loggers, hence the reactions
XD Fair enough
in most cases, nothing significant happens before you know where to log
If significant stuff does happen, then you'd have to weigh your options. In C++ where it's a bit more involved getting the logging going, I have the global logger log to stdout (or is it stderr) by default. Then the python script that launches the C++ redirects stdout/stderr to a particular file.
The C++ eventually switches the logging to a proper file but catching stdout/stderr likke that is useful anyway
in case of a crash etc
In python life tends to be simpler, and usually I don't need logging to parse a config file or command line options, which is the only thing happening before you know where to log
Python, my first and truest love
@static bluff i don't quite understand the need to do it like you describe. it sounds like you're still doing a lot of weird stuff just to be different or because it reminds you of how things work in other languages
the only reasons i can see for deferring logging to disk are: 1) performance, 2) thread safety
it seems like your concern is just stylistic? in which case i think maybe you should be using common lisp instead of python, where you can write your own arbitrary (and arbitrarily convoluted ๐) abstractions
that said i am very interested in how to make logging better, thread-safe-er, and non-blocking
and i'd be really curious about how people do logging in high-concurrency applications and such
Logging is thread safe right?
what do you mean by "thread safer", it is thread safe
streamhandler just has an internal lock right?
probably
Yeah I'm guessing they use locks to achieve safety
iirc there were cases when log messages would mess with each other or conflict with the program's other outputs to stderr/stdout
that shouldn't happen with threads
but either way, you have to now stop your event loop (for example), acquire a lock, do i/o, then release the lock, then proceed
it'd be nice to dump a log message to a queue and have it eventually go to disk or whatever
i know there's a QueueHandler but i haven't messed with it
and i haven't seen anyone use it
well, this is python, so the logger is logging in the same thread, so it's already really inefficient
right
Yeap, queue logger would do it, but eh I have never needed to use it yet atleast
in high performance situations, the logger is in another thread
well, depending what kind of performance you need, I should say
but if latency for your main event loop is the main concern, then yeah
true, it is the kind of thing where if you need performance you might not want python
"might" ๐
hey, pypy, graalpython, and cython all smoke cpython on even nontrivial code
I mean tons of languages smoke pypy etc
i've found that cython basically gives you a 2x runtime speedup just by copying and pasting your python code without any changes
At any rate, it really depends how far down you want to go. At its basics, you'd typically have the logger running in its own thread, and then you have a MPSC lock free queue, if you wanted
just on principle i'd be happy to have something like that
maybe even within a structured concurrency framework like trio or anyio
but even then there are still optimizations, e.g. you don't want to do any string formatting in main threads, only in the logging thread. But I've seen even C++ loggers that advertise themselves as high performance miss that optimization so eh
i also want a logger that accepts a callable
yeah, that is the nice way to do logging
unfortunate that python's syntax for it is kind of shitty
90% of the time my "expensive" pre-logging work can't be easily encapsulated in "string formatting", unless i start wrapping things in callable LoggerPrepare-er things that implement the exact __str__ method i want
that could be a nice abstraction actually
gotta envy Kotlin, logger.info { "Lazily evaluated string " }
it has lambdas
It seems that way and Ruby basically contorted itself back in the day to have syntax like this, "blocks"
Sorry for not responding, was having a smoke. Its not so much that I want to be different, as I have nooooooooo clue what I'm doing. I've been coding long enough to be good with the language but software architecture is new to me. That, and there isn;t exactly a for-dummies book for a project like mine. I'm gonna start school in the fall! Beyond that, there is a method to my madness. I'm inching up on a critical juncture in this project and then I was going to share it here and hopefully get some feedback
but it's actually really simple
{} is a lambda. If you pass a lambda as the last argument to a function, it doesn't have to be inside parens. If its the only argument, you can drop the parens entirely. That's it.
is there no async logger? i dunno i've never logged anything
Swift does exactly the same thing.
heh, crystal went the other direction and followed ruby, no first-class functions of any kind!
but honestly i don't miss them in crystal
Kind of amazing how they achieved the same result as Ruby but with a one sentence explanation instead of endless blog posts about procs vs blocks
It's just very cool because obviously in these brace languages, keywords look like if (something) { ... }
and now, your lambda syntax can basically match that, so you can write really nice looking code and DSL's, that's still extremely easy to reason about
not but its not that hard to do apparently
https://docs.python.org/3/library/logging.handlers.html#queuehandler
https://github.com/innosoft-pro/python-async-logging-handler
if (something) {...} is the same as if(something, {...})?
or is it if({...}, something)?
Yeah, the logging cookbook even has a multi process recipe for logging, it's essentially async since all the IO happens in the logging "server", the handlers just send the messages to the logging server
well if is a keyword
but you can write a function my_if
my_if (something) { ... } is the same as my_if( something, { .... })
yeah
cool, julia has this too https://docs.julialang.org/en/v1/manual/functions/#Do-Block-Syntax-for-Function-Arguments
Yeah. It's just a brilliant idea, IMHO.
i always forget to read the logging cookbook when i have questions
It's crazy how much mileage you get from a handful of simple ideas like this in Kotlin, IMHO. The nice lambdas + extensions enable so much.
does if even have to be a keyword at that point?
for practical reasons it's better if it is
i can see why it'd be a "primitive" function, i guess you want it to be a reserved word so people don't overwrite the if function lol
other than that though you can write your own if complete with else
that works like the real one, just with some syntactic limitations
even in common lisp if is a primitive (a "special form")
actually, maybe not quite, there are some challenges in getting the else tow ork
Yeah, I mean you do need some primitives
and you need something like if to implement if anyway
so instead of making something like if the primitive, just make if the primitive
however some lisp primitives are definitely redundant
Can't you make an array of two functions and just index into it with the boolean casted to an integer? Or is it impossible in java?
e.g. you don't actually need let, or let*
you can recreate let/let* by creating a lambda and passing the arguments into the variables and then calling it immediately
@grave jolt if you can cast the boolean to an integer, then sure
although that's a pretty ugly feature, IMHO :-). If you don't have the cast then of course, you're just back to figuring out how to convert false to 0 and true to 1 without an if ๐
What about a hashmap? In JVM you can put booleans as keys, right?
of course
in that case you don't need a primitive if ๐
but then that implies you can implement an entire hash map without if....
im too tired for dissssssssssssss
!ot @fresh wave If you want an off-topic conversation, we have off-topic channels
Off-topic channels
There are three off-topic channels:
โข #ot0-psvmโs-eternal-disapproval
โข #ot1-perplexing-regexing
โข #ot2-never-nesterโs-nightmare
Their names change randomly every 24 hours, but you can always find them under the OFF-TOPIC/GENERAL category in the channel list.
Please read our off-topic etiquette before participating in conversations.
whats dat
I think hashmaps are implemented natively, not in Java on the JVM
...btw, let's move that to offtopic
I guess, it would still be very bizarre to choose to make hash maps "built ins" but not if
I'd like to register implementations of an abstract base class, but something weird is happening... First, if a child class of an abc doesn't implement an abstract method it will be in the __abstractmethods__ attribute of it:
In [36]: class Spam(ABC):
...: @abstractmethod
...: def ham(self):
...: ...
...:
...: class Nee(Spam):
...: ...
...:
...: class Newt(Spam):
...: def ham(self):
...: ...
...:
In [37]: Nee.__abstractmethods__, Newt.__abstractmethods__
Out[37]: (frozenset({'ham'}), frozenset())
So, to register only implementations of Spam one might think you could just look at this attribute:
In [39]: class Spam(ABC):
...: registry = { }
...: def __init_subclass__(cls):
...: if not cls.__abstractmethods__:
...: Spam.registry[cls.__name__] = cls
...:
...: @abstractmethod
...: def ham(self):
...: ...
...:
In [40]: class Ham(Spam):
...: ...
...:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-40-47f12c5bcb46> in <module>
----> 1 class Ham(Spam):
2 ...
3
But this blows up.
I'm wondering if there's a nice solution, as it is I'm just going to filter out the abstract classes afterwards.
it does, super().__new__ is called before _abc_impl
so, possibly this could be solved with having two separate base classes
SuperSpam defines init subclass
could try
doesn't seem to help. I guess it's an issue of the relative ordering of the metaclass new
and init subclass
probably my next try here would be to use ABC meta instead
Create a new metaclass that does the ABC meta stuff first, and then decides whether to register it
mixing meta classes ๐ฆ
yeah, could just inherit ABCMeta though
In [44]: class RegisterImp(ABCMeta):
...: def __new__(mcls, name, bases, namespace, **kwargs):
...: cls = super().__new__(mcls, name, bases, namespace, **kwargs)
...: if not cls.__abstractmethods__:
...: cls.registry[cls.__name__] = cls
...: return cls
...:
In [45]: class Spam(metaclass=RegisterImp):
...: registry = { }
...: @abstractmethod
...: def ham(self):
...: pass
...:
In [46]: Spam.registry
Out[46]: {}
In [47]: class Ham(Spam):
...: ...
...:
In [48]: Spam.registry
Out[48]: {}
In [49]: class Ham(Spam):
...: def ham(self):
...: pass
...:
In [50]: Spam.registry
Out[50]: {'Ham': __main__.Ham}
this does the trick
yeah, i did the same except I tried to create a base class for easier use
but then you have to filter it out ๐
class RegisterABCMeta(ABCMeta):
def __new__(cls, name, bases, dct):
c = super().__new__(cls, name, bases, dct)
if not c.__abstractmethods__ and c.__name__ != 'RegisterABC':
c._registry[c.__name__] = c
return c
class RegisterABC(metaclass=RegisterABCMeta):
pass
class Spam(RegisterABC):
_registry = dict()
@abstractmethod
def ham(self):
...
i remember having to do this silly "filtering" when I was doing registration, before I started using init subclass
I don't know a better way
i don't think you need to filter it -- the base class should have an empty frozenset so shouldn't be registered
it's also arguably nicer to pass the registry as an argument, I think
it got registered in my case, not sure why
wait, what do you mean, empty frozensets do get registered
we register everything that has no abstract methods
In [54]: bool(frozenset())
Out[54]: False
oh, i used the wrong metaclass
...: if not cls.__abstractmethods__ and bases:
...: cls.registry[cls.__name__] = cls
can check for bases instead
I tried to pass the registry as an argument but I can't quite remember how to pass it along, I use this stuff once in a million years
that's a good idea
I wanted to get you guys' thoughts...
def __init__(self, **settings: Union[None, bool, int, float, str]):
Controllers.application = self # globalize the main application object
# NOTE: Initializing the application requires a 'pre-init' procedure, to startup logging and
# the global exception handler. This has to happen before anything else so that errors
# during startup and debug startup details can be recorded
Controllers.console = Console('Applcation', delay=True, **settings)
Controllers.runtime = RuntimeAuditor(**settings) # measure runtime, file structure, etc
Controllers.runtime.createLogsDirectory() # logging cannot begin without the logs directory
Controllers.console.createFileHandlers() # begin logging
Controllers.runtime.logRuntimeDetails()
Controllers.excepthook = ExceptionHandler() # override built-in sys.excepthook
Controllers.excepthook.logInitialSettings()
# the application's pre-init procedure is complete โ execute callbacks
self.dispatchEvent(Event('pre-init'))
# initialize wxPython โ 'internal-pre-init' and 'internal-init' callbacks are executed
self.applicationInternal = self.ApplicationInternal(self)
self.logApplicationInitialSettings() # log wxPython startup details
Controllers.settings = SettingsController(**settings) # compute initial global settings
Controllers.settings.logInitialSettings()
# On OSX โ create an application-level 'master' toolbar
# On Windows/Linux โ create a default toolbar that is copied and attached automatically to
# every window (optionally, on a per-window basis)
Controllers.toolbar = Toolbar(master=True, **settings)
Controllers.toolbar.logInitialSettings()
# On OSX โ create an application-level 'master' accelerator table
# On Windows/Linux โ create a default accelerator table that is copied and attached
# automatically to every window (optionally, on a per-window basis)
Controllers.accelerators = AcceleratorTable(master=True, **settings)
Controllers.accelerators.logInitialSettings()
# the application's startup procedure is complete โ execute callbacks and write a message
Controllers.console.info('Initialization complete.')
self.dispatchEvent(Event('init-complete'))
Sorry if the comments are a bit unclear, my back is killing me and I just wanted to get this up here
This is the best I can do to get around the logs-directory-not-existing issue we were talking about earlier
Not sure what you mean by "measure runtime". This code looks a lot like java though.
I'm not really sure what you need to do to "get around the issue" still. You just parse arguments/config, then create the logging directory if necessary, then initialize the logger. that's pretty much it.
The OS, and whether or the application is bundled, among other things
If there's something specific you need to get around in the first place, it's not clear what it is
If thats the case, it means I've done my job right ๐
Its that the logs directory goes in a different place depending on the OS. Within the app bundle on OSX, in an application support directory on Windows, and in the cwd if running as an unbundled script
All told, there are like, 10 - 15 steps that have to be taken to sort of 'measure' the local environment
Little baby steps, nothing difficult
okay, but it seems like that's the only one relevant to logging?
For logging its: bundled or not, what OS, get the current working directory, based on those three values get the 'external' root of the application - where the logs directory will go and possibly other support directories - possibly make an application support directory, make the logs directory, and then make the master log file
Thats like, half the steps performed by the runtime auditor, so I figure there's no point splitting its work into two
But please- I posted it so I could get a general critique. I'm trying to learn to do this stuff right
I would still split it up, not much downside
but if you feel like you don't need logging there then it doesn't matter much either way
Well, Applcation is misspelled ๐ Overall it feels very strange to me to be working with classes and instance attributes so much in the init function, like I said before, it feels like Java
I don't know whether Controllers et al is code you wrote yourself or if it is some framework that expects to be used in a certain way, if it's the latter then that's just life
**settings: Union[None, bool, int, float, str] shouldnt that be hinted as a dict?
Only if you want your values to be dicts
doesnt ** return a dict?
ah
"When using the short form, for *args and **kwds, put 1 or 2 stars in front of the corresponding type annotation. (As with Python 3 annotations, the annotation here denotes the type of the individual argument values, not of the tuple/dict that you receive as the special argument value args or kwds.)"```
nvm then
yeah I had to look that up because I don't usually hint settings
I'm curious what you mean by this. I have no idea how an application would normally be built in Python so I'm just doing what feels natural. Also keep in mind that GUI applications work a bit differently than straight up programs
But general, is Python not an object oriented language which should therefore take advantage of an object oriented approach? What would you do differently?
def main():
...
I would say that python is a mixed paradigm language. You use OO when it makes sense, use procedural when it makes sense, etc
are you just using Controllers as a namespace?
@signal tide Also, I looked it up, sadly there is no way to hint **kwargs properly, which is pretty surprising
there's no way to currently use a typed dict, in particular
yeah, to get around circular imports. Lots of these toolset objects I'm using have mutual dependencies, so I'm doing an indirect import sort of thing.
Err, if you have circular imports, more likely you want to target that problem directly
i agree it seems like you might have an issue with your project structure
There's nothing wrong with two toolkits referencing each other - assuming thats done explicitly and with the care that requires
it's not about "wrong with". Circular dependencies aren't good, all other things being equal. That doesn't mean it's never the right decision, but it rarely is.
I also don't know what you mean by a "toolkit"
For example, the RuntimeAuditor needs to be able to use the console to log the runtime details, and the console needs to access a few attributes on the auditor - just as a formatting thing
So usually it'd be better to "break out" some of that stuff so that you can avoid the circular dependency
in one or even both directions
do you need separate controllers for each object?
If the auditor needs to be able to log, don't give it the whole console, or any of the console in fact
just use the standard hierarchical logging approach, logger = logging.getLogger(__name__)
If the console just needs a few attributes from the auditor, don't make console depend on auditor; just take in the attributes you need as arguments. Then whoever is creating both auditor and console can pass those arguments along.
Functionally, I see little difference between that route, and storing the logger in a controllers namespace and referencing within the auditor controllers.console.whatever()
Not sure what that's in response to
the first one ๐
still not sure...
Basically, here's my thought
to follow up on this, **settings: Union[None, bool, int, float, str] becomes settings: Dict[str, Union[None, bool, int, float, str]] inside the function
I've broken my API, public and private, into small manageable pieces - utility objects I loosely call controllers. For every group of similar functions theres a controller, you know, a toolbox object. Its object oriented, clean, organized. But it makes perfect sense to me that a) different parts of the API are going to need to be able to access each other all the time and b) that a namespace for storing all these utility objects so 'the api' can be accessed from outside my programs internals seems, I dunno, useful?
I recognize that circular imports and a big warning sign especially when they sneak up on you, but I've organized things this way on purpose. It makes perfect sense to me. That said, I'm on here learn from you all
the "namespace" is the package system
I have to be completely honest. You say "you're here to learn from us all", but it usually feels more like you already made up your mind and you want a bullet proof thesis to convince you to do it another way
is this good code layout or spaghetti coed?
I'm not going to provide you with said bullet proof arguments, it's too time consuming
someone give an opinion on my code pls!
Well I've got to be honest with you quick, I've tried to implement a lot of what you've suggested, only to come full circle after others have told me that your approach is backward- or too pure at least
:|
looks okay to me
Not sure who the "others" are, I pretty much try to stick to not-very-controversial suggestions with you (because even those become controversial)
When I say I'm here to learn, tbh my man, learn from everyone but you. You're also really, quite unpleasant. If we're being honest
Saying something like "just use logger = logging.getLogger(__name__)
this isn't some crazy opinionated thing, this is just the 99.9% standard way of using logging
Save it quick, we've given each other enough of our time
ehm, first you have to write your code regarding to PEP standards. You name your functions in a completely wrong way
dude... you never gave me anything lol
Just be clear on that
Nor do you get to tell people to save it in a public channel. Be clear on that too.
Reaching out to you specifically to aask your opinion, spending a good hour talking with you, and then going away and spending a week trying your approach only to be walked back from it by others on here isn't giving you some of my time?
i got a quick question/problem that keeps biting me in the ass, does anyone know how i can get around this:
(this is just a quick example but same concept)
class Person:
def __init__(self, name: str, friends: List[Person]):
self.name = name
self.friends = friends
everytime i do something like this i see class Person is not defined on line 2 and im betting there's something in the typing library that can help or something like that
No, it's not. I have nothing to gain from our interactions. I was trying to help you, that is all. "Giving" implies that the other person is "getting" something
this belongs in #software-architecture
Like I said quick, save it ๐
Like I already said, that's just not a thing you can say here
uh anyone know
And @unkempt rock RotCheck confusingly named, it does not check, it makes sure the rotation is correct. It should be called internally by Rotate
@quasi hound use from __future__ import annotations at the top of your file
Or use List['Person'] instead
isn't future for python2 only
No
For the record, my setup is quite similar to that of wxPython, a framework with a long history and a good reputation. And most of the people I talk to have no problem wrapping their minds around it. Yet, most of the people I talk to have no idea what you're talking about quick, when I relay what I've tried to learn from you. I know its probably very rude of me to say this, but I'm pretty sure its you whose programming from a minority perspective. And I'm certain its you whose not trying to have an open mind. I certainly am trying. Only say this because you've actually made me feel like I'm a total idiot in the past, only to speak to, like, anyone else on here pretty much and be told that while I've got some strange habits, my approach is sound
If you talk to people that think that having a bunch of top level objects in some namespace all depending on each other in circular fashion is sound, then great, I guess you can decide to listen to those people. Not sure what I can add to that.
No, me neither.
great, let's stay on topic going forward
๐จ
rotations of >360 are possible, they just look weird (at least for this implementation)
but correct is more fitting anyway, i will change it
Try asking in #software-architecture. The code shown here still isn't following PEP 8
its fully debugged now
also sry i did not know
Looking for feedback on advanced array slicing. We're debating if it should be included in the subset of python supported by py2many (transpiler). Any data on how prevalent it's use is?
https://twitter.com/arundsharma/status/1415038272183488513
https://github.com/adsharma/py2many/issues/422
Advanced array indexing in https://t.co/IPWtp4wDiG as in:
a[1:4] or a[:-1]
I don't have any data. I can just offer that in my personal usage, negative indices into slices is very very common
the three argument form of slicing, I almost never use
I have only ever used the 3 arg form of slicing for verifying tic tac toe wins, but 2 args slices with negative indicies I have done quite often.
especially in strings, it's just super common. "give me the last 4 characters" s[-4:]
I'd say almost all uses of step will be reversing, but negative indices are very handy and commonly used
there are over 136 instances of it in the django codebase
Thanks. That's a useful data point.
Not what i really meant. Let me explain:
You should name your functions in a snake_case format def rot_check:, but you use CamelCase format that is for class class Newclass:
Check this out: https://www.python.org/dev/peps/pep-0008/
If you are lazy to follow pep standards - other developers won't read your code.
Your code itself looks fine, but try too follow standards. Use flake8 and black for that.
but RotCheck() looks nice tho
It is function so it suppose to be in a snake_case rot_check()
yes but Unity has functions such as MoveTowards (in cs)
is this annotation local to python?
yes, C# has a different naming convention than python
few languages have the same naming conventions
i will change it
at least python has a unified convention, C++ people can't agree about anything ๐
yeah, it's nice that pep8 exists
same with rust, you just cargo fmt and you're done
most languages these days have auto formatters
anyone know the reason why (1,2) != (1,2,None)?
and even weirder (1,2, None) < (1,2)
=> False but (1,2, None) > (1,2) => True
None > nothing ๐
Tuples comparision first compares each individual elements, if they're all equal, the lengthier tuple is "greater"
Not sure why the tuples should be equal
It's the same rules you use to sort words, "lexographically".
because None stands for nothing?
It's still an additional element in the tuple
Same applies regardless what the 3rd element is.
if i wanted to compare length, i'd use len(tuple), but i'm trying to compare values of "parts"
not sure what the correct terminology is here
so i'd expect a comparison of nothing to None
You can filter out the Nones before comparing, but it would be odd that indexing two equal tuples might give back different results
Here's the description in the language reference:
https://docs.python.org/3/reference/expressions.html#value-comparisons
For two collections to compare equal, they must be of the same type, have the same length, and each pair of corresponding elements must compare equal (for example,
[1,2] == (1,2)is false because the type is not the same).
Collections that support order comparison are ordered the same as their first unequal elements (for example,[1,2,x] <= [1,2,y]has the same value asx <= y). If a corresponding element does not exist, the shorter collection is ordered first (for example,[1,2] < [1,2,3]is true).
is there a mathematical reason for this behaviour or is that a shortcut?
Well with these rules any set of tuples will be sortable and behave consistently if their members do.
No, those behave consistently, and for versions it's what you want.
All (2, 7, X) versions are before (3, 0, 0).
yeah, but only if you have them neatly in a 0-padded tuple
No? The length doesn't matter if some elements don't compare equally
In [14]: (2, 7, 0) < (3,)
Out[14]: True
The length only comes into play when all coressponding elements are equal
If it was (2, 7, 0) < (2, ), then the corresponding elements are equal, so the second goes first.
The first would go first since it's longer
well, if you parse things, you might end up with (2,3) < (2,3, None)
just as an example
Sure, in that case you'd want to avoid doing that.
you could argue that in this case one could work with 0-padding the tuples, but versions can have an infinite number of parts by definition
None is not "nothing".
If you want a different comparison behaviour, you could do it yourself, or make a class that implements something different.
so one would need to check the length of the tuples to get the padding right.. sigh
This is what tuples, lists and strings all do, it was chosen to be consistent.
yay for consistency.. if only versions were invented back then
I think this behaviour feels natural, especially when you're sorting collections
Well actually sys.version_info is a tuple of the Python version, and is used for comparisons often...
It might be meaningful to mention this comment in the tuple comparison code:
https://github.com/python/cpython/blob/main/Objects/tupleobject.c#L684-L689
Objects/tupleobject.c lines 684 to 689
/* Note: the corresponding code for lists has an "early out" test
* here when op is EQ or NE and the lengths differ. That pays there,
* but Tim was unable to find any real code where EQ/NE tuple
* compares don't have the same length, so testing for it here would
* have cost without benefit.
*/```
which packaging Version tries to follow, which sucks
It usually would be the case the lengths would match for tuples.
ahhh
so it's a case of corner cutting :p
oh well.
what the comment mentions doesn't change the behaviour apart from its performance characteristics
well, it might make a difference with None
then you could indeed say None stands for "nothing" and compare true
Nothing is like any other object.
It'd be horrible if None didn't make a difference in comparisions, (None, 1, 2, None, None) is quite different from (1, 2)
In [7]: a = (1, 2)
In [8]: b = (1, 2, None)
In [11]: print(b[2])
None
In [12]: print(a[2])
---------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-12-434ad6561f14> in <module>
----> 1 print(a[2])
IndexError: tuple index out of range
``` also, equal tuples not producing the equal elements when you fetch the same index
Sounds like a question that'd fit in a help channel or #discord-bots if you're working with the Discord API, this channel is for discussions about the Python language itself
That would be easily solved by only looking at None-padding on the right
it's basically the same for left-0-padding on integers
001 is the same as 1, but 100 is not the same as 1
None is still a meaningful object in Python, even though it's usually used to represent "Nothing"
That's only for versioning though, isn't it? There are libraries to handle versions
packaging should handle versioning, but i'm not going into that ๐
!e
from collections import namedtuple
Version=namedtuple('Version','major minor patch' ,defaults=[0,0,0])
print(Version(major=3, minor=6)==Version(3,6,0))
```namedtuple gives you a way to do this neatly enough
@flat gazelle :white_check_mark: Your eval job has completed with return code 0.
True
It really sounds like you'd want the behaviour for a small subset of uses it gets while breaking everything else. Padding the tuple is not particularly difficult while checking for the trailing Nones would need to be done for every sequence and completely tank performance in some cases
The important idiomatic difference is that None is in fact a value and it makes sense to have it as a collection element. Unlike something like raku's slip which represents the absence of a value in a data structure, None is more often than not a meaningful result from an operation and as such it makes sense to consider it a discrete element than just padding.
With regards to numbers, math just ends up creating multiple ways to write the same number.
People complain a lot about Optional.map in java special casing null for a similar reason as to why it would be really bad idea to have
tup==tup2 and tup[-1] != tup2[-1]
```hold for some pair of native tuples.
doesn't javascript also have undefined as the true "absence of value" object, while null is like python None?
i'd still argue with the similarity to zero padding, where position and context determines meaning
well, even then [undefined, undefined] is different from [], whereas in raku [slip, slip] is indeed empty
because it slips into the array and you can construct some that actually hold elements, for eaxmple [slip(1, 2, 3), slip] is [1, 2, 3]
what purpose does that serve?
it means you don't have to [x for y in z for x in [y + 1, y - 1]] and instead do @z.map(-> $y {slip($y - 1, $y + 1)})
it is a convenient thing, but it makes things extremely complex to specify
currently, in python (a, b) is pretty obvious in meaning, but if trailing Nones were ignored in some form, the semantics would be a bit more complex
and I mean, 0 padding is a syntax error in literals, so it is just int that considers that if it is given a specific base (10 by default), and there it is useful so that int('00001111', 2) works
Let me guess, you are a lua programmer amogorkon?
nope
What language are you getting this null padding from
ah, fair enough haha
๐
how do i interpret this part: -> $y {slip($y - 1, $y + 1)}
i assume -> $y { } is a lambda
what language is that?
raku
oh ok, looks fairly Perl-like
it is what became of perl 6
i assume -> $y { } is a lambda <- is that equivalent to lambda y:..?
or is there a difference in what you can do?
yeah, it is about the same
I get this error please help usage: convert.py [-h] --labels LABELS [--noviz] input_dir output_dir convert.py: error: the following arguments are required: --labels
I tried this way
python convert.py input_dir C:\Users\Raghava\Downloads\cocojson output_dir C:\Users\Raghava\Downloads\cocomxl
This is the file I want to run
so what does slip($y - 1, $y + 1) do?
i think you should ask in a help channel. see #โ๏ฝhow-to-get-help
constructs a 2 element slip of $y - 1 and $y + 1
but what would a "2 element slip" be?
if slip is "absence of something"
i'd be happy to read the raku docs on this if you have a link
i've never seen this concept in any other language
I think I slightly misexplained slip. slip constructs an empty slip, which is a representation of the lack of value, but a slip may contain 0 or more values and it will slip those into whatever data structure it is part of.
I can't see how you're getting the null padding thing from maths. From a math perspective, tuples of different sizes are elements of different spaces, you can't say that they're equal without having some kind of mapping between the spaces
at least, that's the common way in which you'd discuss product spaces. If you want to have a space that includes tuples of all sizes, you can of course do that, but you'd still need to define equality
Maybe I just don't know the math you're thinking of; if you're thinking of e.g. category theory, I dont' have any background in that.
so [slip(1,2), slip(3,4)] is [1,2,3,4]?
yes
that's kind of a wild feature
ikr
don't want to imagine how that's implemented
i said "similar to zero padding" like 0001 == 1 == 1,0000
nah, it's just as valid as our way, though i really wonder how we ended up doing things the opposite way lol
I think it's pretty different because numbers all live in a single space. The digits are just a representation; the actual value of the number is a summation over the digits.
With a tuple, there's no implied summation, it's a product space
although one could argue that a number is an infinitely large vector of digits with place values
yes, a vector, not a tuple
the definitions get murky there
and even for a vector, it's not really a vector, it doesn't obey the vector laws, if you consider the digits as components
Not really, these things have fairly rigorous definitions. vectors and tuples in math mean fairly specific things.
well, what you can do with either one of them depends on the context
so operations like vector multiplication don't have to be defined for all cases
Yes, but a context where a mathematician would say (1,2) = (1, 2, 0) is extremely rare
or non-existent
it's extremely different from your examples of zero padding numbers
not really, actually. sparse matrices come to mind
err, sparse matrices are usually represented by lists of 3 tuples (assuming a rank 2 matrix)
i'm just saying that "leaving stuff out and assume the rest is zero" isn't all that uncommon in maths
as a general approach, no, but that's different than talking about it in the specific context of tuples
as long as you have a well defined context
part of the definition of tuples, in math, as that they are ordered
for two tuples to be equal, the corresponding elements have to be equal
exactly, which is the point about context i made
sorry, I don't follow. But at any rate, I think the existing tuple behavior is pretty reasonable. The good news is that you can write a dataclass which works the way you want, which is better than the tuple even
since it has named fields
now you lost me. how would you go about it with a dataclass?
@dataclass
class Version
major: int
minor: int = 0
patch: int = 0
something like that?
I haven't seen that many versioning schemes with more than 3 parts, but if you wanted to support such things, then you could do it easily enough with a class that holds a list
I'm actually curious where you have seen versions with more than 3 parts
more than 3 are quite common with .postx releases
what's .postx?
well, my point is that even if i would restrict myself to 5 parts, which should cover all regular cases, i would not cover the specification
one sec
Public version identifiers are separated into up to five segments:
Epoch segment: N!
Release segment: N(.N)*
Pre-release segment: {a|b|rc}N
Post-release segment: .postN
Development release segment: .devN
interesting. I have almost never seen more than 3 used.
At any rate though, it seems to me like you'd want to have a class for this regardless, even aside from this aspect of it, there's also all the special bits of text that can appear or not
rc, etc
postN usually is used when you messed up with a detail in setup.py and only notice after hitting the enter button to upload to pypi ๐
yeah, I'm just not sure offhand what the real benefit of it is over bumping the patch version in semantic versioning
Well, I didn't say either of those things ๐
I just asked what the benefit was over bumping a patch
well, the distinction is that it's obvious you didn't fix a problem in the code but with the code release itself
ime, most projects not using semver, are generally using fewer numbers, and a less formal scheme (i.e. one with fewer "hard rules" about when to bump which number). Not more numbers, and more rules, which it seems like this is. Hence my curiosity.
hey, i didn't invent those rules ๐ i really wished versioning was as simple as using tuples of integers and that's it
It's an interesting distinction. I guess I'd probably need more field experience to appreciate the trade-offs.
Well, you still have RC's
so it wouldn't quite be a tuple of integers. unless you think RC's are useless too, which I don't have strong opinions about either.
or 1.0-pre, 1.0-beta, 1.0~<git_commit> ...
packaging Version ignores those parts.. curious
Version("1.0-pre2").release
(1, 0)
pretty odd
๐คทโโ๏ธ
maybe you change the docs or something but not the code
or there was a bug in your setup.py
then there's debian style epochs in the version
i noticed Version returns a plain tuple while internally they use a namedtuple, my guess is they ignore the stuff when they return the .release tuple in order to accomodate people who expect a tuple of ints?
sigh..
i suppose a better approach might be to have a class of VersionDigits that encapsulates those parts and only compares to other VersionDigits
without losing the info in the stringy parts
or plain refuse to parse to integers, but then sys.version_info becomes deprecated
it's a rock and a hard place situation
Using plain tuples for this stuff seems kinda silly, to me
I'm not sure how beholden you are to existing stuff
funnily, Version refuses to take plain tuples, so you can't even turn sys.version_info into a Version object directly
I would probably write a class from scratch that holds a list of integers + an enum (for alpha/beta/rc/final)
and put all the parsing logic inside its init, write == to work the way I wanted (in your case, if one list is shorter than the other, then the other list has to have all zeros to compare equal)
if it were up to me, i'd go for a tuple of int|tuple(int) or some other well defined, recursive structure ^^
so any additional part would really be meaningful
wait, why do you need a recursive structure
well, (1,0) > (1,) - which is weird
if you just had a class that held a list, then you could have those compare equal
also, tuple(int | tuple(int)) is not a recursive structure
i know, too lazy for a full spec ๐
i meant that the existence of any additional sub-part would hold significance and shouldn't be discarded and must hold up to comparisons
class version:
numbers: List[int]
def __eq__(self, other) -> bool:
common_length = min(len(self.numbers), len(other.numbers))
if self.numbers[:common_length] != other.numbers[:common_length]:
return False
if any(x != 0 for x in self[common_length:]) or any(x != 0 for x in other[common_length:]):
return False
return True
nice pictures, blabla text ๐
And I donโt think that fits here and can be classed has advertising IMO
Is the difference between โfunctionโ and โbound methodโ that bound method implicitly passes the object youโre accessing it from as the first argument? Or is there more to it than that?
that is the important one
but you can also assign arbitrary attributes onto functions, but not onto bound methods
Ok thanks, I didnโt know about that
class Test:
def f(self):
pass
t = Test()
Test.f.x = "hey"
print(Test.f.x)
print(t.f.x)```
It also can't get bound to an another object
This works somehow
But not if you try to assign it directly, like t.f.x = "hey", like you said
I guess because itโs just looking it up through the class and not the instance
is there a reason __builtins__ would usually be a module but sometimes a dict?
you might be able to assign your attributes to method.__func__
cause the function isn't bound until the object is created
which makes sense cause you need an object to bind it to
fun fact, i've used ```python
def aspectize(cls, decorator):
for name, func in cls.dict.items():
if not name.startswith("__"):
setattr(cls, name, MethodType(decorator(func), cls))
aspectize(Test, logged)
as a first experiment with aspectizing
replacing bound methods
what does "aspectizing" mean
you know aspect-oriented programming?
I've heard of it
it seems like in this case you're basically applying a decorator to every member function of a class?
yeah
I'd probably expect such a thing to be done via a class decorator
although perhaps the implementation would not change too much. i haven't written that many class decorators.
a while ago i ran into problems with pyqt, which were very hard to debug, and running a regular debugger would've killed my GUI, so i came up with the idea to decorate methods with a call to a logger to track down where my problem happened
yeah, this is a first class feature in a lot of lisps
called advice there
I remember using it for debugging
but those classes were pretty big and i didn't want to decorate all methods, so i came up with aspectizing
emacs ๐
track down all callables in a module, check for a pattern and replace it with a decorated version
it worked so well that i decided to make it even a feature in https://pypi.org/project/justuse/ ๐
although, i haven't really gotten the chance to put it to use (no pun intended) yet
just mentioned it because replacing bound methods is trickier than i anticipated, but easy once you know how to go about it
just don't make the mistake of trying to assign a function directly
I would probably just google and arrive at this: https://stackoverflow.com/questions/6307761/how-to-decorate-all-functions-of-a-class-without-typing-it-over-and-over-for-eac
You probably should read that, your solution doesn't do, afaics, any filtering to make sure the thing you're replacing is actually a bound method
so if there's a class variable for example, things would get weird
well, it gets tricky with properties
which you can work around with https://docs.python.org/3/library/types.html#types.MethodType
Depending I guess whether or not you want to decorate those too, at least they are indeed functions, unlike a class variable.
yeah, i remember i specifically needed those
it seems like a nice trick for logging that I can imagine being useful , I can't really see myself being super excited to use this generally though
I remember advice in lisp, very double edged, very useful but also could be a big headache.
i think my first contact with AOP was with java.. aspectj?
it sounded like a cool idea back then, but never got the chance to try it, until my debugging problem
although, it's a good tool to have in the toolbox, just in case
yeah, I guess in python AOP is somewhat subsumed into the concept of decorators
if a debugger slows down your program too much and you only need to get a glimpse at a specific part of your code
although in python people immediately think @ when they hear "decorator"
well, yes, because that's what it's called ๐
:p
yeah, but you can also decorate a function by calling a decorator with the func as argument, which got a bit burrowed under the pile of syntactic sugar
Sure. I'm not sure if it's buried, I think most, at least, non-beginners understand that's what @ does
but obviously @ is how decorators are used 99% of the time
naturally, and for good reason
It's a pretty neat thing. At this point, there aren't a lot of python features that I like a lot that I haven't seen in some other mainstream languages, but python decorators are one of them.
Definitely more cumbersome to have an equivalent feature in a statically typed language
especially since they removed those restrictions on those, it's really cool
which restrictions?
now any expression can be after @
@a + b
def foo():
...
@halcyon trail https://www.python.org/dev/peps/pep-0614/
the most use i got out of that so far is with callbacks in pyqt
Oh I see
okay, you can't have things like [0] there even
Yeah, I don't really care, although I imagine it only comes up once in a blue moon
I guess that they did this because it was no work; it basically just made the implementation simpler since there was an artificial restriction in place
according to the pep the only reason why the restriction was in the first place was because of guido's gut feeling ^^
although, i must confess i also have a weird feeling about it
with those decorators, i run into a pattern i haven't seen before
it goes something like this
class Foo:
def __init__(self):
@self.button.connect
def _():
... do button stuff...
i'm torn between leaving those functions anonymous and naming them - anonymous because they never get called from anywhere except as callback, named to better have the IDE assist
yes
well, otherwise you'd need to register them as callback in one place and define them somewhere else entirely
this way, you have both in one place
you can register a local lambda without using lambda syntax
sorry, without using @ syntax
yeah, but lambdas can't be as complex as functions, and i need that complexity ๐
sorry, I said lambda by accident
no, it's correct, i would absolutely use a lambda if it was possible
my pointer is that you can simply do:
def __init__(self):
def connect_callback():
...
self.button.register_connect(connect_callback)
ah, as a closure
the whole point of the @ syntax is to re-assign the result, using it purely for side-effect is dubious
similar to a list comprehension, it's supposed to be an expression, which is why people tend to frown on "throw away" list comprehensions
But personally if the callback is more than 2-3 lines I wouldn't define it locally
well, callbacks by nature are throwaway functions and rarely called by two different things
so lambdas would be the natural fit
I don't think that the only criteria to factor out something as a function is that it has to be called more than once
it's one good criteria, but not the only one
complexity, clarity of naming, testability, etc
if connect_callback is 2 lines, then it's very simple and probably the overhead of following it around to a new location is more than just understanding it at the site of init
if connect_callback is 100 lines, then I want it as a separate function, I don't really care if it's only called once
sure, but you'd also want the definition as close as possible to where you register the function usually, less jumping around in the code
Sure, that's a trade-off as well. But again, the length of the thing is pretty important here. If it's 100 lines and it has a clear name, it's better to just see the name, so you can have an idea what it does. Reading the 100 lines and understanding them is going to be a lot of work anyway, so the overhead of goto definition and back in your IDE becomes negligible
agreed
At any rate I find that decorator pattern with the function named _ kinda weird. Maybe if it caught on more widely, I'd be more accepting.
Right now it's in the same bin as [print(x) for x in my_list] for me
yeah.. as i said, i'd prefer a nice and clear lambda syntax over that def _()
it's kind of weird to me, too
Yeah, with nice lambdas you could just do
def __init__(self):
self.button.register_connect {
...
}
but alas
this is python ๐
in a metaclass, i want to perform a certain check on subclasses of a certain base class, but not perform the check on the base class itself. this was my workaround for the fact that the class doesn't actually exist yet when i'm doing the check.
i currently have this ugly hack. is there a less ugly way to do it?
class _DocumentBase:
"""Base class for Document. Ugly hack to get the check in DocumentType to work."""
pass
class DocumentType(type):
def __new__(meta, name, bases, namespace):
# Only Document itself should have this exact `bases`.
is_base_document_type = bases == (_DocumentBase,)
if not is_base_document_type and "collection_name" not in namespace:
raise TypeError(
"Document classes must have a 'collection_name' class attribute."
)
cls = super().__new__(meta, name, bases, namespace)
return attr.s()(cls)
class Document(_DocumentBase, metaclass=DocumentType):
"""Base document class."""
collection_name: ClassVar[str]
i must confess, i never actually worked with metaclasses like that before, usually i'd rather go for composition or token-based patterns
In your hack, you assume that there are no other base classes, going by your exact tuple comparison
Couldn't you forgo the additional sentinel base class and check if bases is an empty tuple?
couldn't you check the mro to check where you are and act on that?
that's a good point @wide shuttle , i could make DocumentType itself "private" and assume the user doesn't have access to it for subclassing
@verbal escarp bases kind of is the MRO
i guess i could create the class, then check its MRO, but given that i'm already assuming i'm at the "top" then it's no different
top? bottom?
heh
When I ran into this problem I just checked the name of the class ๐คทโโ๏ธ
probably a little worse than what you did
what's a "token based pattern" ?
well, my thought was that it might be more robust for subclassing to check the mro, but i don't know
Checking if bases is an empty tuple, and making the meta class private, isn't so great
It's good to have a convenience base class for using metaclasses but its' not composable if the user has another metaclass in mind
I went through this very recently with ABC where I needed ABCMeta
that's a good point
You could of course just say "I don't care if you want to compose with another metaclass" but that's not so hot IMHO
in this particular case it shouldn't matter much because this is supposed to be a django-like API
writing ABCs for these would be really strange
and it's for an internal tool anyway so i'd be willing to just say "that's not supported"
Well, idk, maybe they are using an ABC to do some kind of special registration so they create the classes by factory, etc etc
but it would be nice to not break things without needing to break them
could do it with a decorator too like attrs
I've only used metaclasses a handful of times so I cannot claim to be an expert on their use cases ๐
in fact in this particular case the metaclass wraps attrs anyway
maybe i'll just wrap @attr.s
My 0.02, I'd probably rather use a decorator then a metaclass
all other things being equal
One thing you should be aware of, you may or may not care about, but
wrapping @dataclass or @attr.s in even the most trivial way "breaks" mypy
yeah, mypy i think has attrs hard-coded
yeah
@halcyon trail token based patterns.. imagine the flyweight pattern where each instance only holds a set of trait tokens (enums) which determines whatever properties this thing has and allow methods only if this or that trait
pyright uses some kind of "dataclass protocol" that attrs now suppoorts too
I was very very very sad when I discovered this. You can't even write your own decorator to change defaults in dataclass/attrs
that seems like a fairly odd pattern, to me
i had a lot of fun with that pattern, actually
Instead of having a token T with values A, B, C, why not just choose between mixins A, B, C?
well, benefit is you can switch tokens easily and composition is well defined (maybe use a frozenset as key)
was it here where people were talking about haskell/rust-like typeclasses/traits in python? there's a library for it... seemed kind of misguided
but there were some benefits that i don't remember
It's not just composition though, you said that the tokens are determining what public properties it has
that's affecting the public API
@verbal escarp what would this pattern look like in python?
@paper echo it does seem pretty bizarre to me. But then, customizing behavior "per type" is often surprisingly clumsy in python. I know I've used @singledispatch for this before with mixed feelings
oh that's what it was
i was like a super powered version of singledispatch that worked with protocols and had some other nice features
but it was weirdly couched in terms of "hey dont you wish you had rust traits in python"
Yeah. I can kind of understand desire for that.
Right, that is a weird way to sell it.
have an enum, a slotted class with a set, then a kind of visitor pattern and dispatch via frozenset of those enums, it's quite simple really
i do kind of like the idea of being able to imbue a class with extra behavior without modifying the class itself, and of grouping them together under a single name (although structural typing kind of reduces the need for grouping)
yeah, I mean the extra behavior without modifying the class itself is one of the biggest criticisms of classical inheritance
as in, the class notifies some external observer when its methods are called?
and it's genuinely a real headache in some cases
yeah
and easily my number one issue with Kotlin. By a large margin.
that seems like a lot of indirection without the benefits of typeclasses/traits
unless the observer is global and all method calls on all instances of all classes get routed there
It's a lot more dynamic than typeclasses/traits. Typically a lot more dynamic than you really need.
IMHO
Today I am going to introduce a new concept for Python developers: typeclasses.It is a concept behind our new dry-python library called classes.
If you compare the use cases where you usually see decorators and mixins, it's quite "static", which is intentional
decorators/mixins/metaclasses
well, imagine a thing that may change behaviour on the fly, like a character in a game that holds different items
the name typeclass i feel is so off for this library
Yeah, I can imagine that being useful, but at that point it's just not really an alternative for the things that were originally being discussed
from classes import typeclass
@typeclass
def greet(instance) -> str:
...
@greet.instance(str)
def _greet_str(instance: str) -> str:
return 'Iterable!'
greet(1)
is greet really a "typeclass"? i'd sooner call it a generic method.
i guess my main gripe is the name
the thing originally being discussed is the ability to make a class a "document", which is probably always going to be a fixed, static, decision, per type
yes, this is for an in-house mongodb ORM/ODM
class MyDocument(Document):
...
@Document
class MyDocument
...
class MyDocument(metaclass=MetaDocument):
....
i wanted to build it on top of attrs instead of going the descriptor route like django
in all 3 approaches, it's totally static
Out of curiosity why do you actually need to add stuff
my original approach was the 1st one, but now i'm thinking about the 2nd
why not just write free functions that process attrs classes
because reasons
it's easily the simplest approach and also the one that won't drive mypy crazy
trust me, i wish we could
Well, if you are stuck with that, as annoying as it seems, you may want to keep it as a second decorator
@Document
@attr.s
class MyDocument:
...
i don't think we're necessarily stuck with it. but it will be a harder sell because people are used to traditional inheritance-based OOP
well, decorators are not inheritance based, of course, but yes, they will let you use member function syntax
member function syntax though is actually a bit risk for a type where you're going to do define arbitrary properties
but even if we do go the attrs or pydantic route, we still want some fixed connection between the underlying mongodb collection and the class definition
there is some additional logic to be had
i wished OOP worked :p
e.g. on saving to db, we want to update the instance an updated_at timestamp
Yeah, you can do that with a free function easily though
we want to be able to get a mongodb collection and as you see here i want to enforce that document classes state a collection name
yeah of course, you definitely can
but it gets a bit funky because now saving mutates an argument
Sorry not trying to argue, if you can't convince people with veto power, then you can't, that's life
yeah
Well, it mutates an argument anyway... it's just in one case, that argument may be self ๐
no its not about veto power, but im not entirely convinced that free functions instead of traditional classes make python better
x.save() vs save(x)
right, but mutating self is a lot less "unusual"
oh, that's a pretty odd take tbh
People have been complaining about this in Java for ages
in fact thats the main reason i like traditional OO style, because you mutate self and don't fucking mutate anything else
syntax follows usage
i am probably weird in this
well, traditional OO languages don't have any safeguards against mutating any other argument. And there are also issues, what if two things need to be mutated
IMHO the better solution is that the type system simply enforces what can cannot be mutated
i can't remember the last time i've had to mutate 2 things ๐ but that is a great point
and then it's also evident from the function signature. Whether it's a member or a free function
i find x.save() harder for testing than save(x)
The lack of control over mutation is a pretty big issue IMHO in many mainstream languages, OO doesn't really do much to mitigate it
not harder, just more work
Hmm, why do you find it harder for testing? I actually don't see much difference.
i agree, its more just the semantic/syntactic separation
@halcyon trail you might have to do monkeypatching or mocking with x.save(), arguably less so with save(x)
well, you need to set up an instance etc. first
Maybe I'd agree, if writing a class from scratch, all other things being equal, things that mutate the object, I'd prefer as member functions
the problem is, in this case, not all things are equal
while with save(x) you could just pass in a mock
That's a fair point I suppose, though I'm usually only mocking a very small fraction of classes
@paper echo not all things are equal in the sense that mixins, decoratrs, and metaclasses are all much, much more complicated pieces of python machinery than free functions.
that is also true
So, for me personally, I'd really need to see some concrete benefits before going that way, just "mutators should be members" isn't really enough
i think there's also encapsulation and namespacing benefits
i know there are free-function equivalents of course
Yes, the encapsulation benefit is keeping it as a free function ๐
and i wouldn't really want to "port" this reasoning to a language that isn't already OO-first
there's a reason i like Julia after all
11 March 2020 by Phillip JohnstonWeโve received permission from Scott Meyers to re-publish this article on our website after the version on Dr. Dobbs went down. Iโve taken the liberty of converting some of the bold print to code font, as well as clarifying some code example indentation/bracketing. Even if you donโt completely agree with โฆ Contin...
If you're interested
side note: specifically annotating things as mutable could be a pretty easy win for mypy/typing. imagine
@attr.define
class Point2d:
x: float
y: float
def copy_to(src: Point2d, dst: Mutable[Point2d]) -> None:
dst.x = src.y
dst.y = src.y
Scott Meyers, if it rings a bell, wrote about this topic, probably like 20 years ago, it's pretty interesting
i don't know the name but this is a good point
Encapsulation is a means, not an end.
Well, unfortunately a Mutable annotation like that for a type, doesn't give you much, in the general case
With something like a dataclasses, where it's obvious what mutates and what doesn't, ok, it would help
but say you have a real class
without Mutable, assigning anything to any attribute of the class would be a type checking error
maybe it needs a better name
Yeah, but you can have member functions that perform mutations
how do you know if you can call those members?
hmm.. i'd rather have annotations for side-effects
right, maybe it needs a better name ๐
The answer is, of course, now the members need that annotation
you need to know which member functions are mutating, which are not
i like that
congrats, now you've re-invented C++ const ๐
and Rust's mut
that's how they both work
"behold, this function is mutating stuff" and "this function shouldn't mutate anything" - give a warning if it does
don't worry, i explicitly took inspiration from mut
i think many people would be surprised to get a heads up for side-effects if they didn't expect any
it makes me sad that const/mut hasn't made any real headway into GC languages
_Thing = TypeVar('_Thing', bound='Thing')
@attr.s
class Thing:
foo: int
def inc_foo(self: Mutable[_Thing]):
self.foo += 1
i'm into it
you need better syntax though, nesting everything in Mutable like that is awful
yes
Anyhow. But in GC langauges, the solution instead are basically "partial" interfaces, i.e. non-mutating ones
So you have Sequence, and MutableSequence
Mapping, MutableMapping
Python, Java, Kotlin, probably C#, do stuff like this
i will say, crystal breaks exactly 0 ground in this area
it doesn't even have the interface hierarchy
never heard of crystal
Yeah. It's pretty sad because it's trying to repurpose one tool to solve a different problem, a problem that's really really important and deserves its own first class feature
That said, maybe the feature is just immutable data structures anyway
i was going to say
languages that have immutable-first data structures are doing fine here
realistically there are multiple languages whose immutable data structures will beat the pants, speed wise, off python's mutable ones, in general usage
Yeah, but again, almost no mainstream languages really
for some definition of mainstream
exactly ๐
from my pov python could be immutable-first, if there was a push for it
nim is the only non-"functional" language that i know of with this feature @halcyon trail
unless D also has it
i believe D does have something like it actually, with its pure annotation
immutable int[3] table = [6, 123, 0x87];
D Programming Language
yeah, so D has it
D tries really hard to be all things to all people
so it has GC, it has @nogc, it has const, but it also has immutable, etc
it has all things, except uptake
lol yep. too big
i think the main selling point is that C++ is still even bigger and D is better on a "per feature" basis
as in, goodness / num_features is still supposed to be higher for d than for c++
and then came rust
Yeah, it hasn't really worked out well though. C++ still has a bigger community, better tools, better libraries, and no worrying about will the GC turn on or off, etc
yep