#internals-and-peps

1 messages · Page 23 of 1

feral island
#

But you probably also want to provide a way to clean up the resource in a way that doesn't rely on GC (like the close method on a file)

halcyon trail
#

well, the context manager should close the file

feral island
#

but what if you don't use a context manager 🙂

halcyon trail
#

but you can provide del as a fallback

#

then you're bad and you should feel bad 😛

#

also note in the literal sense the resource will get cleaned up, the main reason (IMHO) to provide del on a file is to flush buffers

feral island
#

only when your program exits

halcyon trail
#

yes, but it will get cleaned up

#

if your problem is leaking stuff while the program runs, then you should definitely be using context managers for that

#

at global scope it's a bit less convenient sometimes, and while the OS will clean the file, you can simply lose the last N lines of your logfile

raven ridge
#

that depends on the resource. file descriptors and memory will be cleaned up by the OS, but other things won't (sysv shared memory, for instance, or named semaphores)

halcyon trail
#

seems like a scary position to be in, presumably if the python interpreter is suitably killed it will still leak

raven ridge
#

yep, absolutely

halcyon trail
#

what surprises me more in all this is that Kotlin works the same way as python, now I'm wondering if this is actually the normal behavior for GC languages

#

everybody knows that python's default arguments behavior is bananas, but maybe the closure behavior isn't and my intuition just isn't good

#

or rather, my intuition is being too influenced by C++/Rust

#

I'm trying to see how it works in swift now 😛

#

here's the kotlin for those interested

fun bar() {
    var x = mutableListOf(5)
    fun foo() { x.add(3) }
    foo()
    println(x)
    x = mutableListOf(5)
    println(x)
    foo()
    println(x)
}

fun main() {
    bar()
}
[5, 3]
[5]
[5, 3]
#

i just think it's funny because, from my perspective in C++, I've found that one idea I had to get used to was that in GC languages, mutation vs reassignment and mutating via a mutation method can have very different consequences.
I expected this to be a situation where they behaved differently, but actually, they behave the same 😛

raven ridge
# halcyon trail seems like a scary position to be in, presumably if the python interpreter is su...

multiprocessing has this exciting hack in it:

On POSIX using the spawn or forkserver start methods will also start a resource tracker process which tracks the unlinked named system resources (such as named semaphores or SharedMemory objects) created by processes of the program. When all processes have exited the resource tracker unlinks any remaining tracked object. Usually there should be none, but if a process was killed by a signal there may be some “leaked” resources. (Neither leaked semaphores nor shared memory segments will be automatically unlinked until the next reboot. This is problematic for both objects because the system allows only a limited number of named semaphores, and shared memory segments occupy some space in the main memory.)

#

and it mostly works! Don't pkill -9 and things will mostly work out, heh

halcyon trail
#

yeah, I suppose the python interpreter will have signal handlers installed for everything that allows it

#

so literally everything but sigkill, it should still clean up

raven ridge
#

I haven't looked at the implementation of that resource tracker, but that's my assumption

rose schooner
#

this stumped me for a few moments

#

then i realized it's the cell that's constant, not the object in the cell (object is changed by STORE_DEREF)

halcyon trail
#

I admit I don't know what a cell is

rose schooner
halcyon trail
#

Okay, sure

rose schooner
#

doing foo.__closure__[0] in bar() should give a cell object

halcyon trail
#

That's basically what I figured was happening

#

I haven't yet decided how to feel about it

#

I want to see how more languages handle this

rose schooner
# halcyon trail I want to see how more languages handle this

lua ```lua
function table_string(tbl)
s = ""
add_comma = false
for _, elem in next, tbl do
if add_comma then
s = s .. ", "
end
s = s .. elem
end
return "{" .. s .. "}"
end

local function bar()
x = {}
local function foo()
x[#x + 1] = 5
end
foo()
print(table_string(x))
x = {}
print(table_string(x))
foo()
print(table_string(x))
end

bar()
output:
{5}
{}
{5}

halcyon trail
#

For dynamic languages it's less surprising since they tend to just implement everything as hashtables from string to whatever

#

I wanted to test c# and swift

#

But I got lazy

rose schooner
#

i didn't local x

#

hold on

#

ok same thing

rose schooner
#

also for some reason "arrays" are treated differently from "hashtables"

#

one has a length, the other doesn't

halcyon trail
#

Weird

rose schooner
#

i guess it makes for a fast scripting language though

flat gazelle
halcyon trail
#

Yeah I came across this

#

It's hard to say what the right behavior is I think

#

They both have significant downsides I'd say

flat gazelle
#

I think this is just one of those cases where the language dev has to make a decision, and that decision will just suck in some cases.

halcyon trail
#

In non GC languages there's no confusion but that comes with other downsides

boreal umbra
#

Has there ever been any consideration of adding methods to generators? Something like this would be nice:

for item in generator.limit(10):
    ...

to get only the first ten items that generator will produce.

grave jolt
#

oops, sent too early

#

I think there are several problems. Mostly: why limit this to generators? All of these functions will likely work on any iterator

#

And you cannot add methods to iterators.

rose schooner
#

itertools.islice(generator, 10)?

#

it's somewhat like len() being a built-in function instead of a method

grave jolt
halcyon trail
#

On the bright side, it's only almost every other language that can do this 🙂

flat gazelle
#

Being duck-typing compatible with a generator is a moderately common thing to take advantage of, so adding any methods would be kinda sorta a breaking change. (coroutine wrappers for example, though I can't quite where they come into play).

raven ridge
quick snow
#

I find this method chaining weird looking in Python code. For this usecase I'd rather see a head(iterable, count=10) function. (In reality I usually just zip with a range.)

rose schooner
cyan raven
#

is there any reason why cpython prefers plain makefiles over cmake?

spark magnet
feral island
halcyon trail
#

Surely python used something to generate the make files though right

#

It's not just using make

flat gazelle
#

yeah, it's autoconf automake etc I believe.

#

ah wait, no automake is used, it's just a Makefile that is then configured by autoconf

halcyon trail
#

By automake you mean auto tools?

#

Oh I see

#

That's pretty scary

#

I wonder if python is the largest active codebase in the world to use pure make in that way

safe basalt
halcyon trail
cyan raven
#

I find it a bit absurd that the most popular python implementation is in c.

radiant garden
#

It doesn't seem very surprising to me

#

It's old and performant

feral island
#

at this point Rust is probably more likely than C++

#

but rewriting a huge codebase into another language is a very difficult task

halcyon trail
#

Well, that's the whole reason why C++

#

Because it's not a rewrite

feral island
#

probably the most likely way this would happen is if there's some component someone wrote in C++ or Rust that we'd want

#

true, C++ makes that easier, but I don't know that we'd gain as much from C++

#

and I think there's some portability concerns

halcyon trail
#

I mean what you'd do is get the current codebase to compile in C++, then change compilation to C++

#

And then you can use c++ features anywhere you want

#

gcc did this so it's eminently doable

halcyon trail
cyan raven
halcyon trail
#

Yep

#

And I'm pretty sure they're quite happy with it

#

They had some good talks about it with lots of specific examples

cyan raven
#

their code-base is really c-like tho 😄
not like clang

halcyon trail
cyan raven
#

the point that im trying to make is ofc it is a huge effort, but it is going to happen anyways.

urban sandal
#

I've been seeing a lot of success with using zig to build c projects (even those that aren't using zig for anything other than build system) (both my own stuff, but also being adopted by quite a few companies for it). Quite a few benefits including getting cross compilation out of the box and very portable build tools.

halcyon trail
cyan raven
halcyon trail
#

Eh it's still very niche

feral island
#

tried to search for "c++" on discuss.python but that's not a great idea

halcyon trail
#

Heh

feral island
#

I do think there were some concerns somewhere about exotic platforms we support where you can't get a C++ and/or Rust compiler, might be mixing it up with something else though

#

maybe mimalloc (the new memory allocator we're using for nogil) doesn't support all platforms we support

halcyon trail
#

C++ compiler is quite a bit easier than rust I think, since gcc still supports more targets than llvm (plus you get the union)

feral island
#

yes that's probably right

halcyon trail
#

Very embedded targets are something of a pain still in rust from what I hear, but that's like targets that don't have a heap

feral island
#

of course C++ is also not exactly a superset of C, so there might have been some concerns about that

cyan raven
feral island
halcyon trail
#

It's hard to imagine something without a heap running python and hard to imagine something with a heap not being supported by gcc 😛

#

But maybe it's my imagination that's the problem here

halcyon trail
#

I'm guessing python is mostly C89 right

#

Since windows didn't support anything past 89 until an hour ago 😛

feral island
#

probably, I think we switched the minimum support to C11 only a few years back

halcyon trail
#

If there's substantial amounts of C99 and C11 that'll make it harder

feral island
halcyon trail
#

I don't blame MS to be clear 🙂

#

But yeah I assume when gcc did it, it was basically C89, and C89 is almost a subset, and most of the non subset things are trivial fixes

#

Although there's a lot of things that are legal C99/C11, are not legal C++, but which gcc will compile happily with the right flags anyway (even by default)

hybrid relic
cyan raven
#

Is Pypy still stuck with an older version of python because of rpython(2.7)?

flat gazelle
#

Pypy has supported python 3 for a while now

spark magnet
neat delta
inner sentinel
#

Is there anyone in here who would like to help make a project.

#

I am looking for someone who is good with python who would like to do something challenging.

boreal umbra
#
In [8]: import inspect

In [9]: def my_func(a: str, b: int = 5): ...

In [10]: sig = inspect.signature(my_func)
Out[10]: <Signature (a: str, b: int = 5)>

In [12]: sig.parameters
Out[12]: mappingproxy({'a': <Parameter "a: str">, 'b': <Parameter "b: int = 5">})

In [14]: sig.parameters['b']
Out[14]: <Parameter "b: int = 5">

In [16]: sig.parameters['b'].default
Out[16]: 5

@full jay

flat gazelle
#

I believe the code object holds the information from which inspect gets its info

full jay
#

So does it only know that in situ?

flat gazelle
full jay
#

Man, Python internals are wild

feral island
#

can I ask what triggered this conversation?

flat gazelle
merry bramble
fallen slateBOT
#

Lib/inspect.py lines 3376 to 3379

def signature(obj, *, follow_wrapped=True, globals=None, locals=None, eval_str=False):
    """Get a signature object for the passed callable."""
    return Signature.from_callable(obj, follow_wrapped=follow_wrapped,
                                   globals=globals, locals=locals, eval_str=eval_str)```
merry bramble
dusk comet
#

why is every link underlined ??

#

it looked like this recently

feral island
#

hm I see that too, might be a recent bug

merry bramble
#

It's a new accessibility thing

#

deliberate, I think

#

(The PR was merged a few months ago, but was only released as part of python-docs-theme==2024.1, which was pushed to PyPI on 27 January)

dusk comet
#

i dont like this, it looks ugly and dull

quick snow
#

I disagree, I think it looks much better with underlines.

neat delta
#

i am curious as to why it's happening at all, considering that underlining links can be done by browsers. surely that removes reason for individual websites to care?

merry bramble
#

I don't have a strong opinion on it, but if you dislike it (or think it didn't need to be done), I'd probably open an issue over at python-docs-theme

#

complaining here won't change anything 😉

sturdy timber
dusk comet
#

(attached pictures are just an examples of some webpages)

underlined links are definitely not the default, i've never seen anyone with always-underlined links

imo, links should be underlined only on hover - in that way you get visual response from website that you are indeed hovering the link and you clearly see size of the link (so you can easily notice that there is adjacent link, look at pic.2)

underlines by default just add a whole bunch of visual noise without any real benefit, and they make recognizing single link even harder because color change that happens on hover is comically small

quick snow
#

Not underlining links is a relatively recent trend.

merry bramble
dusk comet
urban sandal
#

consistency or not with the rest of the world tends to be a terrible argument when discussing accessibility. If such arguments held weight, we'd never have wheelchair accessible public buildings. Sometimes it takes people dragging the world forward a little bit at a time to help those who need it, even when the initial solutions could be improved on (such as providing a toggle).

Edit: to clarify, "such as providing a toggle" for those who don't read the whole discussion, I don't actually think a toggle is a net good, the good comes from making a site that the purpose of is primarily to convey documentation accessible and functional first, then work to improve aesthetics without compromising on function.

ruby elm
#

I think a good reason to not underline links be default is that it devaluates the weight underlines have on the reader. Just look at the table of contents at the side of any page, or the actual table of contents pages. Almost every single word is underlined, and with so many so close together, underlines as a tool for text emphasis become way less useful.

urban sandal
#

Underlining for emphasis in typesetting is and has been generally frowned upon prior to the internet and hyperlinks as we know them, I'm not sure that comes in as a good reason.

sturdy timber
#

I don't think the docs really use underlines for emphasis, yeah

urban sandal
sturdy timber
halcyon trail
#

I'm curious is it that complicated to make the underlines optional

feral island
#

by adding a user setting?

halcyon trail
#

in practical terms, when these things matter for accessibility that's generally the approach. we don't make colors uglier for people in general, so that they can be more accessible to people who are color blind, instead lots of video games, websites, and so on, offer color blind or other accessibility modes

halcyon trail
# feral island by adding a user setting?

I guess? Could you have a cookie or something like that which the CSS checks to decide whether to underline or not? I'm profoundly ignorant of such things so forgive if this question is silly 🙂

feral island
halcyon trail
#

right; so if the cookie is already there anyway then adding another field to store an underline/not preference similar to dark/light seems straightforward enough

#

Seems like the best of all worlds

urban sandal
#

yes and no.

#

this can quickly turn into a pile of toggles, instead of just designing so that the default experience is both more accessible and aesthetically pleasing.

dusk comet
urban sandal
#

That's not the argument I made, and I don't appreciate the strawmanning of it

#

the very message you replied to includes me pointing out a toggle could possibly be an improvement. I don't think it actually is though. I think the solution comes from those who think the current underlines are ugly finding ways to contribute to improved designs that remain accessible by default so that we don't have a pile of hidden toggles people need to interact with before using a site.

#

Is a wheelchair accessible ramp, rather than stairs a good thing? Doesn't the ramp work for everyone? that's the parallel there. We need to find a way to make the designs more pleasing, but we shouldn't be sacrificing other users actual needs for perceived aesthetics.

ruby elm
#

I think the underlines aren't aesthetically pleasing, however even if a toggle existed I probably wouldn't use it. I even more likely wouldn't know it existed, since non-pleasing underlines are at about 0 on my list of concerns, over just reading the docs and getting on with my day.

halcyon trail
#

I don't really understand the big problem with a toggle here, especially when we already have one

#

it's kind of the classic slippery slope argument

dusk comet
halcyon trail
#

and slippery slope arguments are weak, because nobody is proposing adding a thousand more toggles, we're talking about adding one more toggle

#

if there were zero toggles currently, then that argument might be more compelling, as 0 to 1 is more involved in terms of how it changes support and such

urban sandal
halcyon trail
#

they can, and when someoen finds a design that does both well, you can always shift to it. at the moment there's one design that's vastly more popular and by all indications, tends to be preferred.
And another design that's more accessible. So a toggle is reasonable.

#

Like, if there's an argument against a toggle it's not this, it's "someone has to do the work" - this is kind of just arguing for the sake of it 🤷‍♂️

halcyon trail
urban sandal
halcyon trail
#

in new buildings, I still usually see both

#

stairs are a lot more comfortable, convenient and efficient, for people who are healthy enough to easily walk up stairs, which is still a rather strong majority of people

dusk comet
#

on some websited there is a toggle that toggles all accessibility-related things: link underlines, better color contrast, bigger elements and stuff like that
if we really care about that - i think we should make a toggle that toggles all/some of that, instead of enabling this by default

rose schooner
swift imp
#

I don't know what this means

spark magnet
swift imp
#

Right, he's leading the faster Cpython stuff are Microsoft

#

After reading what function inlining is, I guess I sort of get it.

jade raven
#

i love mark shannon, he's a wizard that just waves his wand and invokes his dark magic and we get faster code that no one except him can understand

wanton flame
# swift imp How would this be used to provide what benefits

from the PR itself:

Adds a tier 2 optimizer pass that converts the micro-ops for loading globals and builtins to constants.
This should have two benefits:

  • The resulting code is faster
  • It will enable further optimizations thanks to better constant and type propagation.

Benchmarking and stats show a ~1% speedup, but it is the stats that are interesting.

The tier 2 stats show that we have replaced 3 billion guards with 1.2 cheaper guards, and that all LOAD_GLOBAL_MODULE and LOAD_GLOBAL_BUILTINS have been replaced with inline constants.

There are some changes to the tier 1 stats, which I think are a result of not optimizing when the global/builtin keys version doesn't match, so we avoid poor optimization. This drops us back into tier 1 for later re-optimization to tier 2, hopefully when the set of global variables has stabilized.
This shows up in the tier 1 stats and optimization attempts.
What this suggests is that we should be de-optimizing faster in tier 1 in this case, as once the keys version has changed it will never go back to the original value. But that's for another PR.

cyan raven
#

Does CPython specify a custom type or does every single entry in the symbol table must use these?

typedef enum _block_type {
    FunctionBlock, ClassBlock, ModuleBlock,
    // Used for annotations if 'from __future__ import annotations' is active.
    // Annotation blocks cannot bind names and are not evaluated.
    AnnotationBlock,
    // Used for generics and type aliases. These work mostly like functions
    // (see PEP 695 for details). The three different blocks function identically;
    // they are different enum entries only so that error messages can be more
    // precise.
    TypeVarBoundBlock, TypeAliasBlock, TypeParamBlock
} _Py_block_ty;

is there a more low-level type system?
like i16 or whatever corresponds to one instruction.

feral island
#

that enum is used only in symtable.c which is primarily responsible for figuring out where each name is defined and what scope it comes from

#

it's not directly involved with bytecode generation

cyan raven
#

well okay, I suppose.

round path
rose schooner
#

yea

earnest bear
#

yes

fallen slateBOT
#

✅ silenced current channel for 8 minute(s).

#

❌ current channel is already silenced.

winged sphinx
#

!shhh

fallen slateBOT
#

❌ current channel is already silenced.

jovial flame
#

!unhush

fallen slateBOT
#

✅ unsilenced current channel.

ivory zenith
#

Topic

misty oxide
#

When are dis.Positions attached to an instruction included, when are they null, and why are the fields optional?

#

COPY_FREE_VARS for example doesn't have any line number associated with it. But is there a place where it may make sense to say it comes from?

rose schooner
misty oxide
#

I'm aware that it's at the start of the function, yes.

#

More wondering about the general case, I've already discovered fn.__code__.co_firstlineno

cold storm
#

Hiiiii everyone

hybrid relic
#

Hello, I've heard cpython is considering a rewrite of its build system, as someone coming from a rather long background of maintaining the JDK's make systems, I'm intrigued, is there anywhere I can view this effort or help out with it?

flat gazelle
#

@hybrid relic ^

hybrid relic
#

Thank you!

unkempt rock
#

Good one

rain trellis
#

Just a procedures question - I've got an open PR that's received some really helpful feedback, and that I've simplified/reduced the scope of radically. Would it be better practice/courtesy to the maintainers to push all the changes to the existing PR, or open a "clean" PR with a new branch?

feral island
rain trellis
#

Makes sense, thank you!

worthy sandal
#

Guys, I really suck at backtracking. When i see a question on backtracking my mind goes blank. Could anyone please suggest some good material to master backtracking

#

I really want to learn it

inland acorn
worthy sandal
inland acorn
#

It might mean going back in a maze and picking a different path, it might mean undoing some parsing and attempting to parser it as something else, etc, etc

#

Think of it almost like creating a save point in a game before trying something, when it goes wrong just load the save and try something else

#

Most backtracking problems can be modeled as a tree, where you are trying to find a specific leaf

knotty meadow
#

Can anyone help me to learn pandas library ?

quick snow
whole trench
#

hi, to my surprise when i run a python program after a couple of hours, it takes approx 20secs to even start executing the program. attached screenshot shows the time difference between

  • time of command invocation
  • time of main invocation
  • end of main invocation
  • end of command invocation

pls let me know the reason why the time delay

quick snow
rose schooner
#

python/cpython#115480 seems exciting

neon troutBOT
whole trench
quick snow
feral island
#

Most of the time is likely taken by your imports. Just initializing Python should be much faster

#

e.g. for me this is 20 ms % time python -c pass python -c pass 0.02s user 0.02s system 55% cpu 0.071 total

whole trench
whole trench
feral island
#

and possibly warming up file system caches

quick snow
feral island
#

(and @quick snow is right that in general imports can run arbitrary Python code, so all kinds of things might be happening)

whole trench
feral island
#

yes, if the pyc files are already there Python will just use them

whole trench
#

not the case here. if i run the same command after an hour's time delay is clearly seen

feral island
#

could also be file system caches, in that case

whole trench
#

but can still see files under pycache folders

feral island
#

OSes often cache recently accessed files in memory, making them much faster to access

#

but if you don't use those files for some time, they might get bumped out of the cache by something else

#

that could be what's going on for you, though it's hard to say without access to your system. you should consider using a profiling tool to get more insights into where your program spends time

whole trench
#

ok. let me spend time profiling to get better insights. thanks again

halcyon trail
#

is there a good summary somewhere of why the GIL-ectomy is so hard

#

not doubting it is of course, I'm just trying to grok it fully

feral island
#

!pep 703

fallen slateBOT
feral island
#

might be a good start

#

maybe Larry has a writeup of his previous project (which used the name "gilectomy"), not sure. Definitely should be able to find videos of his PyCon talks about it

halcyon trail
#

thanks!

spark magnet
halcyon trail
#

Like, you make all the refcounts atomic. okay, that has some performance cost, but incrementing an atomic is still very cheap

mild cobalt
#

Why it would be the case that the effort needs to not slow down single threaded code, or why it would slow down single threaded code?

halcyon trail
#

uhm because there's mountains of existing python code that's single threaded, so e.g. a 30% performance hit or something like that would really suck

spark magnet
halcyon trail
#

I thought mirashii was asking you for clarification 🙂

#

you said "the goal is not to slow down single threaded code" and the first part of their question was "why is that important"

mild cobalt
#

Also hi Ned, long time no see, used to bump into you all the time on Freenode… probably 15 years ago or so now. Good to see you’re still around and answering questions as always

feral island
#

And Python programs change a lot of refcounts

halcyon trail
#

so I guess, a lot of times the situation may be that rather than incrementing something many times in the "GIL-ectomied" code, you now maybe want to simply for example increment it once at the start, and then do the rest of the work keeping track of the ref count in a local variable

#

i.e. write to the "true" ref count fewer times

#

and take advantage of the fact that if you +1's the refcount once, that' ssufficient to ensure it won't be destroyed

#

one thing I will say is that AFAIU, in single threaded code, incrementing an atomic integer should actually be pretty much the same cost as incrementing a normal integer, on x86-64

#

well, very close to it

#

most of the "lot less cheap" of atomics vs non-atomics occurs due to extra cache invalidation due to contention

#

if there's no other threads then you will never get invalidated

feral island
feral island
#

not just slower per core, but slower overall

halcyon trail
#

gotcha

#

that's pretty crazy

#

guess there's a lot of reference counting

#

I guess this is the big advantage of tracing GC's?

#

once you start considering multithreading

mild cobalt
#

But quite insightful

halcyon trail
#

On x86-64, IIRC, acquire-release is completely free

#

and that should be sufficient ordering strength for a refcount

#

at least, that' smy understanding

#

actually that may only be for a pure read or write, rather than for a fetch-add

#

at any rate thanks for the link

urban sandal
simple lake
#

Me and a friend are attempting to add a new operator to pythons syntax. We've found where the grammar file is, and have some semblance of an understanding of the lexer/parser/compiler but we still have not found where the definition of any of the keywords or operators are. Does anybody know where we could find this? I'm just looking for the place where it says "Oh yea, when you add, you do 1+1"

feral island
dusk comet
#

do we need refcounts at all? they require memory, they should be updated basically every time you touch the object, every time they are updated they should be checked if they are 0, ...

as i see it, the main benefit of having refcounts is that 99% of objects are collected by RC-GC right at the moment they become trash

if we remove RCs, trash will live longer and garbage collection will become slower because smart GC has to collect all the trash, but is this a problem?

halcyon trail
#

As far as a general approach goes (i.e. without getting into the specific python implementation and issues caused by transitions)
Pretty much all high performance GC's are not refcount + cycle detection

#

they're fancy tracing generational GC's

urban sandal
halcyon trail
#

so "do we need RC's at all?" not as such, no, but transitioning python to tracing GC is probably a pretty insane task

dusk comet
flat gazelle
#

In fact, just about no other implementation of python has reference counts

halcyon trail
#

AFAIU python's GC is very simple: it does the refcount thing, and occasionally it runs something that finds cycle

#

I've never looked at the implementation but that's how it's invariably described

#

that's not how e.g. Java's GC works

flat gazelle
#

There is a concept of generations involved as a heurestic to figure out when to run the cycle detection

halcyon trail
#

sure, but that's not, AFAIU, what is generally meant by "generational" garbage collection

#

that usually refers to putting objects in different allocator areas, that are specialized for different tasks

dusk comet
#

there is an OS called Phantom - it is a persistent OS
they use pretty cool idea for garbage collection: if you take snapshot of the state and slowly find all the trash in it, you can remove this trash from the main state, because trash cannot be resurrected

i imagine in python it could work like this: fork process, run gc in it (main process still does its thing), when gc is done, all found trash is collected in the main process, then gc's process dies

halcyon trail
#

e.g. starting objects in a nursery

#

i mean if python was going to reimplement GC, we don't need to look at relatively obscure OS's: there's a pile of insanely tested, high performance, etc GC's that are used in Java, often for very similar purposes as python code (i.e. backend web servers)

#

but I'm sure python isn't going to reimplement its GC anytime soon

urban sandal
#

I would not be so sure that removing refcounts for a tracing gc would be an uplift in performance for python

flat gazelle
halcyon trail
#

IDK what's different about python, AFAIK all high performance GC's (in basically any sense of the word "high performance") are tracing and not refcount

#

it just boils down to how much you want to rewrite

urban sandal
#

and the languages have other design choices involved as well that help guide some of that

#

it isn't just "This is always better" because that's not really the case

#

there are ways to improve the refcounting performance, I think a few of them have come up over in either the faster-cpython work, or the subinterpreters discussion, but I'd have to go look.

halcyon trail
#

without strong evidence to the contrary I'm going to assume that python isn't different to everything else

dusk comet
#

i dont understand why you say that python's gc is not tracing: it does pretty much the same things that you described and wikipedia article described

halcyon trail
flat gazelle
#

Yeah, that's how the cpython gc works, it reaps the nursery first, and anything that survives leaves the nursery

halcyon trail
#

but "reaping" is only for cycles, is it not?

urban sandal
#

python has both reference counting and a cycle detector. It's a strategy that tends to outperform mark and sweep in single threaded code, and have smaller gc pauses in general

dusk comet
#

!e ```py
import gc
print(gc.get_threshold())

fallen slateBOT
#

@dusk comet :white_check_mark: Your 3.12 eval job has completed with return code 0.

(700, 10, 10)
flat gazelle
#

but yes, only cycles in practice

#

A tracing, generational GC to handle the complex case, and a refcounting strategy to handle the simple case.

halcyon trail
urban sandal
#

except python isn't the only language that has a hybrid approach

#

and the slowness isn't attributable to that

dusk comet
halcyon trail
#

which language are you thinking of?

ember willow
#

Hello, I learned the basics of Python and I'm learning Django, can anyone help me with what would be the pillars that I should focus on in Django for better learning?

halcyon trail
halcyon trail
#

but they're still different things with different goals. the tracer is looking for all objects that are disconnected, the cycle detector is.... lookign for cycles 🤷‍♂️

flat gazelle
#

I mean, it isn't looking for cycles, we call it the cycle detector, but it is just a tracing GC, it finds disconnected cycles by them being... disconnected, not by them being cycles

urban sandal
ember willow
halcyon trail
#

I mean you can call it what you want, it just confuses the conversation and doesn't seem to be in line with the standard usage

halcyon trail
#

(a quick google will find a dozen sources that say that the JVM does not do reference counting, but believe what you want, I've learned in the past it's not likely to be constructive to continue this with you)

urban sandal
#

cool, I didn't say what the JVM used at all in this as it isn't actually relevent to what the right choice is for python

halcyon trail
#

modern optimized garbage collection is hybrid in nature,

#

sure

#

TIL, JVM GC's (all 5) are not modern 🤷‍♂️

flat gazelle
#

JVM GCs are famously bad for latency (though I do believe modern ones are better at this), python does better here since there is less garbage to collect during pauses.

halcyon trail
#

there's two JVM GC's that are optimized for latency

flat gazelle
#

there are now

#

there weren't for decades

#

well, maybe a decade

halcyon trail
#

sure, i'm not offhand sure how that's relevant

#

to be extra clear: nobody was criticizing python for choosing refcount, 30+ years ago

#

or for choosing refcount at all

#

just saying it's not really a terribly efficient option, as it's turned out, but moving away from it probably isn't practical

urban sandal
#

I'm willing to say it's still the correct choice, and that there are gains available without changing the actual gc strategy.

It's also worth pointing out that in compiled languages, lifetime analysis can serve as a substitute for reference counts in the same strategy

#

that doesn't make those strategies non-hybrid in nature, and I dont appreicate the way you're just trying to say that "java does it, must be the only example or only point"

flat gazelle
#

The fact 0 other languages have that as an option makes it a bit hard to actually figure out if it is actually worse (it is arguably worse in multi-threaded), but its hard to even figure out if it beats a naive mark-and-sweep style implementation.

halcyon trail
#

I'm not even sure what you're trying to say. Reference counting means something specific: objects get destroyed immediately when reference count hits 0.

dusk comet
#

afaik, JS's GC is also causing significant latencies
i might be wrong, im not a JS expert

urban sandal
#

In many compiled and functional languages, you know when nothing has a reference without explcit counting

halcyon trail
#

Java does not do that; neither does Go, etc. Swift does that, but it does not really have a full GC, since it does not have cycle detection (you manually mark references as weak)

#

....

#

k, my fault for continuing to engage

#

anyhow

flat gazelle
#

most languages with refcounts seem to not have cycle detection (swift, lobster, one of nims modes), some cannot express ref cycles (kokka).

#

Most python implementation with alternative GCs also have a JIT of sorts, which once again makes it hard to tell

halcyon trail
#

modern GC's leverage every bit of that flexibility of when to destroy an object

flat gazelle
#

IG once nuitka-python gets their PGO working, we may have an apples-to-apples-ish GC comparison.

halcyon trail
#

that's how they've managed to optimize so much, over the decades

#

refcount + cycle detector basically ruins that because it's no longer really deterministic anyway

#

like, finalizers in python are not more encouraged than they are in Java

#

so, you haven't really gained much

#

in Swift, it's different: you have to handle weak manually, but in exchange you actually get true determinism

#

determinism has a lot of value; it means you for example don't really need context managers anymore

#

Refcount just has much less scope for optimization to start with. but yes, if we actually see more mainstream languages that use reference counting GC, that aren't slower for unrelated reasons (like python)

#

we'll have a better sense

dusk comet
#

relying on finalizers in python is almost always a bad idea, they are the last resort to keep consistency in bad situations

#

cool fact: zip can reuse the tuple it yields, but that happens only if the tuple is not stored anywhere else (RC==1)
(not sure if it is a thing in modern cpython, but iirc Raymond Hettinger was talking about this in some old pycon)

urban sandal
#

If you look at the actual tradeoffs involved, there are better ways to implement reference counting for performance while retaining the benefits of fewer gc pauses. An example for this would be looking at what actually gets generated for rust's Rc vs what python does at the interpreter level.

#

Other languages have made significant improvements in their gc implementations, but attributing the issue to reference counting when there is adequate research showing the known performance issues with reference counting have strategies available to remedy them, and that's been linked, isnt helpful to people interestedi n making python faster.

flat gazelle
#

You can't really do the level of static analysis (presumably)rust Rc, nim arc and kokka do to elide refcount changes in most cases in python.

#

python is uniquely poorly suited to sound static analysis

urban sandal
feral island
#

unless a debugger messed with the locals

flat gazelle
#

most languages (even dynamic ones) at least give you this as an axiom, python doesn't.

def f():
    x = y = object()
    g()
    assert x is y
halcyon trail
#

If refcount is 1 then it can mutate in place

urban sandal
feral island
flat gazelle
dusk comet
halcyon trail
#

It's hard to imagine how tbh

#

Python is very dynamic

#

Not usually idiomatically, but legally

#

And optimizers have to consider any legal code

feral island
#

I don't think JavaScript is much less dynamic than Python, and there are well-understood techniques for fast JITs in JS

dusk comet
halcyon trail
feral island
halcyon trail
urban sandal
#

what's to say a debugger didn't change fucntion code?

feral island
#

in the PEP 659 adaptive bytecodes you can see lots of DEOPT checks for that sort of thing

urban sandal
#

changing the values of locals wouldnt change the ref counts in any supported way of doing so

spark magnet
urban sandal
#

and yet it's doable with a debugger, or jsut with ctypes

spark magnet
dusk comet
#
if debugger_is_running(): do_slow_but_reliable_thing()
else: go_fast()
flat gazelle
#

The debugger can still jump to another line in a function in 3.12

raven ridge
flat gazelle
#

Ye, if you attach gdb, all bets are off

dusk comet
urban sandal
spark magnet
flat gazelle
raven ridge
urban sandal
raven ridge
urban sandal
#

I wasn't claiming python can use all of the things available via static anaylsis

halcyon trail
#

It mostly just makes it possible to avoid them, and allows you to safely use non atomic integers when safe

flat gazelle
#

I assume it does the usual increment reference+decrement reference becomes a no-op thing at least

raven ridge
halcyon trail
#

Are you talking about moves?

urban sandal
#

lifetime analysis in rust allows some reference count updates to be completely elided

flat gazelle
#

If I have a program like

a = b
a = object()
```then I do not need to actually change the refcount of the object in b (-ish)
halcyon trail
dusk comet
feral island
#

if the remaining number is nonzero, that means there are references outside of the objects the GC tracks

raven ridge
#

you beat me to pasting it 🙂

#

there is a tp_traverse field that provides information about what references the current object is holding
That's true, but irrelevant here. Consider references on the stack, for instance - the C stack doesn't have a tp_traverse that lets you ask what objects it holds references to.

dusk comet
dusk comet
halcyon trail
#

@flat gazelle so here's an example

raven ridge
#

your intuition is wrong 🙂

halcyon trail
#

Here's the rust code

use std::sync::Arc;

struct Foo {
    x: f64,
    y: f64,
}

#[no_mangle]
pub fn square(x: &Arc<Foo>) -> Arc<Foo> {
    let mut y = x.clone();
    y = Arc::new(Foo{x:3.0, y:5.0});
    y
}
#

I don't think it manages to elide the refcount changes on x

feral island
#

e.g. a C function could do Py_INCREF(obj), then invoke some Python code, then Py_DECREF(obj)

dusk comet
#

ok, that makes sense

dusk comet
feral island
#

if GC runs while the Python code is running, then there may be nothing that holds a reference to obj except for the C stack

halcyon trail
#

@flat gazelle

        push    r14
        push    rbx
        push    rax
        mov     r14, qword ptr [rdi]
        lock            inc     qword ptr [r14]
        jle     .LBB1_3
        mov     qword ptr [rsp], r14
        mov     rax, qword ptr [rip + __rust_no_alloc_shim_is_unstable@GOTPCREL]
        movzx   eax, byte ptr [rax]
        mov     edi, 32
        mov     esi, 8
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
        test    rax, rax
        je      .LBB1_2
        mov     rbx, rax
        mov     qword ptr [rax], 1
        mov     qword ptr [rax + 8], 1
        movabs  rax, 4613937818241073152
        mov     qword ptr [rbx + 16], rax
        movabs  rax, 4617315517961601024
        mov     qword ptr [rbx + 24], rax
        lock            dec     qword ptr [r14]
        jne     .LBB1_9
        mov     rdi, rsp
        call    alloc::sync::Arc<T,A>::drop_slow

You can see the lock inc and the lock dec, I'm not an expert on assembly but I'm guessing this corresponds to the atomic increment and decrement

#

i guess @raven ridge can maybe help confirm whether that's correct or not

flat gazelle
#

it is possible it doesn't happen with Arc, ye

#

it does work with Rc if I am reading the utterly incomprehensible assembly right.

halcyon trail
#

that doesn't mean that rust is doing any clever analysis

#

with Rc, you're simply incrementing and decrementing a non-atomic integer

#

it's trivial for LLVM to remove that later

flat gazelle
#

that's very possible, I mostly used rust as the example since it is the most familiar here, Kokka and Nim do actually do clever analysis.

#

(kokka for sure, could be wrong about nim, but it would surprise me)

halcyon trail
#

I mean if rust did anything clever here, your and sinbad's comments notwithstanding, I'd actually be shocked

#

because that's just not the point of Rust

#

and it's also much harder to be clever when these things are not truly "baked in"

#

Swift refcounts are part of the language

#

Arc has a bit of special treatment but it's basically just another type to the compiler

#

so when it sees .clone() which is running arbitrary custom logic, it's not that easy to just magically make these increment/decrement goa way

flat gazelle
#

makes sense

halcyon trail
#

rust's Arc and C++'s shared_ptr aren't fast, you just barely use them, so the language is still very fast on the whole

#

when shared_ptr first came out, some people got really excited and started using them as if it were a GC language, and in fact this made their code really, really slow 🙂

urban sandal
#

I mean, I said Rc, and you turned around and showed it doesn't work with Arc. There's a difference between those, and it matters to the optimizations possible. It's also why I linked a paper about reference counting strategies that decrease the number of atomics needed.

halcyon trail
#

I guess you still don't have a godbolt link?

flat gazelle
halcyon trail
#

I figured 🙂

#

Like I said, Rc in simple cases like these can just be optimized by LLVM; it's the same as doing

x+=1;
x-=1;

of course this gets optimized out. that doesn't imply any clever lifetime analysis

urban sandal
#

What do you think llvm is doing when it optimizes that out?

halcyon trail
#

it sees a non atomic integer incremented, then decremented... so it knows it can optimize it out?

raven ridge
#

escape analysis

halcyon trail
#

nothing to do with lifetime analysis

#

right, as long as the intermediate state cannot be observed

urban sandal
#

okay, now I want you to use your imagination here because the point I was making was about potential optimizations possible for python, and using inspiration from what other languages and compilers are capable of. Sorry if using the word lifetime with regard to rust caused you to get pedantic in a way that doesn't change the overall point

halcyon trail
#

it absolutely changes the point, because this is just a trivial low level optimization, bringing rust's lifetimes and clever static type system into it as you tried to do was totally irrelevant.

raven ridge
spark magnet
#

seems like people are getting sarcastic with each other, which isn't helping understanding. probably people are using words in slightly different ways. clarify.

urban sandal
#

To clarify: If there's no other scope something can possibly exist in, and we have any proof of that, we can elide some number of reference count updates until the scope sharing is possible

spark magnet
halcyon trail
#

I feel like you say a bunch of stuff, and then i ask for concrete examples of what you describe, and the examples never come, and you are more interested in one-upping people here than in constructive conversation 🤷‍♂️

flat gazelle
#

you can't really prove that something doesn't exist in an outside scope as far as python goes

raven ridge
#

you can if it was created in the inner scope

urban sandal
halcyon trail
#

you don't need to prove that it doesn't exist in an outside scope

#

you just need to prove that nobody can observe the intermediate state

spark magnet
halcyon trail
#

but even then, I mean, somebody could hit Ctrl-c couldn't they if the python code is running interactively?

#

forcing an exception up, in between the two lines

feral island
#

I'm also having trouble understanding what the disagreement is at this point

spark magnet
#

@urban sandal in any case, it might be instructive to read about the free-threading work happening now to remove the GIL. they are dealing with many things like this.

halcyon trail
#

well, integers in python are immutable so you'd need to change the example anyway

urban sandal
#

while I appreicate that people havent followed the whole chat, it gets a btit frustrating to have a bunch of people hop in and miss the overall discussion that's happened, so im gonna step back from this.

halcyon trail
#

but in e.g. C++, if you ahve

mutation();
stuff();
antimutation();

then as long as stuff() doesn't involve any non-inlined function calls, and as long as nothing in stuff() observes anything touched by mutation()

#

and as long as none of it touches atomics (which can be observed by another thread)

#

the compiler can in principle optimize out mutation() and antimutation()

spark magnet
halcyon trail
#

this is actually even true for atomics btw, it's just that fusing atomic writes is kind of crazy and compielrs don't really do that

#

in python, I don't think you can ever prove that stuff() doesn't jump outwards because you could for example just force an exception up

#

and probably, many other things

spark magnet
halcyon trail
#

and the whole optimization just becomes infeasible really quickly

flat gazelle
#

Yeah, I remain very unconvinced traditional techniques for improving refcounting performance can be applied to python

feral island
raven ridge
#

interestingly, exceptions contain a reference to the stack of Python frames, which contain references to their locals. So, for instance, py def foo(): a = 10 if random.random < 0.01: raise RuntimeError() a += 10 return a Arguably Python shouldn't optimize this down to a return of a literal return 20 because if it did the runtime error would hold a reference to a frame that's missing the a local it ought to have

flat gazelle
#

Yeah, various forms of speculative execution can probably work

halcyon trail
worthy sandal
#

Does anyone have coursera subscription that can be shared?

quick snow
#

!rule ad

fallen slateBOT
#

6. Do not post unapproved advertising.

grave jolt
feral island
grave jolt
#

Hmmmm

#

Yes

#

You out 🤓 ed me

simple lake
#

Do the methods of the tokenizer work in such a way that non-ascii unicode characters are not viewed as one character?

grave jolt
#

!ban @quasi stag scam

fallen slateBOT
#

:x: User is already permanently banned (#94486).

grave jolt
#

welp

feral island
feral island
#

it absolutely matters for optimization in the interpreter

dusk comet
#

it is an abstract "we"
i was talking about core developers
if random is a module, and random.random is documented to be a function, isn't it ok to assume that it is a function at runtime?

#

if someone is monkey patching stdlib - they are intentionally shooting in their own leg

feral island
#

it's part of what Python allows you to do at runtime, so optimizations need to consider that it can happen

#

it's also not inconceivable that it happens in real code; I've seen quite a few tests that mock.patch random.random

ruby elm
#

It's also very possible through beginner spaghetti, especially for things like max and min and list.

raven ridge
#

it doesn't necessarily know whether an import random happened, and even if someone did do import random it doesn't necessarily know that nothing has later overwritten the name random with something else - ```py
import random
random = random.randrange(5)

#

and that's before you even get into weird stuff like ```py
import random
globals()[input()] = 42

#

any changes to the compiler that would change the behavior of one of these programs aren't optimizations, they're changes to the semantics of the language itself

#

things like this are what quicknir meant yesterday when saying that the language can be very dynamic but almost never is. Though as Jelle points out, monkeypatches in unit tests are one place where Python really can be this dynamic in real world code.

sour thistle
#

database models can also be pretty dynamic in a way

iirc dataclasses uses eval() or exec() to create the init?

raven ridge
#

thanks to the hell that is the Python import system, it can't even assume that import random gives it a reference to the random module from the stdlib. Maybe the user has a random.py in the current directory. Or maybe they've got an import hook doing something weird... Or maybe they've assigned to sys.modules["random"].

feral island
#

you can't even assume it's a module

rose schooner
#

i did not know this was possible ```pycon

def g(a, /, **kwargs):
... print(a, kwargs)
...
g(5, a=2) # duplicate arguments?
5 {'a': 2}

feral island
rose schooner
#

how come i only learned about it now

feral island
raven ridge
#

that's one of the motivations for positional-only arguments - they let you define a function where the **kwargs can use any name, a la dict.update()

rose schooner
#

what's the point of the func_type mode when compiling? can't it be turned into a Callable/Protocol when evaluating?

unkempt rock
unkempt rock
rose schooner
unkempt rock
simple lake
#

anbody know where these names come from in pycore_ast.h

typedef enum _operator { Add=1, Sub=2, Mult=3, MatMult=4, Div=5, Mod=6, Pow=7,
                         LShift=8, RShift=9, BitOr=10, BitXor=11, BitAnd=12,
                         FloorDiv=13 } operator_ty;```
File says it's generated by `asdl_c.py` but I don't know what that works off of.
fallen slateBOT
#

Parser/Python.asdl lines 102 to 103

operator = Add | Sub | Mult | MatMult | Div | Mod | Pow | LShift
             | RShift | BitOr | BitXor | BitAnd | FloorDiv```
simple lake
#

thanks

simple lake
#

I have successfully (sort of) made a new operator now :)

#

unfortunately despite it building finally

#

the built version does not accept the new operator
But progress is progress :)

rose schooner
simple lake
#

Nothing except throw a syntax error

#

Cannot figure out how to attach the symbol to a bytecode thingy

rose schooner
#

codegen occurs in Python/compile.c

rose schooner
simple lake
#

me and my friend have the final goal of:
reassignable custom operators with no default behavior.
Currently we're just trying to figure out how one new operator works start to finish, so my idea so far has been to just make an addition clone

#

We also wanted to be able to make operators out of unicode characters, but thats gonna take a bit more work to understand

rose schooner
#

ooh

#

interesting

simple lake
#

Alright the big sticking point I keep coming back to is that in bytecodes.c, I see all these things like BINARY_OP_ADD_INT etc. but didn't know what they meant. By my best following of the macros and whatever, I found this in opcode_ids.h

#define BINARY_OP_ADD_FLOAT                    150
#define BINARY_OP_ADD_INT                      151```
But like what do these numbers mean? How does it know?
rose schooner
#

they're autogenerated along with the actual despecialized opcodes

simple lake
#

but how does it know that number is related to the operator or keyword or whatever

#

that is to say, how does it know when it sees a plus to go to the BINARY_OP_ADD_...

feral island
#

and at runtime, the specializer may change it to BINARY_OP_ADD_FLOAT if it sees that the code is usually adding floats

#

for your new operator you probably shouldn't worry about specializations

worthy sandal
#

Does anyone have coursera subscription?

placid wadi
#

why is my cmd python install going like 40kb/ps

my internet has 40mbps

simple lake
#
typedef struct {
    /* Number implementations must check *both*
       arguments for proper type and implement the necessary conversions
       in the slot functions themselves. */

    binaryfunc nb_add;
    binaryfunc nb_subtract;
    binaryfunc nb_multiply;
    binaryfunc nb_remainder;
    binaryfunc nb_divmod;
    ternaryfunc nb_power;
    unaryfunc nb_negative;
    unaryfunc nb_positive;
    unaryfunc nb_absolute;
    inquiry nb_bool;
    unaryfunc nb_invert;
    binaryfunc nb_lshift;
    binaryfunc nb_rshift;
    binaryfunc nb_and;
    binaryfunc nb_xor;
    binaryfunc nb_or;
    unaryfunc nb_int;
    void *nb_reserved;  /* the slot formerly known as nb_long */
    unaryfunc nb_float;

    binaryfunc nb_inplace_add;
    binaryfunc nb_inplace_subtract;
    binaryfunc nb_inplace_multiply;
    binaryfunc nb_inplace_remainder;
    ternaryfunc nb_inplace_power;
    binaryfunc nb_inplace_lshift;
    binaryfunc nb_inplace_rshift;
    binaryfunc nb_inplace_and;
    binaryfunc nb_inplace_xor;
    binaryfunc nb_inplace_or;

    binaryfunc nb_floor_divide;
    binaryfunc nb_true_divide;
    binaryfunc nb_inplace_floor_divide;
    binaryfunc nb_inplace_true_divide;

    unaryfunc nb_index;

    binaryfunc nb_matrix_multiply;
    binaryfunc nb_inplace_matrix_multiply;
} PyNumberMethods;```
The journey continues...
I found how `abstract.c` and `ceval.c` and `compile.c` eventually end up checking binary operators and it led me here. It seems like these binaryfuncs are supposed to be the operations to carry out. Where are these defined?
feral island
dusk comet
#

slots exist only for perfomance reasons, right? (and i guess they make it easier to define methods in CAPI)

quick snow
#

IIRC they aren't really faster than normal objects though

dusk comet
#

i am talking about method slots in type objects

dusk comet
feral island
#

It's sort of a weird system with some odd edge cases (like how some operations have multiple slots)

dusk comet
#

two operations can occupy the same slot (like __set__ and __delete__ use the same slot)

and one operation can occupy two slots (+ operator might perform a number addition operation, or a sequence concatenation operation; slots for them are different)

#

if I understand correctly, "slot" is just a pointer in the type struct
the thing i dont understand is how is it possible to sometimes store c-function in the slot, and sometimes store python-function
is it storing a pointer to some struct that can help distinguish between these cases? kinda a tagged union

feral island
#

I had to learn how this works to implement PEP 688

#

had to write both a wrapper to call the C slot from Python code and another wrapper to call the Python method from C code

simple lake
#

I feel like i got offered the red pill or the blue pill and chose wrong after digging into how Python works

#

just put me back in the simulation lol

pliant tusk
dusk comet
#

where can i read about this helper function?

simple lake
#

yea, is there any information on slots and what-not somewhere?

feral island
cunning gulch
#

Can anyone tell me how to be a shark in algorithms?
Some good sites where i can test myself or understand in easy. I feel like my teacher is bad at formulate things so i try to Think about homestudy for that class.

winged sphinx
pastel sky
#

hi

#

anyone here?

sand kestrel
boreal umbra
rain trellis
#

@final geode Would you or another member of the Microsoft team be willing to do a benchmark run on a branch of mine with superinstructions? JeffersGlass/cpython -> justin-supernodes-onlypairs https://github.com/JeffersGlass/cpython/tree/justin-supernodes-onlypairs

Local testing suggests it's ~6% faster than main + JIT currently, but I'd be curious what a benchmark on official hardware would look like

final geode
#

Sure!

#

Should be done in 2.5 hours or so.

#

Hm, is your branch based off of CPython's main, or my old justin branch? If the latter, the benchmarking may be using a commit prior to the JIT for the base comparison...

#

Just something to be aware of.

rain trellis
#

Good to know! Have not rebased it to current main yet - I'll look at the reference commit once the results are out and take that into account.

#

If the "official" benchmark results are still promising, I'll rebase on top of current main, and see about doing a larger/smarter selection of Superinstruction pairs

boreal umbra
#

I'm not advocating for it, but has there ever been discussion about adding do-while to Python, or a documented reason for why it's never been part of the language?

feral island
final geode
# rain trellis <@762845558435217449> Would you or another member of the Microsoft team be willi...

Looks like your branch failed to build.

python3.11 ./Tools/jit/build.py x86_64-pc-linux-gnu --file ./Tools/jit/superinstructions.csv
Traceback (most recent call last):
  File "/home/benchmarking/actions-runner/_work/benchmarking/benchmarking/cpython/./Tools/jit/build.py", line 6, in <module>
    import _targets
  File "/home/benchmarking/actions-runner/_work/benchmarking/benchmarking/cpython/Tools/jit/_targets.py", line 14, in <module>
    import _jit_c
  File "/home/benchmarking/actions-runner/_work/benchmarking/benchmarking/cpython/Tools/jit/_jit_c.py", line 156
    raise ValueError(f"Wrong number of first_nodes {len(first_nodes)=}\n{'\n'.join(str(f) for f in first_nodes)}")
                                                                                                                 ^
SyntaxError: f-string expression part cannot include a backslash
feral island
#

but I'm guessing we want a slightly bigger compatibility range

final geode
rain trellis
#

Ah gotcha, didn’t realize that. I’ll tweak things to be 3.11 compatible and test locally, and ping you again when it’s ready?

final geode
#

Sounds good. Maybe do the rebase too while you're at it, if it's not too much trouble?

rain trellis
#

Can do! May not be today

dusk comet
pseudo mason
#

yall whats the use case of classes

winged sphinx
rain trellis
#

@final geode That justin-supernodes-onlypairs branch should now be Python 3.11 compatible - it was all things in f-strings that were legal in 3.12 but illegal in 3.11 (nested same-type quotes, and backslashes).

It's also rebased on top of main as of last night. Locally, I'm showing about ~4% speed increase over main+jit with 251 superinstruction pairs.

final geode
#

Same as before, give it 2.5-3 hours.

#

Hm, I'm still getting a bunch of build errors (this time when compiling template_2.c). Are you able to build locally?

final geode
rain trellis
#

Let me have a look

final geode
#

Seems like you might need to redefine some macros or something.

#

Check out template.c for some examples.

rain trellis
#

I thought I had fixed this particular issue... let me see if I did a bad git again...

final geode
#

Longer error notes for a couple:

In file included from cpython/work/template_2.c:97:
cpython/Python/executor_cases.c.h:3825:17: error: use of undeclared identifier 'next_instr'
                GOTO_TIER_ONE(target);
                ^
cpython/Python/ceval_macros.h:424:5: note: expanded from macro 'GOTO_TIER_ONE'
    next_instr = target; \
    ^
In file included from cpython/work/template_2.c:97:
cpython/Python/executor_cases.c.h:3825:17: error: use of undeclared identifier 'next_instr'
cpython/Python/ceval_macros.h:425:5: note: expanded from macro 'GOTO_TIER_ONE'
    DISPATCH(); \
    ^
cpython/Python/ceval_macros.h:99:9: note: expanded from macro 'DISPATCH'
        NEXTOPARG(); \
        ^
cpython/Python/ceval_macros.h:153:30: note: expanded from macro 'NEXTOPARG'
        _Py_CODEUNIT word = *next_instr; \
                             ^
In file included from cpython/work/template_2.c:97:
cpython/Python/executor_cases.c.h:3825:17: error: use of undeclared identifier 'opcode'
cpython/Python/ceval_macros.h:425:5: note: expanded from macro 'GOTO_TIER_ONE'
    DISPATCH(); \
    ^
cpython/Python/ceval_macros.h:99:9: note: expanded from macro 'DISPATCH'
        NEXTOPARG(); \
        ^
cpython/Python/ceval_macros.h:154:9: note: expanded from macro 'NEXTOPARG'
        opcode = word.op.code; \
        ^
rain trellis
#

Ah, I had committed one of the superinstruction templates to git instead of letting it be auto-generated each time, and it was using an outdated version. I'm going to leave that checked in for now, since that branch now builds successfully on a clean pull, and maybe move those templates to a TemporaryDirectory so they don't hang around.

#

Should now (actually) be good to go. Apologies for the trouble.

final geode
final geode
rain trellis
#

Well dang, my bad again, I didn't understand fully how the the jit dependencies in the makefile worked, which interacted badly since jit.c is built at compile time in this branch. I have fixed this.

This does currently work for me locally:

git clone https://github.com/jeffersglass/cpython.git
cd cpython
git switch justin-supernodes-onlypairs
./configure --enable-optimizations --with-lto=yes --enable-experimental-jit
make
make test

However since this is the fourth time I've claimed this is ready... let me try from scratch a couple more times and step through the steps in _benchmarking.yml as well, to see if there's anything else I can find that is broken.

#

Sorry for the mess, and thank you for your time

hybrid relic
#

isn't that yes in --with-lto redundant

#

--with options in autoconf are automatically set to yes if they are the --with variant (not without) with no arguments passed to it

rain trellis
dapper lily
#

Your messages have been removed for being way off-topic

robust gust
#

Can anyone point me to where Python does integer arithmetic in CPython? I've been searching the source code and have been having trouble locating this. IIRC CPython uses infinite-width integers and I want to see the actual source code for this

fallen slateBOT
#

Objects/longobject.c line 3641

_PyLong_Add(PyLongObject *a, PyLongObject *b)```
feral island
#

worth remembering that ints are called "long" in the C source (if you've used Python 2 you might understand why)

#

elsewere in this file is the type definition for the long objects, where it has definitions for "slots" like nb_add that map to the arithmetic operations

#

from there you should be able to trace down to the functions where the actual arithmetic is done

hybrid relic
#

is the _continuation module something Python has or is it a PyPy extension?

dusk comet
#

!e import _continuation

fallen slateBOT
#

@dusk comet :x: Your 3.12 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "/home/main.py", line 1, in <module>
003 |     import _continuation
004 | ModuleNotFoundError: No module named '_continuation'
dusk comet
#

no, there is no such module

hybrid relic
#

!e import _socket

fallen slateBOT
#

@hybrid relic :warning: Your 3.12 eval job has completed with return code 0.

[No output]
grave jolt
unkempt rock
halcyon trail
#

Does someone know the algorithm that python uses to actually salt hashes

#

for strings

#

i.e. how it combines the deterministic hash of the string itself, with the randomly generated salt?

#

seems like its siphash, found a link to the paper, good enough

boreal umbra
#
In [12]: exp = (TypeError, ValueError)

In [13]: try:
    ...:     raise ValueError('hi')
    ...: except exp as e:
    ...:     print(e)
    ...:
hi

I didn't expect this to work.

flat gazelle
#

Interesting, will it also respect instancecheck

boreal umbra
#

what is instancecheck?

flat gazelle
boreal umbra
flat gazelle
#

Create a metaclass with this Dunder, then pass its instance to the except clause

feral island
flat gazelle
#

Good to know

feral island
#
...     def __instancecheck__(self, arg): return True
... 
>>> class E(Exception, metaclass=Meta): pass
... 
>>> isinstance(ValueError(), E)
True
>>> try: raise ValueError()
... except E: pass
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    try: raise ValueError()
         ^^^^^^^^^^^^^^^^^^
ValueError
dusk comet
#
class Meta(type):
     def __instancecheck__(self, arg): return True
     def __subclasscheck__(self, arg): return True
class E(metaclass=Meta): pass
raise E()

Traceback (most recent call last):
  File "D:\_.py", line 302, in <module>
    raise E()
TypeError: exceptions must derive from BaseException
dusk comet
cloud crypt
gray galleon
#

are python strings stored in UTF-8 internally

hybrid relic
#

this is a little complicated but

#

If you're using PyPy, Python always stores strings as UTF-8

#

if you're using the official Python, cpython

#

If I remember correctly it switches between UTF-16 and UTF-32 depending on the context

quick snow
#

CPython stores strings as either Latin-1, UCS-2, or UCS-4, depending on what's the highest codepoint.

#

The reason it doesn't do UTF-8 is because it's variable-length, so you can't do O(1) random access.

prime estuary
#

There is though a C-API function you can call to request a UTF-8 buffer. If it's all ASCII data it'll return the Latin-1 buffer (since it'll be identical), otherwise it'll allocate and cache an additional buffer for you. So you can treat them like they contain UTF-8 data.

gray galleon
#

not relevant here?

#

but really just enter the correct password lol
it's not a blue screen error

gray galleon
#

will python have piping```
a |> b |> c |> d == d(c(b(a)))

dusk comet
#

feel free to submit a pep

spark magnet
half wolf
# gray galleon will python have piping``` a |> b |> c |> d == d(c(b(a))) ```

Various versions of this has been suggested, but afaik always shot down. Personally I do quite a bit of work on Elm and I vastly prefer writing all my code avoiding that stuff. Makes my Elm look more like a lisp, but it's just much simpler to refactor and for someone who doesn't know functional languages well (including me!) to work with it.

dusk comet
#

btw this is doable without new syntax
you can override __rmatmul__ on functions and other callables to achieve that

rose schooner
radiant garden
dusk comet
#

>> and << would be nice for piping in different directions

grave jolt
#

Yeah, this feature doesn't work well without currying or some shorthand syntax for partial application

halcyon trail
#

idk if by partial application, you would include lambdas or not

grave jolt
#

Though even in Haskell/Elm you'll need a lambda or flip fn if the argument is in the "wrong" position

halcyon trail
#

yeah, python lambdas sucking is definitely a big issue with this approach

#

C++ has similar issues with ranges. Not as extreme in the sense that C++ lambdas are less limited, but they are very verbose

flat gazelle
#

You can make the pipe more syntactic by making it e.g. insert its result into a call as the first/last argument, akin to clojure arrow macros or raku feed operators.

radiant garden
#

placeholder notation is nice for pipes

halcyon trail
#

yeah, you just still have a mess when it's in the wrong position

#

coincidentally i was just involved in a discussion about a placeholder proposal for C++

grave jolt
#

[&](){}()

halcyon trail
#

it's just really brutal to not have nice lambdas from day 1, what can you say. very hard to retrofit. closes a lot of doors

halcyon trail
flat gazelle
#

Honestly, pipes aren't that great if you have to spam lambdas to get them to behave.

grave jolt
#

iife 🙂

flat gazelle
#

Really, you want pipes from day 1 so that your APIs are designed with the arguments in the right places.

halcyon trail
#

eh I think you have this backwards

radiant garden
halcyon trail
#

you have nice lambdas, because "piping" isn't something you do with a handful of pre-selected functions

#

it's a general appraoch to data manipulation

flat gazelle
#

I guess kotlin's way works actually, that's true.

halcyon trail
#

kotlin's way is basically "the way" at this point

#

C#, Rust, Swift

#

they all do something very similar

#

because it's a pretty simple approach that's flexible and works great. extensions + concise lambdas, boom.

flat gazelle
#

I am more used to the concept in lisps and functional languages

radiant garden
halcyon trail
#

you definitely can. it just somehow doesn't feel as nice when you're writing all your piping nested inside some function call

grave jolt
radiant garden
#

order of evaluation

#

python isn't lazy

halcyon trail
#

yeah, you'd need to pass them as lambdas

flat gazelle
#

For the simple case, you could pass tuples

grave jolt
#

You'd need to pass the function and the args separately yeah

quick snow
#

The better_partial lib supplies a decorator to decorate these functions with so you don't need to pass them in some lazy way

halcyon trail
#

that's generally one of the nice things about lambdas though, and why they're such a good building block.
there are a decen tnumber of use cases where you want to pass operations around and not having them evaluated immediatley

radiant garden
quick snow
#

But yeah, I feel like anyone dabbling in esoteric Python has at some point written a shell-pipe based lib now

#

IIRC in R a function can choose whether to take an argument evaluated (as in Python) or as some kind of AST-node-like thing

halcyon trail
#

ah R

quick snow
#

(I implemented this in Python once, although obviously not as convenient as in R land)

rose schooner
rugged flower
#

makinator

worthy sandal
#

hi guys, could anyone tell me how take sliding window of length 3 k=3 for py nums = [1, 3, -1, -3, 5, 3, 6, 7] I did something like this but it is not right.

for i in range(len(nums)):
    arr = nums[i:i+k]```

could anyone tell me the logic behind creating a sliding window?
steel solstice
#

you can just discard the lists that arent of len 3

rose schooner
hasty turtle
#

is that a python`s BUG? or Works As Intended?

feral island
#

!e import dis; dis()

fallen slateBOT
#

@feral island :x: Your 3.12 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "/home/main.py", line 1, in <module>
003 |     import dis; dis()
004 |                 ^^^^^
005 | TypeError: 'module' object is not callable. Did you mean: 'dis.dis(...)'?
feral island
#

It's to decide whether to show the "Did you mean" suggestion

subtle vector
#

😮 .. that actually makes sense.

hasty turtle
dusk comet
#
on_error:
  if callable(getattr(mod, mod.__name__, None)):
    show_nice_error()
hasty turtle
#

the nice error detection has side effect.....😅

gray galleon
neat delta
#

Rejected

i do enjoy, however, that

[ TBD: Guido, amend/confirm this, please. Preferably both; this is a PEP, it should contain all the reasons for rejection and/or reconsideration, for future reference. ]
is still there

jade raven
#

i like it

gray galleon
#

yeah

raven ridge
#

it's reasonably intuitive, but so is list(range(5, 1, -1))

jade raven
faint river
#

but now ranges aren't lists so 🤷‍♂️

raven ridge
#

and newbies do learn type conversions pretty quickly - at least int(x) and str(x). It's not a big leap from there to list(x)

jade raven
#

fair

gray galleon
#

range literals are short

#

and can be optimized at compiletime

soft drum
#

I'm using Ubuntu-22.04 on wsl which comes with python 3.10. I downloaded 3.12 and the pip along with it.
Now I'm getting this error: ModuleNotFoundError: No module named 'apt_pkg'

rose schooner
quick snow
peak spoke
rose schooner
native flame
#

this is how julia handles those

julia> [5, 6, 1:6, 7, 9]
5-element Vector{Any}:
 5
 6
  1:6
 7
 9

julia> [x:y for x in (1, 2) for y in (3, 4)]
4-element Vector{UnitRange{Int64}}:
 1:3
 1:4
 2:3
 2:4

julia> [(8:14:-1)]
1-element Vector{StepRange{Int64, Int64}}:
 8:14:7
dusk comet
#

this julia looks like python

gray galleon
#

ig maybe you don’t need [ and ] to create a range literal?```
0:5 == range(5)
1:2 == range(1, 2)
[1:7] == [range(1, 7)]
[*1:7] == list(range(1, 7))

umbral plume
#

[]-less syntax for range literals sounds like it'd clash with the syntax for type annotations - theoretically you'd want to be able to make ranges out of variables and not just literals (i.e. like doing x = start:end), but if you were to put that range literal by itself as a line of code, its indistinguishable from a type annotation

gray galleon
#

1..7 but that’s new syntax so fair enough

peak spoke
flat gazelle
#

I would take new syntax over a:b being a slice sometimes and a range other times.

native flame
gray galleon
#

what would step range syntax be by then tho

#

0..10..2?

peak spoke
#

Ah yeah that would be ambigous, but considering it's a noop when interpreted as a range I think it'd be fine if it was the annotation

gray galleon
quick snow
cyan raven
#

Does PEG grammars have regex support? what does Python use for such things? like how would a comment rule look like?

comment: '#' what_here?

This is not allowed based on the published paper.

([\n]!)*
feral island
cyan raven
feral island
#

otherwise it'd be very annoying because comments can appear everywhere, so you'd have to add them to every single rule

cyan raven
#

well yes, I understand that.

rose schooner
steel solstice
#

Isn't this a breaking change?

#

Cause if I do x: int I've now made a slice?

rose schooner
#

it's like a named expression

radiant garden
#

it is a breaking change for anything that uses annotations, like dataclasses

peak spoke
#

But there's no real use in interpreting it as a range in that context

#

I guess there could be some class magic that can work with the standalone expression but that's an extreme edge case and could work with the normal constructor

radiant garden
#

it's still semantically different (x:y requires x and y to be bound variables)

#

(and of course with metaclasses nothing is safe)

rose schooner
#

if a literal was restricted to those contexts it wouldn't be breaking

#

either range or slice literal

radiant garden
#

sure

boreal umbra
#

I don't have any issues with default arguments and how they're specified wrt mutable objects. Though if they didn't currently exist as a feature, and the specification were being debated, do you think the consensus would be that the default argument should essentially be a no-argument lambda?

def my_func(a, b=[]):  # `[]` is an expression that's re-evaluated each time `my_func` is called without a value for `b`.
    ...
native flame
quick snow
#

And you wouldn't even have the same counterspell (giving the function/lambda a default argument and putting the value into the closure)

halcyon trail
#

what's the other gotcha here?

#

FWIW this is how default arguments work in pretty much all other languages I can think of.

quick snow
#

!e

funcs = []
for i in range(3):
    def func():
        print(i)
    funcs.append(func)
for func in funcs:
    func()
fallen slateBOT
#

@quick snow :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | 2
002 | 2
003 | 2
halcyon trail
#

what does that have to do with default arguments

quick snow
#

It would make nested functions always behave like this.
You couldn't solve this by giving func a i=i argument anymore.

halcyon trail
#

I'm not sure I follow

#

it wouldn't change the behavior of nested lambdas at all.

flat gazelle
#

previously doing def func(i=i) changed that behavour to do 0 1 2

halcyon trail
#

but then you say: "you couldn't use (abuse) default arguments to do this thing"

#

sure. that's two separate points.

feral island
halcyon trail
#

I think I'm pretty okay withat, tbh, I think func(i=i) is pretty hacky and bad for readability

flat gazelle
#

but I do agree that having default arguments computed at call time makes more sense

feral island
#

So you couldn't use this quirk to get nested functions to behave in a someetimes more intuitive fashion

halcyon trail
feral island
halcyon trail
#

oh, for sure, no argument there.

feral island
#

Personally I think late-bound arguments are generally the better default, but it's too late in Python's evolution as a language to change the behavior now

halcyon trail
#

but "breaking change of existing behavior" was never in this discussion

#

the original comment was:

Though if they didn't currently exist as a feature, and the specification were being debated, do you think the consensus would be
And then that was followed up by a link to a proposal that would give it special syntax

feral island
#

Adding late-bound arguments as an option might be feasible but all the syntax feels ugly to me

halcyon trail
#

Agree on the first point

#

THe second, don't have much opinion on. Having half a dozen

if x is None:
    x = dict()
quick snow
#

Consider this:

def foo():
    x = 1
    def bar(y=x):
        print(y)
    return bar

You'd have to scan that "lambda" for any mentions of names that occur in the defining frame's locals.
What should happen if that expression doesn't resolve (e.g. typo) at call time? NameError?

I don't know. Maybe I've just used Python so long that only the current behavior makes sense to me.

halcyon trail
#

at the start of functions is also pretty ugly

quick snow
#

x = x or {} :)

feral island
#

def f(x=types.MappingProxyType({}) 😄

halcyon trail
#

abusing truthiness, you truly have used python for too long 😛

quick snow
#

It's use, not abuse. Quack. :P

halcyon trail
#

😛

flat gazelle
#

Yeah, I wouldn't use a non is check there

halcyon trail
#

as for your example, idk, that example just isn't going to behave intuitively IMHO, in general, because it's not intuitive code.

#

it's cool to demonstrate quirks of python but in a real codebase you best believe anyone who writes that is just getting told to rewrite it

#

so whether early binding defaults make that example more or less intuitive, is for me, a total non-factor

quick snow
halcyon trail
#

what's important is that early binding makes the 99.99% common case unintuitive

flat gazelle
#

There are ways to solve it without the default argument abuse though

halcyon trail
#

it was a demonstration of how default argument hacks can help solve that problem

flat gazelle
#

for example ```py
@default_arg(name=value)
def inner_function(a, name): pass

halcyon trail
#

but they're not the cause of the problem

#

Isn't the exact same thing that causes the headaches with default arguments, what also causes the problem with your example?

#

early binding?

#

I just tried this with kotlin fwiw, to keep it with GC languages:

fun main() {    
    val funcs = (1..3).map { {print(it)} }
        
    for (f in funcs) {
        f()
    }
}

Prints 123

quick snow
halcyon trail
#

🤷‍♂️

#

so far all your example shows is that python's behavior around this stuff is very unintuitive in two common examples

#

and you're basically saying you wouldn't want to see one fixed, because that would remove a hacky solution for the other

quick snow
#

Nobody's talking about fixing anything anyways; this isn't going to change. It's about "what if Python was designed today".

I find the current behavior very intuitive, I don't have anything else to say I haven't already.

halcyon trail
#

I've never met someone who said they found the result of

def foo(x=[]):
    x.append(5)
    print(x)

foo()
foo()

intuitive before, so that's interesting for sure

flat gazelle
# halcyon trail I just tried this with kotlin fwiw, to keep it with GC languages: ```kotlin fun ...

fwiw, kotlin does the same thing as python here if you use the same variable in all the lambdas - kotlin creates a new variable in for loops (and obviously map, python would do the same thing with map too), so it avoids the weirdness by having more scopes. e.g.

fun funcs(): List<() -> Int> {
    var res: MutableList<() -> Int> = MutableList(0){{0}}
    var i = 0;
    res.add{i}
    i++
    res.add{i}
    i++
    res.add{i}
    for (j in 1 .. 3) {
        res.add{j}
    }
    return res
}```does 2 2 2 1 2 3
halcyon trail
#

the thing is that i++ "should" create a new variable as well

#

LIke, your brain would like to believe:

    var i = 0;
    val closure = i;
    i += 1;
    print(closure)
#

and closure here is still 0

#

at least, my brain would like to believe that.

#

it's like the closure is binding to the name itself, rather than to the value the name is bound to, which is how assignment works in these languages

flat gazelle
#

It's more common since it lets you mutate the variables from inside the closure, that is

def foo():
    hits = 0
    def handler():
        nonlocal hits
        hits += 1
    run_thing(handler)
```will work (in languages without `nonlocal`, it is the only useful default).
(fwiw java does forbid this, you cannot mutate variables if they are also used in a closure, both from inside and outside the closure).
flat gazelle
#

I would argue this should be the programmers choice, like in some algol and C++, but that then gives you yet another binary choice to teach.

halcyon trail
#

It's a bit inprecise

#

Things could still be mutated from inside the closure

#

They just couldn't be reassigned

flat gazelle
#

Ye, that's what I meant

halcyon trail
#

Integers happen to be immutable

#

But I see your point

flat gazelle
#

I have had to write int[] hits = new int[1] in java exactly once for exactly this reason.

halcyon trail
#

but arguably this is no differnt than how functions cannot "mutate" their integer arguments

flat gazelle
#

Ye, and that is also occasionally annoying

halcyon trail
#

the solution is just to find a different way to express it, and given all th epotential headaches with shared mutability in GC languages

#

is this ever really a bad thing, considering the long term?

jade raven
#
import tkinter as tk

class App():
    def __init__(self):
        self.root = tk.Tk()
        self.root.geometry('350x350') 
        self.root.title('Example')
        self.root.mainloop
        return

if __name__ == '__main__':
    App()```
#

when running this snippet on the REPL, after putting in the code and then hitting enter, the GUI opens.

#

however this won't run when you do it via python main.py because it's missing the parens around mainloop, i feel this is a bug

#

@spark verge asking here since the thread closed

spark verge
#

It will flash open the window and close it again, it does run

jade raven
#

it doesn't even flash the windows and close it for me

#

it simply runs without opening the GUI

#

as expected

spark verge
#

But if you put a time.sleep in somewhere then you'll see it

jade raven
#

before or after the mainloop?

#

i'm not seeing it

spark verge
#

If you hit newline/carriage return maybe the window will close

jade raven
#

the UI only opens when you hit enter (on repl)

spark verge
#

Right

jade raven
#

so shouldn't this be a bug regardless? not sure

spark verge
#

Does it disappear after _ gets reassigned?

jade raven
#

no

spark verge
#

Oh weird maybe that's a bug that tk doesn't cleanup on __del__

hazy jasper
#

Hi, all - I wanted to have a discussion on Python Docstrings styles, comparing two specific styles. Is this the right channel or should this go in a Help Channel? Didn't want to post full blurb until I was in the right space.

spark verge
feral island
#

or a documentation channel?

hazy jasper
#

I'll post here for now, @spark verge / @feral island , and I can always pivot if need be!

A few weeks ago, I discovered the following style of DocStrings and really liked it:

def fahrenheit_to_celsius(fahrenheit):
  """Converts Fahrenheit to Celsius.

  Args:
    fahrenheit: The temperature in Fahrenheit. (float)
  Returns:
    celsius: The temperature in Celsius. (float)
    """

However, yesterday, I discovered another style that I can sympathize with as a good style (albiet a bit more busy/cluttered):

def repeatStr(userStr, repeatTimes=5):
  """
  Accepts user input string & returns string output repeated on separate lines.

  Args:
    userStr: string
      The string to be repeated.
    repeatTimes: int (Optional, Default=5)
      The number of times to repeat the string.
  Returns:
    outputStr: string
      The string repeated on separate lines. 
  """

The 1st style to me feels more Pythonic, although I could see it quickly broaching line length limits from various formatters/linters.

I wanted to get ya'll's thoughts on each or a preference as I work to build my own understanding of best practices/personal preferences.

feral island
# hazy jasper I'll post here for now, <@172270232499388416> / <@783088578363523104> , and I ca...

There are different styles used by different projects, I think "Google-style" and "numpy-style" are common terms but I haven't familiarized myself with them much. In your second style, I don't like that the default value is repeated from the signature. (And the type too, if you add type annotations, as you probably should.) That is bad because there's always a risk that the docstring and code go out of sync with future changes.

spark verge
#

There's always PEP 727

hazy jasper
feral island
hazy jasper
feral island
hazy jasper
grave jolt
#

the __init__ signature is more than 750 lines long

#

also, all the languages they surveyed used a microsyntax inside the docstring, so it's not clear what that survey is for 😛

grave jolt
# grave jolt here's how the author of PEP 727 uses it (via `typing-extensions`) <https://gith...

Take this: ```py
routes: Annotated[
Optional[List[BaseRoute]],
Doc(
"""
Note: you probably shouldn't use this parameter, it is inherited
from Starlette and supported for compatibility.

            ---

            A list of routes to serve incoming HTTP and WebSocket requests.
            """
        ),
        deprecated(
            """
            You normally wouldn't use this parameter with FastAPI, it is inherited
            from Starlette and supported for compatibility.

            In FastAPI, you normally would use the *path operation methods*,
            like `app.get()`, `app.post()`, etc.
            """
        ),
    ] = None,

You could make it a little less verbose by just improving the writing inside the docstringspy
routes: Annotated[
Optional[List[BaseRoute]],
Doc("Routes to serve incoming HTTP and WebSocket requests."),
deprecated(
"""
It is inherited from Starlette and supported for compatibility.
Instead use the path operation methods, like app.get(), app.post(), etc.
"""
),
] = None,
but that's still more verbose thanpy
routes: Optional[List[BaseRoute] = None,

Routes to serve incoming HTTP and WebSocket requests.

deprecated: It is inherited from Starlette and supported for compatibility.

Instead use the path operation methods, like app.get(), app.post(), etc.

#

Some languages support annotations for parameters:

@doc("Routes to serve incoming HTTP and WebSocket requests.")
@deprecated(
    """
    It is inherited from Starlette and supported for compatibility. 
    Instead use the *path operation methods*, like `app.get()`, `app.post()`, etc.
    """
)
routes: Optional[List[BaseRoute] = None,
``` though this is a pretty big addition to the language
peak spoke
#

It always felt like this would need an ide to nicely work with as it adds quite a bit of noise to the signature

#

typing alone is a bit too much already in some cases

grave jolt
#

Yeah, it acts as a painful roadblock when reading the code

#

WIth a single docstring, you can at least click a > button on the left of most editors (and on GitHub) and hide it all

hazy jasper
peak spoke
#

Yes but having to use a properly configured IDE to read some code isn't ideal

hazy jasper
#

IDE doesn't have to be the only way to read the code - it's my understanding this would flow through things like python3 -m pydoc any_module_name_here , unless I'm overly optimistic about PEP 727 implementation

hazy jasper
# hazy jasper I'll post here for now, <@172270232499388416> / <@783088578363523104> , and I ca...

Tying back to this, I think I'm leaning towards Style #1 based on the above convo until a more formal definition comes out.

def repeatStr(userStr, repeatTimes=5):
  """
  Accepts user input string & returns string repeated on separate lines.

  Args:
    userStr: The string to be repeated. (string)
    repeatTimes: The number of times to repeat the string. (int) <Optional> 
  Returns:
    outputStr: The string repeated on separate lines. (string)      
  """

Sample with the Optional added in angle brackets & default removed from Args, only shown in the parameter list (where readers can know it's defaulted since it has a value set)

urban sandal
hazy jasper
#

Now I'm thinking it's redundant to state Optional when a default value is set

grave jolt
#

I feel like this style of parameters often just restates what has already been said

#
def repeatStr(userStr: str, repeatTimes: int = 5) -> str:
  """
  Repeat string on separate lines
  """"
#

annotations for individual parameters are more useful when they are more or less independent. Like if the function had a boolean flag that changed how it behaves in an edge case

quick snow
hazy jasper
grave jolt
hazy jasper
# quick snow But `Optional` doesn't mean "may be omitted", it means "...or None"

In the case of the above, a default value is set, so "Optional" to me meant "It's optional for a user to update this beyond the default value, but not required for successful execution". In your statement of "or None", my intent with a default value is the program executes whether or not that variable is supplied elsewhere as a parameter, so I think we're saying the same thing?

grave jolt
#

yeah, "optional argument" does mean that, it's the typing.Optional that's the misnomer

hazy jasper
grave jolt
#

I'd rather someone put too much into a Docstring than not enough or not write one at all
I'm not quite sure about that. I think it's worth teaching to keep the right balance

#

Comments aren't free, they require maintenance. An incorrect, redundant or confusing comment makes the code harder to read, not easier

hazy jasper
#

"too much" meaning my sample from 1:44 pm MT. If that's "too much", that's way better than none at all; I can see the value of your suggestion from 2:01 PM MT, but it comes across a bit more difficult to interpret as I imagine a beginner reading it (similar to if I asked him to write a list comprehension already... that would be ill of me so soon)

hazy jasper
grave jolt
#

Yeah, especially if the docstrings are mandatory (i.e. if every function and class requires a docstring or the CI will fail)

hazy jasper
fallen slateBOT
#

bot/exts/info/codeblock/_parsing.py lines 57 to 62

class CodeBlock(NamedTuple):
    """Represents a Markdown code block."""

    content: str
    language: str
    tick: str```
grave jolt
#

It's going to be a mystery what tick is, if you don't already know that the Python bot will complain at messages using the "wrong tick"
'''py
print(42)
'''

hazy jasper
#

That is indeed a good example of that. Had me turning my head and squinting my eyes until I realized the problem in your sample.

vital urchin
#

is garbage collection called between cpython frame objects within the eval loop? or when is it called?

feral island
#

more seriously, I think it can currently happen after every bytecode instruction? previous versions allowed it anywhere when an allocation was triggered

uneven raptor
#

now how bad of an idea is this

import foo
bar = foo(…)
lucid snow
#

so, what are the chances of that? as far as I know, stuff like module level __getattr__ is a thing that's recognized by linters, so why not do that for other module level "magic" methods such as __call__ and so on

fallen slateBOT
uneven raptor
#

could certainly save some boilerplate in smaller modules

feral island
#

was just rejected

#

and so was

#

!pep 713

fallen slateBOT
uneven raptor
#

:(

#

any reason

lucid snow
#

nooo, my hopes and dreams... crushed bread_pensive

uneven raptor
#

oh wait there’s hope

#

apparently if they come up with a good reason for it theyll consider revisiting

feral island
jade raven
#
uneven raptor
#

i saw

vital urchin
#

on an unrelated note: is there a way to add a gdb breakpoint within a .py file?