#internals-and-peps
1 messages ยท Page 26 of 1
i don't really like depending on the system python generally anyway
it's definitely not a blast
i've been meaning to write ablog post about how micromamba should, IMHO, simply be the default recommendation to people at this point, to get up and running with python
https://www.bitecode.dev/p/why-not-tell-people-to-simply-use reading this is what made me want to write it
Have you read this? https://discuss.python.org/t/wanting-a-singular-packaging-tool-vision/21141
Just wanted to second what Steve said. In whatever way it is possible, rallying towards providing the kind unified tooling experience like Rust has would have huge benefits for the Python community. The kind of unified, cross platform experience the Rust team has managed with rustup/cargo/rustc that bootstraps so well would be great. For Python...
It's a long discussion but should be pretty relevant to your blog post
I remember reading this once and I don't think I agreed with everything there
but yeah there are too many ways that's for sure
that's a lot to read ๐
I gave up about halfway through
but still learned a lot
I want micromamba hatch to be the tool for everything
but yeah, everything I skimmed basically reinforced my view - there's no obvious technical issue with micromamba, or at least, nothing that it does worse, except that some relatively obscure packages may be available on pip, but not conda
how obscure is a big question
it's just a weird thing where a surprising number of people haven't heard of it - or just don't like it for kind of shallow reasons ("dont like the UI" was mentioned in the thread somewhere)
I honestly haven't been able to get people at work to use anything other than poetry or plain old pip
pretty obscure, but obviously I can't guarantee that everything is there. You also do have some ability to install pip packages into a conda environment. You do have to be a lot more careful when doing that though.
you are suprised people haven't heard of micromamba (I hadn't). We are surprised you hadn't heard of hatch. The ecosystem is very large.
you haven't heard of conda either?
oh, sorry, I have heard of conda.
it started as a packaging tool, but is growing to do more things.
And even the huge popularity of poetry was almost not enough to make people agree to switch to it. people really don't like changes to their workflows with tools they haven't heard of
a project manager
that's understandable, but the win of conda over every alternative I've heard of is so massive
I'm not sure what's micromamba if that helps lol
but what is a "project manager" - that sounds like a person, not software
this feels like there should be an xkcd about it
i find that when people feel like one tool is a clear winner, it's because they have unstated constraints or requirements that are different than other peoples'
micromamba is a package manager/environment creator - its creates environments that are the same as conda environments, uses conda packages.
it's basically just a different application for creating conda environments.
mamba is a variant of conda most notably featuring a much faster solver. miniconda and micromamba are minimalist distributions of conda/mamba that don't include conda's default 100ish packages
Not really - I just honestly think that most people doing python just don't methodically approach their native dependencies
I've yet to speak to someone who doesn't use mamba/conda, that actually understands whats happening with their native dependencies, to be totally honest
what is "solver"?
that sounds a bit like, "most people are not as good at this as I am"
most people doing python don't know as much about the native ecosystem as I do, yes - I'm primarily a C++ developer, who does a fair amount of python, so that's not too surprising.
I've also done a fair amount of work supporting quantitative python distributions and environments on a variety of pretty old servers - and it's pretty wild how big of a hole you can dig yourself into
conda is mostly used by people needing to install non-Python packages (science, ml, etc). Most Python devs don't need those packages.
they need packages that need those packages
my limited understanding is that it's supposed to be your one-stop-shop for developing in Python. Running scripts, creating environments, building packages, static analysis, test execution
lots of python packages depend on native packages
if you use a database - your python package is probably a wrapper around a C library
etc
everything you need for python development in one tool
dependencies and such. when an update or new package is requested, the program has to find a state of versions of all installed packages that's compatible with the change request, and occasionally has to install new packages. mamba is conda rewritten in C++ and thus makes this process much faster. it has other nice features, though, too
like, things have improved a lot in the sense that now people use docker - so at least your native packages are reproducible - if they worked once, then they'll continue to work on other servers
but your native packages still aren't solved properly
note that "native packages" includes the python interpreter itself
@neat delta @halcyon trail a quick tiny-poll, have you heard of uv?
I have
I see
astral's pip variant iirc, yes
interesting that uv made more noise than hatch
most of what uv is trying to do, micromamba already does - just better, since uv still doesn't handle what micromamba does
funny how that works
yes I'm not trying to compare, I personally don't think uv is production-ready
micromamba kind of achieves the gold standard for what you want in bootstrapping an environment
it's a statically linked executable, so it can be installed by literally just downloading it
and it installs everything for you
the python interpreter, all the python packages, all the native dependencies of all the python packages
all correctly "solved"
when you create a mamba/conda environment, your only dependency outside that environment will be libc and a handful of similar, ultra low level libraries.
sound very intriguing but I'm not sold yet haha
Send your blog post here when you get around to it!
idk, this seems like the dream to me ๐
@halcyon trail you're talking as if the Python world is littered with poorly installed dependencies. I don't see that. What am I missing?
I care much more about this stuff than what you said
I'm not sure what you mean by that
oh, thanks
i didnt think it is such an expensive process
i said earlier that most python devs don't need native packages. You said they do because of transitive dependencies, and you are saying conda does a good job of installing them where other package managers don't.
I don't really understand why I need a "one stop shop" for, for example, static analysis and test execution - everyone is going to want potentially different static analyzers, or unit test frameworks, and you can just isntall the ones you want and run them.
unless i misunderstood.
Like I don't understand the value add I suppose.
yes.
that implies that people not using conda have bad installations?
I'm saying when you setup a pip environment, for example, you'll probably have packages that depend on libcurl - where does that libcurl come from? is it going to be the right version?
is it the wrong version?
No - most of them time it "just works out" - until it doesn't, and that's when people suffer and complain that things don't work, and they don' tunderstand why
there's no guarantee it's the right version
Seems to be a non issue with plain pip and wheels
native package = package written in C?
or a package that provides bindings for external (unrelated to python) project?
i don't see that many people complaining that things don't work
(things like that)
it depends just how heavy your native dependencies are - the lighter they are, the more chance you have for things to happen to work
This is a matter of "convention of configuration". 99% of the projects don't need anything special. I don't want to rethink what static analysis tool or test framework I need. I want a consensus on the best practice and an easy way to do that. I trust smarter people than me to choose what's the best one.
how do you figure that?
i think they are saying the same thing I am: 98% of the time, it works fine.
(probably more than that)
if nothing else, I can offer that I see plenty of people complaining about trying to debug their application, or setup their development environment, inside docker.
presumably, they do that to reproduce the native environment - if it was just python packages, they could simply pip install it to whatever machine and that would be sufficient.
with conda, you don't need that - it's far easier to activate a conda environment and start developing in that, than to work inside docker.
wheels allow installing the right binary version, pip uses them, seems to work fine for the vast majority of packages and I don't see too many users complaining, like before wheels. So I guess I figure by having a vague feeling? ๐
Anyway @halcyon trail I think you should really read that thread if you want more insight on why not everyone uses conda, Paul isn't on this server I think but he specifically presents some use cases there.
the versions of the binaries are still not solved. afaics, in cases where this happens, the dependencies are simply installed privately to that pip version.
that doesn't really work unfortunately. Or rather - it will work until it doesn't.
there are reasons why conda isn't used by everyone
wheels may bundle their binary dependencies, which solves some problems at the cost of more space usage
I skimmed a good chunk trying to figure out and didn't see anything concrete - if you have something though by all means.
You also mention docker. I've been programming in Python for about 20 years, never touched docker, conda, etc.
I mean even in this convo - I'm not seeing good reasons why not. Just discussion whether the reasons to use it are so strong.
Skimming is a start but you should actually read it
well... okay? :-). What should I add to that? I'm not sure how you ensure that things will work between dev and prod, or between different servers - these are both ways of ensuring some degree of consistency. Maybe you have another way, maybe you work in a domain where you don't need it, etc
this is really the whole crux of it - you cannot really "bundle" dependencies that are .so's
like you "can" but it's just fundamentally a flawed approach
it's just question of how long you get away with it before there's a collision
pretty long, I'd say
these dependencies are not private, they are public, and two versions cannot coexist in the same process
as long as you're not touching GPUs
err what
that's clearly not the case because the entire quantitative python ecosystem uses exactly these tools - regardless of GPU's.
it just takes a handful of common native dependencies in your python ecosystem to turn this into a mess.
I just never ran into this, I don't know
I believe you when you say it happens
99% of my python packaging problem are with sdists
what are "sdists"
I guess this stuff - https://docs.python.org/3.10/distutils/sourcedist.html
basically installing from source
you download the code and run a setup.py
(at least that's what pip does for you)
that causes problems since you actually have missing dependencies very often
one last example I guess - even the python interpreter itself.
if you maintain a library, you may well want to run unit tests against multiple versions of some of your dependencies - your biggest dependency is obviously python itself.
with micromamba, python itself is just another pacakge in your environment
so its trivial to, for example, have two lock files - one that specifies all your package dependencies along with python 3.12, and one that specifies your package dependencies along with python 3.10
you just activate each environment, the same way as you would activate a venv - except when you do, you also change python versions
run your tests with each environment active, in CI
Also re wheels -
https://discuss.python.org/t/native-dependencies-in-other-wheels-how-i-do-it-but-maybe-we-can-standardize-something/23913
the part on "why not conda" is pretty funny
Iโd like to discuss mechanisms for python packages/wheels to depend on native dependencies that were installed by other python wheels. Specifically, Iโve already solved this problem in a custom build tool and I think the way itโs implemented would be useful for other projects if this pattern + implementaton details existed in some standard/stand...
why is that funny?
i think it's funny how it's siimlar to the discussion before, at least the parts of it I was able to skim
there's no technical discussion - just "I had a bad experience, so lets just not talk about it"
to be fair, your criticism of not using conda is "things might break eventually."
That seems pretty unfair - maybe your intent here is to be tongue in cheek?
no, my point is that you also don't have deep details about how they might break. There are tooling alternatives, and people choose different ones, but you seem to be saying there is One Right Way and if people don't choose it's because they aren't good at their jobs.
I've seen things break first hand, many many times. There's an entire sub-domain within python that uses this tool, because they've seen things break many, many times.
And I've also provided a bunch of other examples of benefits - nobody has provided any of the benefits of pip, conversely (well, I did - I said a package might be on pip, but not on conda-forge)
I've given deep details here? I addressed to Jelle, for example, why wheels approach simply isn't scalable?
i should stop. i'm not trying to convince you to switch from conda.
I also linked a thread that's trying to solve problems with wheels that they don't solve out of th ebox.
I mean, that's fine, I don't think you are - I just don't understand why you're saying - "you haven't said anything".
I've written a fair amount, and it's all been pretty technical and specific, and I've admitted it will not affect every python project to the same degree.
Like, let me help you - if you know good reasons not to choose micromamba/et al, over pip based solutions, other than package availability, can you please tell me what they are?
dont you, by using micromamba, have to install stuff in the mamba way? python, venvs, etc?
I have no problem believing they exist - people just seem oddly reticent to say what they actually are - hence why it's funny to me to see 2 threads in a row where people just vaguely handwave and say "I had a bad experience"
i was forced to use micromamba to reproduce an issue about 3 months and it was a nightmare.
I've never heard of micromamba, and I guess have never had a problem that it solves better than pip.
it replaces venvs
right, so it forces me to do things the mamba / conda way
if you never heard of it, how would you know that...
it takes away my freedom of choice.
err what?
if you use pip, or venv, you also have to do things the pip way, or the venv way.
ok, i've never had a problem that pip doesn't solve.
if i use pip i have alternatives
same for venv
because you're layering another thing on top of it? I don't really understand what you're saying.
Do you have a specifical technical problem with how mamba worked, or is it just "freedom of choice" ?
The thing is, many of the solutions you mention (binaries, multiple python versions, etc.) have been solved one way or another to the extent users and library maintainers don't see a need to find a better tool for the job. Which conda may well be.
i did when i was forced to use it, it was a convoluted specific way of doing things, their way that i struggled with and put me off
hello
I have my code but how to execute it ?
like if I coded a bot that will do something on discord how do I execute it ?
and should I use visual studio pycharm or other ?
because it opens me the console
which users, and which library maintainers? A huge fraction of python users are using these tools as standard. For sure, there are some users that have less need of solving these issues. but then you're also back at fragmentation.
idk, obviously its very hard for me to address this without specifics. install micromamba, creating an environment, and activating it, overall seems pretty intuitive to me - it's about 4 lines
i can say the same for vanilla python + venv and pip.
i guess a lot of the convo was spawned off the "singular packaging tool" - maybe the overall point here is that mamba/conda could work as a singular packaging tool.
Mostly what I'm hearing is "yeah its cool that mamba does thsi but its overkill" - rather than actual problems. But for many people, the problems it solves are necessary.
it's always worked well for me.
err, great? Did I criticize the usability of python/venv/pip? No.
But for many people, the problems it solves are necessary.
so does poetry, venv, uv, etc
i think what ned was trying to say is that for us, we've always used pip because it's never lacked functionality
what problem does venv solve for you that micromamba doesn't? Like, something specific, not "my freedom of choice" or "the UI was convoluted but I don't remember how"
if it did, and micro mamba had come up while looking for missing functionality, we would've been aware of it
they both create isolated enviroments, but venv setup was much simpler for me
Like, I've given a very specific problem that conda/mamba solves that pip does not - would be nice if you could do the same, rather than these vague pronounciations
Most users that respond to the Python Survey: https://lp.jetbrains.com/python-developers-survey-2022/
okay, maybe im missing something
given a very specific problem that conda/mamba solves that pip does not
what problem?
micromamba sounds great. I've just never needed it, and I had never heard of it.
Yes, more people use pip than conda.... but many people use conda. That's why conda support is pretty standard in all the major IDE's, you might notice.
they're not building out that support for 5 people
agreed
yes, conda is widely used.
I mean that's literally what the entire conversation was about - micromamba handling the solving and installation of native dependencies - including even python itself
are you talking about external dependencies like idk, cuda / nvidia drivers?
not drivers - libraries
.so's, on linux, for example.
and the python interpreter itself - the python interpreter is a native dependency of any python code
those are compiled shared libraries no?
yes
how often do you encounter the case where a simple pip install wouldnt install the necessary dependencies for you?
And quite a bit of library maintainers use other tools to handle python versions: https://github.com/search?q=path%3A%2F^tox.ini%24%2F&type=code . So all I'm saying is there are lots of people for whom the issues conda solves are solved already.
it installs it, but often not correctly. I encountered that case constantly in the past - for people using things like numpy, scipy, etc
i mean, it has happened to me, but 99% of the time it's been external dependencies that are behind a propietary wall of access or hard to track down
is that rather not an issue of the library maintainer not setting up the install correctly?
Yes - this kind of thing isn't a very good solution.
There isn't dependency resolution, for example, between the python libraries and the python interpreter
but it might be "good enough"
but I feel like we already agreed upon that - obviously many people don't use conda, and presumably msot of those people use something "good enough" - ergo, for many people, solving the problem even if it's not very correct, is "good enough"
What does "dependency resolution between the python libraries and the python interpreter" mean? It'll install the correct binary wheels for each Python version, no?
if i release a library that claims that works on python 3.12, and when my users install it on 3.12 and it doesnt work, thats an issue with my library, not with pip
this just lists some python versions
but the version of python and the version of packages you want to install affect each other - it might even lead to the environment being unsolvable, right?
If you want to do this, you resolve a dependency file against differnt python versions, and solve the environment to produce a lock file
it's just a function of pip not being able to handle some things
but that's the maintainer responsibility to pin their requirements against working dependencies.
It's possible, I guess? You can declare specific dependencies for specific Python versions, that's how I usually see it being done in the wild.
I'm not sure how else to explain. The maintainer can't do such a thing - the issue is that you need to resolve dependencies - but things like binaries aren't represented as packages in pip - so they cannot be solved correctly.
Sure, and typically it'll work, as long as your project is relatively small, and/or you don't try too hard to maintain testing on older python versions, etc etc. It's an ad hoc solution but it can certainly work, especially if the python interpreter is your only native dependency.
but hopefully like, you can see the issue, right?
dependency resolution is this problem, it gets represented as a graph, etc etc.
python packages all declare their dependencies, the graph has to be solved correctly, this is non-trivial - pip has a proper solver now I believe, but in the past it didn't, and this was bad (if you've been programming python for a while maybe you remember it causing issues in the past)
the python interpreter is ultimately a dependency like anything else - if you depend on a python package foo, and dozens of other packages, and you want that to be resolved to a reproducible environment, it needs to b eresolved simultaneously with the python interpreter version - and any other native packages that these dozens of packages depend on. Anything shy of this is just fundamentally ad hoc.
i hear what you are saying. Our point is that the failure case you are talking about is rarely seen by many devs.
You can be very specific about Python versions and package versions in your deps. Or you can be more lenient, e.g. https://github.com/igraph/python-igraph/blob/main/tox.ini
sure - just to be clear - I totally accept that.
I guess I'm not understanding how this constitutes what I said - solving all of your dependencies simultaneously.
somewhere, you want to feed in information about all the depencices you want at the same time:
- python=3.7
- foo>=2.1
- bar>=3.2
- blub
etc. You feed these all in - you try to solve the graph. blub, foo, bar, can depend on each other, and on python, so you try to find a solution that satisfies all the dependencies, and results in the latest possible versions.
you save that solution to a lock file - now you can reproduce that exact environment.
you repeat that process for all the different sets of dependencies you want to test against - maybe you also want to test against python=3.8 holding everything else constant, or maybe you want to test where python=3.8 and foo>=3.0, etc
That was post 1 of 2, "the lenient way". Here's 2 of 2, "the specific way": https://github.com/qpsolvers/qpsolvers/blob/dc00ac33254e2dc2d8838d4fe2f473088a2034a3/tox.ini
People are doing what you describe, with or without lockfiles, in tools other than conda. That's all I'm getting at.
I don't think I ever claimed that people are not testing in different native environments without conda
I simply said those environments are typically ad hoc
you can specify the environments you want to test your python code in - but the environment, and your python dependencies, are specified as two different things - even though they are tightly coupled. that's all.
Like, yes - obviously this can be made to work, especially for very simple environments. But there's a solution that solves this properly - that's an advantage. the only question is what disadvantages it has, at that point.
packages are typically distributed as wheels, which contain vendored copies of the native libraries they depend upon. People building wheels typically use a solution like cibuildwheel to build those wheels, and which does those builds in a reproducible environment
the issue is that you need to resolve dependencies - but things like binaries aren't represented as packages in pip
when you say "binaries", do you mean that in the sense of external executables looked up on $PATH, or in the sense of native libraries?
native libraries
the issue with vendoring is that linux doesn't really allow vendoring .so's - I'm less knowledgeable about windows, I think it works a bit better there
if foo and bar are both python packages and both depend on blub.so, and they bother vendor it, i.e. they have their own copy
and pip is not checking that they vendor the exact same copy (or at least, one that's ABI compatible) - then you're in trouble
that is handled by auditwheel repair, more or less. Each wheel contains its own copy of the vendored library. The copy is given a filename based on a hash of its contents, and a matching SONAME. That handles the ABI compatibility issue - there's no chance of a Python library loading an ABI-incompatible native library. What happens instead is just that that, if two different Python libraries need two different ABI-incompatible versions of the same native library, that native library gets loaded twice, and the two Python libraries resolve their symbols against different loaded DSOs.
This does mean that Python libraries can't share global state using native libraries, though - there's awkward situations where, say, one Python library registering a handler in a native library won't let another Python library use it, since each has its own vendored copy.
I guess I'm not following - are the Python libraries using dlopen or something?
all extension modules are loaded using dlopen
the interpreter itself makes the dlopen call in import
Gotcha, yes, this gets around the collision in the global symbol table
Which is usually the first reason why I would say that elf/Linux doesn't really support vendoring these things
But yeah, you still have issues, as you mentioned
yeah. The unexpected behavior is that it's impossible to share state, instead of that state is accidentally shared
Well to be clear, without dlopen it's not about sharing state
You're literally just calling a different library then you were linked against
yeah, I'm lumping symbols and data in with "state"
And if it's not ABI compatible it explodes
I wanted at some point to install the same set of dependencies with pip and mamba and see which libraries actually come from wheels versus what It simply expects to be on the system
that comes from the manylinux spec
Although, I'm just using ldd, which could miss some things
there's only a small number of libraries that are allowed to come from the system in a manylinux compatible wheel
https://peps.python.org/pep-0513/#the-manylinux1-policy dates back to 2016
I know
I wish it had arrived ten years before that
I spent a few months at work around maybe 2014 or 2015 installing numpy et all on various servers and suffering
got it
Also fwiw it's not just global state, it's any state that can be passed.between the libraries
And there's no way to make that impossible, you'll just default at runtime
*segfault
that's basically impossible unless you go out of your way to do something weird
Only if the C libraries are "public" dependencies though
I agree it's very rare in a Python context
your extension module would need to dlopen a library that it's not linked against, and another module would need to also dlopen that same library, and they'd need to somehow get different versions.
or you'd need to smuggle a pointer to an object belonging to a library through Python code and back to C code on the other side
Im talking about the latter situation, yes
yeah. Well, fair enough. It's possible, but if you do that you're already doing something way outside of the guardrails
That's not that insane, but you would need multiple python packages doing a very rare thing, I admit
I mean, it is that insane if you're not considering ABI when you do it
you, as the extension module author, ought to know that your version of a library is whatever you've linked into your library, and other extension modules might be linked against different versions of that library. If you go out of your way to smuggle them a pointer that they can't use with their library, you're doing something quite unreasonable
Well, sure, but it's pip that's not considering abi here, really, right?
Im very impressed with what wheel is doing, but it's ultimately a workaround
in the end, it's more or less equivalent to statically linking each extension module's dependencies into its .so
it's no more of a workaround than statically linking plus -fvisibility=hidden
right - they need to be, since they're loaded dynamically when an import happens
So continuing your analogy, this is like mixing static and dynamic linking
Which is indeed... Bad?
this is like mixing static and dynamic linking
yep, aka "fat bindings"
Which is indeed... Bad?
Like, morally? Aesthetically? Technically?
Sure. I would consider this an absolute last resort - so does every C++ dev Ive ever worked with
When people depend on two libraries that need different boost versions or something - they cry and then do this
I dunno. Statically linked DSOs doesn't seem so bad to me. Python libraries are predominantly exposing Python interfaces, not C interfaces, so it tends not to make much difference that they've each socked away a different version of some common dependency
It's definitely a lot less bad in python than in actual C or C++
but I still just prefer to solve it properly
in any event, this is the context for why conda was a lot more popular a decade ago than it is today. People today mostly use pip because they're extremely unlikely to find a case where it doesn't Just Work. I definitely grant that it used to be very common a decade ago.
Do you have a basis for saying conda is less common?
just anecdotal - I used to hear people talk about it a lot, and now I almost never do
i never left conda, but yeah, i originally used it largely involuntarily - numpy (or numpy+mkl, idr) and others a long time ago were much harder to install with just pip
For people who come at python from the angle of writing web servers and django and so on - yes, they don't know about conda or mamba, and they didn't in the past either.
for people who come at python from the angle of quantitative work, numpy, pandas, tensorflow, sklearn, etc - conda/mamba are still extremely popular in my experience
all of the data science people I know are using pip or poetry, personally. ๐คทโโ๏ธ
ah yes, mkl, the good old days
it would be nice if there was some reasonable source of actual data, but I can't think of a good source off the top of my head
here's the best I could do
https://www.jetbrains.com/research/python-developers-survey-2017/
https://lp.jetbrains.com/python-developers-survey-2022/#PythonPackaging
5 years apart, conda usage goes from 17% to 21%
I tend to suspect that a pycharm based survey will underestimate things - a lot of the kinds of people doing quantitative work don't use pycharm. they're the kind of people who live in ipython notebooks and just analyze data.
but not much evidence its declining
in 2017, 25% said they use Anaconda in the "What additional technology(s) do you use in addition to Python? (multiple answers)" section, in 2022, 22% say they use conda for installing packages. I dunno, teasing any trends out of the survey data isn't gonna be easy. It's a less obvious trend than I'd have wagered.
I can +1 the anecdotal evidence that [conda] questions here (in PyDis) have been declining
yeah, it's basically noise
my anecdote is that my company went from not using conda - to using conda, maybe in about 2017 - and then upgraded to micromamba in maybe 2021 or something like
and micromamba is a significant improvement over conda - so it's not like that ecosystem is atrophying
there's also this too - https://prefix.dev/blog/introducing_rattler_conda_from_rust
micromamaba in rust ๐
A big thing with this is that since conda envs are language agnostic and already deal with binary code - you can already put together cross-language environments with it very successfully.
which might be something for which demand is increasing - hard to say. (that definitely ties heavily into why my company uses it - we use it to create cross python/C++ environments)
@dense orbit we don't allow advertisements in this server, I've gone ahead and deleted your ad
So does this solve your main issue?
Not my main issues, no, in the sense of what I need for my workplace. It does mean the chance of a pip environment for a pure Python project (with native dependencies) working properly is higher than I expected
Apologies for the ping, but I may have missed this in the conversation and was hoping for clarification: what is the "proper" solution to this in your eyes? Dependency resolution for regular installers (e.g. pip) that takes into account the ABIs of binaries within packages?
Proper solution meaning you apply the same standards to native packages as you do to Python packages
One version exists in the environment
Ah, to avoid the fat wheel issue. Gotcha.
There are 5 open pprint-related PRs, is there any way to help them move forward (he coyly asks, hoping his PR will benefit)?
https://github.com/search?q=repo%3Apython%2Fcpython+pprint++&type=pullrequests&state=open
# _.py
def get_file():
return __file__
print(f'from file: {get_file()}')
# stdout:
from file: D:\_.py
>>> print(f'from repl: {get_file()}')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\_.py", line 2, in get_file
return __file__
^^^^^^^^
NameError: name '__file__' is not defined. Did you mean: '__name__'?
>>>
``` why is `__file__` deleted from `__main__.__dict__` after it is imported and interactive mode is enabled?
I haven't kept up with the evolution of pip much but at one time it was a lot more difficult to get builds certain packages with non python dependencies right? Conda was for that. I still use it at work, I actually make all of our python environments with it. Im a fan of conda environments. I also like it because it's a general environment manager not just python.
I'm just repeating what quicknir said, o well
yes, at one time it was difficult to distribute packages with native dependencies via pip. That's by and large no longer the case.
Do other python environment managers allow multiuser environment management? I for example have around 100 conda envs on our system. Users all share them. Bc of the way conda works, they're extremely lean, the non in house packages are all hard linked together and what's different is our weekly release of in house developed software.
Yeah, we have a thin layer over conda that creates immutable environments from lock filed
*files
The environment gets an automatic name that is based on the hash of the lock file
Then there's a shell function called ensure, you feed the lock file path and it either activates it if it exists or it creates it
Use it a lot for CI and prod
Interesting. You may get a DM from me at some point
For sure
I also use micromamba to setup my shell environment
It's kind of funny but it's actually the easiest way to install super recent versions of everything you need
Neovim, git, tmux, zsh, ripgrep, fd, bat, eza
fzf
I install it all from micromamba
And just add to PATH, don't even activate that environment
how do you mean?
deleted from
__main__.__dict__after it is imported and interactive mode is enabled
if you import a module, its namespace is not __main__, is it?
if i have variables in __main__, they are also available in repl
Guys, I have a serious problem with my Python compiler Nuitka and Python 3.12, this code right here, makes the package context inaccessible to me.
#ifdef HAVE_THREAD_LOCAL
_Py_thread_local const char *pkgcontext = NULL;
# undef PKGCONTEXT
# define PKGCONTEXT pkgcontext
#endif
Now when I load extension modules, in a package, the believe they are loaded to top level, and that's vastly incompatible. This variable is not accessible by any API, only something like _PyModule_CreateInitialized will use it. But in order to load extension modules, I need to set that thread local variable, which I cannot...
My code here
static const char *NuitkaImport_SwapPackageContext(const char *new_context) {
// TODO: The locking APIs for 3.13 give errors here that are not explained
// yet.
#if PYTHON_VERSION >= 0x3c0 && PYTHON_VERSION < 0x3d0
#ifndef HAVE_THREAD_LOCAL
PyThread_acquire_lock(_PyRuntime.imports.extensions.mutex, WAIT_LOCK);
#endif
// spell-checker: ignore pkgcontext
const char *old_context = _PyRuntime.imports.pkgcontext;
_PyRuntime.imports.pkgcontext = new_context;
#ifndef HAVE_THREAD_LOCAL
PyThread_release_lock(_PyRuntime.imports.extensions.mutex);
#endif
return old_context;
#elif PYTHON_VERSION >= 0x370
char const *old_context = _Py_PackageContext;
_Py_PackageContext = (char *)new_context;
return old_context;
#else
char *old_context = _Py_PackageContext;
_Py_PackageContext = (char *)new_context;
return (char const *)old_context;
#endif
}
It's killed by the lack of _PyRuntime being used, so that's a huge issue. Any idea, how to overcome _Py_PackageContext removal?
Looking to be fact checked about the semantics of the -m flag:
python -m foo.bar.baz
this would run as main module whatever I would get if I did import foo.bar.baz in the current working directory (which might be something in a subdirectory of the current directory, or something that's installed)--correct?
The semantics proposed are fairly simple: if -m is used to execute a module the PEP 302 import mechanisms are used to locate the module and retrieve its compiled code, before executing the module in accordance with the semantics for a top-level module. The interpreter does this by invoking a new standard library function runpy.run_module.
sounds like a good sanity check
For python -m foo.bar.baz to work each of foo bar and baz need an __init__.py and nested appropriately
If that's what you're saying the structure is going to be then yeah to your question
That's not entirely true. You can use -m even for modules inside implicit namespace packages
Maybe start a thread on https://discuss.python.org/ ?
Ah, I'm not familiar with those at all
That's just a directory without a __init__.py
should python have a frozen hashable dict type for the purpose of functools.cache?
https://peps.python.org/pep-0416/ has been rejected before, if you need one, https://pypi.org/project/immutables/ implements such a type.
๐
!pep 603 also exists, not rejected
idk wrong channel or the correct one. #python-discussion is very crowded generally, but what is the difference between concurrent.futures threadpoolexecutor and threading? and which is better?
I've seen some decent blog posts on this
I didnt do proper tests using timeit, there exists few discussions on reddit or some medium articles on tests, but got no satisfying answers since I am sort of able to do I/O tasks with threading too
Personally I just strongly prefer the concurrent futures API
but how is is there internal implementation implemented that people do call concurrent.futures as more modern
hm, even I started using it recently
The other tool to consider is asyncio
it's a higher level API over the threading module, it makes it easier to use and implement a few convenience methods
It depends what you're calling though
Honestly it's been so long since I used threading Pool that I actually forget the exact benefits of concurrent futures
It is pretty nice though how you can get back a generator on the futures that yields as they complete
KRRT, can you explain in more detail pls, because as I told these type of answers exists online but were not so satisfying
that's one of the convenience APIs
So I typically submit my work and then I have a for loop over the completing futures
sure, let's move over to #async-and-concurrency
i ran into an issue compiling FTXUI with scikit-build-core and CMake, complaining about relocation problems and yada yada with the linker - in short, there was a symbol of type R_X86_64_PC32, which basically means that it should lie within a 32 bit offset from the instruction, but the architecture is on 64 bit, and that's why it fails. what i find odd, is that this doesn't occur when compiling FTXUI manually, it only occurs when it comes into contact with CPython. i've fixed the issue now (namely, it was to add the -mcmodel=large flag), but i'm quite curious as to why CPython causes this library to do that. does anyone have any idea as to what CPython could be doing to cause this, or if this is a common thing that happens?
that means that the library wasn't compiled with -fPIC. The R_X86_64_PC32 relocation won't appear in position-independent code.
ah, that's suprisingly simple. i spent a lot of time researching the problem, and that was not on any of the answers
hm - it ought to have literally been in the linker's error message
oh, the linker error was quite ambiguous: final link failed: bad value
hm. it should have said:
relocation
R_X86_64_PC32against symbol ... can not be used when making a shared object; recompile with-fPIC
maybe that appeared in your output, but wasn't the last line?
it might have gotten buried in the scikit-build-core logs
oh wait, it was right there - i just missed it
that was a lot of work for nothing then ๐
looking at the build logs, -fPIC was passed anyway, so that error message isn't really useful, i still would have had to figure out the -mcmodel=large, so i guess it's not that much wasted work
I recently had a problem where mcmodel=large was the solution
It is kind of crazy how hard that stuff is to figure out. Terrible error messages, Google searches yielding dregs - I think a Stack overflow answer with one upvote was one of the more helpful things I found
I think a Stack overflow answer with one upvote was one of the more helpful things I found
always has been
for that specific error, the common SO solution was โupdated my libc and now it works!โ or something similar
vibes
the linker is a pretty terrifying piece of technology because everything depends on it, but it's pretty mind blowing how much this sums up most of my experiences debugging linker problems.
Hi all I need some advices which channel best to ask?
Depends on the topic, but likely #1035199133436354600 will fit.
for "zero-cost exceptions"
https://github.com/python/cpython/issues/84403
Now that the bytecodes for exception handling are regular (meaning that their stack effect can be statically determined) it is possible for the bytecode compiler to emit exception handling tables.
does that mean that when exceptions themselves are raised, is there zero cost to that or not?
no, exception raising is more expensive I believe (and superlinear in the size of the code object)
my understanding is it's zero cost when you have a try-except that doesn't actually raise
right, i got that, but i was wondering for the
it is possible for the bytecode compiler to emit exception handling tables
shouldn't this mean that it should be more efficient when the exceptions are actually raised
the way the table is encoded, it needs to scan from the beginning to find the right place I believe
ah, i was expecting something like a switch table i guess
looks like "scan from the beginning" isn't quite right, it needs to do a binary search through the table
so the cost of raising an exception is about log(n) in the size of the code object
So if you catch exception in a code that holding a pandas dataframe that's 1GB, it takes longer? Is memory allocation, what you mean by size here?
No, I mean the size of the code object
I think in terms of number of "blocks" that affect exception handling, e.g. try and with
Okay, I need to read up on code blocks
Not sure that's the formal term being used, the doc I linked above should have the correct logic and terminology
the code object is the compiled bytecode for your function. Its size is (very roughly) proportional to how much source text is in your function.
Ok, that seems easy to grasp. So a longer function that's being called in a try-except, and if an exception is raised, it will take longer than if it was shorter
i don't actually know how the exception table works, I'm just clarifying what a code object is. I think you have it right. Though also, "longer" here is still a very short time.
Not quite, what matters is how large the function is in which the try-except is.
So
try:
really_large_function()
except FooError:
...
doesn't matter
oh, right, not "called in". it's the function containing the try/except.
And any function the raised exception propagates through, I think? If really_large_function() called some other function that raises the exception, we'd need to decode the exception table for really_large_function() to figure out whether the exception is being caught or needs to propagate up the stack
Right, makes sense. At least if it's not caught further down
would the time to search the exception table be proportional to the number of regions that handle exceptions differently? If a function has no try/except at all, then it will be very fast to determine that even if the function is 1000 lines long.
yes, I think so. The way I understand is that the function is divided into blocks, and for the duration of each block the same "thing" happens to exceptions (i.e., it gets handled by the same exception handler). The exception table is essentially a list of those blocks, and we do a binary search through the list to find the right handler.
right, so the time is O(log nT) where nT is the number of try/except/finally/etc regions
waves his hands frantically
and the point of "zero-cost exceptions" was zero cost to enter the try block and no overhead if an exception doesn't happen.
is there any kind of special handling for stop iteration exceptions?
all of this sounds extremely reasonable in the context of exceptiosn being an actual error; then you typically want good happy path performance and sad path performance matters less
but obviously stop iteration exceptions aren't "really" errors
I did notice that adding a for loop makes the exception table bigger
heh
but in general StopIteration doesn't get caught the same way when iterating (i.e., it doesn't literally generate bytecode equivalent to try: next(it) except StopIteration:)
Instead, the FOR_ITER opcode directly calls _PyErr_ExceptionMatches
i guess the thing is that users can technically catch the StopIteration
and prevent it from actually exiting the loop
no?
the StopIteration doesn't get thrown in the loop body, it gets thrown when calling __next__/tp_iternext
async for loops do afaik add exception handlers directly to the bytecode.
!e
async def f(x):
async for y in x:
print(x)
import dis
dis.dis(f)
:white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | 1 0 RETURN_GENERATOR
002 | 2 POP_TOP
003 | 4 RESUME 0
004 |
005 | 2 6 LOAD_FAST 0 (x)
006 | 8 GET_AITER
007 | >> 10 GET_ANEXT
008 | 12 LOAD_CONST 0 (None)
009 | >> 14 SEND 3 (to 24)
010 | 18 YIELD_VALUE 3
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/CWLDEX6KS5QJJXGBTEY4C5MFBQ
if you have a coroutine, it's legal to catch the stop exception, isn't it?
I don't know what you mean by that. You can certainly catch StopIteration if you call next() directly
Sorry, I'm not sure what's unclear I guess
In a for loop, the stop iteration is effectively caught by the loop itself
it calls next, and then if that throws Stopiteration, it simply exits the loop
whatever point the StopIteration is thrown from, it can be caught in between the place it originates
and where it's caught
sure
the relevant bytecode catches the exception in C directly, there is no space for the user of a for loop to catch the exception. Of course, if the iterator itself is wrapped and something catches StopIteration rather than propagating it up the loop doesn't stop.
the "iterator itself" can be implemented in any way - it could throw Stopiteration from inside multiple layers of call stack
what I'm saying is that in at least some ways, stop iteration follows the same rules as any other exception, even if such usages are unnatural or unidiomatic - that probably limits the kinds of optimizations that can be done
true
if users were not allowed to catch Stopiteration at all, then you could probably do more, that's all I mean
i wonder if there's ever really a good reason to actually catch StopIteration yourself
itertools.chain, for example
is that how it's actually implemented for some kind of technical reason
def chain(*iterables):
# chain('ABC', 'DEF') โ A B C D E F
for iterable in iterables:
yield from iterable
this is the sample implementation given
Why do lists in CPython internally use PyVarObject_HEAD? They are defined like this: https://github.com/python/cpython/blob/main/Include/cpython/listobject.h#L5-L6
and store the list size in the ob_size. But isn't this misusing PyVarObject? Normally ob_size tells the size of the object itself - but for lists, the size of the object itself is in fact always constant, only the size of the dynamical array they have a pointer to changes.
Include/cpython/listobject.h lines 5 to 6
typedef struct {
PyObject_VAR_HEAD```
There are some special cases already, e.g. C next functions are allowed to just return NULL without actually setting an exception
So does try: next(it, default=None) except StopIteration still incur the exception overhead?
Are there severe performance implications with iteration like that? It seems pretty core to the language so i guess not?
it's probably a lot slower than a for loop
PyIter_Next being able to return NULL without an exception was a mistake :(
I suppose its just the sacrifice you make when asking to be able to iterate an iterable outside of a loop.
No free lunch
Type 'copyright', 'credits' or 'license' for more information
IPython 8.23.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: def whileloop(lst):
...: it = iter(lst)
...: while True:
...: try:
...: elt = next(it)
...: except StopIteration:
...: break
...: +elt
...:
In [2]: def forloop(lst):
...: for elt in lst:
...: +elt
...:
In [3]: lst = list(range(100))
In [4]: %timeit whileloop(lst)
1.77 ยตs ยฑ 1.55 ns per loop (mean ยฑ std. dev. of 7 runs, 1,000,000 loops each)
In [5]: %timeit forloop(lst)
743 ns ยฑ 1.47 ns per loop (mean ยฑ std. dev. of 7 runs, 1,000,000 loops each)
A bit more than 2x slower
Interesting choice to do +elt why not just a ...? You wanted something you could do in both?
It's just somewhat hard to think of real use cases for "iterating an iterable outside of a loop" - given the suite of python builtins for various things (e.g. zip)
I did ctrl-f for StopIteration on the itertools page, there are a couple examples there
of course, presumably most of the itertools stuff is implemented as builtins and doesn't actually use it, and you can reuse that stuff yourself. but still instructive I guess.
def accumulate(iterable, function=operator.add, *, initial=None):
'Return running totals'
# accumulate([1,2,3,4,5]) โ 1 3 6 10 15
# accumulate([1,2,3,4,5], initial=100) โ 100 101 103 106 110 115
# accumulate([1,2,3,4,5], operator.mul) โ 1 2 6 24 120
iterator = iter(iterable)
total = initial
if initial is None:
try:
total = next(iterator)
except StopIteration:
return
yield total
for element in iterator:
total = function(total, element)
yield total
the comment explains that ob_size is len(the_list)
Yeah, I realise this, but why apply PyVarObject_HEAD and use ob_size, instead of using PyObject_HEAD (as would be appropriate for a constant-size object) and a custom size field?
there's been 2 examples in my 5.5 years as a python programmer where I had to. So yeah for N=1 its pretty rare. One was niche file reader I implemented that had weird semantics, the other was scanning directories and processing data. Both probably could have been worked around not having to do what I did but I felt it was more work.
makes sense
yeah, I guess I wanted to do something with the element, no strong reason
here's a real fun question, is except StopIteration for extracting a single element, faster than islice or not? ๐
because ob_size is available right there, why not use it?
For this I always reach for next(it, default=default). Only because I can easily throw in a conditional and requires no import, even if verbose to type
Ah right
yeah I've used that trick before but forgot it here. so would this be faster for accumulate? probably
No idea if its faster
rather than
if initial is None:
try:
total = next(iterator)
except StopIteration:
return
you'd have
_sentinel = object()
...
if initial is None:
total = next(iterator, _sentinel)
if total is _sentinel:
return
Some might say its less pythonic bc its lbyl
yeah, some might say
brings me back to this note from guido on PEP 463:
I disagree with the position that EAFP is better than LBYL, or โgenerally recommendedโ by Python. (Where do you get that? From the same sources that are so obsessed with DRY theyโd rather introduce a higher-order-function than repeat one line of code? :-)
we talked about this a bit in this channel i think
I wouldn't say it is less pythonic, just that you should be aware of the downsides
tuple, bytes, bytearray too
This is an extension of PyObject ... only used for objects that have some notion of length. ...
https://docs.python.org/3/c-api/structures.html#c.PyVarObject
This is a macro used when declaring new types which represent objects with a length that varies from instance to instance. ...
https://docs.python.org/3/c-api/structures.html#c.PyObject_VAR_HEAD
that would explain it then - in #c-extensions, they pointed out a comment that said PyVarObject was only to be used for variable-length object structures, there was nothing about length
i was under the assumption that it just used ob_size because it's standard throughout the C API
it seems that pytz module is required when converting a region's local time to UTC. What is the reason that this is not yet folded into the internal python libs ?
It's not required, just not as comfortable.
!e
from datetime import *
local_dt = datetime.now().replace(tzinfo=timezone(timedelta(hours=2)))
print("local:", local_dt)
print("utc:", local_dt.astimezone(timezone.utc))
:white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | local: 2024-07-05 17:37:40.138322+02:00
002 | utc: 2024-07-05 15:37:40.138322+00:00
If by timezone you mean something like Europe/Berlin instead of a UTC offset, then the reason is the update period of Python vs. third-party packages. Unfortunately it is surprisingly common for countries to change their timezone (e.g. whether to observe daylight saving time) in the last minute.
@formal ore this stuff is in the python standard library
as of 3.10 or something like that
3.9
pytz is basically not really recommended to be used anymore
the actual zones come from an IANA database, which you can update on your system independently of python
i think there's a few different approaches to how you get your IANA database
You can either get it from the system, or from a pip package - https://pypi.org/project/tzdata/
Ah, good to know
yeah, we were stuck on 3.8 forever, and then jumped to 3.11 not long ago, and then I found out about this, it came as quite a (pleasant) surprise
no more of that tz.localize(dt) weirdness
Ahh nice. I'll check it out. Thanks
<@&831776746206265384>
this was dealt with earlier and they're no longer in the server
Should co_linetable be documented in https://docs.python.org/3.14/reference/datamodel.html#index-58 ?
Perhaps, but not too well:
The attribute is public only to support creation of new code objects.
Hm, makes sense. I read that in the PEP and didn't pay attention, but now you quote it I can see a case for leaving it out.
What's the best way to just get into contributing to CPython
I'd take a look at the issues that have no comments, and check for those with no related PRs. See if any of them falls into an area of interest for you, or whether the complexity fits your skill.You can try to fix one, but just reproducing and/or diagnosing the issue is a great help already. You can also review PRs and suggest improvements (or just ask questions to better understand some details, that may also help the author to make things clear).
https://github.com/python/cpython/issues?q=is%3Aissue+is%3Aopen+sort%3Acomments-asc
You can also take a look at issues labelled "easy", but be warned that many of them aren't:
https://github.com/python/cpython/issues?q=is%3Aissue+is%3Aopen+sort%3Acomments-asc+label%3Aeasy
Another nice thing to do is to read the devguide: https://devguide.python.org/
You can also take a look at issues labelled "easy", but be warned that many of them aren't:
https://github.com/python/cpython/issues?q=is%3Aissue+is%3Aopen+sort%3Acomments-asc+label%3Aeasy
yep, and if they don't look easy, a comment suggesting to remove the "easy" label can also help
I have some issues with building python
i would build the pythons with pyenv and call it a day.
How do you get on triage team
how does __init_subclass__ work? there's no type slot for it on PyTypeObject
i have a question:
now, we cannot use yield from in an async function.
the reason is given:
it is too hard to implement that. (https://peps.python.org/pep-0525/#asynchronous-yield-from)
how ever, https://peps.python.org/pep-0380/#formal-semantics has given an implement of yield from, you can see that it is a syntax sugar, as you can use pure python code to implement yield from, before it got added in py.
why cant they just replace yield from by the code in https://peps.python.org/pep-0380/#formal-semantics when python parser see it in an async function?
the implementation of yield from back when that PEP was written was a single bytecode instruction, in that world, it is pretty difficult. I believe it has since been changed into the code snippet you showed in bytecode, so it should be doable. ( or well, you'd need specialised instructions that work with StopAsyncIteration rather than StopIteration )
may i know the 'that PEP' you said, is 380 or 525?๐
I believe both, actually.
๐ค
it's been a single instruction for quite a while
so they wanted to use a single bytecode instrument to do that at the beginning, which is difficult, however, later they give up this requirement, and it is easy to do that in more than one byte code instruments, but they forgot this task?
Yea, it's still not trivial, but I am not sure it's really worth doing
It might just do regular attribute lookup. To figure this out I'd open Objects/typeobject.c and search for init_subclass
I hear the implementations of async generators is already really complicated, yield from would presumably make it worse
It is pretty crazy
i tried manually adding an __init_subclass__ with METH_VARARGS, but python complained about some unbound method error
Cant they just replace every yield from with the snippet when they build the ast tree?
i don't fully remember what the error was, let me check
not quite, that snippet doesn't actually work for async generators
you need to await anext and StopAsyncIteration in places
and I'm sure some new issues would arise from that, as they would from having multiple yield froms nested in each other.
You mean the snippet cannot be used in an async function ,or the object after it cannot be an async generator object?
yeah, adding an __init_subclass__ to the methods messed with this https://github.com/python/cpython/blob/b765e4adf858ff8a8646f38933a5a355b6d72760/Objects/descrobject.c#L271
Objects/descrobject.c line 271
if (funcstr != NULL) {```
async generators don't use next (at least not in the way this snippet wants it to work), they use anext
๐ค
def f():
yield from some_sync_generator
async def f():
yield from some_sync_generator
async def f():
yield from some_async_generator
i think there may be 3 cases?
i guess, the snippet should work in the middle case?
They can't use async from C
the snippet works in none of the cases
you can, it just sucks
How?
async is built atop protocols, so you can manually build an object that has send and throw and manages its own state correctly
You could certainly adapt it to work for async generators
even the first case, which is already been there?
oh wait sorry I misread
Could they do that for yield from then
yeah, it'd work for the middle case, but that would be kind of a mess IMO.
not really, yield from worked by the internals of send of generator explicitly checking if the generator is yielding from another afaik.
oh yeah, yield from in an async generator could in theory either take a sync or an async iterable. To me it's "obvious" it should take an async iterable, but maybe some people would want a sync iterable instead.
I meant more mess from just a user perspective, but yeah, I can actually see the perspective that yielding from sync iterators makes sense
that is what i want to say.
when it is the middle case, it use the snippet, and when it is the last case, it use a modified snippet (which use athrow/asend instead)
maybe the last case should add an extra async to yield from like for/with
yeah, I could see it tbh.
didn't we want to delete them anyway ๐
not I
How do they yield from C
i know it is difficult to write a formal one.
but why cant they just use the snippet (from https://peps.python.org/pep-0380/#formal-semantics) anyway?
i mean, use this snippet to replace them when building ast tree.
atleast that can solve the middle case i said above....
we could put all that into the AST or the compiler. Why would we?
to let people use it before a formal one is finished?
"the middle case" is yielding from a sync generator inside an async one, right?
if it's implemented in CPython, then we're stuck with the semantics
doesn't matter whether it's implemented by an AST transformation or new bytecode, that's just details
these days, you have specialised SEND THROW bytecode instructions
yes. i m saying the middle of this 3.
IIRC, it used to be that a generator tracked which generator it was yielding from, and handled that in the send implementation, but I am not 100% confident
stuck with the semantics?
But what if they want a C function that yields
or does async
Do they add bytecodes somehow
you make an object that has a send and throw method
let me find an example
https://github.com/python/cpython/blob/main/Objects/genobject.c#L1892-L1932 you can await this object (and await is more or less spicy yield from).
But that's the async generator itself
no
What about like a function that returns one
Oh
if you want to return an async generator, you'd need to return an object with asend and athrow, which are then themselves awaitable.
i had a proposal for an async functions c api a little while back but it got scrapped
Are there docs for that
I am not sure tbh, I doubt anyone even does this.
partially because it's so difficult to
yup, agreed
most just resort to passing something like this
def cb():
loop = asyncio.get_event_loop()
loop.run_until_complete(...)
c_func(cb)
could you please explain stuck with the semantics? i did not get it...๐
Could they make it easier like making the object automatically or something
If we add something to the language, then backwards compatibility guarantees mean we can't change the semantics later
(practically speaking; there are deprecation pathways and all, but it's really hard to change semantics at the core language level)
i understand.
but isnt the that the behive it supposed to be?๐ค
(or did i miss something?)
I would expect yield from in an async generator accept an async iterable, not a sync iterable
That's a deprecated way to call get_event_loop
Plus, we don't generally add things to the language just because we can. There's got to be a more convincing argument for adding it
If you could make such an argument and write a PEP, then yes, maybe we could add yield from in async generators
But somebody needs to actually step up and make that case
๐ฒ๏ปฟ i thought it was either yield from sync_gen or async yield from async_gen before ......๐
I guess that's a possibility but that's more new syntax to add to the language. You have to make a case that that syntax is useful enough to justify the extra complexity in the language implementation and the extra thing users have to learn to know the full language
I cant type many nowโฆ something wrong in discord front end
(i was guessing it would beasync yield from or yield async from when accepting an async generator, because of async for and async with, and because it is valid to use sync generators inside an async one)
and I completely agree that caution needs to be exercised here
oh? what's the new way?
!d asyncio.get_event_loop
asyncio.get_event_loop()```
Get the current event loop.
When called from a coroutine or a callback (e.g. scheduled with call\_soon or similar API), this function will always return the running event loop.
If there is no running event loop set, the function will return the result of the `get_event_loop_policy().get_event_loop()` call.
Because this function has rather complex behavior (especially when custom event loop policies are in use), using the [`get_running_loop()`](https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.get_running_loop) function is preferred to [`get_event_loop()`](https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.get_event_loop) in coroutines and callbacks.
As noted above, consider using the higher-level [`asyncio.run()`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run) function, instead of using these lower level functions to manually create and close an event loop.
asyncio.run
that's only for run_until_complete
what about the other 100 things that the loop object can do
ah, looks like get_event_loop itself is not deprecated, just deprecated to use if there's no event loop
You just use get_running_loop() instead
b]
Im delving into the C API, are these functions here things you implement or just how you call the builtins from the C level? https://docs.python.org/3.10/c-api/object.html#c.PyObject_RichCompare
functions are the things you call at the C level. Often they internally call some aspect of a user-created type
At the Python level that's often a dunder method, at the C level those map to "slots", e.g. tp_richcompare in this case
Ok I think I'm getting this, if I wanted to define a class with C API, I define the corresponding tp_* functions and make a pointer to them in the class struct right?
there are two ways, static types and heap types, only the first involves creating a struct directly. there is a chapter about this in the C API reference docs
!pep 749
You know what would be really evil. Publishing an annotationslib package on PyPI
let's hope nobody does that ๐
Surely they would request it before submission of draft
what if i just do it for the funnies :p
@feral island
then the maintainer of that package would have a hard time using it on Python 3.14
(the stdlib comes before site-packages)
well I meant for the funnies, not as an actual package
well, it's not very funny, I guess
not really
PEP 749 specifies annotationlib right?
annotationslib would just be slightly different
Aren't stdlib package names banned from pypi? (except for dataclasses)
Wouldn't the steering council and coredevs take issue with this. They didn't want to use toml for example
Maybe that was special bc it was around so long and had a huge userbase
if the package is created right now because somebody thinks it's funny, I doubt it will matter
it's different for a real package that is actually in use
did something change in 3.12 about how tp_as_buffer's fields get inherited? i have a class with tp_as_buffer filled out, but its subclasses don't seem to be inheriting those fields in 3.12 and above any more
There was a pep about the buffer protocol, but I dont remember if its 3.12 or 3.13
It was 3.12 by none other than our @feral island https://peps.python.org/pep-0688/
Not that I know of. My PEP didn't change this
If I want to define a static type do I have to recompile python? I'm unsure what this paragraph under static types means
Also, since PyTypeObject is only part of the Limited API as an opaque struct, any extension modules using static types must be compiled for a specific Python minor version.
no you can add it in an extension module
why does the bot add the "awaiting core review" label when you leave review comments and request changes
It means that if you're defining static types, you can't use the limited/stable ABI, and instead need to build a separate wheel for each Python version you want to support
Ah, ty
I think it's a bug
D:\a\1\s\Modules\gcmodule.c:450: visit_decref: Assertion "!_PyObject_IsFreed(op)" failed
Memory block allocated at (most recent call first):
File "<unknown>", line 0
object address : 0000012429415790
object refcount : 8
object type : 00007FFA2FCEF9B0
object type name: dict
object repr :
what would be a way to figure out which dict this is? my code isnt even directly altering any dicts, and it seems tracemalloc isnt offering any help. tried checking it out in the visual studio debugger too but it seems the dict has already been largely deconstructed by this point, and theres no repr at the end because accessing its tp_repr caused a segfault :/
!e h
:x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 1, in <module>
003 | h
004 | NameError: name 'h' is not defined
is it worth reporting
@final geode @round path Thanks again to both of your for your assistance and suggestions! The blog post on how to try out the CPython jit is now live: https://jeff.glass/post/try-cpython-jit/
Great post! One minor nit: cold exists -> cold exits, everything else looks pretty good.
Offtopic, I'm sorry, but I envy you for having a last name that's a TLD.
This is awesome, thanks!
Changing
PYTHON_JITat any point during execution (i.e. by usingos.environ['PYTHON_JIT'] = 1) will also enable and disable the JIT at runtime.
Is this true? Not at my laptop now, but I think we only check this at startup.
Also, minor, most of your headings for debug output only say LLTRACEโฆ PYTHON_LLTRACE=5 is the only one that uses the entire env var name.
I donโt understand py.types that well. Do I just need it to tell mypy Iโm type checking?
Only for packages you put on pypi
Iโm trying to figure out if I need to add this to my packages at work in our internal pypi. We have typing and use pydantic for schema validations.
Probably you do need it
probably you should. I added something internally that enforces that all internal packages have py.typed
I guess I donโt fully understand it. So placing a py.typed file as the base of your project (where pyproect is or in foo/foo for example?) will add a note for something like mypy that everything must be typed? I only need one of these files bc itโs recursive through the project right?
No, it is needed inside a library and tells mypy to use the code in that library when type checking code that is using the library
What do you mean by probably need it? Need it for mypy to know that everything MUST be typed?
I donโt follow. If my package is typed, mypy will use the types I create to check typing? If I donโt have the py.typed file mypy will not use the types I create but only std library types? So if I have a type package.FooSchema, it will only be used in type checking if I have the typed file but if I donโt then mypy will not consider the FooSchema?
if you don't have py.typed, mypy will consider all types in package as Any
Ohhh thatโs bad
I always wondered why isn't the default to just use the types there
Like, if someone provided types, why not use them?
The library Iโm updating is almost entirely custom types. We use these to force validation of everything our team communicates with. Making sure we send what weโre supposed to be sending and receiving the right stuff as well. Havenโt set up mypy yet as weโre in v1 but my editor is just lit the f up with type violations. Even though the package works I need to update this.
Yeah doenst make sense at all.
I'm sure it makes sense. I just don't know why.
Yeah, true.
Off-topic: We also do these sort of *-models or *-types at work but they usually cause me trouble when not working in a monorepo.
Since a change in then almost usually means a change in other packages
you need to make sure you're not breaking anyone's flow at multiple repos
We have something similar but yeah when our library changes types it bricks the rest of our pipeline. Gotta update it all or I guess use more flexible types like maybe optional fields.
I don't like too many optional fields
makes your data less predictable
more if statements
We need a few bc the external apis add fields occasionally to meet other business needs. The fields we need are strict but if other fields are present we allow it and just ignore it. B
sorry i'm confused, is that the behavior even if the package is fully typed inside?
yes
that sounds counterintituitive
that sounds like i must maintain both the typing within the package, and within the py.typed
py.typed is just an empty file, there's not much to maintain there
ah
py.typed is a marker to tell the type checker that the package contains type annotations (as opposed, I think, to using annotations for non-typing purposes)
I just wrote code like this
a: list
b: list
c: dict
for x in [a, b, c]:
x.clear()
I think this might actually be the first time I've leveraged duck typing for methods of builtins.
broo
https://mypy-play.net/?mypy=latest&python=3.12&gist=fd93f5a92c8690152a427aa234305898
mypy doesn't like it
The mypy Playground is a web service that receives a Python program with type hints, runs mypy inside a sandbox, then returns the output.
#topic-join-v-union
https://github.com/python/mypy/issues/12056 for context
You're right, I was mistaken. Thank you! I've pushed a fix for that and the typos as well. ๐
!e ```py
import sys
import shlex
print(shlex.split(f"{sys.executable} -m pip install -r requirements.txt"))
:white_check_mark: Your 3.12 eval job has completed with return code 0.
['/lang/python/default/bin/python', '-m', 'pip', 'install', '-r', 'requirements.txt']
wait wut
Administrator in ~\Desktop via ๐ v3.11.9
sys.executable: C:\Users\NuitkaDevOps\AppData\Local\Programs\Python\Python311\python.exe
['C:UsersNuitkaDevOpsAppDataLocalProgramsPythonPython311python.exe', '-m', 'pip', 'install', '-r', 'requirements.txt']
windows bug then i guess
i guess it's Cpython PR time? something like that should check for platform rather than be posix always
I wonder if it would happen if I have a backslash in the path to my sys.executable on unix
shlex.split(s, comments=False, posix=True)```
Split the string *s* using shell-like syntax. If *comments* is [`False`](https://docs.python.org/3/library/constants.html#False) (the default), the parsing of comments in the given string will be disabled (setting the [`commenters`](https://docs.python.org/3/library/shlex.html#shlex.shlex.commenters) attribute of the [`shlex`](https://docs.python.org/3/library/shlex.html#shlex.shlex) instance to the empty string). This function operates in POSIX mode by default, but uses non-POSIX mode if the *posix* argument is false.
Changed in version 3.12: Passing `None` for *s* argument now raises an exception, rather than reading [`sys.stdin`](https://docs.python.org/3/library/sys.html#sys.stdin).
!e
import sys
import shlex
print(shlex.split(f"{sys.executable} -m pip install -r requirements.txt", posix=False))
:white_check_mark: Your 3.12 eval job has completed with return code 0.
['/lang/python/default/bin/python', '-m', 'pip', 'install', '-r', 'requirements.txt']
same thing ๐
!e
import sys
import shlex
executable = sys.executable.replace("/", "\\")
print(shlex.split(f"{executable} -m pip install -r requirements.txt"))
:white_check_mark: Your 3.12 eval job has completed with return code 0.
['langpythondefaultbinpython', '-m', 'pip', 'install', '-r', 'requirements.txt']
shlex is not supposed to be used on windows commands
I was trying to keep cross platform compatibility in one go :p
Fair enough โ but unix paths can contain backslashes as well
Maybe you just need to escape the backslashes?
are they treated as slashes or as regular "letters"?
Regular characters. Unix filenames can contain anything but slash and null byte
(Actually using them in practice is pretty bad UX, I can't even call a command containing a backslash in my shell.)
It should be interpreting the backslash as a shell would
I think straight string interpolation is going to be wrong in several edge cases (even for something simple like a path with a space)
As in:
$ 'py\\thon'
zsh: command not found: py\\thon
But:
import os
os.system('py\\thon') # this works
(after I made a python interpreter under that name)
The shell one is calling a command with two backslashes in the name, because you combined two different forms of quoting. You needed either sh $ py\\thon or ```sh
$ 'py\thon'
Hey folks! New here. Are there any bytecode debuggers that can report on the "calculation stack" state? I say calculation stack - i don't mean the frame stack that represents function execution, but the stack that the core loop uses to process values when running bytecode. If anyone has better language for these constructs, would love that.
Anyway -i'd like to be able to step through a line itself and see the results of the expressions. so given:
`def f():
return 5
x = 0 + f() + 1`
i'd like to see the result of f(), then the result of 0 + 5, and finally the result of 5 + 1, and then the store operation.
are there any tools out there that can do this? I was looking at trepan3k, but it looks like it can't monitor the stack/results of calculations. I'm down to tinker with the interpreter if need be. Thanks y'all!
I recall pycharm's debugger having something similar to this
I think playing the next step doesn't play an entire line but just the next "calculation", though I might be wrong
Iirc thonnys debugger does this
how come some PRs get stuck on โawaiting mergeโ?
somebody needs to actually want to click the merge button
personally for me, merging is a much scarier action than just leaving an approving review
hey can somebody help me in my project or if this wrong channel to ask for this can you tell where i can ask it
how come? i thought the point of people doing reviews was to make the merge button easier to click
git blame
reviews are helpful, but as the person clicking the merge button I still feel accountable
reviews are more helpful if they show evidence that the reviewer thought about all the things that are important, e.g. how well-tested is this, does this fit in the current structure of the code, does the discussion show evidence that there is consensus for the change?
For example, before merging something I'll often read through the associated issue to make sure people are in agreement that the change is the right direction to go
what do you mean by โcurrent structure of the codeโ? as in, the style guide?
that, and also just the way the specific file being changed is structured
like is this the right place to make a change
damn this is actually true on all code and this is a great way to put it, I've felt this a lot but didn't know how to say it
are there any ways other than --no-tkinter, --no-ctypes and --no-ssl to speed up cpython compilation time e.g. for git bisect?
Maybe using -O0 instead of the normal optimization level could help. And ccache is likely to help.
More cores also help
-O0 should help a lot
every codebase will be different, but my C++ codebase at work, unoptimized builds take about half the time as optimizations. optimizations are expensive.
ccache is also great advice but it could be a bit finnicky to setup, unles cpython comes with a step by step guide for it. I probably wouldn't do that unless you do this sort of thing often, for a one off I'd only do it if I were desperate
hmmm well
If it's a purely logical bug, maybe. But if e.g. something is causing undefined behavior in the interpreter and crashes, -O0 might turn off some optimizations that lead to this
don't quote me on that though, I'm not a C expert
(this is C++ rather than C but I assume it's similar there)
The "Infinite loop without side-effects" in https://en.cppreference.com/w/cpp/language/ub is particularly amusing to me
yes, if the bug is based on UB then tweaking optimization levels could hide the bug or make new bugs appear
Maybe it can be good then, find new bugs
depends on what the meaning of "good" is ๐
if you're looking to get to the bottom of one bug, unrelated bugs that appear randomly are probably not the most helpful
That is true
that's definitely a fair point - sorry I hadn't scrolled up far enough to see that was the bug
Oh I didn't scroll up either. I'm just talking out of my arse
i suspect on average optimizations are actually more likely to make UB bugs disappear, than appear, but I could be wrong
haha okay
PEP 503 specifies that compliant package repositories must collapse runs of adjacent - or _ or . characters to a single - in the normalized name served by the index. JFrog Artifactory apparently doesn't do that. Is anyone aware of a package name on PyPI that has two adjacent - or _ or . in its (pre-normalization) name? It'd be great if I could find a real life example of a package that's not installable (at least, not by following its documented installation instructions) that I could give them in a bug report.
i don't think so? optimizations often make use of the assumption that UB won't happen
Yeah, so it could hide something that would segfault if it actually executed at runtime, like the first example here
https://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html
Right, that's the kind of situation I was thinking of
but obviously it can go both ways
i'm having some trouble embedding the cpython main branch
take this simple C embedded program
#include <Python.h>
int main(void) {
Py_Initialize();
Py_Finalize();
return 0;
}
i've been compiling it like this gcc a.c -o out -L./ -l:libpython3.14d.a -I./ -I./Include -I./Include/internal -I./Include/cpython -lm
but it complains about not finding platform independent libraries in prefix and exec_prefix:
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Fatal Python error: Failed to import encodings module
Python runtime state: core initialized
Exception ignored in the internal traceback machinery:
ModuleNotFoundError: No module named 'traceback'
ModuleNotFoundError: No module named 'encodings'
i didn't see anything in the devguide about embedding, what am i missing?
you may want to read through https://docs.python.org/3/c-api/init_config.html#init-config.
As a shot in the dark, does PYTHONHOME=$PWD ./out work?
nope, i'll read through that
thanks, the trick was to add the local Lib and Modules directory to config.module_search_paths
maybe this is worth putting in the devguide?
I don't think what you're doing is a normal way of doing embedding... normally you'd build the interpreter, install the built interpreter, and then link against the installed libraries and compile against the installed headers in order to produce your executable that embeds the interpreter
going straight from in-tree build to embedded executable without first installing the interpreter seems weird, and it's not too surprising to me that the devguide doesn't have anything to say about doing that...
how do you "install the built interpreter"?
sudo make install
or without sudo if you've set up the prefix to be somewhere you have privileges to write to
well that would have been helpful to know a few hours ago ๐
the devguide definitely does cover how to build from source, and includes the make install step ๐
hm, yeah. They do say
There is normally no need to install your built copy of Python! The interpreter will realize where it is being run from and thus use the files found in the working copy.
but that seems like bad advice when you're trying to actually use it, like by embedding it into another executable...
ah well, I'll let the core devs weigh in on that piece
The devguide doesn't seem like the right place since the devguide is about developing CPython itself, and you're not doing that, you're embedding it
But despite being a core dev I don't know mucha bout this area
well, i was embedding it to triage #121849
hi
yo python sats my line is broken can somone help m
e
if oparation == :+
i am just a beginner
see #โ๏ฝhow-to-get-help, this channel is for
internals-and-peps
alr
Is there anything in python so far for implementing generics?
Maybe we can have that as a proposal soon
Go used to have a similar problem where they couldn't use generics
Led to a lot of code duplication
I don't think it'd be more thematic to use templates like in C++
could be a cool idea but probably too much work
The typing specification does support generics
(c)python isn't even a compiled language to start with so talking about C++ templates and monomorphization (i.e. code duplication) is... confusing
Python/compile.c would disagree ๐
has the word "compile", checks out ๐
python has had generics for about 9 years
just maybe not in the same way as one would anticipate, at least syntax wise
Arguably forever, lists can always hold anything. What we've had for 9 years (a little more I think?) is typing syntax to describe them
and mypy's own syntax for a few years before that
Wait how so!
!pep 484
can you give us a concrete thing you'd like to do? It will be easier to sort through the terminology conflicts.
are there any plans to stop supporting 32bit windows?
I don't believe so
i think there was a discuss.python.org thread about it recently
Right now, supporting the Windows x86 (32-bit) platform is the responsibility of all core developers: itโs a Tier-1 platform. Iโm not comfortable with this status: Iโm not sure that โall core devsโ have access to a Windows machine to build Python in 32-bit mode and debug issues (specific to this platform). Recently, I fixed a RecursionError in ...
Yes, also this from the packaging side https://discuss.python.org/t/dropping-32-bit-packages/5476
Where would be an appropriate place to discuss dropping 32-bit packages across the python ecosystem? Neither the Redhat, the windows store nor conda-forge support any 32-bit variants, and producing packages doubles CI time for all library packagers. I realize each project could have its own support policy, but a python-wide statement would be m...
Hi
Hey guys
I'm new and need your help
My mother lost her phone
Is there anyway we could track it
Because someone stole it after
It won't work
I tried
What about python?
I heard you can track
Damn
So it is useless?
Rust?
What is that
Please guys I need your help
Can you send an inv
Where do I find it
?
I did
What
How
Ye but not in te pic lol
What the heck
!ot
#ot2-never-nesterโs-nightmare
Please read our off-topic etiquette before participating in conversations.
this is the #internals-and-peps channel, not offtopic.
im curious about the choice to use purple in the new repl
i thought blue or something would have been a more python-y color
is ALSA worthy for stdlib?
they won't add that to the stdlib
why? not worthy?
too niche, and can be installed from pypi
pip? why prefer pip?
because the core team doesn't want to have to maintain an ALSA library, and most people don't need it.
it's harder to get rid of something once it's in the stdlib.
why that date?
someone said official ALSA support will be out in 30th Feb 2025
2nd March 2025 is just a more possible one
why would Python drop winsound just because some other thing gets support?
i think, in 3.13, stdlib have windows sound support, but no linux, that would be weird
it only has winsound because of history and not wanting to break people using it. It wouldn't be added to the stdlib today.
oh
we've tried to pick colours that work well in both dark and light terminal themes and still have enough contrast. here's an adjusting at tweaking the colours, but the last comment shows it's not great on for example Ubuntu. the general plan is to make the colour scheme configurable
I was scrolling through What's new and I noticed a thing -
https://github.com/python/cpython/pull/118816 removed pickle support from itertools
But it left behind a test function that made sure that the pickle deprecation warnings: https://github.com/python/cpython/blob/main/Lib/test/test_itertools.py#L20-L37
Would it be be weird if I opened a PR to remove it?
Should theoretically be fine, pickle has been removed and won't be coming back.
But I feel awkward going "hey you missed this"... It'd also be my first code change, so there's a bit of nerves
Oh yeah, that test function looks unused. Feel free to send a PR to remove it. Won't need an issue or a NEWS entry
python/cpython#122100
Oooh even number
You know.... one of these days I should probably actually add something
Removing things is more important, we have plenty of code already ๐
hey y'all! just downloaded 3.14 from github. I'm going to be modifying the code, so i'd like the application to be named something other than python (so as not to conflict with otherpython installations or be confusing). I'm just learning make and it seems the name is determined in the makefile. is there anything else i'll need to change to get a custom name for my version of python?
3.13 isn't even fully out yet... it's in beta I believe
haha i may have been overeager then
ok. that said - in general, would i only need to modify the makefile to change the app name?
You can use make altinstall but I'd probably recommend using pyenv to build pre-releases and keep them relatively isolated
This is so bizarre. I'm curious to figure out whether this is a bug or a feature
If I do:
printable = dict(zip(range(2**16), list(filter(str.isprintable, (map(chr, range(2**16)))))))
print(printable)
I get a dict ending with:
{ # ...
55528: '๏ฟค', 55529: '๏ฟฅ', 55530: '๏ฟฆ', 55531: '๏ฟจ', 55532: '๏ฟฉ', 55533: '๏ฟช', 55534: '๏ฟซ', 55535: '๏ฟฌ', 55536: '๏ฟญ', 55537: '๏ฟฎ', 55538: '๏ฟผ', 55539: '๏ฟฝ'
}
Note the index on the terminating elements. However, if I do:
printable = dict(zip(range(2**16), list(filter(str.isprintable, (map(chr, range(2**16)))))))
with open("printable.json", "w") as fp:
import json
fp.write(json.dumps(printable, indent=4))
I wind up with
{
// ...
"55502": "\uffeb",
"55503": "\uffec",
"55504": "\uffed",
"55505": "\uffee",
"55506": "\ufffc",
"55507": "\ufffd"
}
55507 != 55539, so... Where'd the last 32 elements go?
Python 3.12.3 (main, Apr 10 2024, 05:33:47) [GCC 13.2.0] on linux
It's just in a different order. Try sort printable.json | tail.
sort printable.json | tail
"9992": "\u2b1d",
"9993": "\u2b1e",
"9994": "\u2b1f",
"9995": "\u2b20",
"9996": "\u2b21",
"9997": "\u2b22",
"9998": "\u2b23",
"9999": "\u2b24",
"999": "\u0433",
"99": "\u00a5",
Gets me every time
sort -n then
Ooo
They are confirmed to be missing
marley@localhost:~$ grep -E '^\s*"555[0-9]{2}":' printable.json
"55500": "\uffe9",
"55501": "\uffea",
"55502": "\uffeb",
"55503": "\uffec",
"55504": "\uffed",
"55505": "\uffee",
"55506": "\ufffc",
"55507": "\ufffd"
marley@localhost:~$

Interesting, right?
Here's this if anyone wants to check for themselves
Wait... I may just be an idiot
32 elements missing...
Never mind
I'm just an idiot
I'll leave the post mortem as an exercise for the reader 
I can't even find a way to get 55539 entries, need a little hint ๐
But the last values match, so if there are 32 missing entries they are either in the beginning or scattered in the middle ๐ค
Did you run the code on two different OSes? (I get 6 more entries on Windows)
i get 55507 on both
3.12.0 on amd64-win32
although i get the reason why the numbers are mismatched
i don't get why there's 32 less elements
Same
good evening. Doubt from a noob. For creating virtual environmental I know 2 ways, in windows through WSL and VSCOOD. What is the most "professional" and usual way used in companies?
How do you create the virtual environments via WSL & VS code respectively?
this question would be better suited to a help thread; see #โ๏ฝhow-to-get-help
!e
import sys
print(f"{sys.stdout.fileno() = }, {sys.stderr.fileno() = }")
:white_check_mark: Your 3.12 eval job has completed with return code 0.
sys.stdout.fileno() = 1, sys.stderr.fileno() = 2
is this guaranteed to always be the case?
!e
import sys
sys.stdout, sys.stderr = sys.stderr, sys.stdout
print(f"{sys.stdout.fileno() = }, {sys.stderr.fileno() = }")
:white_check_mark: Your 3.12 eval job has completed with return code 0.
sys.stdout.fileno() = 2, sys.stderr.fileno() = 1
well fuck.
well if it wasn't it'd be pretty weird
would that have the effect of all printed text being displayed as an error?
that's what I mean
so let me refine my question: are filenos 1 and 2 guaranteed to always point to what the OS considers stdout and stderr, respectively?
!e
import sys
sys.stdout.close()
f = open(__file__, "r")
print(f.fileno())
:x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 4, in <module>
003 | print(f.fileno())
004 | ValueError: I/O operation on closed file.
huh that's not what I expected
:white_check_mark: Your 3.12 eval job has completed with return code 0.
/home/main.py
was wondering what it would be in snekbox
no I was just trying to demonstrate that you can close stdout
oh wait the printing is throwing that
because it's trying to print to sys.stdout which I closed
!e
import sys
sys.stderr.close()
f = open(__file__, "r")
print(f.fileno())
:white_check_mark: Your 3.12 eval job has completed with return code 0.
3
!e
import sys
sys.stderr.close()
raise Exception('rekt')
:x: Your 3.12 eval job has completed with return code 1.
001 | object address : 0x7f68497a91e0
002 | object refcount : 3
003 | object type : 0x7f684a0f1ca0
004 | object type name: Exception
005 | object repr : Exception('rekt')
006 | lost sys.stderr
!e
import os, sys
os.close(sys.stderr.fileno())
f = open(__file__, "r")
print(f.fileno())
:white_check_mark: Your 3.12 eval job has completed with return code 0.
2
see I got fd 2 to point to some random file instead
@hearty heath I removed your message. do not post random memes in this server.
POSIX guarantees that STDIN_FILENO is 0, STDOUT_FILENO is 1, and STDERR_FILENO is 2. That leads Python to construct the sys.stdin stream wrapping fd 0, the sys.stdout stream wrapping 1, and sys.stderr wrapping 2. After they've been created, something else could always mess with them, though. Like the pytest capfd or capsys fixture, or contextlib.redirect_stdout
!e
import os, sys
os.close(sys.stdin.fileno())
f = open(__file__, "r")
print(input())
:white_check_mark: Your 3.12 eval job has completed with return code 0.
import os, sys
you made the snake swallow its tail
I should spend more time in this channel instead of moderating or doing work.
I guess more generally - "what the OS considers stdout and stderr" isn't really a thing in the way you're imagining, at least not on POSIX systems (I'm not familiar enough with Windows to say if it's different). There's not really a continuous idea of what stdout and stderr are. The OS guarantees that the streams exist when you start your process. Once your process is running, it can do whatever it wants with those streams. There's no guarantee that, while your process is running, it has any open streams
"The OS guarantees that the streams exist when you start your process" Does it? I think that's something the shell does, but pretty sure you can spawn a process in a way that doesn't have those fds
I believe POSIX requires that the streams be open when main starts executing... Let me see if I can get a reference for that...
directly after program startup on posix platforms, yes
in a manner of speaking anyway
godlygeek is correct re: this
hm, no - the exec docs say:
If file descriptor 0, 1, or 2 would otherwise be closed after a successful call to one of the exec family of functions, implementations may open an unspecified file for the file descriptor in the new process image. If a standard utility or a conforming application is executed with file descriptor 0 not open for reading or with file descriptor 1 or 2 not open for writing, the environment in which the utility or application is executed shall be deemed non-conforming, and consequently the utility or application might not behave as described in this standard.
So: an implementation may or may not guarantee that 0/1/2 are open when main runs. If it doesn't and they're not, any POSIX utilities are allowed to fail.
Looks like Linux does allow the process to start without those FDs open
$ python -c 'import os; os.close(0); os.close(2); os.execlp("ls", "ls", "-l", "/proc/self/fd")'
total 0
lr-x------ 1 godlygeek godlygeek 64 Jul 22 20:04 0 -> /proc/5092/fd
lrwx------ 1 godlygeek godlygeek 64 Jul 22 20:04 1 -> /dev/pts/3
$
Wait... if 0 is closed, where is it outputting?
0 is stdin, 1 is stdout. I didn't close 1
oh right
but note that, when ls opened /proc/self/fd in order to list the contents, it got fd 0
!e
this is mildly amusing
import os, pathlib
pathlib.Path("foo.txt").write_text("Hello, world!\n")
os.close(0)
with open("foo.txt"):
print(input())
:white_check_mark: Your 3.12 eval job has completed with return code 0.
Hello, world!
foo.txt
Hello, world!
heh, I actually sorta like that ๐
Anyone with access to Windows and a 3.13.0b4 or main build? Does pasting:
exec(compile("tuple()[0]", "s", "exec"))
In the new REPL exit the interpreter for you?
Hmm, seems to exit on Linux too.
crashes for me on macos too on close-to-latest main
Is the repl relevant? Does it behave differently in a script than in the repl?
yeah it crashes in repl code
run_multiline_interactive_console(console)
File "/Users/jelle/py/cpython/Lib/_pyrepl/simple_interact.py", line 156, in run_multiline_interactive_console
more = console.push(_strip_final_indent(statement), filename=input_name, _symbol="single") # type: ignore[call-arg]
File "/Users/jelle/py/cpython/Lib/code.py", line 303, in push
more = self.runsource(source, filename, symbol=_symbol)
File "/Users/jelle/py/cpython/Lib/_pyrepl/console.py", line 200, in runsource
self.runcode(code)
File "/Users/jelle/py/cpython/Lib/code.py", line 95, in runcode
self.showtraceback()
File "/Users/jelle/py/cpython/Lib/_pyrepl/console.py", line 168, in showtraceback
super().showtraceback(colorize=self.can_colorize)
File "/Users/jelle/py/cpython/Lib/code.py", line 147, in showtraceback
lines = traceback.format_exception(ei[0], ei[1], last_tb.tb_next, colorize=colorize)
File "/Users/jelle/py/cpython/Lib/traceback.py", line 155, in format_exception
return list(te.format(chain=chain, colorize=colorize))
File "/Users/jelle/py/cpython/Lib/traceback.py", line 1384, in format
yield from _ctx.emit(exc.stack.format(colorize=colorize))
File "/Users/jelle/py/cpython/Lib/traceback.py", line 747, in format
formatted_frame = self.format_frame_summary(frame_summary, colorize=colorize)
File "/Users/jelle/py/cpython/Lib/traceback.py", line 583, in format_frame_summary
show_carets = self._should_show_carets(start_offset, end_offset, all_lines, anchors)
File "/Users/jelle/py/cpython/Lib/traceback.py", line 701, in _should_show_carets
statement = tree.body[0]
IndexError: list index out of range
@glass mulch please report a bug on CPython if there isn't one open already
Ah, looks like https://github.com/python/cpython/pull/122126 will fix it.
hm that doesn't look very robust
the problem seems to be that the AST is wrong, not that the REPL is
I'd be curious what all_lines is in the tree = ast.parse('\n'.join(all_lines)) call
all_lines = ["# Important: don't add things to this module, as they will end up in the REPL's"]
Due to a linecache bug, the _pyrepl source is being passed as the contents of the bogus "s" file. So the tree is empty there. Applying the fix from #122126 solves it.
But I still think we should be more robust against exceptions coming from _should_show_carets/format_frame_summary if they mean exiting the interpreter.
you could presumably trigger a similar bug by creating an empty file x.py and then compiling with x.py as the filename
and then that PR's code would still crash
hm. linecache is a cache, and it does stand to reason that code using it needs to be robust against the possibility that the cache is outdated... so, maybe a contextlib.suppress is the best solution...
what is the default UA of asyncio.open_connection ?
UA? user agent?
if that's what you mean then it opens a TCP connection, it's lower level than HTTP so it doesn't have a "default" User Agent
oh......
if you want async http, try httpx or aiohttp
httpx is slower, but aiohttp is more verbose
and aiohttp also comes with a server component and support for web sockets
how is aiohttp more verbose? ๐ค
Both runpy.run_path() and pkgutil.get_importer() fail with a too long filename in Windows, due to importlib._bootstrap_external._path_stat() calling os.stat() and no checks for the raised ValueError. Now, both functions may well be left alone. But does anyone see a way to trigger this for more interesting targets?
> .\PCbuild\amd64\python_d.exe -c "import runpy; runpy.run_path('a' * 33000)"
Traceback (most recent call last):
File "~\PycharmProjects\cpython\Lib\pkgutil.py", line 223, in get_importer
importer = sys.path_importer_cache[path_item]
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
KeyError: 'aaaaaaaaaaaaa[...]a'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
import runpy; runpy.run_path('a' * 33000)
~~~~~~~~~~~~~~^^^^^^^^^^^^^
File "~\PycharmProjects\cpython\Lib\runpy.py", line 281, in run_path
importer = get_importer(path_name)
File "~\PycharmProjects\cpython\Lib\pkgutil.py", line 227, in get_importer
importer = path_hook(path_item)
File "<frozen importlib._bootstrap_external>", line 1716, in path_hook_for_FileFinder
File "<frozen importlib._bootstrap_external>", line 173, in _path_isdir
File "<frozen importlib._bootstrap_external>", line 158, in _path_is_mode_type
File "<frozen importlib._bootstrap_external>", line 152, in _path_stat
ValueError: stat: path too long for Windows
Hmm, there's another path:
>>> from importlib.machinery import SourceFileLoader
>>> s = SourceFileLoader("x" * 33000, 'x' * 33000)
>>> s.load_module("x" * 33000)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<frozen importlib._bootstrap_external>", line 649, in _check_name_wrapper
File "<frozen importlib._bootstrap_external>", line 1176, in load_module
File "<frozen importlib._bootstrap_external>", line 1000, in load_module
File "<frozen importlib._bootstrap>", line 537, in _load_module_shim
File "<frozen importlib._bootstrap>", line 966, in _load
File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 991, in exec_module
File "<frozen importlib._bootstrap_external>", line 1076, in get_code
File "<frozen importlib._bootstrap_external>", line 549, in cache_from_source
File "<frozen importlib._bootstrap_external>", line 104, in _path_join
ValueError: _path_splitroot: path too long for Windows
that seems to be behaving as expected?
I guess. I'm trying to figure out if there's any interesting place that doesn't expect that ValueError, like happened with linecache and ended exiting the interpreter:
https://github.com/python/cpython/issues/122170
Since these are import adjacent, I'm poking at the import machinery, etc.
Well, I learned that a very long PYTHONPATH in Windows stops the interpreter from initializing correctly, so there's that. Not sure if it's already known, important or just an obvious "then don't do that", but might be worth an issue to figure out. What do you think?
> $Env:PYTHONPATH="a" * 33000
> py -3.13
Exception ignored in running getpath:
Traceback (most recent call last):
File "<frozen getpath>", line 668, in <module>
OSError: failed to make path absolute
Fatal Python error: error evaluating path
Python runtime state: core initialized
Current thread 0x00005a8c (most recent call first):
<no Python frame>
you need to enable the long paths when installing the python interpreter.
this is not a realistic situation, because paths that long are not possible in NTFS
and outcome is expected, because provided path doesn't make any sense
OSError: failed to make path absolute
good point
Hmm, yeah might not be anything. Also works in sys.path, where it's likely even less of an issue:
>>> import sys
>>> sys.path.insert(0, "a" * 33000)
>>> import email
Traceback (most recent call last):
File "<frozen importlib._bootstrap_external>", line 1512, in _path_importer_cache
KeyError: 'aaaaaaaaaaaaaaaaaaaaaaaa[...]aaaaaaaaaa'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<python-input-6>", line 1, in <module>
import email
File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
File "<frozen importlib._bootstrap>", line 1322, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1262, in _find_spec
File "<frozen importlib._bootstrap_external>", line 1555, in find_spec
File "<frozen importlib._bootstrap_external>", line 1527, in _get_spec
File "<frozen importlib._bootstrap_external>", line 1514, in _path_importer_cache
File "<frozen importlib._bootstrap_external>", line 1490, in _path_hooks
File "<frozen importlib._bootstrap_external>", line 1714, in path_hook_for_FileFinder
File "<frozen importlib._bootstrap_external>", line 173, in _path_isdir
File "<frozen importlib._bootstrap_external>", line 158, in _path_is_mode_type
File "<frozen importlib._bootstrap_external>", line 152, in _path_stat
ValueError: stat: path too long for Windows
That error doesn't seem great, I would support a change to print a more tailored error message
though not sure if it's really worth changing, is this likely to happen without someone messing with the environment on purpose?