#off-topic
1 messages · Page 4 of 1
Because the CVE process has no means to mark preconditions to exploit as part of the validity of the CVE, they are right
whether or not them being right is useful to developers or will just act as noise making CVEs less useful is the issue with it
To me this sounds like they are mocking the idiots that’re knowingly filing false CVEs for clout.
we should be viewing CVEs in a more wholistic way, rarely is one bug enough to cause full system compromise, but often a chain of bugs is.
we should want to fix all bugs, but saying all bugs lead to full system compromise isn't practical
and sometimes intentional behavior is accepted as a CVE, without consideration that the intentional behavior has a precondition to it that prevents the associated CVE
should we make software less useful, or should it only be a CVE if the necessary invariants to safety can be violated?
fantastic response lol https://github.com/pytest-dev/py/issues/287#issuecomment-1283567565
Instead of a page listing downloads I still want rustup for Python
yes I agree with you about having a single tool but what I mentioned earlier was that the tool must manage Python installations for you and it's a terrible experience to go through the installation/compiling process all the time so we need prebuilt distributions like those standalone builds that everybody uses now
True. IIRC the last time I asked, standalone was already a solved problem (technically) but there’s little momentum to actually put the builds into the release pipeline and shown on the website. Is that actually the case? The problem has been at the back of my head for so long I’m no longer sure if I’m hallucinating a solution.
I believe this is the main issue on the topic in CPython https://github.com/python/cpython/issues/119696
That's what rye does right?
Downloads premade portable builds
So does hatch if you want to
I'm proposing a gradual rewrite of some tooling at work and I thought I would use the cloud CLIs for comparison
even though I don't trust them to maintain stuff over long time scales, Google really is the master of UX
just like their UI is nicer than AWS, so too is gcloud compared to aws
az is somewhere in the middle but much closer to Google's, it's a very nice interface
the only characteristic for which AWS shines is the response time, they must have optimized with lazy imports and other techniques because it's way faster
responsiveness:
unless I have outdated understanding this is a fair comparison because all are written in Python
That's kind of wild? uv help is like 3ms
Python imports are slow.
There has been a fair bit of work optimising pip's start up time (without resorting to large scale lazy imports) and even then pip help > /dev/null takes 120-50 ms (and I'm on a fast laptop).
Hatch is optimized for this https://github.com/pypa/hatch/actions/runs/9830087987/job/27135868762#step:8:17
I don't test UV in the benchmark because... obviously
The next time would probably be breaking up more of pip's codebase into sections that can lazily loaded, and working with our upstream vendored libraries to improve their import time, but that's a lot of work for not much gain at this point...
A lot of the low-hanging fruit (on pip's end) has been dealt with at this point AFAICT.
I still lament the rejection of PEP 690
pretty much everything left for tools like pip is stuck waiting on the interpreter startup speed itself to be faster
@junior narwhal is there a recommended way to run commands for building - i want to play around with a poor mans integration of zig / pydust and it neds to run come commands for building the extension modules
doing a bit more research so I can finish up a PEP and omg what is AWS doing
So, I have a confession to make, I don't have a hardware security key yet. I plan on fixing that soon, does anyone have any experience with the Yubico Security Key? That's the key I'm planning to buy. I'm curious to whether there are any "gotchas" I should be aware of.
am I hallucinating or was there a blog post about requests or urllib3 easing a transition away from a deprecated extra/optional dependency by maintaining a meta package?
I vaguely remember a blog post too, but this is a relevant issue https://github.com/urllib3/urllib3/issues/2680
I can't find any blog post, feels like something @valid rover would've written about at some point, but I dunno.
you found it, thank you! I thought it was Seth but I couldn't find anything and AI didn't help me
found it, Twitter https://x.com/sethmlarson/status/1558076150206926850
Perhaps you're thinking of the video from Anthony? https://youtu.be/_jUXdX8e9Wg?feature=shared 😛
no it was definitely your Twitter, long threads equals a blog post in my mind apparently lol
Ahhhh yeah Twitter thread too!
Haha i should go back and change all my long threads into posts now that Twitter is the way it is
I actually find Twitter/X one of the best platforms now, the Community Notes feature is so good to reduce misinformation
Yeah i don't like the walled garden aspect, or the owner.
Used to be able to link anywhere!
that policy got removed promptly unless I'm thinking of something else
oh and the walled garden policy got removed very quickly, I know because I send family cute animal videos and they don't have an account (and were indeed very annoyed when they had to have one)
I see walls, here's how the "long thread" looks if you're logged out... 🙃
I have a Yubikey Nano and it works well. I also use Touch ID on Mac as 2FA
the only Yubikey gotcha I know is accidentally pressing it and it splatting something like "cccccclvkjnnfllcrihrgdfdjglucgbeckklibdgdhgv" onto my screen! (it's technically a USB keyboard)
and I can't see anything at https://x.com/sethmlarson except a sign-in box
for user profile pages: Facebook takes a similar approach where the default privacy settings are set up as private, Instagram and Threads default is public but endless scroll is disabled and you get a message to sign in, the only popular social media app I know of that isn't limited is TikTok
I guess the problem is that Twitter used to not be that, so people with an account still treat it as something where they can just send around links to everyone and expect these links to work.
Nobody ever sends me a Facebook link, because they know that I probably don’t have an account there and can’t see the thing.
links to specific posts work just like before, long "chains" you have to have an account which like you said I guess used to work but now match other social media platforms
I think this is not a very huge problem because a person willing to view a longform thread chain of text information is very likely to already be a user, whereas the vast majority of single posts are memes, videos, news stories, etc. which are inherently consumable
for example, Seth explaining how urllib3 gradually deprecated an extra, there is nobody I know who I would send that to who isn't already on the platform
That sounds like an interesting post
Is it posted somewhere else that I could read it without signing up for Twitter?
Check out Yubiswitch: https://github.com/pallotron/yubiswitch
I forget if there's any active but there might be ongoing programmes for giving hardware keys for maintainers on PyPI?
The program between PyPI and Google was finished in 2021 or 2022?
Huh, yea.
The cost isn't actually that bad. I first looked at the Yubikey 5 which seems to be the "gold standard" and those would be ~$80 each across the pond, but the Yubico Security Key seems sufficient and is much cheaper at $40.
https://store.google.com/product/titan_security_key is what they shipped, I think.
And, that's cheaper still.
I live in Canada so you can x1.4 whatever freedom dollar pricing you got. (I don't know how much pounds are worth).
1 GBP = 1.78 CAD apparently.
Damn.
It's nicer numbers than 1 GBP = 107.5 INR
Google has a poor track record when it comes to supporting their products long-term, and while I doubt they would drop support for a security minded hardware product like this, the pricing is actually not that competitive once you factor currency conversions anyway.
All excellent reasons to not use that over a Yubikey. :)
I have money, it's just that I'm also not in the position to justify $100+ on security keys.
One for normal usage and one as backup if you loose/break the daily one
Yea... the plan is that I'm going to get another at $undetermined-later-time.
I feel you. I got few yubikeys back when Cloudflare and Yubico made a deal and keys were like 10% the price or something
I actually store 2FA recovery codes so I'm not SOL immediately if I lose my 2nd factor.
But have yet to set them up as well
The main thing that prompted this is a) my phone is starting to fail which is ... not great given it has all of my TOTP codes, and b) I have meaningful access to things at this point...
It's funny, I started contributing to black just as a way to past the time during the lockdowns and now I have commit access to a major project, and triage access to a few more.
10%...? thats incredible
I use Bitwarden and pay for the premium account ($10/yr). Passwords and TOTP codes stored E2E encrypted in the cloud. I really don’t ever have to worry about losing my TOTP codes since they’re both backed up and synced to all my devices. I do still download and store recovery codes in a safe place but I can’t remember the last time I’ve needed one. IMHO, Bitwarden has probably the best UX of any password manager I’ve used, has a CLI, is open source, and available on just about everything.
I use Bitwarden as well, but that's protected by TOTP 2FA which is on my (somewhat failing) phone
I have the same exact personal setup but with 1Password, for work however we must use security keys
https://peps.python.org/pep-0749/ might meet your needs
I occasionally have to enter my BW 2FA code into BW itself. Not very often, but in those cases I use a device with biometric unlock so I'm never practically in a bind relying on a single device.
I have the same exact personal setup but with 1Password, for work however we must use security keys
Same at work as well.
this goes incredibly hard 😂 https://github-roast.pages.dev
I wasn't sure if I wanted to actually click the button, but ended up being genuinely amused by the result:
https://github-roast.pages.dev/share/pradyunsg/?lang=english
Oof. It clung a bit too strongly on the don't send me emails thingie tho.
It tends to focus on the wrong things sometimes. Like it complains my repos such as FastAPI has zero stars… yeah because it’s a fork
I'm kinda happy it didn't find anything to poke that hurts?
Listing affiliations is intentional, follower count and repository counts are meh, tech support disclaimer makes my life better.
On that note... https://praise-me.fly.dev/ exists too!
Heh, I like that (even though it's praise for me was a bit too scattergun to feel sincere. There really is a lot of random crap littering my public GitHub, and I haven't gotten around to archiving all the repos I should)
Hah, yeah the roaster completely failed for me as well. I think its developer didn’t understand that popular repos often become GitHub orgs.
great meme 😂
Are projects ever given access to the machines CPython uses for CI? https://github.com/python-cffi/cffi/issues/109#issuecomment-2302948967
I don't think so, the buildbots are all provided and administered by different people/groups
Yeah, the transitive trust requirements make the logistics too much of a hassle to be practical (unfortunately)
I asked the core dev Discord's off-topic channel if anyone knows of a way community orgs can get access to GitHub's ARM64 beta, though (since I assume MS would like their CoPilot+ PCs to be a good platform for AI development, not just consumption).
The ARM64 GHA runners?
The beta is open to the public now: https://github.blog/news-insights/product-news/arm64-on-github-actions-powering-faster-more-efficient-build-systems/
You should be good to use them
These runners are available to our customers on our GitHub Team and Enterprise Cloud plans.
Yeah, I only caught it on a second re-read myself.
GitHub tends to roll out to paid users first, and then open source projects a few months later
what's the general consensus on using github copilot/LLMs to write code?
I avoid it, even if there weren't potential (open, unanswered legal questions) issues for who owns the code, for the tasks it can do, it takes longer to review copilot code to ensure it's sound than it does to just write it.
I don't use it for any OSS work but my company is experimenting with it and I'm using it and giving them feedback. I think there are some workflows it speeds up. It's good at autocompleting repetitive code. It often has good rename variable suggestions. It's good at bootstrapping test cases when you don't have any yet.
It's amazing for one off throwaway scripts, especially when you need to write it in a language or library you don't know off by heart.
for me AI has been super awesome for coding! before copilot I was using Tabnine, now I use both. it frequently provides completions that are accurate without even typing the first character and just doing a new line. I've gotten slightly weaker in the past few years before I started my current treatment and if I had to estimate, AI has got me as productive as I was in ~2018
For OSS submissions, definitely steer clear as best you can due to the murky state of copyright around AI generated code (some level of AI assistance may be hard to avoid depending on which editor you use).
For personal use, the AI-enhanced Intellicode in Visual Studio was spectacular when I was learning C# earlier this year. I also found https://nicholas.carlini.com/writing/2024/how-i-use-ai.html to be a really interesting read as to what current gen AI code generators are already good at.
I'm looking to switch to a self-hosted web analytics platform. Any good suggestions?
I prefer something simple.
I think plausible.io is the best known. I've heard good things about it
We're using it for the Python docs: https://plausible.io/docs.python.org/
I love seeing other people discover and use Next PR Number. It's a dead simple project, but folks find it useful :)
https://github.com/pandas-dev/pandas/pull/51978#discussion_r1137248410
https://github.com/aucampia/rdflib/blob/d74ccad7d95e84faf76c5fa6314a5352a5080f6e/CHANGELOG.md?plain=1#L345
And I know a few other contributors of other projects are using it when I check the internal logs I keep.
I had only heard of matomo before, but plausible definitely seems closer to what I want being much simpler. Thanks for the suggestion!
Sure thing! Glad it fits what you want
Hmm, trying to self-host plausible-ce on a VPS with only a GB of ram is definitely a tall order.
ouch yeah I didn't realize it requires clickhouse
I'm planning to migrate to a larger VPS anyway, but I won't have the possibility to do that until later.
checks out 
OK, so I did migrate to a 2 GB VPS, hopefully this suffices for now. It's a bit expensive, but I'll be migrating to an entirely different VPS provider later.
On the bright side, it is working :)
So. Many. Hyperlinks. AHHH
Our review of the account named in your report has concluded. We have determined that one or more violations of GitHub’s Terms of Service have occurred and have taken appropriate action in response.
Good job but also... sigh
consider yourself fortunate that you didn't create a popular Bitcoin library in the past and are now on a list that s[pc]ammers @ on various repos who you have to report at least once a month...
Well, since I never touched crypto with a 10 foot pole, I’m luckily safe here.
@silk jungle #general message couldn't help myself
Perhaps I should figure out how this Mastodon thing works... hmm
It's nice. It is a smaller community than twitter but I kind of like that
Yep, and lots of Python people
I've found it to be quite useless outside of some tech bubbles, I wish everyone could be on the same platform
I've been using it specifically as a place to post Python musings and links (which is how I started out with Twitter), and it's functional for that purpose (which fits into one of @ofek's tech bubbles)
Hello all! We just created a new channel under Other Projects, called #wheel-next . The idea is for folks to collaborate on the evolution of the wheel spec, variant support, symlink support, writing PEPs, reference implementations, etc. etc. The public GH is https://github.com/wheel-next with more information. While a bunch of folks from various corporations are collaborating, this is very much a community-driven fully open initiative.
Are you going to get zstd support into CPython so we can use it in the wheel spec pls
I’m too stupid to write C code correctly
@marsh kite was going to look into that. I'm a strong +1 on that.
Awesome
I looked into it at one point but C code makes my brain hurt
Need to make it possible to use pyo3 in CPython 🙂
* insert Rewrite it in Rust memes *

Yes, I have a branch that I need to refactor but I will probably do that in the next couple of weeks.
Then I "just" need to write a PEP about it 🙂
nothing makes me appreciate life more than when my computer recovers from a BSOD crash loop
I encourage all of you to use Windows in order to enable this gratitude hack
arch rolling release on prod server can give you similar rush :D
what about chromeos, nginx as a webapp
only needs a few zero days, don't worry about it 
Excellent. I'm now locked out of Twitter. I had set up TOTP 2FA and yet they seem to have disabled that and require my recovery code.
It seems like that code got rotated when I updated the TOTP application.
I'm going to make sure all of my recovery codes are up to date...
I was curious to who linked to my post on Twitter.
that's rough
Fortunately it requires physical access and tearing the yubikey apart, but it does mean if you lose your key you should still revoke any keys on it, even if you think it's protected by a PIN
I thought a PIN protected you against this attack, as they can't force the chip to execute the code vulnerable to the side channel without one?
Perhaps, I'm not familiar enough with the details
apropos of nothing, I dislike Go profoundly. the concept of dependencies using Git is interesting but otherwise almost everything about it I hate and having to work with it is painful
Having a mutable multi-tenant "package" source seems like a nightmare, I really don't like the idea of git repositories as dependencies
they have their place at times, but as the main source of packages, I think there are too many issues
git-based dependencies are fine, so long as you actually point to a specific commit hash and not to a mutable reference (like a tag), or are intentional in pointing to a branch or tag that you trust the author of to provide the version guarantees you expect. It's really no different than people having dependencies without an upper bound with python deps and with an index in play.
There are more problems with Go as well.
it's always such a mistake to respond to an active social media tech thread about your passion, on a Sunday evening no less. don't do it folks
actually I would like to slightly alter what I said in the opposite direction, I think those of us involved in packaging need to do much more evangelism on social media otherwise nobody is aware of anything, for example https://x.com/zeeg/status/1832910845854253338
it's so bad that a person extremely knowledgeable about Python thinks that the PSF doesn't do fundraising for packaging...
so I've never actually used uv
does it do anything special besides install things
lol
maybe my impression was wrong? I thought it just copied the pip and pip-tools CLIs under a uv sub command and made them faster
it's basically Hatch in Rust with an experimental locking strategy/file and workspaces (coming soon to Hatch), a pipx-like command (coming after workspaces to Hatch) but without Hatch environments and plugin capabilities
slight high-level deviation in what we view as good UX but basically that is an accurate assessment
but yeah as I've been slowly realizing, I basically failed as an open source maintainer in the year 2024 because of my limited social media posting. even if you have great docs, people will not even know about what's possible without constant evangelism
it's why everyone thinks the ability to install arbitrary versions of Python via tool python install ... is so novel even though Hatch did that last December, and just like how tool run pep_723_script.py everyone thinks was a novel UX innovation by UV when I wrote the spec and introduced it in Hatch in the spring (although some folks realized like Will for example who changed their blog post entry, very nice of him https://textual.textualize.io/blog/2024/09/15/anatomy-of-a-textual-user-interface/#all-right-sweethearts-what-are-you-waiting-for-breakfast-in-bed)
@junior narwhal I totally agree with your tweet, money is what made uv possible. Of course it’s a risk to create a company for something like that and it’s a (small) risk to buy into things backed by a company (because things can always become enshittified when the company is in trouble)
uv is basically 1. identify something that’s used and grown and changed for a very long time, 2. discard the complex edge cases that few people need 3. do a clean-slate rewrite of the core and API everyone needs, but faster.
This needs time, and time costs money.
[uv is] basically Hatch in Rust
I wouldn’t describe it like that. I’d describe Hatch as (primarily) a Python project management tool whereas uv is (primarily) a direct Python venv manipulation tool like pip. In my eyes they are mostly orthogonal (indeed Hatch wouldn’t useuvas a backend if there was a bigger overlap, right?). One of the biggest overlaps is probably in that they can both download and manage Python runtimes, right?
I’d say that the PEP you wrote regarding script dependencies is the most Hatch-like thing uv implements (even though pipx did the “create venvs from spec in cache dir” first)
I wouldn’t describe it like that. I’d describe Hatch as (primarily) a Python project management tool whereas uv is (primarily) a direct Python venv manipulation tool like pip.
some of the new stuff in uv is project management, they describe it like this:
End-to-end project management: uv run, uv lock, and uv sync. uv can now generate and install from cross-platform lockfiles based on standards-compliant metadata, making it a high-performance, unified alternative to tools like Poetry, PDM, and Rye.
For what it's worth, I find it pretty hurtful that you'd say the only reason we could build this is money — we're not a big team. Charlie's done some really impressive work and attracted talented people who are excited about building things that improve the status quo at scale.
I also think that uv is not "basically Hatch", yes we have a large overlap in features but I think we've taken a different approach in our designs (and not necessarily in a better way, just different — e.g., Hatch is way more extensible and pluggable).
no one is denying the skills of your team. it is just the fact that having a team that works on a tool full-time as their job gives far better results than having the same team working on it as volunteers, after hours, when they also have to focus on their jobs that put the food on the table. Astral did something great with uv, but the money behind it played significant role in how well and how fast the tool was made
Sure the money is helpful and has a role in the speed we're able to work at, but to say it is "what made it possible" seems like a stretch.
You're welcome to your opinions though. Just know that we read these things and it's not harmless.
(I totally agree with the sentiment that way more money should be put into the Python ecosystem, esp. packaging)
I'm sorry, I'm currently sick and fuzzybrained and don't choose my words good.
I should have said something like:
What Astral is doing is a big effort, and having the ability to continue investing time into it ensures that their work has staying power.
I think if what you did were hobby projects, there is a higher chance that youd prioritize things that keep you alive over pouring time into it.
I'm super grateful for what you do.
I can definitely vouch for the "money = time" aspect. While the project I'm working on for LMStudio (portable venv layering that actually works properly) is a dramatically more niche use case than anything Astral are doing, it's something I've thought should exist for more than a decade, but would never have cared enough about to write on my own time. Dedicating 24 hours a week to it makes that project possible. I wouldn't vouch for LMStudio's longevity (I'm just a contractor, I don't know anything about their monetisation strategy), but once the project is published that won't matter so much, as the open source license will cover a lot of risks for other folks that find it suitable for their own use cases.
What money doesn't magically make happen is the research and community engagement efforts that Zanie, Charlie, and the other Astral folks have been putting in, so they deserve all the credit for that. They started from the assumption that the existing tools worked the way they do for a reason, so they ensured that first they could replicate (most of) that behaviour before really starting to explore what could be done more effectively by approaching it differently. No amount of money could make a project like uv work as well as it has without a team that actually listened to and understood the developer community they were trying to support.
Where would I ask a gh-action-pypi-publish question? Specifically, I'm wondering if this:
- name: Generate artifact attestation for sdist and wheel
uses: actions/attest-build-provenance@v1.4.3
with:
subject-path: "dist/*"
- uses: pypa/gh-action-pypi-publish@release/v1
with:
attestations: true
makes sense to have both an attestation for GitHub and one uploaded for PyPI? Are they unrelated?
I think they're unrelated, but let's ask @steel crane and @royal dirge
you might want both if your users want more options to verify
Oh wait, the PyPI action can do that?
time to read
it's only a couple of weeks old: https://github.com/pypa/gh-action-pypi-publish#generating-and-uploading-attestations
part of PEP 740
My question would be — does it push the attestations it creates back to GitHub’s system?
It doesn’t ask for the right permission to do that, so no. Doing both seems to be fine. Pybind11 now has both
Well, I tried it
It uh... broke
> Run pypa/gh-action-pypi-publish@8a08d616893759ef8e1aa1f2785787c0b97e20d6
Checking dist/crazylibs-0.1.2-py3-none-any.whl: PASSED
Checking dist/crazylibs-0.1.2.tar.gz: PASSED
Notice: Generating and uploading digital attestations
Error: Attestation generation failure: The following paths look like distributions but are not actually files: /github/workspace/dist/crazylibs-0.1.2.tar.gz, /github/workspace/dist/crazylibs-0.1.2-py3-none-any.whl
https://github.com/letsbuilda/crazylibs/actions/runs/10863127565

That was it
heh, I wasn't expecting to see my name in a formal attribution, but apparently people cite your top PyPI packages data frequently @dreamy hatch https://zenodo.org/records/4732473
"""
This script uses the data available on:
https://hugovk.github.io/top-pypi-packages
DOI:
Hugo van Kemenade, & Richard Si. (2021, May 1).
hugovk/top-pypi-packages: Release 2021.05 (Version 2021.05).
Zenodo. http://doi.org/10.5281/zenodo.4732473
"""
Yeah, I made a list, it gets cited quite a lot. Turns out making hard-to-get data more accessible is useful for science!
It's also easy to hook up Zenodo to make a "digital object identifier" (DOI) for package releases, so people can cite the DOI. For example, here's Pillow: https://zenodo.org/doi/10.5281/zenodo.596518
wow the new o1 OpenAI model is really good. I was struggling with a type checking issue and, whereas the other models (4o and Claude) were trying to fix the problem how I wanted, it explained why what I'm trying to do is not possible given current Mypy limitations
now im curious about what hte problem was
https://github.com/ofek/msgspec-click/blob/v0.1.0/src/msgspec_click/_core.py#L224
I wanted the setter functions to have a signature that represented the actual type that would be passed but there is no solution currently for that (even with a bounded TypeVar or a long Union) so I had to end up asserting the type at the start of every function
Still hoping one day we will get types on the same level as TypeScript has
most unlikely unfortunately
this looks slightly like the stuff we do for pytests store objects - where the store key varies the item type correctly
but the writeup would be magnitudes uglier in the linked case
I would appreciate a link!
I tried that last night and still got errors
i suspect that one needs a dict subclass with the type declarations and then do per item assignment of the methods
I tried a bounded TypeVar with and without Generic, tried with Protocol, tried a long Union, nothing worked
you bascially have a mapping where the get methods are "generic" but the mapping itself is not
so you'd have a get[T](type[T]] -> Callable[[..., T], None]: ... but t wouldnt be part of the outer type
hence the need for a custom mapping type, its most ugly
in case others find it useful, I made a library for generating Click options from msgspec types. This is useful for plugins being configurable by users at the command line. Hatch will use this soon and overall will be going all-in on msgspec 🙂 https://github.com/ofek/msgspec-click
I tried Pydantic but unfortunately importing it takes longer than the current response time of Hatch itself. I don't know what's going on there but I can't wait until the situation improves https://github.com/pydantic/pydantic/issues/9908#issuecomment-2351090365
while i'm generally pretty happy with o1, this can be statically typed: https://mypy-play.net/?mypy=latest&python=3.12&gist=08afb9f85adf90405d0cb841c43af22c
there are some similar patterns that are slightly easier to type too, e.g. using functools.singledispatch
might need some adjustment if you have some interesting requirements around subclassing
oh awesome! I was close to that but the only difference was that for the bound I had only the base class of all types because I thought that would work
that actually will work too, but then you won't get a type error if there's a new subclass you attempt to use that doesn't have an entry in the dict (like UnknownType in the example)
(in all cases note you do need the one measly type ignore on the assignment, to paper over the fact that you'll have some false negatives around subclassing. but all the users of it will be happy / the type inference will be what you want)
@onyx spindle is this Secrus also you?: https://github.com/sdispater/pendulum/issues/844#issuecomment-2366836232
yes
You guys really are everywhere
in less places than I would want, but still barely handling all of that in the time I have 🤣
I can't test my app on the 3.13 RCs because Pendulum doesn't have wheels and I have no Rust
But I... IDK if it's "incompatible" with 3.13 in the sense that it will burn down, or if it's just needing new wheels
yeah, I was thinking more about code breakage on 3.13
as to wheels... I will see what I can do. I don't have PyPI access to the project and might take a moment to get the author to publish them
You may have heard of this new thing called "Trusted Publishing" 
yeah, still requires some work from the PyPI project admin
Yeah
Do you need help with... I can try and install Rust in my container and see if it runs sometime in the next few days
I might find some time in the near future to fix some stuff and maybe tag 3.0.1 with 3.13 in the pipelines so we can check, but no guarantees on that
Having a dependency on Pendulum for something you need to update to new versions of Python I would suggest if not a good idea, with the exception of the 3.0.0 release the project has been borderline unmaintained for years.
I'm really hoping whenever becomes popular, I would love to use that as my main datetime library. But a bit wary it's a minor player right now.
Oh
Well that's good to know
It's not my choice though
It's a dependency of:
https://github.com/microsoft/kiota-serialization-form-python
https://github.com/microsoft/kiota-serialization-json-python
hi friends. 👋 I have a VERY off-topic question for you. Who setup the discord onboarding here? We (pyOpenSci) are adding discord to our platforms, and I LOVE that you have people read and provide an emoji response to the rules before they can post. I wondered if you have a bot or how that was set up. Many thanks!! 👐
@silk jungle was tweaking the settings on that IIRC
shouldn't this function also do something on macOS since that platform is also case insensitive? https://docs.python.org/3/library/os.path.html#os.path.normcase
that reminds me, that function should also be changed to account to windows new case sensitivity flag, unless it already does, hm, let me check
what would you recommend for macOS users? I'm writing code that finds executables satisfying certain conditions and since the default experience is case-insensitive I'm worried that users would get duplicate paths (mostly because of shell startup scripts) so I'm thinking of not using that function and treating macOS as case-insensitive in all cases
apparently macos has support for case sensivity as well, food for thought
yes I understand that but I'm trying to think of the 99% use case
then shouldn't you be alright using it anyways?
On other operating systems, return the path unchanged.
i.e for the 99% it will return a lowercase path
I suppose, although I've seen some software use uppercase in path components but you're right that only Windows is particularly prone to stuff being a mix of upper and lower case so I'll use that function indeed, thanks!
I don't have the bandwidth to find it but I could've sworn I had to fix a bug (maybe for Hatch) that was explicitly about case sensitivity on Windows and macOS but not Linux
found it, but I can't find where macOS was mentioned https://github.com/pypa/hatch/issues/1350
macOS case sensitivity was another issue from before which is why I copied the decision to treat it as case-insensitive https://github.com/pypa/hatch/issues/1054
@ionic tulip pluggy getting some more love, in case you haven't seen 🙂 https://simonwillison.net/2024/Sep/25/djp-a-plugin-system-for-django/
Looks lovely, I gotta reiterate the async support plan for plugging
can anyone reproduce this, especially the time reduction by using the previous minor release of Pydantic? https://github.com/pydantic/pydantic/issues/9908#issuecomment-2377109363
side note: it saddens me that Windows is ubiquitously slower with everything and I wish I had more systems knowledge to understand why
For what it's worth, I can't see much of a difference when run on Windows 11 with fresh environments — ended up with around 115-130 ms on both Pydantic minor versions. My sample sizes were relatively small though.
thanks for trying! that actually is a reproduction because they announced significant improvements in 2.9 https://pydantic.dev/articles/pydantic-v2-9-release#performance-improvements
big day (not really): I decided to switch to double quotes. historically it's easier for me because I don't have to press shift but everyone expects them now plus I do a lot of work in other languages so I had to change around my keyboard hotkeys so that the apostrophe sends a double quote and the apostrophe requires shift. I thought about having two different keys but there's not much space left on my on-screen keyboard
Since Black came around, I just started typing whatever I’m used to since Black (now Ruff) will just reformat things on save anyway.
AFAIK, aside from process creation being slow compared to *nix systems (which is mostly an artifact of "win32 is big, really big, so loading the win32 API into each new process is slow"), it's no one thing, but lots of little things arising from different design decisions over the life of DOS/NT/modern Windows vs *nix (and Linux in particular). Hence even MS eventually deciding that Linux was often the best choice for headless use cases where having access to the full win32 API isn't useful.
With the real-time Linux patch set finally being mainline, it's even harder to see that performance gap ever closing.
where I see it the most is with IO, even with equivalent storage attachments. Windows is just so slow it's crazy
@nocturne swallow congrats on TOML support in tox!
I'm curious to hear others' opinions, is this a bug in the image definition or is that base directory often nonexistent and we should provide a fallback? https://github.com/tox-dev/platformdirs/issues/315
Replied on the issue - looks like it's a common enough problem to have some recommended fallback locations.
Does anyone remember where the PyPI sqlite database was from? Someone had a (simi?) regular sqlite file production (attached to github releases if memory serves?) with package and dependency information about all the PyPI packages. I still have an old copy (pypi.db , 166 MB, from March), but have forgotten where it came from.
Fantastic, thanks. Looks like it's a bit out of date, but that's it!
Me: I'll star it to make sure I can find it next time.
Also me: Oh, I've already starred it....
I do that all the time
@valid rover A new run right around the release of 3.13 would be nice, especially since classifiers are included! I've been pushing to get the 3.13 classifier in as many of my projects as possible this last week.
See also https://pyreadiness.org/3.13/
I wonder if automation for a monthly release would be beneficial?
probably, it's a case of how much I want to muck around w/ GitHub Actions debugging. But this shouldn't be too bad
FYI, I like having three levels of marks, with the only "x" being packages that declare 3.12 support but not 3.13. Some packages simply don't list classifiers at all.
And that page should include wheels, scipy does have 3.13 wheels, just no classifiers, for example.
why does SciPy have 3.13 wheels but no 3.13 classifier? it has 3.10-3.12 classifiers (and wheels)
Ah, I assumed it didn't have classifiers. Projects with per-python wheels tend to be more likely to be in the "classifiers are not needed" opinion. I guess they forgot.
ah, they have a 3.13 classifier, just not released yet
- Update
pyproject.tomlto include Python3.13in the
classifiersmetadata. Considering we already ship3.13
wheels on PyPI built frompyproject.toml, I can't imagine
there's a good reason to delay adding it now.
We're coming up on the initial publication date for the virtual environment layering project I've been working on, so it's that fun anticipatory mix of "yay, I finally get to share the full technical details of the project I've been alluding to for the past few months" and "ugh, I hope nobody points out something egregiously obvious that I've completely overlooked".
I guess if the latter happens, that is one of the intended benefits of working in the open... 🙂
"layering" ? is this something like having the base python of a virtualenv be another virtualenv and having its site be avaliable to the dependent virtualenv
Yup, and dealing with all the various reasons why attempting to do that can fail in the general case.
why layer them to begin with - my initial gut reaction is ouch, implied pins on the packages from the base venv or version hell
Did anyone package Python CLI app in winget?
Did anyone package Python CLI app in
New data available: https://github.com/sethmlarson/pypi-data/releases/tag/2024.10.08
New data available: https://github.com/
mini-blog post (i dont have a blog lmao)?
anywho
the hash function in python is defined as a modular reduction over the mersenne prime (a prime of the form 2^x - 1) 2^61 - 1
i was bored, and deccided to try intentionally making a hash table with such collisions
results after benching:
>>> timeit.timeit("a[p + 1]", setup = "from __main__ import p, a")
0.08406147197820246
>>> timeit.timeit("b[p + 1]", setup = "from __main__ import p, b")
0.08369432506151497```
```py
>>> timeit.timeit("b = {x * p + 1: 4 for x in range(1, 1000)}", setup = "from __main__ import p", number = 10)
0.1114881259854883
>>> timeit.timeit("a = {x * p + x: 4 for x in range(1, 1000)}", setup = "from __main__ import p", number = 10)
0.002080571954138577```
initialization time takes 50x longer for the one intentionally made for collisions, yet they both take about the same time to lookup
tested and replicated on different machines
It was a long time ago, but I have seen a real world case where a function name hash collisions observably slowed down a Python application
https://www.mozilla.org/en-US/security/advisories/mfsa2024-51/
Firefox CVE with critical impact
HTTPX has a public call next week about the plan for 1.0 https://github.com/encode/httpx/discussions/3344
The apparently fast lookup is because you're grabbing the first entry out of the hash bucket even in the "many collisions" case. Try adding a c variant that builds the collision-prone variant in the opposite order so you're grabbing the last value in the hash bucket:
>>> from timeit import timeit
>>> p = 2**61 - 1
>>> timeit("a = {x * p + x: 4 for x in range(1, 1000)}", setup = "from __main__ import p", number = 10)
0.000741713999559579
>>> timeit("b = {x * p + 1: 4 for x in range(1, 1000)}", setup = "from __main__ import p", number = 10)
0.04867755600025703
>>> timeit("c = {x * p + 1: 4 for x in reversed(range(1, 1000))}", setup = "from __main__ import p", number = 10)
0.048195532000136154
>>> a = {x * p + x: 4 for x in range(1, 1000)}
>>> b = {x * p + 1: 4 for x in range(1, 1000)}
>>> c = {x * p + 1: 4 for x in reversed(range(1, 1000))}
>>> timeit("a[p + 1]", setup = "from __main__ import p, a")
0.04043049199935922
>>> timeit("b[p + 1]", setup = "from __main__ import p, b")
0.039784960000361025
>>> timeit("c[p + 1]", setup = "from __main__ import p, c")
9.078608001999783
There's a reason we eventually accepted the need for container implementations to use a cryptographically secure hash function: https://peps.python.org/pep-0456/
right, i didnt realize that until 2 days after
also, techniccally, siphash isnt cryptographically secure
hm, wikipedia says its a non-cryptographic hash function while the github says it is
AFAICS, it's just not a hash function. It's a PRF.
a hash can be a PRF
iirc you can make a hash function from a PRF
Maybe, but then it's not the same function anymore. 🙂
ah right, a PRF has to be keyed
The PEP and the wiki article are using "cryptographically secure" in two slightly different senses. A general purpose cryptographic hash function has the extra characteristic that it offers robust collision resistance: if two things hash to the same result, you can be confident that they had the same input. Siphash doesn't give you that level of collision resistance, so there are lots of cryptographic use cases where it isn't suitable (e.g. as a password storage hash - if your password storage is collision prone, then people can get in not only with your actual password, but also with any other password that happens to collide with it Edit: turns out it's the need for a key, and the lack of work ratio tuning parameters that make Siphash not great for password storage. TIL.).
Siphash is cryptographically secure in the narrower sense that even given a bunch of known inputs and their hashes, you still can't predict the hash for a novel input without knowing the hash key currently in use, and you don't have any practical way to learn that hash key. That wasn't true for the old hashing algorithm - even after hash randomisation was added, you could still theoretically examine the runtime behaviour to infer the hash secrets, and then use those to craft inputs that were highly likely to provoke a high rate of hash collisions, and hence induce quadratic behaviour in algorithms using sets and dictionaries (thus defeating the purpose of adding hash randomisation in the first place). As far as we're aware nobody ever actually created a realistic attack on the original hash randomisation algorithm, but switching to Siphash meant even that theoretical risk went away.
wow is it scary to release based off of an old commit without a lock file
Is there anything similar to a strict xfail for GA? "Mark this job as passed if it fails, fail if it succeeds"
(I want to ignore free-threaded builds of Python for as long as package installation fails)
Are you using a shell step for the installation? If so, you could invert the command status with !.
Does anyone know how to import multipart.py from multipart/__init__.py (replacing the module with the package)? I've now done this two ways: once with a custom .pth and loader, but it turns out Google Colab forces you to run !pip install python-multipart in the notebook, which means custom .pth files aren't run, as the process is started before the install happens. The second attempt I'm basically loading multipart.py by path, but that's not going to work if it's not loading from a file system, like in a zipapp, so if there' s a better way to get this, I'd be open for suggestions! Trying to fix a long standing package name collision and calm a heated fight between multipart (now in the CPython docs as a CGI replacement) and python-multipart (fairly popular). https://github.com/Kludex/python-multipart/pull/168
this has also been done with conda before https://github.com/asmeurer/sudoku
Direct link for anyone using a browser that declines to load the modern Twitter/X web interface (such as strict privacy mode in Firefox): https://github.com/konstin/sudoku-in-python-packaging
is there any practical way to prevent certain packages from being part of a dependency tree
i recently run into situations where i want to early stop others from trying to use specific packages as forst starting point for a solution (for examples types-confluent-python is a 3rd party package and packages wrong some types for confluent-kafka)
im not aware if a easy way starve them out out of using constraints that make any version of it forbidden
I also needed that recently, and was almost certain that there must be a way, but could not find anything.
im not aware of a direct way to enforce conflicts with other packages
@junior narwhal FYI, in case you have opinions on argparse and un-"soft-deprecating" optparse and getopt: https://discuss.python.org/t/getopt-and-optparse-vs-argparse/69618
both argparse and optparse should be marked as broken by design, dont use in new code
I tend to agree, we should have something better in stdlib and just deprecate all three of them
please comment on the thread 🙂
hmmm, as i don't have a alternative to show for, i don't feel like that's going to help
having a idea for one, and showing one are different pairs of shoes
also, having 4 parsers in stdlib sounds kinda insane
also, having 3 parsers in stdlib sounds kinda insane
I use a constraint like package==9999.0.1.2.3.4.5.6.7.8.9 for this. I await the day this strategy fails 🙂
this is brilliant, I hate it so much 🤣
unfortunately its not usable for pyproject.toml -
You should also use an epoch I think
not directly, but you could write a test that resolves with the constraint
Is something powerful yet less constrained possible? Command line semantics are purely convention-driven. You’ll always find things that are either explicitly valid or work by accident in some implementation that don’t work in another.
I think it’s insane to have CLIs that allow things like having multiple arguments to an option (like cmd -f 1 2 3 evaluating this as [1,2,3] being passed to -f) or having option values that start with dashes being valid anywhere else than in cmd --long-opt-with-equals=-v-a-l- or cmd -- -v-a-l-). But apparently that’s what some people in that thread want?
I want both of these things. 🙂
If that strategy ever fails you, package===this-is-not-happening-stop-trying-to-make-it-happen may be worth a try (courtesy of the 3% of PyPI that did not comply with PEP 440 when it was written: https://packaging.python.org/en/latest/specifications/version-specifiers/#arbitrary-equality). Tools may complain about that one, though.
What’s wrong with cmd -f 1,2,3 or cmd -f 1 -f 2 -f 3, and the given ways of specifying values starting with dashes? Why do you want ambiguity?
Maybe I misinterpreted what you said. I like the support for the -f 1 2 3 syntax when -f takes a fixed number of arguments. In this case there is no ambiguity. OTOH, an option taking a variable number of arguments is madness.
As for option values starting with dashes, well... I want -f "$x" to be interpreted in the same way no matter what the value of x is. (Again, I might be misunderstanding your position here.)
I think it's pretty wild that we're trying to bring back one standard library module that is explicitly for niche purposes and the other just to avoid a deprecation/behavior change/bug fix cycle in the current most popular module
OTOH, an option taking a variable number of arguments is madness.
yes, that’s what I meant, sorry for being unclear
As for option values starting with dashes, well... I want -f "$x" to be interpreted in the same way no matter what the value of x is. (Again, I might be misunderstanding your position here.)
yeah, with a fixed number of arguments, this is well defined too of course. I’d still argue that if-f/--file’s single argument can possibly start with-, you should specify it as"--file=$x".
I think argparse can handle all these cases, so I agree with you and @ofek: If there are any real issues in argparse (i.e. issues that don’t involve ambiguity), they should be fixed instead of bringing back the old C-like stuff
I’d still argue that if
-f/--file’s single argument can possibly start with-, you should specify it as"--file=$x".
You should, but the reason you should is that--file "$x"may not be interpreted correctly byargparse. And that's the issue here.
If there are any real issues in argparse (i.e. issues that don’t involve ambiguity), they should be fixed instead of bringing back the old C-like stuff
Mind you, I haven't said that. 🙂 Honestly, I've no idea what the best strategy for dealing withargparseis. I haven't personally looked at its internals, but from other people's accounts it seems pretty fundamentally broken.
As many issues I have with argparse, IMO this is not a wrong thing to do. It is also far from unique to argparse.
If it is to be rescued, it seems like it would have to be significantly reworked, with some features removed.
What's not a wrong thing to do?
There are two choices if $x starts with a dash, either treat it as an option or a parameter to the flag. Neither is wrong, and there are (non-Python) popular tools that do either.
I'd have to disagree.
You can, but that’s just how existing tools are.
What I’m saying is it’s probably not a good idea to argue on this particular thing against argparse
I'm of the view that continuing to describe the module that backs click and Typer as deprecated is a fundamentally bad idea (that would be optparse, not argparse, for the reasons given in the click docs).
I do think folks limited to just standard library modules should keep preferring argparse, though.
Even if --file has nargs=1? Then that’s an argparse bug that should be fixed.
Yes:
$ cat test.py
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--file')
args = parser.parse_args()
print(f'{args.file=}')
$ python test.py --file --foo
usage: test.py [-h] [--file FILE]
test.py: error: argument --file: expected one argument
And I do think it's a bug. It's just that from other people's statements I gather that the bug is pretty fundamental to how argparse works, so fixing it is not going to be easy.
Typer depends on Click only, not optparse, and Click is getting its own parser soon for various reasons https://github.com/pallets/click/issues/2205
Depending on click still transitively depends on optparse, at least for now. And if anyone else wanted to write their own click equivalent, optparse would still be a better starting point than argparse (for the same reasons). Projects that are happily using optparse will likely gain few practical benefits from migrating to argparse instead (and may cause regressions in their command line if they try to do so).
pip is still on optparse for similar reasons
I got nerd sniped apparently so if I have time at the end of the quarter of working on pip I'll probably work on vendoring click and revamping the CLI. unlikely because that sounds like a lot of work but we'll see
is it worth it? I mean, optparse works fine and there is a lot of code around it in pip already. having to vendor yet another package and rewrite something that works just fine sounds like a lot of unnecessary work. Also, click API does not match the class-based way the pip CLI works now, so that would be another level of refactors. And personally, the deocrator-based way of click is just ugly when you have to use more than 2-3 decorators
if it means using something that's maintained then I think it's worth it, of course only marginally impactful as you mentioned unless it's actually removed from the standard library (hopefully will be eventually)
I'd say that pip pretty much locks optparse in stdlib
there's no question of removing optparse. it's "soft deprecated" which means:
A soft deprecated API should not be used in new code, but it is safe for already existing code to use it. The API remains documented and tested, but will not be enhanced further.
Soft deprecation, unlike normal deprecation, does not plan on removing the API and will not emit warnings.
https://docs.python.org/3/glossary.html#term-soft-deprecated
yes I agree with this which is why, if we were to push a magic button, pip would move away from it
Quick question, if a library aims to support all non-EOL versions of Python, is Programming Language :: Python :: 3 fine as a classifier or is there a reason to actually list all of the individual version classifiers?
I'd say it's fine
some package managers (like Poetry) might add classifiers for you during package build, basing on your requires-python setting
add to TOML or distribution metadata?
yes
I do set [project] requires-python but I use flit and it does not generate classifiers AFAIK. Managing them manually is annoying, I always forget to update those when updating the test matrix.
metadata
does that happen even when that field is not marked as dynamic?
it's fine to just do Programming Language :: Python :: 3. Classifiers are just some information for end users, apps use other metadata
yes. it has been like this long before Poetry supported PEP 621
sorry, yes true, I'm asking I suppose when the user does define PEP 621 metadata if that also happens or just the custom poetry config
you got me there, I would have to check in the code, but IIRC it always happens, no matter PEP 621 or legacy poetry config
oh interesting, if that's the case then it definitely violates the spec
that might actually be something we should tweak before release if that's true 🤔
ok, just checked, when classifiers are defined in [project].classifiers they are not enriched
so all is fine
I use the classifiers to record "this is what I test in CI", but I don't think there's any consumer that genuinely pays much attention to them.
Yeah, I don't know if there's much point to Python version classifiers.
I personally don’t like Click at all, and I know a few people who don’t either. Maybe reconsider switching pip to it, some people might not want to merge a PR like that.
Not saying you shouldn’t do it, just to ask first if it would be welcome
yeah definitely I will (if I actually have time for this in the end)
I think click was considered in that discussion
internally click is nice, but the API is not so great
That explains why it rankles with me. I'm an API guy. When I try using click I immediately start writing abstractions to work around it.
And then I ask myself why I use it at all and ditch it. Every time.
that's so interesting, to me Click's API (especially the decorators) is beautiful and allows for a very nice separation of CLI configuration vs business logic
that style is actually what most newer CLIs written in Rust use via the Clap library's derive feature https://ofek.dev/words/guides/2022-11-19-writing-a-cli-in-rust/
Dekorators are ok, but not when you have 300+ lines of code of only decorators
I guess I think of click & Typer as Python-based DSLs for describing CLIs, that then call regular Python code to do the heavy lifting. I definitely wouldn't call https://github.com/lmstudio-ai/venvstacks/blob/main/src/venvstacks/cli.py pretty, but it lets me describe exactly what I want in a way that keeps options consistent across the different subcommands (with a bit of help from the test suite to ensure names don't get out of sync)
out of curiosity, what made you go with typer over click? I tried it a few times but never became fond
also
meanwhile Cleo:
but to be fair, click exposes their whole API on top level, cleo doesn't
While I use click too, I do think decorator-based setup is a performance problem. With lazy imports being rejected, it means when your app starts up, all those decorators have to execute, just adding to Python-based application startup woes. In a $job-2 we had a Python CLI that everybody used many times a day. Startup performance was the number 1 complaint and a lot of engineering went into old school lazy imports. It helped, but not enough to forstall the inevitable of the CLI being completely rewritten in Rust.
Mainly just that I like the annotation based syntax. For a different application I might care more about the startup time hit, but the layer archive build times are routinely measured in minutes, so a couple of hundred milliseconds is a rounding error.
Now I'm wondering if Mark's syntactic macros PEP could help with CLI app startup performance by moving interface definition work to compile time...
I like the syntax too! 😄
I love clap-derive.
Being able to define a fully typed data structure together with the CLI that produces it is wonderful.
Typer breaks the illusion of doing that enough that it isn't on the same level for me, and click doesn't have its validation integrated with typing at all.
at work I just wrote a CLI using Click that is entirely comprised of lazily loaded commands and when I have more time Hatch is going to be migrated to that new way, it's very nice and will bring some other cool features...
you might be interested in a library I wrote at work for the aforementioned CLI https://ofek.dev/msgspec-click/usage/
Using LazyGroup? TIL!
yes
Nice
I also have it set up for extensions like git supports
Oh, interesting! The lack of clear ways to define sets of options that should be available across multiple commands is one of the things that doesn't endear Typer to me, which makes your approach appealing.
msgspec could be useful in venvstacks in general... (the metadata handling at the moment is seriously clunky)
Like OptionGroup/ArgumentGroup from optparse/argparse?
Yeah, exactly that (I may have missed ways of doing it, though - weirdly enough, venvstacks is the first CLI I've ever written with multiple subcommands, so I've never needed to set up option groups before)
Oh does typer support **kwargs: Unpack[SomeTypedDict]
It should if it doesn't
Typer & typing.Unpack
https://github.com/Textualize/rich/pull/3546#issuecomment-2452281238 @junior narwhal yes please. Rich is a major part of pip's start-up time and I don't have the interest personally to reduce it from rich's end.
sigh Just came across this delightful snippet in some module activity guidelines for my Master's course: sudo pip3 install paho-mqtt (opinions may have been expressed on the class discussion forum for that activity...). It's not even a case where there's no corresponding Debian package (sudo apt-get install python3-paho-mqtt works fine).
practicality over purity :p
since when messing up your global site-packages is practical?
from a user's perspective, it accomplished their goal in the short term?
They will have to change it for new distros that prohibit changes to global site packages
that sounds like long term thinking
This is why they have feedback forms at the end of degree courses
Wait that's a thing?
At this point I thought it doesn’t even need to be new distros? Many tutees would have trouble running that command today.
Man, I do kinda regret setting my own CA for mTLS. It's cool, but an authentication proxy with GitHub OAuth would be easier to use :P
I forgot that I set this client certificate to last only a year. I don't remember very well how to generate a new one (although I think I have a script somewhere).
Yeah, at least in Arch, Debian, and Ubuntu this command luckily won't run.
I absolutely despise when installing some software on Windows wipes out half of your PATH
this time it's https://github.com/volta-cli/volta
which is probably red flag number 9000 that I should continue avoiding the JS ecosystem
Despite what Microsoft might say, Windows is second class in dev space. It might be king in casual space, but it's "meh" for dev on a good day
in some regards but not about PATH semantics as I just mentioned, modifying the environment variable literally only works as expected on Windows
other operating systems use a mix of shell configurations that are hard to debug, on Windows you have the registry which is amazing
This was for Ubuntu, and the OS does yell at you if you try it as stated in the activity notes. It probably ran without complaint when the course notes were written, though (and forcibly upgrading paho is unlikely to outright break anything).
Yeah, I've been handing out some 1s for this course. Some of the units feel like they were written 20-30 years ago (when I was still an undergrad), and barely touched in the intervening decades.
wow this is perhaps the hardest software epic fail I've seen in some time. I went looking and actually the tool is written in Rust so it's not a JS issue per se and I found the code here https://github.com/volta-cli/volta/blob/v2.0.1/src/command/setup.rs#L236-L266
they read the variable using the registry properly but then write the variable using a command in a subprocess that limits the length 🤦♂️
Some of the units feel like they were written 20-30 years ago and barely touched in the intervening decades.
Sounds like every CS university course I have seen...
Having experience developing on Windows is an excellent dev environment on its own. It’s a bad space for devs because the devs make it bad.
Dev experience for Python is good if you understand the OS differences because people like Steve Dower made it good. Also good for Rust because the Rust devs treat each platform mostly equally from the beginning. Some other languages… let’s say it’s easy for users to tell if devs are only pretending to understand cross-platform.
in my experience half the problems arise when people come from other OS and assume their understanding of linux / macos carry over to windows, and then basically operate under that assumption
when you take a moment to understand how windows works, then it can be a fine dev enviroment, just like macos, or linux
The two that have most commonly tripped me up in my current project: cp1252 (latin-1-ish) as the default text encoding (but that caught some genuine non-UTF-8 locale bugs on other platforms too), and the lack of conventional symlink support (so venvs work differently)
part of it is engineers coming from other platforms, part of it is the shell experience being terrible (cmd is bad and PS is difficult), part of it is poor defaults like the carriage return line ending (and encoding like Alyssa just mentioned), part of it is the difficulty reproducing your ideal developer environment (I've looked and I simply don't know how people set up a Windows box automatically to their liking), etc. there's a lot
lack of good/well-known/seen-in-tutorials tools like ls is another big one
IIRC venvs work differently only partly due to symlink (you can have proper symlinks in dev mode nowadays) but how DLL is resolved against the exe. But yeah specifically how Windows needs exe shims is quite annoying
Yeah, there are extra differences in how CPython itself starts up. The differences are less opaque these days than they used to be (due to the sys.path initialisation being written as regular frozen Python code instead of a tangled mess of conditionally compiled C code), but they're still there.
Seems like a shared responsibility F after all (mostly Windows’ fault though). Sure, why shell out when there’s good Rust APIs for what you want to do? But the real headslapper is setx just arbitrarily truncating data, WTF. Either fail or succeed, never mangle data and proceed with that.
Counterpoint: Nobody should have to deal with locales ever. If some legacy POS software can’t do Unicode, pipe it through iconv before letting anything else touch its output, and file a bug with the authors to upgrade their bullshit to the 21st century, they’re two decades behind.
The bug was that I wasn't forcing the main program (edit: as in, the bit I was working on) to read the subprocess output as UTF-8, even though it was forcing the subprocess to produce UTF-8 output. (It failed on windows because the main process was running as cp1252)
I see! Yeah, if that’s possible that sounds like a great idea!
In a similar vain, @mighty flower and I are happy to help out with the vendoring project @hexed briar https://github.com/pradyunsg/vendoring/issues/62#issuecomment-2452800632
I'm sure you're inundated with notifications, so you may have missed this 
I’m also looking to get more involved with being helpful around here
please give a 👍 if you use Material for MkDocs and would find this useful https://github.com/squidfunk/mkdocs-material/issues/7761
What happened here? Any ideas? https://pypistats.org/packages/zope
Someone's cache/mirror stopped being used probably
Don't mind me, just sitting here happy that our moderation rules shut down a random NFT bot that tried spamming here. 🙃
The number of times automod in other discords shut down scams for steam gift cards is disheartening when you realize that most accounts that post these have been hacked.
so relatable https://x.com/youyuxi/status/1867830650486886908
Sometimes I feel I live in a separate, parallel universe with people who vehemently complain about JS tooling.
Things have improved so much in the past few years and yet it sounds like nothing has ever worked.
Most people only recognise two states: Either it works, or it does not. If you don’t magically fix all the problems, you’ve done nothing and nothing changed.
Often those who shout loudest that something is broken, do nothing to help fix that
That world is so complex. Transpilers, bundlers, source maps and module systems most of which are in a transitory state that never seems to end. I understand why that continues to be confusing and sometimes just doesn't work.
Here, however, 2.7 is dead, there are no transpilers, bundlers, or source maps, or module system transition. So it's harder for me to see why people have problems here.
Concrete recent example: https://mastodon.social/@webology/113653520596020173
From a tooling point of view (given how Jeff eventually made the problem go away), I assume uv publish was trying to be helpful and automatically infer useful metadata based on the repo contents. From a user point of view, the end result was a mysteriously failed upload where the suggested diagnostic steps weren't actually helpful (since the incorrect metadata was being added implicitly rather than explicitly).
At any given point in time, there will probably be some transitional rake lying around for people to step on.
Heh, prompted by the #pip thread on browsing repotrends graphs, it's clear Ezio did a good job with metadata preservation on the CPython issue import: https://www.repotrends.com/python/cpython
Apparently folks also managed to keep ahead of the issue opening rate for the better part of two years (although the timing of that decline makes me wonder if the issue opening rate slowed down rather than the issue closing rate going up).
There's also a "New issues and pull requests" chart showing a steady increase of new issues per month, so I put it down to good triaging work
Bugs happen. I don’t think his “Python packaging is so frustrating” is an accurate summary of what happened there.
- the CLI reported the server response
- the server said what’s wrong
- what was wrong was something implicit that the packaging backend should handle transparently, so no need to invest into more in-depth user-friendly errors since that error message was pretty damn friendly for something that’s an internal bug
I personally went away from setuptools long ago since it’s so complex and carries around so many legacy modes of operating that I don’t think it can prune itself to simplicity in the forseeable future.
So I think what happened is that he went with a very complex build backend without needing it. People with simple packages shouldn‘t default to setuptools, and people should feel empowered to hop into some chat room and ask for help!
That wasn't what happened. Jeff's an experienced Python dev, went to publish something the same way he had published several other things, and it didn't work (through no fault of his own).
That’s in no way incompatible with what I said
The fact we can help root cause what went wrong, doesn't eliminate that initial frustration of "Oh look, it broke, again".
Yes, read my last paragraph in the long message
The backend turned out not to be setuptools, so my initial guess was wrong (that got clarified later in the Mastodon thread)
Venting is also a different state from actively seeking help (if I hadn't already interacted with Jeff many times, I wouldn't have replied, since his post was fairly clearly just venting frustration rather than seeking assistance)
people should feel empowered to hop into some chat room and ask for help!
That’s the part of my message that addresses that
I think
- people should default to simple, well maintained build backends (there is a reason you guessed setuptools, even if you ended up guessing wrong)
- people should hop into a chat room when they feel frustration coming up instead of banging their head to the wall
Both of these can be encouraged
No worries on that front where Jeff is concerned (I don't think he's been an open source contributor longer than I have, but it's long enough that I don't know that for sure)
Even when you're experienced, "I've found a bug" isn't your first reaction - it's to try and figure out what you're doing differently from the last time you did whatever it is you're doing. (Experience actually reinforces that reaction, since you're usually right!)
Hm, if I specified standards-compliant metadata and got that error, I’d guess bug, but then again I probably spent more time learning about Python packaging than most.
Yeah, it's different in areas that we work on all the time (I'm far more likely to assume "bug" when doing something strange in venvstacks than I would when packaging a regular Python library).
Actually, why are you thinking that it’s not that setuptools bug?
I wouldn‘t guess that uv publish does anything other than invoking the build backend and uploading what it gets
Ah OK, here’s an open issue referencing both the (fixed) uv publish bug and the (still open) setuptools bug: https://github.com/astral-sh/uv/issues/9513
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"fixed my issue.
relatable
uv publish uploads the metadata as it got it from the build backend through .dist-info/METADATA, no inference involved
I don't thinking I'm using setuptools (I don't see it listed when I pip list) but I'm seeing similar issues.
The build backend doesn't show up with pip list; given that you implicitly get setuptools if you don't declare a [build-system] and setuptools is the only build backend that i'm aware of having this bug, this does sound like an implicit setuptools problem
The different is likely because uv publish sets license-file in the formdata (https://github.com/astral-sh/uv/blob/8074917449ec53d238112a9b0722110d902f522e/crates/uv-publish/src/lib.rs#L699) while twine does not (yet) (https://github.com/pypa/twine/blob/75be078d2849d07c3406772cbbdb13c72139b341/twine/package.py#L199)
I thought about adding warnings when there are unknown fields in METADATA to catch such problems with the next metadata version, but this would probably lead to more problems when build backends are already using custom fields
The amount of complexity people are willing to add to their package build systems solely in order to derive their version numbers from their version control tag history continues to boggle my mind.
Yes, in-repo versions bring their own set of problems, but they're so much more manageable by comparison, since they only affect your release process, not everyone who ever tries to build your project from source (in who knows what kind of wacky environment).
(I ventured deep into this territory today, due to the whole "VCS refs can't be hashed, but most VCS tag based dynamic versioning schemes break if you try to build them from a source tree tarball instead of a published sdist" problem. My least awful workaround ended up being to do an interim fork of the project and switch it to static versioning with a local version identifier appended so it could be built from a source tree tarball)
#setuptools_scm-based projects like hatch-vcs have an option to derive the version number from the directory name in order to support e.g. GitHub tarballs
I would argue however that those should not be used and rather the source distribution on PyPI should be preferred since maintainers actually control that
for large repositories it can save a lot of bandwidth
Absolutely, this only comes up when you need to work around a release not existing yet (which means an unversioned commit archive rather than an sdist)
The directory trick only works for tag archives, not commit archives.
(if there was a tag, it would imply a release, and I wouldn't be perpetrating these shenanigans in the first place)
I'm curious others' thoughts on this https://github.com/pypa/hatch/issues/1858
I wouldn't bother adding extra checksums, use the PEP 740 attestations instead for PyPI things and GitHub's attestations for other things
i put up the readme for a experiment I'll try to manifest in the first half of the new year - please take a look at https://github.com/cogs-of-testing/cot.config.ingest/pull/1/files and rip it to shreds 🙂
If it plays out it'll be the next gen configuration for pytest
But the example alone has quite some mess up potential
For a potentially easier path towards toml/yaml support in the future, you may want to poke around at @junior narwhal's msgspec-click: https://ofek.dev/msgspec-click/usage/#example
Using msgspec for the data model description also provides tentative answers for some of your other open questions.
the maintainer of msgspec still hasn't released a version supporting Python 3.13, people keep asking in issues and I keep trying to follow-up with an email thread I had with the person to no avail
they had personal issues IRL to handle but now there is just silence unfortunately
if there is no release and still no correspondence in a few months I'm going to try adopting/claiming the package, I forget the process but whatever our process is for that
Oh, yeah, I did an sdist-installable branch to help work around that: https://github.com/jcrist/msgspec/issues/777#issuecomment-2548219213 (I'm sufficiently paranoid that I wasn't willing to trust the alternative packages published by people I don't already know. We're in prime supply chain attack territory right now, alas)
https://peps.python.org/pep-0541/#how-to-request-a-name-transfer
There's still a big backlog but it's being actively worked on, so I recommend getting the request in the queue
This doesn't really feel appropriate — it's not an abandoned project. There was activity last month, it's the holiday season, and Jim has a family.
it wouldn't happen right now obviously, but I said in a few more months of no correspondence
I think it's entirely reasonable for let's say 6 months to go by before such a request, I don't think years should be the criteria although admittedly I haven't read the link that Hugo posted above yet
(also to be extra clear, my emails with Jim have been offering to become a maintainer and assist with releases and whatever else but no responses)
The PEP specifies >12 months and other criteria
that's for releases specifically which will meet the criteria next month
I see
I've been in touch with him recently, he's not disappeared — just busy with other things.
it's a shame Pydantic is so slow that it's unusable for CLIs and serverless scenarios currently, I'm really bummed about that
UV can probably perform the resolution of dependencies for a medium-sized project faster than import pydanticplus model definitions
That one certainly is a inspiration
Having configuration files and environment overrides properly integrated is going to be a challenge
In particular funky stuff like addopts and option overrides
how is it funny?
Zanie's day job is for Python tooling written in Rust.
and because tools are often rewritten in Rust to make them faster
I know, I was just curious to hear why they consider it to be funny
(I assumed that to mean that it’s written in a “funny” way)
(Poor reading comprehension :)
Yeah it's just ironic
Overall not so surprising though, I write a lot less Python these days
Does anyone know of a tool or script to fetch say like the timings of the last 20 GHA workflow runs? I'm trying to compare my CI times on main to when I add dev drives to it.
This feels like a common problem, and unfortunately github's own usage metrics simply aren't useful in my situation.
You might be able to find something in dask/distributed I know there were charts for flaky and slow tests
Nothing that a custom script can't deal with. Now I just need to scope this to the windows jobs.
I probably should've used the graphQL API for this...
This is coming along nicely though. I've wanted something like this for a while.
Ah, the necessary data doesn't seem to be exposed in the GraphQL API. Also wow their new explorer is not something I can use easily.
Looks like you're already on the right path! Other things include GitHub's gh CLI and https://github.com/nedbat/watchgha
I was going to contribute to Refined GitHub but my node/npm toolchain is too old. Ugh. I'll get around to it later.
does anyone know where they announce sponsorships? I can't find any blog post or data anywhere https://x.com/Ofekmeister/status/1872434086368936220
@jaunty marlin 😍 https://github.com/jcrist/msgspec/releases/tag/0.19.0
release is out now ^
https://datatracker.ietf.org/doc/html/rfc8962
If this work is standardized, IANA shall set up a registry for criminal networks and addresses. If the IANA does not comply with these orders, the Protocol Police shall go and cry to ICANN before becoming lost in its bureaucracy.
hahahaha
How long is the wait?
I contacted my localhost but it's been months and I still haven't heard back 
I'll send them a message right now, thanks!!
"All your networks are belong to us". Now that kicked off a nostalgia trip 🙂
what does one do when they have huge backlogs on existing projects? start another one
Depends on the nature of the mental health emergency
I suffered at least 4 variants and theres a Loki involved
but you know what they say, github stars are a really good measure of how many stars a repo has
happy new years folks! ✨
i hear they're overloaded, they have a backup at 127.0.0.1 and ::1 though, try t hem
GitHub has been making a ton of UI changes/feature rollouts... I just got new the new issues UI, and... hmm
The new features seem nice, but the UI is harder to read at first glance. Some information is straight up missing, while the rest is more muted.
Time to get involved on the GitHub preview feedback discussions, I guess
Hopefully they at least tweak the overly aggressive label truncation in the issue timeline. They'll probably ignore my other feedback, but /shrug.
You got a reply at least! I also haven't been a fan of the new UI, but I've decided it's too much to get involved in UI/UX discussions unless it completely breaks my workflow (which an early version of this UI preview did and thenkfully they rolled that specific change back)
the only change that I disagree with is on PRs there is no longer a button to run CI for new contributors, it's really inconvenient
yeah, it's annoying to switch back to the old one
is it no longer an option overall or does the new UI miss this option only?
the latter, you have to go to the actions list page now and click the button for each job workflow
oh wow, TikTok will no longer be available in the US starting on Sunday. I didn't think the ban would actually happen but this is good news https://www.reuters.com/technology/tiktok-preparing-us-shut-off-sunday-information-reports-2025-01-15/
Not very optimistic tbh. I hear reports people are already flocking to xiaohongshu which is arguably worse.
Feels like a Patriot Act 2.0 tbh
Re: GH Action approvals - it's their #1 known issue, and "is coming" per https://github.com/orgs/community/discussions/143787#:~:text=Approving GitHub Actions workflows
“we in America like our disinformation like we like our corn: home-grown”
seems reasonable to me, for example we wouldn't let a broadcasted television station be owned by an adversary
Except you only ban one station while like 10 others still broadcast just fine
as far as I know the ban would apply to all large social media companies
I think if there is now an awareness in the zeitgeist that social media networks can negatively impact a populace then it makes sense to reduce the impact of networks that have the potential to deliberately propagandize
in any case, many people that are currently against the ban are going to flip in the coming weeks because the new president and allies like Elon Musk are also against the ban and so they will take the opposite opinion
eh, banning foreign media is generally only done by repressive regimes tbh
When the ACLU, EFF, and Knight Foundation are all against a thing, that feels like a pretty good sign
AFAIK RT America was never attempted to be banned (it ended up shutting down because a bunch of private companies chose to no longer air it), and that was basically just wholly a propaganda vehicle for Russia
we aren't banning foreign media however, we are banning foreign social media specifically from entities that are considered hostile to us
that's quite different
I suppose the crux of the issue is whether or not one thinks social media algorithms have the ability to negatively impact populations. if one doesn't think that is possible or the risk negligible, then I would understand why the ban would seem like a bad idea
The first amendment doesn't really have a "well if someone could do something bad" loophole
Also we're apparently perfectly fine letting foreign (and domestic) entities use social media to negatively impact the population, as long as their headquartered here
that only applies to citizens
I'm curious your opinion on the "crux of the issue" I mentioned above
The US Courts have already conceded that the tiktok ban implicates the first amendment FWIW
so the US courts do not agree that the first amendment does not apply in this case, the lower courts have just stated that they think the ban is consitutional under the first amendment
See the opinion in TikTok Inc v Garland: https://casetext.com/case/tiktok-inc-v-garland
Having concluded that TikTok has standing, we need not separately analyze whether the User Petitioners have standing to raise the same claims. See Carpenters Indus. Council v. Zinke, 854 F.3d 1, 9 (D.C. Cir. 2017) (explaining that "if constitutional standing can be shown for at least one plaintiff, we need not consider the standing of the other plaintiffs to raise that claim" (cleaned up)).
Tiktok in the US is distributed by Tiktok LLC, which is a US company registered in Delaware, and thus is entitled to the full protections granted to any other US company, it happens to be a US company owned by (through a number of other companies) a China based company.
Or more plainly, (again from the courts opinion in TikTok Inc v Garland):
We conclude the Act implicates the First Amendment and is subject to heightened scrutiny.
Sure, social media can be used to negatively impact a population. But "well maybe someone could do something bad with it" doesn't clear the bar for the government to curtail constitutional rights.
The government thus far have provided no evidence that China has done anything bad, and they've admitted that their banning is entirely based on the fact that they could.
those were my words but there are already instances of deliberate modifications to what is shown to people on the platform, it's not people theory crafting per se
The government has made no claims that the PRC has or is influencing that is shown to people on their platform, and they've admitted that the ban is based on the fact they could.
(to be clear, I'm ignoring the grand standing done by congress people during the congressional hearing, because congressional hearings for things like this are like 70% just ways for congresspeople to get sound bytes into the media and have no requirement that their questioning is based on fact in any way)
The government's arguments in court are much more representative (IMO) of what they are actually able to produce as factual evidence
I understand what you're saying, but I am saying that it is happening and was just reading about it earlier
So both of those things suffer from the same problem, they observe something, then they leap to a conclusion about why that observed thing happened. They admit as much in their conclusion of the Timebomb document:
Given the research above, we assess a strong possibility that content on TikTok is either amplified or suppressed based on its alignment with the interests of the Chinese Government.
That's fine for a study, but a proper free and just nation should require something more than "well there's a possibility that ...." before starting to strip away constitutional rights. They have (or at least should have) the requirement to prove both that a bad thing is happening and that it is having the impact that they believe.
The lawfare blog has an interesting take on this too
in which they more or less start out conceding the argument that tiktok is being manipulated, but then link to studies showcasing that people generally aren't that swayed by social media algorithms, and if their feed keeps serving them content that don't already align wth their beliefs, they tend to just.... stop using that app.
https://www.science.org/doi/10.1126/science.abp9364 - Where some users were given a chronological feed and some were given an algormithic one on facebook, and the tl;dr was that as they were exposed to a more diverse range of opinions, rather than just the stuff they already believed, they ended up using facebook less, and there was no noticeable impact on polarization or political knowledge between the two groups.
https://tnsr.org/2024/03/from-panic-to-policy-the-limits-of-foreign-propaganda-and-the-foundations-of-an-effective-response/ - a long article, but links to a lot of other studies, and the general thrust being that "propaganda is generally most effective at providing people with rationalizations to ideas that they already had, rather than at giving them new ideas on their own"
Supreme Court ruling from today https://www.supremecourt.gov/opinions/24pdf/24-656_ca7d.pdf
whats your take?
Next step: Elon Musk unveils new short video function on X with “Free Speech”*
*as long as you’re not critical of Elon or anyone helping him to make the US into his Corporatocracy
… or at least that would be the case if there were any engineers left at X.
tiktok was used to attack romanian elections
A platform’s moderation proved to be woefully inadequate to curb the platform’s use to coordinate and spread disinformation
How is that in any way unique to TikTok lol
I'm not familiar with the case, the quick summaries I've found seem to suggest that tiktok itself didn't attack Romanian elections, that third parties paid influencers to artificially boost and spread (mis)information.
which happens on every platform tbh, and would be a reasonable thing to try to prevent, but banning one random app isn't likely to do much
exactly what I just said 😄
The CIA has replaced democratically elected governments in significantly more direct ways, and the Republican party’s survival can only be attributed to the use of social media to erode the belief in truth itself.
I can’t believe the supreme court can see us down here from their mountainously high horses, pointing fingers towards a teenager platform while they decide what the law is based on what boomers on Facebook want it to be.
I think it's rather uncharitable to believe that the Supreme Court judgments are determined by "boomer" opinions on Facebook
I was being facetious. In full, educated honesty: The conservative judges on the supreme court make it have its most partisan composition in living memory (and probably quite a bit beyond). The source is of course not (directly) Facebook, but the cases they took up are clearly a coordinated mission to destroy decades-old liberal staple case laws and replace it with populist reactionary ones.
The fact that this is what it does with bipartisan support is tragic.
which rulings in particular do you disagree with?
or more precisely, which rulings do you think that the outcome was in direct opposition to a robust reading of the law e.g. a fully politically motivated ruling?
I disagree in principle with some of the rulings, for example Roe v Wade, but when I read the rationale it makes sense based on legalese
and at the end of the day that is the purpose of the Supreme Court, only (mostly) to enforce constitutionality. if modern times require an update to the Constitution then that's a different branch of government
Are you claiming that TikTok is a random app? Equivalent to literally any application on a mobile phone app store?
any app? No, but it's clearly just one out of a slew of social media apps that all suffer from this problem. It's not even the largest of these apps.
I agree that many social media platforms are problematic and I hope the EU makes good progress in limiting them robustly - or outright banning them if they do not co-operate.
As a reasonable demonstration to how silly attempting to ban a single app for a systemic issue is, people are flooding to Red Note now instead of TikTok, and on every axis that the law used to justify banning TikTok, RedNote is worse.
doesn't xiaohongshu translate to "Little Red Book" (not sure if it's actually a reference to Mao's)? I'm not sure where Red Note came from but I see that now in some circles
afaik it does translate to that, but when they translated the app into english they translated it to RedNote
Another topic... hmmmm I wonder how the logistics work for providing light lunch and refreshments for remote conference attendees works.
$50 doordash coupon here we go
speaking of doordash, sometime in the past few months they changed their map service to something else and the routes are absolutely horrible now. the driver will be almost right next to my place and the path will still circle around half the town
mapbox apparently, not sure what they used before
They used to use google: https://careersatdoordash.com/blog/scaling-geospatial-innovation-with-a-location-simulator/
I suggest you just read the dissenting opinions, and compare the tone with the dissents of other supreme courts.
You'll find that they're comparatively scathing. I don't think our understanding of constitutional law compares to theirs, so I'll leave it to the pros to call out what their colleagues are doing.
Oh cool. My field. I’d guess they’re still using the Google Maps Direction API in some form since it’s mentioned on their Dasher Support page. There’s a reddit post from 3 months ago, but it’s just a commenter that’s claiming it’s still Mapbox. There’s a 2017 medium post from MapBox about it, but that careers page is more recent.
Check out that simulation architecture diagram (really cool stuff). That article is from 2020. I’ve heard a lot about what they’re trying to do with this, and my guess is that more recently they’ve started making automated, on-demand modifications to their directions, and some users are probably seeing issues with that if there are tricky traffic patterns or some other tuning/helpful data. FWIW I’m working on very similar stuff right now. So it’s probably them and not some vendor’s API, but idk for sure.
@analog oyster example, it's right outside
lol wow. Yea that's wild, but it really could be anything. Any recent traffic issues?
Something as simple as construction can cause this. Even if it's not directly in-route.
not sure as I really don't ever go out but if that's the cause then there must be permanent traffic 😂
Lol yea also in general that highway is probably more reliably higher travel speeds, so maybe something happened recently that’s causing it to favor that. I’m sure they’re baking a lot into it.
TIL: The walrus operator won't work in f-strings but sometimes also won't throw a ValueError because := is a valid format expression if followed by a digit. f"{x := 5}" will just print the current value of x with padding, or throw a NameError.
We should have a linter rule for this…
Or maybe the formatter shouldn’t allow you to have a space after the variable. If this is formatted as f"{x:=5}" it’d be a lot easier to notice
@steel crane wow this sucks. I think your work is great 
I am aware that hacker news is particularly critical of seemingly anything Python related, but this is a new low.
I've no idea the context of that
It's the PyPI supports digital attestations HN thread.
@silk jungle thank you, i really appreciate that! and yeah, that thread was definitely a new low -- HN is always overly critical of processes that they don't understand/aren't legible to them, but that devolved right into conspiracies
it's really just 3-4 specific people on HN who seem to have a personal vendetta against anything packaging, and especially anything that involves the security stuff PyPI has done in the last 3-4 years
Is anyone else having issues with github notifications automatically marked read despite not opening it? I think gmail is activating whatever email receipts github uses to automagically mark notifications as read.
Hm, I wonder if it's due to Google Advanced Security, and thus they're more rigorously checking my emails.
Maybe something similar to this?
https://berthub.eu/articles/posts/shifting-cyber-norms-microsoft-post/
Ugh, probably.
Ugh, it looks like unenrolling in Advanced Security resets the 2FA recovery codes. Lovely.
hey everyone, first time poster here. wasn't sure where to stick this question but wanted to do my best not to clutter other spaces.
I'm working on a monorepo of atomic python packages, there's roughly 60+ standalone atomic packages.
Does anyone have an idea of what the pypi rate limit is for creating new projects? For updating projects?
Struggling to find a resource with this info, greatly appreciate any info on the topic.
thank yall so much
what do you mean by atomic?
welcome! rate limits are mostly documented here: https://docs.pypi.org/api/#rate-limiting. there are some additional limits around creating new projects that we don't publish. if you hit these, you can try again within an hour and should be fine
also, these questions are totally appropriate for #pypi, feel free to ask there in the future
excellent, thank you so much!
small, discrete, typically a single component. our atomic components feature entry-points allowing them to plugin to our namespace pkg.
I anticipate publishing and maintaining many atomic components over time, but as of now, we just need to get away with publishing the 60 or so that currently exist.
I was able to get to around ~20 before running into a 429.
So I was just hoping for a rough ball park on the API rate limits to see what we're able to work with.
I did look over what di provided but I'm still not seeing a ballpark estimate.
I want to responsibly use pypi of course, but I also want to publish these packages haha, so if there's any specifics please let me know and I'll make sure we plan within the limits.
thank yall!
Whelp, and now my github notifications are totally broken. Lovely
Fortunately, I don't really mind using email as my GH inbox. Still frustrating, however.
quick update on my end, got everything published. My best guess despite lack of clarity in docs is that the rate limit for new projects is something like 20-25 new projects per hour or so.
After my 429 lifted, I published the remainder slowly to ensure I'm not riding the limit.
Fortunately, I don't expect too many days where we need to publish 20+ packages now that we've completed our initial batch
yeah, it seems like something they modify as needed: https://github.com/pypi/warehouse/blob/b1424270acf634f0ecb898576f9617dcae3aa56d/warehouse/config.py#L547-L552
richard in the house!
yup, di mentioned above:
there are some additional limits around creating new projects that we don't publish. if you hit these, you can try again within an hour and should be fine
yeah, di scared me a bit because there's a big "use responsibly or be banned forever" message on there haha, but super thankful for the outstanding resposne time and extra insight.
As you said @dreamy hatch, took a break and tried again after awhile and just published the remainder in between chores and back to business, barely felt the bump.
@silk jungle, this was precisely what I was looking for to get that peace of mind overall, thank you so much for circling back here and diving deeper. 🙏
note that the ratelimit you hit is just for creating new projects and not releasing updates to existing projects, you should be fine going forward!
right, that was my main worry, now we know the terrain better, smooth sailing!
@visual furnace we're interested in using keyring at work for a CLI but importing it is extremely slow, would you accept a PR that introduces some lazy imports?
What environment? On macOS for me, with Python >=3.12 it’s only 35ms (on 3.11, some compat code takes 50 extra ms to import)
tests on Windows show ~170
the new Gemini models are amazing. not only for regular use cases do they appear superior to ChatGPT but 2.0 Pro Experimental is actually the first model that I feel produces better code than Claude, and that is quite the feat
it's funny actually, the released emails from the early days of OpenAI spoke about Google being some kind of existential threat and that they would de facto produce AGI without competition. I thought last year that it was a bit hyperbolic but now I understand
Given the off-the-record praise I had heard from Googlers for their internal AI tooling a decade+ ago, it actually surprised me that Bard was initially so poor. The internal tools might not have been structured around the current conversational AI model, though, so maybe it just took them a while to bring the relevant expertise to bear on their public solution.
Possibly. Maybe first double-check that it hasn't already been tried. Part of the plugin-based architecture of keyring makes it ill-suited for lazy imports (backends are imported eagerly to determine if they're viable and at what priority for a given environment). It specifically breaks out of Mercurial's demand-import mechanism. I'd start with a modest draft PR to illustrate the concept, because I'll be disinclined to accept it if it's too imposing on the implementation. I'm happy to explore the problem space, however. I'm also interested in exploring systemic solutions to this problem. I'm finding it to be a repetitive problem that one implements a piece of code using standard idioms only to find that it needs manual delayed imports (in the standard library and beyond). I'd like to see something like Mercurial's demand import or perhaps something even more integrated with the core implementation. Ideally, it shouldn't be the responsibility of Python programmers to write idiosyncratic code to achieve better performance. Do feel free to ping me on GitHub (if you haven't already).
The botocore issue to add async support is 10 years old as of today! https://github.com/boto/botocore/issues/458#issuecomment-2652277303
Happy birthday ❤️
the roast....
I'm definitely following issues and requests on Mozilla's bug database that are over 20 years old
for any distro experts here, is there a modern equivalent to this? http://createrepo.baseurl.org
currently we vendor it but it lacks support for Python 3 and I would prefer not to use it at all rather than manually updating it
This is actually cool! https://fi-le.net/pypi/
Objective C bridge PyObjC
A little Python history: it was Python's ObjC bridge which intrigued the team at CNRI about Python in 1994. Roger Masse and myself were working on their software agents ("Knowbots" although Bob Kahn really did not like us using that term as a noun) project as part of the digital library initiative, and we were doing that in ObjC on NeXT machines. We went to the first workshop ostensibly to talk to Guido about the ObjC bridge and see how we could use and evolve that. We recognized that the two languages were a really great pair. After the workshop we were talking to folks at CNRI about all the cool things we could do and someone (Dave Ely IIRC) suggested we try to hire Guido. The timing was right and as they say, the rest is history.
Follow up: It was not long after Guido joined CNRI that he said we could just do it all in Python and forget about ObjC, which of course, is exactly what we did 😆
sqlglot's issue management blows my mind: https://github.com/tobymao/sqlglot/issues
you mean that they have almost nothing open?
Yeah, and if you go through their closed it's because they've addressed it by answering or fixing
Flask is pretty much the same
David Lord (lead maintainer of Pallets community) spent a lot of time to go "inbox zero"
Amazing, I don't think my brain can work like that
pyright another incredible example of this
They close way more things as wont-fix than we do though
It's sort of a different attitude / user-facing brand
Interesting that they'll close an issue with..
This is indeed not supported today, but it's out of scope for the core team. Well-tested PRs are welcome.
@timber sphinx I have a prototype that enables pytest-socket to work in Python subprocesses
I am not excited to write the tests though. Skimming through the pytest-socket tests, I have no idea what I'm looking at
This is not suspicious at all :P
current diff: https://github.com/miketheman/pytest-socket/compare/main...ichard26:pytest-socket:subprocess?expand=1. I still need to write the allow_hosts tests.
@timber sphinx https://github.com/miketheman/pytest-socket/pull/409
was looking into speeding up import times
came across now what seems to be a dead / no longer maintaned project named oxidized-importer and I got intrigued.
my main offender has always been pytorch with it's notorious 2-2.5s import, coupled with tensorrt adding another 300-350ms.
I am developing a CLI and I want it to feel as snappy as possible.
Has anyone went down the rabbit hole of improving start-up times past lazy-importing?
A simple argparse prompt with -h, python 3.12.9 compiled with LTO+PTO results in approx 1.5s of start-up from the go.
You might find this issue and the ideas within it interesting. oxidized-importer even comes up at one point.
I forgot I was importing pytorch from the get go
so that's why my initial -h import time was so large
doesn't fix the whole issue, just a confusion I had 🤣
seems like torch takes roughly 1.04s to import
will defo read it, seems to be slightly outdated but still worth a read
further optimizations in lazy importing has brought down import times by +-150ms
will see if there's any other improvement I can make
lazifying the torch import
🤔
a lot better
I only moved the problem elsewhere
but I did manage to save at least 150ms on total initial runtime which is still quite significant
Tidelift was acquired recently ... that's a bit worrying. While I have no reason to think their acquirer specifically will shut down tidelift, the general theme of acquisition seems to imply that will happen at some point.
i dont think they will terminate tidelift
you can always ask the PR person
Kinda weird question
say I do some ops with the os module
I do say some string concatination with os.path.join
and I will never ever use the os module past that point
does the GC simply yeet out the os module from memory since it no longer gets used?
Or does it linger around until the python execution ends.
The latter
Reference to it is kept in sys.modules (and by any module that uses it)
do devs typically try to remove unused modules from memory especially for long running processes?
to free up space ofc
os is probably a module that is loaded by interpreter itself (or at least so many stdlib modules that you're bound to import it through another one anyway)
os was mentioned particularly to get my point through
I don't think anyone would remove unused modules to free memory, code objects don't take that much space and most modules don't have globals that would keep a lot of data in them
🤔 not even those that load .dlls in memory?
Not in any typical usage
I can't speak to all potential edge scenarios where it maybe makes sense for some very specific reason
But like, generally, there's no reason to
the 9pm free will of over optimizing python for no point is probably kicking in.
if you want to use something once and not have it introduce a continued runtime cost run it in a separate process
oh like a separate thread and then simply kill it?
no, process
In long living apps, code objects are probably a largely insignificant part of program memory compared to whatever data they operate on, while for short living processes, you'll get the memory back soon anyway.
check out the multiprocessing module in the standard library
one thing to watch out for is that if you aren't on Linux/a COW system then you can have a large memory spike from the copying
^ This, experienced it first hand
the alternative would be subprocess.run([sys.executable, "-c", "..."])
Oh that could work
and get output from somewhere or use a shared temporary file that is known with an environment variable for example
cmd = [
ffprobePath,
"-v",
"quiet",
"-print_format",
"json",
"-show_format",
"-show_streams",
"-count_packets",
inputPath,
]
result = subprocess.run(cmd, capture_output=True, text=False)
if result.returncode != 0:
logging.error(f"ffprobe failed: {result.stderr}")
raise Exception(f"ffprobe failed: {result.stderr}")
stdout = result.stdout.decode("utf-8", errors="replace")
if not stdout:
raise Exception("No output received from ffprobe")
``` been using it for getting video metadata with `ffprobe`
never considered modules tho'
that sounds fun
Even C programs don’t unload most functions either, it’s generally reserved only for extreme cases
Did you know you can pass check, encoding and errors kwargs
No, please do tell me more
Oh it saves a lot of the code here. check=True raises on a bad exit code, and encoding+errors makes result.stdout text
Also you probably don't want to log an exception you raise because it will usually get logged again at the top of the stack
try:
result = subprocess.run(
cmd,
capture_output=True,
text=True,
encoding="utf-8",
errors="replace",
check=True,
)
except subprocess.CalledProcessError as e:
raise Exception(f"ffprobe failed: {e.stderr}")
if not result.stdout:
raise Exception("No output received from ffprobe")
probeData = json.loads(result.stdout)
you were absolutely right
didn't know these existed
and it still works with weird chinese characters ❤️
You can just let the subprocess.CalledProcessError raise
oh, so just remove try except?
I guess you're right again
and it's also more stylish 😎
you can also remove text=True. It's redundant with encoding.
If encoding or errors are specified, or text is true, file objects for stdin, stdout and stderr are opened in text mode using the specified encoding and errors or the io.TextIOWrapper default.

@jaunty marlin is there a machine-readable list of all of the rules Ruff implements? Just wondering. If not, I'll scrape the webpage, no biggie.
probably uvx ruff rule --all --output-format json
(yes I don't have Ruff installed, it's 2025!)
Thanks!
I like how I asked the same question a day later: #1070132471699607623 message
I should remember to read off-topic
can't see that message link
The plan was to review the Ruff rules, but it seems like you've picked that up instead
@kind moon we're pretty strict about the invites we allow
I see that
It was just in the "#linter" channel
Well, so far I've done a review of all categories that don't make any sense for pip, and all the B rules, so not very far, I don't mind if I continue or someone else wants to
I definitely prefer if someone does it :P
I could do it, but I have other things I'd rather work on (reviewing that resume PR for one)
Well, I'm happy to keep at it, just going to be fairly slow
it's one of the last things that needs to be done promptly
A random thought that just came to mind is that I actually haven't expanded the set of PyPI projects I know and regularly use in a long time. I spend almost all of my time on OSS projects where it's undesirable to gain dependencies so I've rarely had a reason to look for an existing project to solve a specific problem.
It's not a problem per se, but I do wonder how much I'm reinventing the wheel (unnecessarily) whenever I do write scripts/projects for personal use.
That’s what these “awesome …” lists are for, but I have the same problem. I think there are a few things I’ve caught, but probably not a lot:
- dataclasses (stdlib) for most simple data containers
- ruff instead of flake8/black for formatting and linting
- httpx instead of requests for HTTP(S) requests (and generally async libraries for IO)
- cyclopts instead of typer/click/… for CLI
- rich for ANSI formatting (links, colors, … in the terminal)
- plumbum for shell scipting (i.e. calling subprocess and piping them)
- py-spy for profiling (might already be outdated, this has changed almost every time I researched the best choice)
For Rust there’s https://blessed.rs, would be nice to have something like that for Python as well.
https://github.com/larryhastings/appeal is another one
Why? It does what Typer does, but with less boilerplate, and actually supports obvious things like positional-only parameters or typing.Literal in an obvious way.
that assumes one likes typer ;)
I don't really like the "magic" libaries that do stuff by analyzing the function signature
call me weird, but I prefer the argparse way of declaring the CLI
I like argparse too, except for the fact that almost everything has to be specified twice.
And not only is that not DRY, but there’s still the problem that the typing isn’t actually enforced anywhere. Below, we just say that Args.foo exists, but it’s not actually tied to the parser in any way!
import argparse
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from collections.abc import Sequence
from typing import Self
class Args(argparse.Namespace):
"""Some docs."""
foo: int
"""The foo parameter."""
@classmethod
def parser(cls) -> Self:
"""Construct a CLI argument parser."""
parser = argparse.ArgumentParser(description=cls.__doc__) # *
parser.add_argument(
"foo", # 1
type=int, # 2
help="The foo parameter.", # 3
)
@classmethod
def parse(cls, argv: Sequence[str] | None = None) -> Self:
"""Parse CLI arguments."""
return cls.parser().parse_args(argv, cls())
def main(argv: Sequence[str] | None = None) -> None:
args = Args.parse(argv) # args is typed, unless we accidentally lied above …
...
if __name__ == "__main__":
main()
* using cls.__doc__ I don’t actually have to specify this one twice
- the name is already specified by the attribute above
- likewise for the type
- the docstring has to be typed twice, since there’s no standard way to use attribute docstrings.
Compare to the cyclopts variant, which looks exactly how I would write that function anyway, plus a decorator.
import cyclopts
app = cyclopts.App()
@app.default
def main(foo: int, /) -> None:
"""Some docs.
Parameters
----------
foo
The foo parameter.
"""
...
if __name__ == "__main__":
app()
It’s not magic, if you know exactly what happens.
Any sufficiently analyzed magic is indistinguishable from science
– Corollary of Clarke’s third law, apparently from a comic called Girl Genius
the API's quite nifty, my only reservation is rich being a required dependency - I'd rather I wasn't locked into a specific formatter, especially one which is slow to import
Big fan of cyclopts, really love its API
This got me curious so I timed it
❯ hyperfine --warmup 1 --shell=none "python -c 'import cyclopts'" "python -c 'import click'" "python -c 'import typer'" "python -c 'import argparse'"
Benchmark 1: python -c 'import cyclopts'
Time (mean ± σ): 175.6 ms ± 5.8 ms [User: 104.2 ms, System: 62.5 ms]
Range (min … max): 170.8 ms … 195.2 ms 15 runs
Benchmark 2: python -c 'import click'
Time (mean ± σ): 111.8 ms ± 1.6 ms [User: 63.8 ms, System: 38.1 ms]
Range (min … max): 109.1 ms … 116.7 ms 25 runs
Benchmark 3: python -c 'import typer'
Time (mean ± σ): 360.9 ms ± 20.1 ms [User: 187.5 ms, System: 159.4 ms]
Range (min … max): 344.5 ms … 414.5 ms 10 runs
Benchmark 4: python -c 'import argparse'
Time (mean ± σ): 62.0 ms ± 2.1 ms [User: 28.1 ms, System: 25.8 ms]
Range (min … max): 58.7 ms … 67.1 ms 49 runs
Summary
python -c 'import argparse' ran
1.80 ± 0.07 times faster than python -c 'import click'
2.83 ± 0.13 times faster than python -c 'import cyclopts'
5.82 ± 0.38 times faster than python -c 'import typer'
cyclopts defers importing rich, so that's the import time without rich
$ python -X importtime -c 'import cyclopts' 2>| grep rich
[nada]
a slightly more representative benchmark:
app = App()
@app.command
def foo(loops: int): ...
app()' --help"
Benchmark 1: python -c 'import cyclopts'
Time (mean ± σ): 63.7 ms ± 4.6 ms [User: 54.1 ms, System: 7.3 ms]
Range (min … max): 52.5 ms … 74.0 ms 54 runs
Benchmark 2: python -c 'from cyclopts import App
app = App()
@app.command
def foo(loops: int): ...
app()' --help
Time (mean ± σ): 132.9 ms ± 6.8 ms [User: 114.6 ms, System: 14.2 ms]
Range (min … max): 125.0 ms … 150.9 ms 22 runs
Summary
python -c 'import cyclopts' ran
2.09 ± 0.18 times faster than python -c 'from cyclopts import App
app = App()
@app.command
def foo(loops: int): ...
app()' --help
so it doesn’t import rich if you don’t pass --help? Then there’s no issue. --help doesn’t need every millisecond of saved runtime, only actually running the CLI needs to be as fast as possible.
? unless your CLI's doing actual work, I'd say feedback should be/feel instantaneous, and that includes --help. It's when you're "actually running the CLI" that you're least likely to notice its import penalty
I'm using Typer in my current projects, but https://ofek.dev/msgspec-click/usage/ is at the top of my "to investigate" list next time I'm starting a CLI project from scratch.
I have proposed something like this in the User Success WG
idea is to re-do the python.org website, as a hub of the python ecosystem
so that it could be the starting point for docs
it would link to documentation for beginners, use-case specific (eg. conda, pyopensci), list of popular libs, etc
on the Diversity and Inclusion we were also talking about a resource for users to find local user groups and events
also, resources for people wanting to setup user groups, and support resources like suggested guidelines on how to address CoC complaints, etc.
Love that idea!
would any Windows users know if there's a way to group or tag packages in winget? I thought I'd give it a shot after upgrading to Windows 11, but I want to be able to distinguish packages I've installed from other (corporate) packages
@buoyant flame would know (not sure how active he is on here)
I respond to pings, but I can't help with winget. You're probably best to go to their GitHub repo and post an issue or discussion (but since they aren't a package manager - rather, they're just a package install runner - they probably aren't keeping any of their own metadata for things you install, which means they aren't going to be able to tag things like that) @west basin
@valid radish do you think a PEP for platformsdir in the stdlib has any potential?
(sorry for the ping; coming out of https://discuss.python.org/t/add-cross-platform-user-directories-to-pathlib-standard-library/)
yes, Petr offered to sponsor a PEP, and Matt Milner was going to write one up, but it's not happened yet:
i offer myself as a tribute
(provided that Matt Milner decides not to continue with this)
i see that he made that comment on feb 11, 2024, should i contact him and ask him what the status is? or should i just go ahead on my own, not sure what the niceties around this is
yeah, contact him first, maybe you can help him with it. ping him on the issue?
sounds good, will touch base with him, thanks!
yes but I won't be available this month
will give me time to research and flesh it out 😆
timelines: the 3.14 feature freeze is just over a month away (beta 1: 2025-05-06), so you either have a mad rush to get the PEP written, discussed, updated, submitted, accepted, and implemented; or you have plenty of time before 3.15 feature freeze in ~13 months 🙂
oh, plenty of time then
i gotta prepare for a few pycon talks so this gives me time to prepare and not rush things
I published a PEP which doesn't currently have a packaging purpose, but may be of interest to those here https://discuss.python.org/t/pep-784-adding-zstandard-to-the-standard-library/87377
(it would be really nice if the existing standard library support for compression/archive formats had a more unified interface...)
you could make a PEP for that to add a unified interface and how to gradually deprecate the old one
seems unlikely given my previous experience, honestly. Generally one gets told to put stuff in a third-party package on PyPI first. And even if everyone liked it and used it, it'd be hard to build a case for deprecation
certainly just trying to write a PEP laying out a design, without actually doing implementation seems like a non-starter. And I have other projects
However, I suppose I could blog my design thoughts at some point, at least.
A unified interface to compression APIs would be great! Definitely out of scope for this PEP. It is something I considered, but I expect it can be introduced at a later date
I'd be interested to see what ideas you have for unifying APIs. There are quite a number of corners of compression libraries that are rather unique/specific to that library
yeah, an entirely separate idea. As is the general idea of allowing wheels and sdists to use other compression formats (it does sort of make sense IMO to have "anything explicitly supported in the standard library" as the line for what formats are supported; but maybe we want to exclude uncompressed tar?)
The tricky thing with allowing just any old compression formats is not all formats have as wide adoption
well, it would probably have to be either common-subset functionality, or have some well-defined rule for what happens if the format doesn't support a feature (e.g. storing file permissions or symlinks)
zlib is everywhere, more or less, so it is reasonable to expect users to have it
FWIW, in my testing, LZMA would help quite a bit with many popular packages, and that's in the standard library already
oh absolutely, but it's decompression speed is quite slow :(
(separately, but also relevant to reducing the overall bandwidth: I would love to be able to get smaller wheels for Numpy, without all the testing/doc/other development stuff bundled in)
I've been kinda insulated from caring about decompression speed because I don't do PyTorch/CUDA kinda stuff and because my Internet is pretty slow anyway
(I guess this sort of thing also matters a lot more for people running CI/build farms etc.)
Yeah I think a PEP about compression formats should ideally allow both LZMA and eventually Zstandard to be used
does numpy bundle all that in wheels? sounds unlikely
you'd be surprised
LZMA is on the order of 3x slower than zlib to decompress, but with 1.33x better compression (both at max compression levels)
that sounds about right, yes
another design issue: do we allow the index to store (or provide on demand!) the same build artifact in multiple compression formats? If so, do installers provide a UI to choose? (Do they get to choose?)
Yeah the details of the implementation are tricky
I think it's more about making choices and living with them (and historically that has been difficult and slow)
I'd imagine for pip, we'd default to zstandard assuming that it isn't worse than zlib on average. I guess we'd gain a flag to select the type of wheel which I'm not a huge of, but out of everything in the wheel 2.0 transition, this is probably one of the easier things to deal with.
Multiple compression formats are mandatory for the transition story. There are going to be tons of clients that lack support for wheel 2.0 or a wheel compressed with zstandard. They'll need the old wheel 1 zips to function.
(I haven't been following the wheelnext discussions. Presumably this has been [very briefly?] discussed at some point.)
Yep, we have been discussing that. It's not totally clear to me what that ends up looking like to be honest, there are a lot of choices to be made
fwiw, when y'all are ready to present this to the broader packaging community, it'd be nice to have a clear summary of what's expected of the various stakeholders.
While I probably should catch up on the discussions, it'd be nice to have a summary of what's expected of us for the pip project. I'm sure some of the pip maintainers are involved in the discussions, but I am not.
Well I'll say that you are very welcome to participate, though I am sure maintaining pip is time consuming as it is :)
Well yeah, the problem is that I don't have time to read 200+ posts 😅
Well, a good summary of things are contained in the recent summit summary https://wheelnext.dev/summits/2025_03/summary/
Slides etc for that are here: https://wheelnext.dev/summits/2025_03/slidedecks_and_resources/
I feel you... keeping up with DPO is hard
Regarding generic compression APIs: the best part of adding these to the stdlib is it means the shutil archiving APIs can support them by default.
I've kept up with the packaging category for the most part, but I haven't caught up on some of the huge threads that date back months/years.
I don't really participate in standards discussions intentionally. I like to know what's going on, especially as a pip maintainer, but I don't have much to contribute.
I don't really work on the packaging-side of pip, if I'm being honest. I work on everything around the core package management code, networking, error handling, etc. :P
Those parts matter too!
yeah, there's a ton of context. I've spent many hours reading through these threads
Oh yeah. It just isn't as applicable to the standardization process.
That's true
shutil.make_archive is nice and underused, though I still would love a unified interface that allowed passing compresslevel/etc. which shutil does not currently :(
Yeah, I was also thinking a dedicated compression lib could offer a chance to take a second swing at unification. The idea of shutil.make_archive is nice, but I had to copy-and-edit the code for use in venvstacks since it didn't expose all the options I wanted to tinker with for the different formats.
yeah
The compression format discussion reminded me that folks here might be interested in https://github.com/python/cpython/issues/120036 (basic idea: offer shutil.make_reproducible_archive with different defaults that favour future reproducibility of the same output archive over faithfully recording every detail of the input files)
having tried to do just that, I wish I'd kept better notes...
... supposing I were to make a high-level wrapper for the standard library compression modules, and thus try to give them a unified API, and put it on PyPI. Any suggestions for a distribution name?
I actually attempted something similar
I'm not too happy with the write API but it works
This would be great! There's https://github.com/python/cpython/issues/87713, which I think would be required before adding that.
what's the benefit of having a unified compression API, and secondarily what's the benefit of having that in the standard library?
you can easily switch the compression lib used without having to do significant refactoring/rewrite
also just general ease/pleasantness of use, and lower learning curve
tbf, there could be a "unified api" and "specialized api", where specialized has features specific to the particular library
True, my current approach reverts back to manipulating the file system state in a temporary folder to make zip files work. There's a gz header timestamp that is also problematic.
Every time I'm in powershell it's a nightmare to do anything simple
true, but it's relatively straightforward to read once working and committed to version control
Has anyone ever attempted dynamically downloading certain heavy dependencies such as Pytorch or TensorRT using a pyinstaller generated .exe?
what do you mean by "dynamically downloading"?
Pytorch + TensorRT shipped through pyinstaller equate to about 97% of the total storage occupied and pushing new software versions is also relatively harsh.
try:
import torch
except: # if import not found
#run pip install torch or anything alike
all within a pyinstaller environment.
I don't think there is much benefit in that. All the tools now cache the wheels, so downloads are once per system usually. Also, that won't work unless you do manual dependency management and not using uv/poetry/pdm or whatever
maybe a versioning system where updates could be done easier would be a nice addition as well
therefore users would only have to download the dependnecies once and I could simply ship an update.exe that overwrites what has changed with the new version
The sheer enormity of PyTorch and Tensorflow is pretty much the whole reason venvstacks actually exists now, as opposed to being the vague idea percolating in the back of my brain that it was for the previous decade.
It's more for the app-written-in-something-else-embedding-Python use case than it is for native Python apps, though.
(there's also a channel for it here!)
@lapis solstice I'm curious to whether there is a way to convince multiprocessing to avoid importing the main module during worker startup entirely. The context is that I'd like to add parallelized logic to pip. I've designed it so the parallelized logic lives entirely in its own module that (barring the entrypoint) does not depend on the rest of pip. The problem is that multiprocessing will import the main module, which may result in a large portion of pip being re-imported. This is quite slow, and also a potential security vulnerability.
I do patch sys.modules[__main__] with a lightweight module (pip) but I'd like to avoid the import altogether. Alternatively, I'm probably going to have to switch back to using multiprocessing.Pool as at least it initializes the workers immediately (addressing the security concerns).
I don't believe there is a formally supported API, but I'm pretty sure if you mutate __main__.__file__ and/or __main__.__spec__, multiprocessing will respect that.
Unfortunately, that didn't work.
I also tried replacing __main__ with builtins (:P) but that breaks in other ways.
Even if you change them before importing multiprocessing?
I've looked at the multiprocessing implementation, AFAICT it grabs __main__ right before a worker is started so that wouldn't matter.
I'm going to do the more reasonable thing and stop using concurrent.futures and do best-effort patching, but leave it at that otherwise.
(which sucks as sub-interpreters are only available through concurrent.futures, but alas, I don't want to cause the next pip security report).
Can you refactor the main module in pip so the unwanted imports are inside the if __name__ == "__main__": block? multiprocessing runs the child main as __mp_main__, so only code outside that block will run. (This is the officially supported way of keeping the code execution minimal in child processes spawned by multiprocessing)
(it's yet another point in favour of the from .cli import main implementation layout)
It's actually the console script wrapper that's the problem.
#!/home/ichard26/dev/oss/pip/venv/bin/python
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())
When running pip directly pip install six, the wrapper is the main module.
If I run pip indirectly via python -m pip install six, then pip._internal.cli.main is never imported.
What would break (if anything) if the wrapper template was changed to be:
if __name__ == '__main__':
import re
import sys
from pip._internal.cli.main import main
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())
Anyway, I've already worked around the problem. This is purely an academic question.
And I should go to bed, it's almost 1 AM 😅
Heh, I was about to disappear for a late lunch.
That would affect every tool installed by pip. I don't have the appetite for such a change even if IMO whoever is broken by such a change does not deserve sympathy.
Heh, I thought that might be it - theoretically safe, in practice, who knows?
(I must be missing something - why would the script wrappers for installed package entry points, be getting called by multiprocessing?)
multiprocessing calls the main module to initialize new workers (to better match normal startup behaviour/restore global state). When pip is run from a console script wrapper, it's the wrapper that is the main module.
It's like doing python venv/bin/pip which is a silly but functional way of using pip (except on Windows).
It's a good default (e.g., the parallelized code depends on global logging state) but it's annoying when you want to spin up a pool of wholly independent workers.
I should look into how compileall handles that.
In theory, I could work around by managing my own pool of processes, but that would be a nightmare to write and maintain (and be quite buggy).
Its parallelization? It simply passes the entire list of paths to concurrent.futures.ProcessPoolExecutor and lets it deal with the rest.
yeah
... would that not work for parallel downloads and/or resolutions as well?
(oh wait, is unzipping the bottleneck? is that cpu-bound? but still)
Actually, distlib's script template used to guard the entrypoint import under if __name__ == "__main__". 🫰 Hopefully a patch to restore this is accepted: https://github.com/pypa/distlib/pull/242
it's hard to imagine a use case for explicitly importing a script wrapper...
If it's rejected or ignored you could update the distlib template in pip, distlib explictly supports that
Well.. the point is that the fixed script wrapper disseminates across the ecosystem so when (if) we merge the parallelization PR, the hack is unnecessary.
I just realized, I have no idea how uv implements their console scripts though.
Yeah, they copied distlib and then accepted a PR to remove the re import
(it occurs to me that .removesuffix is available since 3.9....)
(then again, those conditions are only required for windows support, yes?)
(ah, but installing into environments for older python is still supported... ? even if --python doesn't necessarily work?)
would you be willing to make a similar PR to installer?
what are some plausible explanations for package A having a higher download count than package B yet also requiring B as a dependency? example:
mkdocs-material (A) has lots of releases (~weekly), so when you upgrade, mkdocs (B) which is released ~twice/year, is already downloaded and in the cache?
ohhh nice that's a good point, I was thinking about caching but in terms of some sort of multi-stage container build/CI stuff. the release frequency aspect makes more sense. thanks!
speaking of caching, I've been puzzling over how Setuptools and Pip end up getting downloaded so much. I can see how users end up with multiple versions, and repeatedly install them into temporary environments - but the cache hit rate should be a lot higher, I would think? Is there something about CI systems, Docker containers etc. that defeats the caching perhaps?
(also: would PyPI count a separate download if, for whatever reason, Pip tries to scan for metadata with the range request technique etc. but ultimately rejects that build artifact?)
For docker I have to assume a lot of users are running "pip install pip --upgrade" after a step that invalidates dockers layer cache
For CI like GitHub, I think a lot of people just don't cache installing dependencies
at some point I guess I'll have to learn those technologies properly. It's kinda hard to find a motivation when I'm so accustomed to solo dev
the other thing that would be interesting to figure out, is how much of PyPI's bandwidth is actually serving packages, vs. metadata-related requests (oh, and I guess the actual pypi.org pages count for something too...)
I believe the answer to this is yes these are counted as independent downloads
But they shouldn't really happen as far as I know
yes, currently range requests are indistinguishable from full-file requests in terms of download counts
in the last week pypi.org served 1.74 PB (this includes Simple and JSON APIs, project pages, etc) while files.pythonhosted.org served 25.88 PB (can't distinguish between metadata-only and full files)
that's quite a bit more than I recall the average rate looking like...
didn't they say something like 600 PB total for 2023? is it going up o_O
wow, close to double yeah
by the way, is it okay to share this image publicly (e.g. on a blog)?
sure
I guess (B) must be cached more? (edit: Discord way out of date for some reason... I see the discussion now)
Is anyone a user of https://lobste.rs/s/ntxtm8/introduction_modern_cmake that could invite me? I'd like to comment on that thread. I recognize a lot of Python people like @jaunty gust in the "users" tree 😉
I like this, more automation https://www.digicert.com/blog/tls-certificate-lifetimes-will-officially-reduce-to-47-days
Welcome to DM me with your email address to use for invite.
Sure. I'll also submit a PR to uv when I get the chance. Probably later today after work.
shoutout to whoever added this tip callout, I've gone to that page so many times just to look at the codes and always have to muddle around finding the section because the page is so long
You're welcome!
turns out
Pytorch 2.7.0 is now 1.1GB* larger than before
went from 6.1GB* to 7.28GB
diabolical
☣️
good luck packaging this in the max 2GB filesize limit github provides
@lyric quiver how does a 3338MB Wheel can lead to a 7.28GB folder/file ? I'm confused
TensorRT and Pytorch primarily
Oh so it's not just PyT
oh, not at all
there's a couple more, but 6.9-7.0GB of those 7.28GB are just Pytorch and TensorRT

