#off-topic

1 messages · Page 4 of 1

lusty folio
#

I think that's an amusing consequence of the situation we find ourselves in

#

Because the CVE process has no means to mark preconditions to exploit as part of the validity of the CVE, they are right

#

whether or not them being right is useful to developers or will just act as noise making CVEs less useful is the issue with it

kind moon
#

To me this sounds like they are mocking the idiots that’re knowingly filing false CVEs for clout.

lusty folio
#

we should be viewing CVEs in a more wholistic way, rarely is one bug enough to cause full system compromise, but often a chain of bugs is.

#

we should want to fix all bugs, but saying all bugs lead to full system compromise isn't practical

#

and sometimes intentional behavior is accepted as a CVE, without consideration that the intentional behavior has a precondition to it that prevents the associated CVE

#

should we make software less useful, or should it only be a CVE if the necessary invariants to safety can be violated?

junior narwhal
shadow zealot
#

Instead of a page listing downloads I still want rustup for Python

junior narwhal
#

yes I agree with you about having a single tool but what I mentioned earlier was that the tool must manage Python installations for you and it's a terrible experience to go through the installation/compiling process all the time so we need prebuilt distributions like those standalone builds that everybody uses now

shadow zealot
#

True. IIRC the last time I asked, standalone was already a solved problem (technically) but there’s little momentum to actually put the builds into the release pipeline and shown on the website. Is that actually the case? The problem has been at the back of my head for so long I’m no longer sure if I’m hallucinating a solution.

marsh kite
kind moon
clear wigeon
junior narwhal
#

I'm proposing a gradual rewrite of some tooling at work and I thought I would use the cloud CLIs for comparison

#

even though I don't trust them to maintain stuff over long time scales, Google really is the master of UX

#

just like their UI is nicer than AWS, so too is gcloud compared to aws

#

az is somewhere in the middle but much closer to Google's, it's a very nice interface

#

the only characteristic for which AWS shines is the response time, they must have optimized with lazy imports and other techniques because it's way faster

jaunty marlin
#

The gh CLI is pretty nice too

#

I've copied their styling before

junior narwhal
#

responsiveness:

#

unless I have outdated understanding this is a fair comparison because all are written in Python

jaunty marlin
#

That's kind of wild? uv help is like 3ms

silk jungle
#

Python imports are slow.

#

There has been a fair bit of work optimising pip's start up time (without resorting to large scale lazy imports) and even then pip help > /dev/null takes 120-50 ms (and I'm on a fast laptop).

junior narwhal
#

I don't test UV in the benchmark because... obviously

silk jungle
#

The next time would probably be breaking up more of pip's codebase into sections that can lazily loaded, and working with our upstream vendored libraries to improve their import time, but that's a lot of work for not much gain at this point...

#

A lot of the low-hanging fruit (on pip's end) has been dealt with at this point AFAICT.

frank shore
lusty folio
#

pretty much everything left for tools like pip is stuck waiting on the interpreter startup speed itself to be faster

ionic tulip
#

@junior narwhal is there a recommended way to run commands for building - i want to play around with a poor mans integration of zig / pydust and it neds to run come commands for building the extension modules

junior narwhal
#

doing a bit more research so I can finish up a PEP and omg what is AWS doing

silk jungle
#

So, I have a confession to make, I don't have a hardware security key yet. I plan on fixing that soon, does anyone have any experience with the Yubico Security Key? That's the key I'm planning to buy. I'm curious to whether there are any "gotchas" I should be aware of.

junior narwhal
#

am I hallucinating or was there a blog post about requests or urllib3 easing a transition away from a deprecated extra/optional dependency by maintaining a meta package?

silk jungle
#

I can't find any blog post, feels like something @valid rover would've written about at some point, but I dunno.

junior narwhal
valid rover
junior narwhal
#

no it was definitely your Twitter, long threads equals a blog post in my mind apparently lol

valid rover
#

Ahhhh yeah Twitter thread too!

#

Haha i should go back and change all my long threads into posts now that Twitter is the way it is

junior narwhal
#

I actually find Twitter/X one of the best platforms now, the Community Notes feature is so good to reduce misinformation

valid rover
#

Yeah i don't like the walled garden aspect, or the owner.

#

Used to be able to link anywhere!

junior narwhal
#

that policy got removed promptly unless I'm thinking of something else

#

oh and the walled garden policy got removed very quickly, I know because I send family cute animal videos and they don't have an account (and were indeed very annoyed when they had to have one)

dreamy hatch
#

I see walls, here's how the "long thread" looks if you're logged out... 🙃

dreamy hatch
dreamy hatch
junior narwhal
fierce horizon
#

I guess the problem is that Twitter used to not be that, so people with an account still treat it as something where they can just send around links to everyone and expect these links to work.

Nobody ever sends me a Facebook link, because they know that I probably don’t have an account there and can’t see the thing.

junior narwhal
#

links to specific posts work just like before, long "chains" you have to have an account which like you said I guess used to work but now match other social media platforms

#

I think this is not a very huge problem because a person willing to view a longform thread chain of text information is very likely to already be a user, whereas the vast majority of single posts are memes, videos, news stories, etc. which are inherently consumable

#

for example, Seth explaining how urllib3 gradually deprecated an extra, there is nobody I know who I would send that to who isn't already on the platform

kind moon
junior narwhal
hexed briar
silk jungle
#

The program between PyPI and Google was finished in 2021 or 2022?

hexed briar
#

Huh, yea.

silk jungle
#

The cost isn't actually that bad. I first looked at the Yubikey 5 which seems to be the "gold standard" and those would be ~$80 each across the pond, but the Yubico Security Key seems sufficient and is much cheaper at $40.

hexed briar
#

And, that's cheaper still.

silk jungle
#

I live in Canada so you can x1.4 whatever freedom dollar pricing you got. (I don't know how much pounds are worth).

hexed briar
#

1 GBP = 1.78 CAD apparently.

silk jungle
#

Damn.

hexed briar
#

It's nicer numbers than 1 GBP = 107.5 INR

silk jungle
hexed briar
#

All excellent reasons to not use that over a Yubikey. :)

onyx spindle
#

Yeah, security keys aren't cheap

#

What's worse, in theory you should have 2

silk jungle
#

I have money, it's just that I'm also not in the position to justify $100+ on security keys.

onyx spindle
#

One for normal usage and one as backup if you loose/break the daily one

silk jungle
onyx spindle
silk jungle
#

I actually store 2FA recovery codes so I'm not SOL immediately if I lose my 2nd factor.

onyx spindle
#

But have yet to set them up as well

silk jungle
#

The main thing that prompted this is a) my phone is starting to fail which is ... not great given it has all of my TOTP codes, and b) I have meaningful access to things at this point...

#

It's funny, I started contributing to black just as a way to past the time during the lockdowns and now I have commit access to a major project, and triage access to a few more.

frank shore
# silk jungle The main thing that prompted this is a) my phone is starting to fail which is .....

I use Bitwarden and pay for the premium account ($10/yr). Passwords and TOTP codes stored E2E encrypted in the cloud. I really don’t ever have to worry about losing my TOTP codes since they’re both backed up and synced to all my devices. I do still download and store recovery codes in a safe place but I can’t remember the last time I’ve needed one. IMHO, Bitwarden has probably the best UX of any password manager I’ve used, has a CLI, is open source, and available on just about everything.

silk jungle
#

I use Bitwarden as well, but that's protected by TOTP 2FA which is on my (somewhat failing) phone

junior narwhal
#

I have the same exact personal setup but with 1Password, for work however we must use security keys

coral marsh
frank shore
junior narwhal
lapis solstice
hexed briar
shadow zealot
#

It tends to focus on the wrong things sometimes. Like it complains my repos such as FastAPI has zero stars… yeah because it’s a fork

hexed briar
#

I'm kinda happy it didn't find anything to poke that hurts?

Listing affiliations is intentional, follower count and repository counts are meh, tech support disclaimer makes my life better.

lapis solstice
fierce horizon
#

Hah, yeah the roaster completely failed for me as well. I think its developer didn’t understand that popular repos often become GitHub orgs.

onyx spindle
#

I mean, it just queries Chat GPT for a roast

hexed briar
#

Ah, I wonder how it'd do without my bio in there.

#

Also, wow that Hindi prompt. 😂

junior narwhal
#

great meme 😂

junior narwhal
dreamy hatch
#

I don't think so, the buildbots are all provided and administered by different people/groups

lapis solstice
#

Yeah, the transitive trust requirements make the logistics too much of a hassle to be practical (unfortunately)

lapis solstice
#

I asked the core dev Discord's off-topic channel if anyone knows of a way community orgs can get access to GitHub's ARM64 beta, though (since I assume MS would like their CoPilot+ PCs to be a good platform for AI development, not just consumption).

dreamy hatch
#

These runners are available to our customers on our GitHub Team and Enterprise Cloud plans.

kind moon
#

oh

#

I missed that line

lapis solstice
#

Yeah, I only caught it on a second re-read myself.

mighty flower
#

GitHub tends to roll out to paid users first, and then open source projects a few months later

late pine
#

what's the general consensus on using github copilot/LLMs to write code?

lusty folio
#

I avoid it, even if there weren't potential (open, unanswered legal questions) issues for who owns the code, for the tasks it can do, it takes longer to review copilot code to ensure it's sound than it does to just write it.

mighty flower
#

I don't use it for any OSS work but my company is experimenting with it and I'm using it and giving them feedback. I think there are some workflows it speeds up. It's good at autocompleting repetitive code. It often has good rename variable suggestions. It's good at bootstrapping test cases when you don't have any yet.

#

It's amazing for one off throwaway scripts, especially when you need to write it in a language or library you don't know off by heart.

junior narwhal
#

for me AI has been super awesome for coding! before copilot I was using Tabnine, now I use both. it frequently provides completions that are accurate without even typing the first character and just doing a new line. I've gotten slightly weaker in the past few years before I started my current treatment and if I had to estimate, AI has got me as productive as I was in ~2018

lapis solstice
#

For OSS submissions, definitely steer clear as best you can due to the murky state of copyright around AI generated code (some level of AI assistance may be hard to avoid depending on which editor you use).

For personal use, the AI-enhanced Intellicode in Visual Studio was spectacular when I was learning C# earlier this year. I also found https://nicholas.carlini.com/writing/2024/how-i-use-ai.html to be a really interesting read as to what current gen AI code generators are already good at.

silk jungle
#

I'm looking to switch to a self-hosted web analytics platform. Any good suggestions?

#

I prefer something simple.

marsh kite
#

I think plausible.io is the best known. I've heard good things about it

dreamy hatch
silk jungle
#

And I know a few other contributors of other projects are using it when I check the internal logs I keep.

silk jungle
marsh kite
silk jungle
#

Hmm, trying to self-host plausible-ce on a VPS with only a GB of ram is definitely a tall order.

marsh kite
#

ouch yeah I didn't realize it requires clickhouse

silk jungle
#

I'm planning to migrate to a larger VPS anyway, but I won't have the possibility to do that until later.

silk jungle
marsh kite
#

I wish clickhouse scaled down not only out

#

true of a lot of software TBH

silk jungle
#

OK, so I did migrate to a 2 GB VPS, hopefully this suffices for now. It's a bit expensive, but I'll be migrating to an entirely different VPS provider later.

#

On the bright side, it is working :)

silk jungle
#

So. Many. Hyperlinks. AHHH

hexed briar
#

Our review of the account named in your report has concluded. We have determined that one or more violations of GitHub’s Terms of Service have occurred and have taken appropriate action in response.
Good job but also... sigh

junior narwhal
#

consider yourself fortunate that you didn't create a popular Bitcoin library in the past and are now on a list that s[pc]ammers @ on various repos who you have to report at least once a month...

fierce horizon
#

Well, since I never touched crypto with a 10 foot pole, I’m luckily safe here.

onyx spindle
silk jungle
#

Perhaps I should figure out how this Mastodon thing works... hmm

marsh kite
#

It's nice. It is a smaller community than twitter but I kind of like that

dreamy hatch
#

Yep, and lots of Python people

junior narwhal
#

I've found it to be quite useless outside of some tech bubbles, I wish everyone could be on the same platform

lapis solstice
#

I've been using it specifically as a place to post Python musings and links (which is how I started out with Twitter), and it's functional for that purpose (which fits into one of @ofek's tech bubbles)

frank shore
#

Hello all! We just created a new channel under Other Projects, called #wheel-next . The idea is for folks to collaborate on the evolution of the wheel spec, variant support, symlink support, writing PEPs, reference implementations, etc. etc. The public GH is https://github.com/wheel-next with more information. While a bunch of folks from various corporations are collaborating, this is very much a community-driven fully open initiative.

vast wren
#

I’m too stupid to write C code correctly

frank shore
vast wren
#

Awesome

#

I looked into it at one point but C code makes my brain hurt

#

Need to make it possible to use pyo3 in CPython 🙂

onyx spindle
marsh kite
marsh kite
#

Then I "just" need to write a PEP about it 🙂

junior narwhal
#

nothing makes me appreciate life more than when my computer recovers from a BSOD crash loop

#

I encourage all of you to use Windows in order to enable this gratitude hack

onyx spindle
#

arch rolling release on prod server can give you similar rush :D

vast wren
#

just use all of the OSs

#

then you get all the downsides

silk jungle
#

what about chromeos, nginx as a webapp

#

only needs a few zero days, don't worry about it blob_gentilhomme

silk jungle
#

Excellent. I'm now locked out of Twitter. I had set up TOTP 2FA and yet they seem to have disabled that and require my recovery code.

#

It seems like that code got rotated when I updated the TOTP application.

#

I'm going to make sure all of my recovery codes are up to date...

#

I was curious to who linked to my post on Twitter.

marsh kite
silk jungle
#

that's rough

marsh kite
#

Fortunately it requires physical access and tearing the yubikey apart, but it does mean if you lose your key you should still revoke any keys on it, even if you think it's protected by a PIN

onyx sphinx
marsh kite
#

Perhaps, I'm not familiar enough with the details

junior narwhal
#

apropos of nothing, I dislike Go profoundly. the concept of dependencies using Git is interesting but otherwise almost everything about it I hate and having to work with it is painful

marsh kite
#

Having a mutable multi-tenant "package" source seems like a nightmare, I really don't like the idea of git repositories as dependencies

#

they have their place at times, but as the main source of packages, I think there are too many issues

lusty folio
#

git-based dependencies are fine, so long as you actually point to a specific commit hash and not to a mutable reference (like a tag), or are intentional in pointing to a branch or tag that you trust the author of to provide the version guarantees you expect. It's really no different than people having dependencies without an upper bound with python deps and with an index in play.

fierce horizon
#

There are more problems with Go as well.

junior narwhal
#

it's always such a mistake to respond to an active social media tech thread about your passion, on a Sunday evening no less. don't do it folks

junior narwhal
#

actually I would like to slightly alter what I said in the opposite direction, I think those of us involved in packaging need to do much more evangelism on social media otherwise nobody is aware of anything, for example https://x.com/zeeg/status/1832910845854253338

#

it's so bad that a person extremely knowledgeable about Python thinks that the PSF doesn't do fundraising for packaging...

vast wren
#

so I've never actually used uv

#

does it do anything special besides install things

#

lol

#

maybe my impression was wrong? I thought it just copied the pip and pip-tools CLIs under a uv sub command and made them faster

junior narwhal
#

it's basically Hatch in Rust with an experimental locking strategy/file and workspaces (coming soon to Hatch), a pipx-like command (coming after workspaces to Hatch) but without Hatch environments and plugin capabilities

#

slight high-level deviation in what we view as good UX but basically that is an accurate assessment

#

but yeah as I've been slowly realizing, I basically failed as an open source maintainer in the year 2024 because of my limited social media posting. even if you have great docs, people will not even know about what's possible without constant evangelism

#

it's why everyone thinks the ability to install arbitrary versions of Python via tool python install ... is so novel even though Hatch did that last December, and just like how tool run pep_723_script.py everyone thinks was a novel UX innovation by UV when I wrote the spec and introduced it in Hatch in the spring (although some folks realized like Will for example who changed their blog post entry, very nice of him https://textual.textualize.io/blog/2024/09/15/anatomy-of-a-textual-user-interface/#all-right-sweethearts-what-are-you-waiting-for-breakfast-in-bed)

fierce horizon
#

@junior narwhal I totally agree with your tweet, money is what made uv possible. Of course it’s a risk to create a company for something like that and it’s a (small) risk to buy into things backed by a company (because things can always become enshittified when the company is in trouble)

uv is basically 1. identify something that’s used and grown and changed for a very long time, 2. discard the complex edge cases that few people need 3. do a clean-slate rewrite of the core and API everyone needs, but faster.

This needs time, and time costs money.

[uv is] basically Hatch in Rust
I wouldn’t describe it like that. I’d describe Hatch as (primarily) a Python project management tool whereas uv is (primarily) a direct Python venv manipulation tool like pip. In my eyes they are mostly orthogonal (indeed Hatch wouldn’t use uv as a backend if there was a bigger overlap, right?). One of the biggest overlaps is probably in that they can both download and manage Python runtimes, right?

I’d say that the PEP you wrote regarding script dependencies is the most Hatch-like thing uv implements (even though pipx did the “create venvs from spec in cache dir” first)

dreamy hatch
#

I wouldn’t describe it like that. I’d describe Hatch as (primarily) a Python project management tool whereas uv is (primarily) a direct Python venv manipulation tool like pip.

some of the new stuff in uv is project management, they describe it like this:

End-to-end project management: uv run, uv lock, and uv sync. uv can now generate and install from cross-platform lockfiles based on standards-compliant metadata, making it a high-performance, unified alternative to tools like Poetry, PDM, and Rye.

jaunty marlin
#

For what it's worth, I find it pretty hurtful that you'd say the only reason we could build this is money — we're not a big team. Charlie's done some really impressive work and attracted talented people who are excited about building things that improve the status quo at scale.

#

I also think that uv is not "basically Hatch", yes we have a large overlap in features but I think we've taken a different approach in our designs (and not necessarily in a better way, just different — e.g., Hatch is way more extensible and pluggable).

onyx spindle
# jaunty marlin For what it's worth, I find it pretty hurtful that you'd say the only reason we ...

no one is denying the skills of your team. it is just the fact that having a team that works on a tool full-time as their job gives far better results than having the same team working on it as volunteers, after hours, when they also have to focus on their jobs that put the food on the table. Astral did something great with uv, but the money behind it played significant role in how well and how fast the tool was made

jaunty marlin
#

Sure the money is helpful and has a role in the speed we're able to work at, but to say it is "what made it possible" seems like a stretch.

#

You're welcome to your opinions though. Just know that we read these things and it's not harmless.

#

(I totally agree with the sentiment that way more money should be put into the Python ecosystem, esp. packaging)

fierce horizon
#

I'm sorry, I'm currently sick and fuzzybrained and don't choose my words good.

I should have said something like:

What Astral is doing is a big effort, and having the ability to continue investing time into it ensures that their work has staying power.

I think if what you did were hobby projects, there is a higher chance that youd prioritize things that keep you alive over pouring time into it.

I'm super grateful for what you do.

lapis solstice
#

I can definitely vouch for the "money = time" aspect. While the project I'm working on for LMStudio (portable venv layering that actually works properly) is a dramatically more niche use case than anything Astral are doing, it's something I've thought should exist for more than a decade, but would never have cared enough about to write on my own time. Dedicating 24 hours a week to it makes that project possible. I wouldn't vouch for LMStudio's longevity (I'm just a contractor, I don't know anything about their monetisation strategy), but once the project is published that won't matter so much, as the open source license will cover a lot of risks for other folks that find it suitable for their own use cases.

What money doesn't magically make happen is the research and community engagement efforts that Zanie, Charlie, and the other Astral folks have been putting in, so they deserve all the credit for that. They started from the assumption that the existing tools worked the way they do for a reason, so they ensured that first they could replicate (most of) that behaviour before really starting to explore what could be done more effectively by approaching it differently. No amount of money could make a project like uv work as well as it has without a team that actually listened to and understood the developer community they were trying to support.

robust sandal
#

Where would I ask a gh-action-pypi-publish question? Specifically, I'm wondering if this:

- name: Generate artifact attestation for sdist and wheel
   uses: actions/attest-build-provenance@v1.4.3
   with:
     subject-path: "dist/*"
- uses: pypa/gh-action-pypi-publish@release/v1
  with:
     attestations: true

makes sense to have both an attestation for GitHub and one uploaded for PyPI? Are they unrelated?

dreamy hatch
#

I think they're unrelated, but let's ask @steel crane and @royal dirge

you might want both if your users want more options to verify

kind moon
#

Oh wait, the PyPI action can do that? firHmm time to read

dreamy hatch
kind moon
#

My question would be — does it push the attestations it creates back to GitHub’s system?

robust sandal
#

It doesn’t ask for the right permission to do that, so no. Doing both seems to be fine. Pybind11 now has both

kind moon
#

Well, I tried it
It uh... broke

> Run pypa/gh-action-pypi-publish@8a08d616893759ef8e1aa1f2785787c0b97e20d6
Checking dist/crazylibs-0.1.2-py3-none-any.whl: PASSED
Checking dist/crazylibs-0.1.2.tar.gz: PASSED
Notice: Generating and uploading digital attestations
Error: Attestation generation failure: The following paths look like distributions but are not actually files: /github/workspace/dist/crazylibs-0.1.2.tar.gz, /github/workspace/dist/crazylibs-0.1.2-py3-none-any.whl

https://github.com/letsbuilda/crazylibs/actions/runs/10863127565

firHmm

#

That was it

silk jungle
#

heh, I wasn't expecting to see my name in a formal attribution, but apparently people cite your top PyPI packages data frequently @dreamy hatch https://zenodo.org/records/4732473

dreamy hatch
#

Yeah, I made a list, it gets cited quite a lot. Turns out making hard-to-get data more accessible is useful for science!

junior narwhal
#

wow the new o1 OpenAI model is really good. I was struggling with a type checking issue and, whereas the other models (4o and Claude) were trying to fix the problem how I wanted, it explained why what I'm trying to do is not possible given current Mypy limitations

ionic tulip
junior narwhal
onyx spindle
#

Still hoping one day we will get types on the same level as TypeScript has

ionic tulip
ionic tulip
#

but the writeup would be magnitudes uglier in the linked case

junior narwhal
#

I would appreciate a link!

junior narwhal
#

I tried that last night and still got errors

ionic tulip
#

i suspect that one needs a dict subclass with the type declarations and then do per item assignment of the methods

junior narwhal
#

I tried a bounded TypeVar with and without Generic, tried with Protocol, tried a long Union, nothing worked

ionic tulip
#

you bascially have a mapping where the get methods are "generic" but the mapping itself is not

#

so you'd have a get[T](type[T]] -> Callable[[..., T], None]: ... but t wouldnt be part of the outer type

#

hence the need for a custom mapping type, its most ugly

junior narwhal
#

in case others find it useful, I made a library for generating Click options from msgspec types. This is useful for plugins being configurable by users at the command line. Hatch will use this soon and overall will be going all-in on msgspec 🙂 https://github.com/ofek/msgspec-click

normal ibex
junior narwhal
normal ibex
#

that actually will work too, but then you won't get a type error if there's a new subclass you attempt to use that doesn't have an entry in the dict (like UnknownType in the example)

#

(in all cases note you do need the one measly type ignore on the assignment, to paper over the fact that you'll have some false negatives around subclassing. but all the users of it will be happy / the type inference will be what you want)

kind moon
kind moon
#

You guys really are everywhere

onyx spindle
#

in less places than I would want, but still barely handling all of that in the time I have 🤣

kind moon
#

I can't test my app on the 3.13 RCs because Pendulum doesn't have wheels and I have no Rust

#

But I... IDK if it's "incompatible" with 3.13 in the sense that it will burn down, or if it's just needing new wheels

onyx spindle
#

yeah, I was thinking more about code breakage on 3.13

#

as to wheels... I will see what I can do. I don't have PyPI access to the project and might take a moment to get the author to publish them

kind moon
#

You may have heard of this new thing called "Trusted Publishing" firLick

onyx spindle
#

yeah, still requires some work from the PyPI project admin

kind moon
#

Yeah

#

Do you need help with... I can try and install Rust in my container and see if it runs sometime in the next few days

onyx spindle
#

I might find some time in the near future to fix some stuff and maybe tag 3.0.1 with 3.13 in the pipelines so we can check, but no guarantees on that

mighty flower
pure plover
#

hi friends. 👋 I have a VERY off-topic question for you. Who setup the discord onboarding here? We (pyOpenSci) are adding discord to our platforms, and I LOVE that you have people read and provide an emoji response to the rules before they can post. I wondered if you have a bot or how that was set up. Many thanks!! 👐

onyx spindle
#

@silk jungle was tweaking the settings on that IIRC

junior narwhal
clear wigeon
junior narwhal
clear wigeon
junior narwhal
#

yes I understand that but I'm trying to think of the 99% use case

clear wigeon
#

then shouldn't you be alright using it anyways?

On other operating systems, return the path unchanged.
i.e for the 99% it will return a lowercase path

junior narwhal
#

I suppose, although I've seen some software use uppercase in path components but you're right that only Windows is particularly prone to stuff being a mix of upper and lower case so I'll use that function indeed, thanks!

#

I don't have the bandwidth to find it but I could've sworn I had to fix a bug (maybe for Hatch) that was explicitly about case sensitivity on Windows and macOS but not Linux

junior narwhal
ionic tulip
junior narwhal
#

side note: it saddens me that Windows is ubiquitously slower with everything and I wish I had more systems knowledge to understand why

strong blade
junior narwhal
junior narwhal
#

big day (not really): I decided to switch to double quotes. historically it's easier for me because I don't have to press shift but everyone expects them now plus I do a lot of work in other languages so I had to change around my keyboard hotkeys so that the apostrophe sends a double quote and the apostrophe requires shift. I thought about having two different keys but there's not much space left on my on-screen keyboard

fierce horizon
#

Since Black came around, I just started typing whatever I’m used to since Black (now Ruff) will just reformat things on save anyway.

lapis solstice
# junior narwhal side note: it saddens me that Windows is ubiquitously slower with everything and...

AFAIK, aside from process creation being slow compared to *nix systems (which is mostly an artifact of "win32 is big, really big, so loading the win32 API into each new process is slow"), it's no one thing, but lots of little things arising from different design decisions over the life of DOS/NT/modern Windows vs *nix (and Linux in particular). Hence even MS eventually deciding that Linux was often the best choice for headless use cases where having access to the full win32 API isn't useful.

With the real-time Linux patch set finally being mainline, it's even harder to see that performance gap ever closing.

junior narwhal
#

where I see it the most is with IO, even with equivalent storage attachments. Windows is just so slow it's crazy

onyx spindle
#

@nocturne swallow congrats on TOML support in tox!

junior narwhal
lapis solstice
robust sandal
#

Does anyone remember where the PyPI sqlite database was from? Someone had a (simi?) regular sqlite file production (attached to github releases if memory serves?) with package and dependency information about all the PyPI packages. I still have an old copy (pypi.db , 166 MB, from March), but have forgotten where it came from.

robust sandal
#

Fantastic, thanks. Looks like it's a bit out of date, but that's it!

#

Me: I'll star it to make sure I can find it next time.
Also me: Oh, I've already starred it....

junior narwhal
#

I do that all the time

robust sandal
#

@valid rover A new run right around the release of 3.13 would be nice, especially since classifiers are included! I've been pushing to get the 3.13 classifier in as many of my projects as possible this last week.

dreamy hatch
valid rover
#

I can kick off a run!

#

(Thanks for reminding me that people do use the data!)

timber sphinx
valid rover
#

probably, it's a case of how much I want to muck around w/ GitHub Actions debugging. But this shouldn't be too bad

robust sandal
#

FYI, I like having three levels of marks, with the only "x" being packages that declare 3.12 support but not 3.13. Some packages simply don't list classifiers at all.

#

And that page should include wheels, scipy does have 3.13 wheels, just no classifiers, for example.

dreamy hatch
#

why does SciPy have 3.13 wheels but no 3.13 classifier? it has 3.10-3.12 classifiers (and wheels)

robust sandal
#

Ah, I assumed it didn't have classifiers. Projects with per-python wheels tend to be more likely to be in the "classifiers are not needed" opinion. I guess they forgot.

dreamy hatch
#

ah, they have a 3.13 classifier, just not released yet

#
  • Update pyproject.toml to include Python 3.13 in the
    classifiers metadata. Considering we already ship 3.13
    wheels on PyPI built from pyproject.toml, I can't imagine
    there's a good reason to delay adding it now.
lapis solstice
#

We're coming up on the initial publication date for the virtual environment layering project I've been working on, so it's that fun anticipatory mix of "yay, I finally get to share the full technical details of the project I've been alluding to for the past few months" and "ugh, I hope nobody points out something egregiously obvious that I've completely overlooked".

I guess if the latter happens, that is one of the intended benefits of working in the open... 🙂

ionic tulip
lapis solstice
ionic tulip
onyx spindle
#

Did anyone package Python CLI app in winget?

junior narwhal
#

Did anyone package Python CLI app in

junior narwhal
cosmic dock
#

mini-blog post (i dont have a blog lmao)?

anywho
the hash function in python is defined as a modular reduction over the mersenne prime (a prime of the form 2^x - 1) 2^61 - 1

i was bored, and deccided to try intentionally making a hash table with such collisions

results after benching:

>>> timeit.timeit("a[p + 1]", setup = "from __main__ import p, a")
0.08406147197820246
>>> timeit.timeit("b[p + 1]", setup = "from __main__ import p, b")
0.08369432506151497```
```py
>>> timeit.timeit("b = {x * p + 1: 4 for x in range(1, 1000)}", setup = "from __main__ import p", number = 10)
0.1114881259854883
>>> timeit.timeit("a = {x * p + x: 4 for x in range(1, 1000)}", setup = "from __main__ import p", number = 10)
0.002080571954138577```

initialization time takes 50x longer for the one intentionally made for collisions, yet they both take about the same time to lookup

tested and replicated on different machines
mighty flower
#

It was a long time ago, but I have seen a real world case where a function name hash collisions observably slowed down a Python application

marsh kite
dreamy hatch
lapis solstice
# cosmic dock mini-blog post (i dont have a blog lmao)? anywho the `hash` function in python ...

The apparently fast lookup is because you're grabbing the first entry out of the hash bucket even in the "many collisions" case. Try adding a c variant that builds the collision-prone variant in the opposite order so you're grabbing the last value in the hash bucket:

>>> from timeit import timeit
>>> p = 2**61 - 1
>>> timeit("a = {x * p + x: 4 for x in range(1, 1000)}", setup = "from __main__ import p", number = 10)
0.000741713999559579
>>> timeit("b = {x * p + 1: 4 for x in range(1, 1000)}", setup = "from __main__ import p", number = 10)
0.04867755600025703
>>> timeit("c = {x * p + 1: 4 for x in reversed(range(1, 1000))}", setup = "from __main__ import p", number = 10)
0.048195532000136154
>>> a = {x * p + x: 4 for x in range(1, 1000)}
>>> b = {x * p + 1: 4 for x in range(1, 1000)}
>>> c = {x * p + 1: 4 for x in reversed(range(1, 1000))}
>>> timeit("a[p + 1]", setup = "from __main__ import p, a")
0.04043049199935922
>>> timeit("b[p + 1]", setup = "from __main__ import p, b")
0.039784960000361025
>>> timeit("c[p + 1]", setup = "from __main__ import p, c")
9.078608001999783

There's a reason we eventually accepted the need for container implementations to use a cryptographically secure hash function: https://peps.python.org/pep-0456/

cosmic dock
#

also, techniccally, siphash isnt cryptographically secure

cosmic dock
#

hm, wikipedia says its a non-cryptographic hash function while the github says it is

mild pollen
#

AFAICS, it's just not a hash function. It's a PRF.

cosmic dock
#

iirc you can make a hash function from a PRF

mild pollen
#

Maybe, but then it's not the same function anymore. 🙂

cosmic dock
#

ah right, a PRF has to be keyed

lapis solstice
#

The PEP and the wiki article are using "cryptographically secure" in two slightly different senses. A general purpose cryptographic hash function has the extra characteristic that it offers robust collision resistance: if two things hash to the same result, you can be confident that they had the same input. Siphash doesn't give you that level of collision resistance, so there are lots of cryptographic use cases where it isn't suitable (e.g. as a password storage hash - if your password storage is collision prone, then people can get in not only with your actual password, but also with any other password that happens to collide with it Edit: turns out it's the need for a key, and the lack of work ratio tuning parameters that make Siphash not great for password storage. TIL.).

Siphash is cryptographically secure in the narrower sense that even given a bunch of known inputs and their hashes, you still can't predict the hash for a novel input without knowing the hash key currently in use, and you don't have any practical way to learn that hash key. That wasn't true for the old hashing algorithm - even after hash randomisation was added, you could still theoretically examine the runtime behaviour to infer the hash secrets, and then use those to craft inputs that were highly likely to provoke a high rate of hash collisions, and hence induce quadratic behaviour in algorithms using sets and dictionaries (thus defeating the purpose of adding hash randomisation in the first place). As far as we're aware nobody ever actually created a realistic attack on the original hash randomisation algorithm, but switching to Siphash meant even that theoretical risk went away.

junior narwhal
#

wow is it scary to release based off of an old commit without a lock file

west basin
#

Is there anything similar to a strict xfail for GA? "Mark this job as passed if it fails, fail if it succeeds"

#

(I want to ignore free-threaded builds of Python for as long as package installation fails)

mild pollen
#

Are you using a shell step for the installation? If so, you could invert the command status with !.

west basin
#

the installation's performed by nox

#

it might be a good idea to split it out if I can

robust sandal
#

Does anyone know how to import multipart.py from multipart/__init__.py (replacing the module with the package)? I've now done this two ways: once with a custom .pth and loader, but it turns out Google Colab forces you to run !pip install python-multipart in the notebook, which means custom .pth files aren't run, as the process is started before the install happens. The second attempt I'm basically loading multipart.py by path, but that's not going to work if it's not loading from a file system, like in a zipapp, so if there' s a better way to get this, I'd be open for suggestions! Trying to fix a long standing package name collision and calm a heated fight between multipart (now in the CPython docs as a CGI replacement) and python-multipart (fairly popular). https://github.com/Kludex/python-multipart/pull/168

lapis solstice
ionic tulip
#

is there any practical way to prevent certain packages from being part of a dependency tree

i recently run into situations where i want to early stop others from trying to use specific packages as forst starting point for a solution (for examples types-confluent-python is a 3rd party package and packages wrong some types for confluent-kafka)

im not aware if a easy way starve them out out of using constraints that make any version of it forbidden

glacial bear
#

I also needed that recently, and was almost certain that there must be a way, but could not find anything.

ionic tulip
#

im not aware of a direct way to enforce conflicts with other packages

jaunty marlin
dreamy hatch
ionic tulip
shadow zealot
#

I tend to agree, we should have something better in stdlib and just deprecate all three of them

dreamy hatch
#

please comment on the thread 🙂

ionic tulip
#

hmmm, as i don't have a alternative to show for, i don't feel like that's going to help

having a idea for one, and showing one are different pairs of shoes

onyx spindle
#

also, having 4 parsers in stdlib sounds kinda insane

glacial bear
kind moon
normal ibex
onyx spindle
ionic tulip
onyx sphinx
normal ibex
fierce horizon
# shadow zealot I tend to agree, we should have something better in stdlib and just deprecate al...

Is something powerful yet less constrained possible? Command line semantics are purely convention-driven. You’ll always find things that are either explicitly valid or work by accident in some implementation that don’t work in another.

I think it’s insane to have CLIs that allow things like having multiple arguments to an option (like cmd -f 1 2 3 evaluating this as [1,2,3] being passed to -f) or having option values that start with dashes being valid anywhere else than in cmd --long-opt-with-equals=-v-a-l- or cmd -- -v-a-l-). But apparently that’s what some people in that thread want?

mild pollen
#

I want both of these things. 🙂

lapis solstice
fierce horizon
mild pollen
#

As for option values starting with dashes, well... I want -f "$x" to be interpreted in the same way no matter what the value of x is. (Again, I might be misunderstanding your position here.)

junior narwhal
#

I think it's pretty wild that we're trying to bring back one standard library module that is explicitly for niche purposes and the other just to avoid a deprecation/behavior change/bug fix cycle in the current most popular module

fierce horizon
# mild pollen Maybe I misinterpreted what you said. I like the support for the `-f 1 2 3` synt...

OTOH, an option taking a variable number of arguments is madness.
yes, that’s what I meant, sorry for being unclear

As for option values starting with dashes, well... I want -f "$x" to be interpreted in the same way no matter what the value of x is. (Again, I might be misunderstanding your position here.)
yeah, with a fixed number of arguments, this is well defined too of course. I’d still argue that if -f/--file’s single argument can possibly start with -, you should specify it as "--file=$x".

I think argparse can handle all these cases, so I agree with you and @ofek: If there are any real issues in argparse (i.e. issues that don’t involve ambiguity), they should be fixed instead of bringing back the old C-like stuff

mild pollen
#

If there are any real issues in argparse (i.e. issues that don’t involve ambiguity), they should be fixed instead of bringing back the old C-like stuff
Mind you, I haven't said that. 🙂 Honestly, I've no idea what the best strategy for dealing with argparse is. I haven't personally looked at its internals, but from other people's accounts it seems pretty fundamentally broken.

shadow zealot
#

As many issues I have with argparse, IMO this is not a wrong thing to do. It is also far from unique to argparse.

mild pollen
#

If it is to be rescued, it seems like it would have to be significantly reworked, with some features removed.

mild pollen
shadow zealot
#

There are two choices if $x starts with a dash, either treat it as an option or a parameter to the flag. Neither is wrong, and there are (non-Python) popular tools that do either.

mild pollen
#

I'd have to disagree.

shadow zealot
#

You can, but that’s just how existing tools are.

#

What I’m saying is it’s probably not a good idea to argue on this particular thing against argparse

lapis solstice
#

I'm of the view that continuing to describe the module that backs click and Typer as deprecated is a fundamentally bad idea (that would be optparse, not argparse, for the reasons given in the click docs).

#

I do think folks limited to just standard library modules should keep preferring argparse, though.

fierce horizon
mild pollen
#

Yes:

$ cat test.py
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--file')
args = parser.parse_args()
print(f'{args.file=}')

$ python test.py --file --foo
usage: test.py [-h] [--file FILE]
test.py: error: argument --file: expected one argument
#

And I do think it's a bug. It's just that from other people's statements I gather that the bug is pretty fundamental to how argparse works, so fixing it is not going to be easy.

junior narwhal
lapis solstice
#

Depending on click still transitively depends on optparse, at least for now. And if anyone else wanted to write their own click equivalent, optparse would still be a better starting point than argparse (for the same reasons). Projects that are happily using optparse will likely gain few practical benefits from migrating to argparse instead (and may cause regressions in their command line if they try to do so).

onyx spindle
#

pip is still on optparse for similar reasons

junior narwhal
#

I got nerd sniped apparently so if I have time at the end of the quarter of working on pip I'll probably work on vendoring click and revamping the CLI. unlikely because that sounds like a lot of work but we'll see

onyx spindle
#

is it worth it? I mean, optparse works fine and there is a lot of code around it in pip already. having to vendor yet another package and rewrite something that works just fine sounds like a lot of unnecessary work. Also, click API does not match the class-based way the pip CLI works now, so that would be another level of refactors. And personally, the deocrator-based way of click is just ugly when you have to use more than 2-3 decorators

junior narwhal
#

if it means using something that's maintained then I think it's worth it, of course only marginally impactful as you mentioned unless it's actually removed from the standard library (hopefully will be eventually)

onyx spindle
#

I'd say that pip pretty much locks optparse in stdlib

dreamy hatch
#

there's no question of removing optparse. it's "soft deprecated" which means:

A soft deprecated API should not be used in new code, but it is safe for already existing code to use it. The API remains documented and tested, but will not be enhanced further.

Soft deprecation, unlike normal deprecation, does not plan on removing the API and will not emit warnings.

https://docs.python.org/3/glossary.html#term-soft-deprecated

junior narwhal
glacial bear
#

Quick question, if a library aims to support all non-EOL versions of Python, is Programming Language :: Python :: 3 fine as a classifier or is there a reason to actually list all of the individual version classifiers?

onyx spindle
#

I'd say it's fine

#

some package managers (like Poetry) might add classifiers for you during package build, basing on your requires-python setting

junior narwhal
kind moon
#

yes

glacial bear
#

I do set [project] requires-python but I use flit and it does not generate classifiers AFAIK. Managing them manually is annoying, I always forget to update those when updating the test matrix.

onyx spindle
junior narwhal
#

does that happen even when that field is not marked as dynamic?

onyx spindle
onyx spindle
junior narwhal
onyx spindle
junior narwhal
#

oh interesting, if that's the case then it definitely violates the spec

onyx spindle
#

that might actually be something we should tweak before release if that's true 🤔

#

ok, just checked, when classifiers are defined in [project].classifiers they are not enriched

#

so all is fine

lapis solstice
#

I use the classifiers to record "this is what I test in CI", but I don't think there's any consumer that genuinely pays much attention to them.

mild pollen
#

Yeah, I don't know if there's much point to Python version classifiers.

fierce horizon
junior narwhal
#

yeah definitely I will (if I actually have time for this in the end)

onyx spindle
#

I think click was considered in that discussion

onyx spindle
fierce horizon
#

That explains why it rankles with me. I'm an API guy. When I try using click I immediately start writing abstractions to work around it.

And then I ask myself why I use it at all and ditch it. Every time.

junior narwhal
#

that's so interesting, to me Click's API (especially the decorators) is beautiful and allows for a very nice separation of CLI configuration vs business logic

onyx spindle
lapis solstice
#

I guess I think of click & Typer as Python-based DSLs for describing CLIs, that then call regular Python code to do the heavy lifting. I definitely wouldn't call https://github.com/lmstudio-ai/venvstacks/blob/main/src/venvstacks/cli.py pretty, but it lets me describe exactly what I want in a way that keeps options consistent across the different subcommands (with a bit of help from the test suite to ensure names don't get out of sync)

junior narwhal
onyx spindle
#

meanwhile Cleo:

#

but to be fair, click exposes their whole API on top level, cleo doesn't

frank shore
# onyx spindle Dekorators are ok, but not when you have 300+ lines of code of only decorators

While I use click too, I do think decorator-based setup is a performance problem. With lazy imports being rejected, it means when your app starts up, all those decorators have to execute, just adding to Python-based application startup woes. In a $job-2 we had a Python CLI that everybody used many times a day. Startup performance was the number 1 complaint and a lot of engineering went into old school lazy imports. It helped, but not enough to forstall the inevitable of the CLI being completely rewritten in Rust.

lapis solstice
lapis solstice
fierce horizon
#

I love clap-derive.

Being able to define a fully typed data structure together with the CLI that produces it is wonderful.

Typer breaks the illusion of doing that enough that it isn't on the same level for me, and click doesn't have its validation integrated with typing at all.

junior narwhal
junior narwhal
junior narwhal
#

yes

frank shore
#

Nice

junior narwhal
#

I also have it set up for extensions like git supports

lapis solstice
#

msgspec could be useful in venvstacks in general... (the metadata handling at the moment is seriously clunky)

onyx spindle
lapis solstice
onyx sphinx
#

It should if it doesn't

lapis solstice
#

Typer & typing.Unpack

silk jungle
lapis solstice
#

sigh Just came across this delightful snippet in some module activity guidelines for my Master's course: sudo pip3 install paho-mqtt (opinions may have been expressed on the class discussion forum for that activity...). It's not even a case where there's no corresponding Debian package (sudo apt-get install python3-paho-mqtt works fine).

ocean hedge
#

practicality over purity :p

onyx spindle
#

since when messing up your global site-packages is practical?

ocean hedge
#

from a user's perspective, it accomplished their goal in the short term?

onyx sphinx
#

They will have to change it for new distros that prohibit changes to global site packages

ocean hedge
#

that sounds like long term thinking

onyx sphinx
#

This is why they have feedback forms at the end of degree courses

shadow zealot
#

At this point I thought it doesn’t even need to be new distros? Many tutees would have trouble running that command today.

silk jungle
#

Man, I do kinda regret setting my own CA for mTLS. It's cool, but an authentication proxy with GitHub OAuth would be easier to use :P

#

I forgot that I set this client certificate to last only a year. I don't remember very well how to generate a new one (although I think I have a script somewhere).

fierce horizon
junior narwhal
#

I absolutely despise when installing some software on Windows wipes out half of your PATH

#

which is probably red flag number 9000 that I should continue avoiding the JS ecosystem

onyx spindle
#

Despite what Microsoft might say, Windows is second class in dev space. It might be king in casual space, but it's "meh" for dev on a good day

junior narwhal
#

in some regards but not about PATH semantics as I just mentioned, modifying the environment variable literally only works as expected on Windows

#

other operating systems use a mix of shell configurations that are hard to debug, on Windows you have the registry which is amazing

lapis solstice
lapis solstice
junior narwhal
#

wow this is perhaps the hardest software epic fail I've seen in some time. I went looking and actually the tool is written in Rust so it's not a JS issue per se and I found the code here https://github.com/volta-cli/volta/blob/v2.0.1/src/command/setup.rs#L236-L266

they read the variable using the registry properly but then write the variable using a command in a subprocess that limits the length 🤦‍♂️

onyx spindle
#

Some of the units feel like they were written 20-30 years ago and barely touched in the intervening decades.
Sounds like every CS university course I have seen...

shadow zealot
#

Having experience developing on Windows is an excellent dev environment on its own. It’s a bad space for devs because the devs make it bad.

#

Dev experience for Python is good if you understand the OS differences because people like Steve Dower made it good. Also good for Rust because the Rust devs treat each platform mostly equally from the beginning. Some other languages… let’s say it’s easy for users to tell if devs are only pretending to understand cross-platform.

clear wigeon
#

in my experience half the problems arise when people come from other OS and assume their understanding of linux / macos carry over to windows, and then basically operate under that assumption

#

when you take a moment to understand how windows works, then it can be a fine dev enviroment, just like macos, or linux

lapis solstice
#

The two that have most commonly tripped me up in my current project: cp1252 (latin-1-ish) as the default text encoding (but that caught some genuine non-UTF-8 locale bugs on other platforms too), and the lack of conventional symlink support (so venvs work differently)

junior narwhal
#

part of it is engineers coming from other platforms, part of it is the shell experience being terrible (cmd is bad and PS is difficult), part of it is poor defaults like the carriage return line ending (and encoding like Alyssa just mentioned), part of it is the difficulty reproducing your ideal developer environment (I've looked and I simply don't know how people set up a Windows box automatically to their liking), etc. there's a lot

#

lack of good/well-known/seen-in-tutorials tools like ls is another big one

shadow zealot
#

IIRC venvs work differently only partly due to symlink (you can have proper symlinks in dev mode nowadays) but how DLL is resolved against the exe. But yeah specifically how Windows needs exe shims is quite annoying

lapis solstice
#

Yeah, there are extra differences in how CPython itself starts up. The differences are less opaque these days than they used to be (due to the sys.path initialisation being written as regular frozen Python code instead of a tangled mess of conditionally compiled C code), but they're still there.

fierce horizon
fierce horizon
lapis solstice
fierce horizon
#

I see! Yeah, if that’s possible that sounds like a great idea!

silk jungle
#

I'm sure you're inundated with notifications, so you may have missed this this

kind moon
#

I’m also looking to get more involved with being helpful around here

junior narwhal
glacial bear
marsh kite
#

Someone's cache/mirror stopped being used probably

hexed briar
#

Don't mind me, just sitting here happy that our moderation rules shut down a random NFT bot that tried spamming here. 🙃

marsh kite
#

The number of times automod in other discords shut down scams for steam gift cards is disheartening when you realize that most accounts that post these have been hacked.

junior narwhal
shadow zealot
#

Most people only recognise two states: Either it works, or it does not. If you don’t magically fix all the problems, you’ve done nothing and nothing changed.

onyx spindle
#

Often those who shout loudest that something is broken, do nothing to help fix that

fierce horizon
#

That world is so complex. Transpilers, bundlers, source maps and module systems most of which are in a transitory state that never seems to end. I understand why that continues to be confusing and sometimes just doesn't work.

Here, however, 2.7 is dead, there are no transpilers, bundlers, or source maps, or module system transition. So it's harder for me to see why people have problems here.

lapis solstice
#

Concrete recent example: https://mastodon.social/@webology/113653520596020173

From a tooling point of view (given how Jeff eventually made the problem go away), I assume uv publish was trying to be helpful and automatically infer useful metadata based on the repo contents. From a user point of view, the end result was a mysteriously failed upload where the suggested diagnostic steps weren't actually helpful (since the incorrect metadata was being added implicitly rather than explicitly).

At any given point in time, there will probably be some transitional rake lying around for people to step on.

#

Heh, prompted by the #pip thread on browsing repotrends graphs, it's clear Ezio did a good job with metadata preservation on the CPython issue import: https://www.repotrends.com/python/cpython

Apparently folks also managed to keep ahead of the issue opening rate for the better part of two years (although the timing of that decline makes me wonder if the issue opening rate slowed down rather than the issue closing rate going up).

dreamy hatch
#

There's also a "New issues and pull requests" chart showing a steady increase of new issues per month, so I put it down to good triaging work

fierce horizon
# lapis solstice Concrete recent example: https://mastodon.social/@webology/113653520596020173 F...

Bugs happen. I don’t think his “Python packaging is so frustrating” is an accurate summary of what happened there.

  1. the CLI reported the server response
  2. the server said what’s wrong
  3. what was wrong was something implicit that the packaging backend should handle transparently, so no need to invest into more in-depth user-friendly errors since that error message was pretty damn friendly for something that’s an internal bug

I personally went away from setuptools long ago since it’s so complex and carries around so many legacy modes of operating that I don’t think it can prune itself to simplicity in the forseeable future.

So I think what happened is that he went with a very complex build backend without needing it. People with simple packages shouldn‘t default to setuptools, and people should feel empowered to hop into some chat room and ask for help!

lapis solstice
#

That wasn't what happened. Jeff's an experienced Python dev, went to publish something the same way he had published several other things, and it didn't work (through no fault of his own).

fierce horizon
#

That’s in no way incompatible with what I said

lapis solstice
#

The fact we can help root cause what went wrong, doesn't eliminate that initial frustration of "Oh look, it broke, again".

fierce horizon
#

Yes, read my last paragraph in the long message

lapis solstice
#

The backend turned out not to be setuptools, so my initial guess was wrong (that got clarified later in the Mastodon thread)

#

Venting is also a different state from actively seeking help (if I hadn't already interacted with Jeff many times, I wouldn't have replied, since his post was fairly clearly just venting frustration rather than seeking assistance)

fierce horizon
#

people should feel empowered to hop into some chat room and ask for help!

#

That’s the part of my message that addresses that

#

I think

  1. people should default to simple, well maintained build backends (there is a reason you guessed setuptools, even if you ended up guessing wrong)
  2. people should hop into a chat room when they feel frustration coming up instead of banging their head to the wall

Both of these can be encouraged

lapis solstice
#

No worries on that front where Jeff is concerned (I don't think he's been an open source contributor longer than I have, but it's long enough that I don't know that for sure)

#

Even when you're experienced, "I've found a bug" isn't your first reaction - it's to try and figure out what you're doing differently from the last time you did whatever it is you're doing. (Experience actually reinforces that reaction, since you're usually right!)

fierce horizon
#

Hm, if I specified standards-compliant metadata and got that error, I’d guess bug, but then again I probably spent more time learning about Python packaging than most.

lapis solstice
#

Yeah, it's different in areas that we work on all the time (I'm far more likely to assume "bug" when doing something strange in venvstacks than I would when packaging a regular Python library).

fierce horizon
#

Actually, why are you thinking that it’s not that setuptools bug?

#

I wouldn‘t guess that uv publish does anything other than invoking the build backend and uploading what it gets

#

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

fixed my issue.

relatable

foggy elk
#

uv publish uploads the metadata as it got it from the build backend through .dist-info/METADATA, no inference involved

#

I don't thinking I'm using setuptools (I don't see it listed when I pip list) but I'm seeing similar issues.

The build backend doesn't show up with pip list; given that you implicitly get setuptools if you don't declare a [build-system] and setuptools is the only build backend that i'm aware of having this bug, this does sound like an implicit setuptools problem

#

I thought about adding warnings when there are unknown fields in METADATA to catch such problems with the next metadata version, but this would probably lead to more problems when build backends are already using custom fields

lapis solstice
#

The amount of complexity people are willing to add to their package build systems solely in order to derive their version numbers from their version control tag history continues to boggle my mind.

Yes, in-repo versions bring their own set of problems, but they're so much more manageable by comparison, since they only affect your release process, not everyone who ever tries to build your project from source (in who knows what kind of wacky environment).

lapis solstice
#

(I ventured deep into this territory today, due to the whole "VCS refs can't be hashed, but most VCS tag based dynamic versioning schemes break if you try to build them from a source tree tarball instead of a published sdist" problem. My least awful workaround ended up being to do an interim fork of the project and switch it to static versioning with a local version identifier appended so it could be built from a source tree tarball)

junior narwhal
#

I would argue however that those should not be used and rather the source distribution on PyPI should be preferred since maintainers actually control that

#

for large repositories it can save a lot of bandwidth

lapis solstice
#

Absolutely, this only comes up when you need to work around a release not existing yet (which means an unversioned commit archive rather than an sdist)

#

The directory trick only works for tag archives, not commit archives.

#

(if there was a tag, it would imply a release, and I wouldn't be perpetrating these shenanigans in the first place)

junior narwhal
dreamy hatch
#

I wouldn't bother adding extra checksums, use the PEP 740 attestations instead for PyPI things and GitHub's attestations for other things

ionic tulip
lapis solstice
#

For a potentially easier path towards toml/yaml support in the future, you may want to poke around at @junior narwhal's msgspec-click: https://ofek.dev/msgspec-click/usage/#example

Using msgspec for the data model description also provides tentative answers for some of your other open questions.

junior narwhal
#

the maintainer of msgspec still hasn't released a version supporting Python 3.13, people keep asking in issues and I keep trying to follow-up with an email thread I had with the person to no avail

#

they had personal issues IRL to handle but now there is just silence unfortunately

#

if there is no release and still no correspondence in a few months I'm going to try adopting/claiming the package, I forget the process but whatever our process is for that

lapis solstice
dreamy hatch
jaunty marlin
junior narwhal
#

it wouldn't happen right now obviously, but I said in a few more months of no correspondence

#

I think it's entirely reasonable for let's say 6 months to go by before such a request, I don't think years should be the criteria although admittedly I haven't read the link that Hugo posted above yet

#

(also to be extra clear, my emails with Jim have been offering to become a maintainer and assist with releases and whatever else but no responses)

jaunty marlin
#

The PEP specifies >12 months and other criteria

junior narwhal
jaunty marlin
#

I see

#

I've been in touch with him recently, he's not disappeared — just busy with other things.

junior narwhal
#

it's a shame Pydantic is so slow that it's unusable for CLIs and serverless scenarios currently, I'm really bummed about that

jaunty marlin
#

😬 yeah

#

I haven't tried it since the Rust port tbh

#

(which is sort of funny)

junior narwhal
#

UV can probably perform the resolution of dependencies for a medium-sized project faster than import pydanticplus model definitions

ionic tulip
west basin
lapis solstice
dreamy hatch
#

and because tools are often rewritten in Rust to make them faster

west basin
#

(I assumed that to mean that it’s written in a “funny” way)

#

(Poor reading comprehension :)

jaunty marlin
#

Yeah it's just ironic

#

Overall not so surprising though, I write a lot less Python these days

silk jungle
#

Does anyone know of a tool or script to fetch say like the timings of the last 20 GHA workflow runs? I'm trying to compare my CI times on main to when I add dev drives to it.

#

This feels like a common problem, and unfortunately github's own usage metrics simply aren't useful in my situation.

onyx sphinx
silk jungle
#

Nothing that a custom script can't deal with. Now I just need to scope this to the windows jobs.

silk jungle
#

I probably should've used the graphQL API for this...

#

This is coming along nicely though. I've wanted something like this for a while.

#

Ah, the necessary data doesn't seem to be exposed in the GraphQL API. Also wow their new explorer is not something I can use easily.

dreamy hatch
silk jungle
#

I was going to contribute to Refined GitHub but my node/npm toolchain is too old. Ugh. I'll get around to it later.

junior narwhal
junior narwhal
silk jungle
#

https://datatracker.ietf.org/doc/html/rfc8962

If this work is standardized, IANA shall set up a registry for criminal networks and addresses. If the IANA does not comply with these orders, the Protocol Police shall go and cry to ICANN before becoming lost in its bureaucracy.
hahahaha

kind moon
#

How long is the wait?

#

I contacted my localhost but it's been months and I still haven't heard back softFeels

silk jungle
#

have you tried asking /dev/null?

#

they never miss a message

kind moon
#

I'll send them a message right now, thanks!!

lapis solstice
#

"All your networks are belong to us". Now that kicked off a nostalgia trip 🙂

onyx spindle
#

what does one do when they have huge backlogs on existing projects? start another one

ionic tulip
dreamy hatch
#

but you know what they say, github stars are a really good measure of how many stars a repo has

silk jungle
#

happy new years folks! ✨

cosmic dock
silk jungle
#

GitHub has been making a ton of UI changes/feature rollouts... I just got new the new issues UI, and... hmm

#

The new features seem nice, but the UI is harder to read at first glance. Some information is straight up missing, while the rest is more muted.

silk jungle
#

Time to get involved on the GitHub preview feedback discussions, I guess

silk jungle
#

Hopefully they at least tweak the overly aggressive label truncation in the issue timeline. They'll probably ignore my other feedback, but /shrug.

mighty flower
junior narwhal
#

the only change that I disagree with is on PRs there is no longer a button to run CI for new contributors, it's really inconvenient

dreamy hatch
#

yeah, it's annoying to switch back to the old one

onyx spindle
junior narwhal
#

the latter, you have to go to the actions list page now and click the button for each job workflow

junior narwhal
shadow zealot
#

Not very optimistic tbh. I hear reports people are already flocking to xiaohongshu which is arguably worse.

vast wren
#

Feels like a Patriot Act 2.0 tbh

timber sphinx
fierce horizon
junior narwhal
#

seems reasonable to me, for example we wouldn't let a broadcasted television station be owned by an adversary

shadow zealot
#

Except you only ban one station while like 10 others still broadcast just fine

junior narwhal
#

as far as I know the ban would apply to all large social media companies

#

I think if there is now an awareness in the zeitgeist that social media networks can negatively impact a populace then it makes sense to reduce the impact of networks that have the potential to deliberately propagandize

#

in any case, many people that are currently against the ban are going to flip in the coming weeks because the new president and allies like Elon Musk are also against the ban and so they will take the opposite opinion

vast wren
#

When the ACLU, EFF, and Knight Foundation are all against a thing, that feels like a pretty good sign

#

AFAIK RT America was never attempted to be banned (it ended up shutting down because a bunch of private companies chose to no longer air it), and that was basically just wholly a propaganda vehicle for Russia

junior narwhal
#

we aren't banning foreign media however, we are banning foreign social media specifically from entities that are considered hostile to us

#

that's quite different

#

I suppose the crux of the issue is whether or not one thinks social media algorithms have the ability to negatively impact populations. if one doesn't think that is possible or the risk negligible, then I would understand why the ban would seem like a bad idea

vast wren
#

The first amendment doesn't really have a "well if someone could do something bad" loophole

#

Also we're apparently perfectly fine letting foreign (and domestic) entities use social media to negatively impact the population, as long as their headquartered here

junior narwhal
#

I'm curious your opinion on the "crux of the issue" I mentioned above

vast wren
#

so the US courts do not agree that the first amendment does not apply in this case, the lower courts have just stated that they think the ban is consitutional under the first amendment

#

See the opinion in TikTok Inc v Garland: https://casetext.com/case/tiktok-inc-v-garland

Having concluded that TikTok has standing, we need not separately analyze whether the User Petitioners have standing to raise the same claims. See Carpenters Indus. Council v. Zinke, 854 F.3d 1, 9 (D.C. Cir. 2017) (explaining that "if constitutional standing can be shown for at least one plaintiff, we need not consider the standing of the other plaintiffs to raise that claim" (cleaned up)).

#

Tiktok in the US is distributed by Tiktok LLC, which is a US company registered in Delaware, and thus is entitled to the full protections granted to any other US company, it happens to be a US company owned by (through a number of other companies) a China based company.

#

Or more plainly, (again from the courts opinion in TikTok Inc v Garland):

We conclude the Act implicates the First Amendment and is subject to heightened scrutiny.

vast wren
#

The government thus far have provided no evidence that China has done anything bad, and they've admitted that their banning is entirely based on the fact that they could.

junior narwhal
vast wren
#

(to be clear, I'm ignoring the grand standing done by congress people during the congressional hearing, because congressional hearings for things like this are like 70% just ways for congresspeople to get sound bytes into the media and have no requirement that their questioning is based on fact in any way)

#

The government's arguments in court are much more representative (IMO) of what they are actually able to produce as factual evidence

junior narwhal
vast wren
# junior narwhal I understand what you're saying, but I am saying that it is happening and was ju...

So both of those things suffer from the same problem, they observe something, then they leap to a conclusion about why that observed thing happened. They admit as much in their conclusion of the Timebomb document:

Given the research above, we assess a strong possibility that content on TikTok is either amplified or suppressed based on its alignment with the interests of the Chinese Government.

That's fine for a study, but a proper free and just nation should require something more than "well there's a possibility that ...." before starting to strip away constitutional rights. They have (or at least should have) the requirement to prove both that a bad thing is happening and that it is having the impact that they believe.

#

The lawfare blog has an interesting take on this too

#

in which they more or less start out conceding the argument that tiktok is being manipulated, but then link to studies showcasing that people generally aren't that swayed by social media algorithms, and if their feed keeps serving them content that don't already align wth their beliefs, they tend to just.... stop using that app.

vast wren
#

https://www.science.org/doi/10.1126/science.abp9364 - Where some users were given a chronological feed and some were given an algormithic one on facebook, and the tl;dr was that as they were exposed to a more diverse range of opinions, rather than just the stuff they already believed, they ended up using facebook less, and there was no noticeable impact on polarization or political knowledge between the two groups.

https://tnsr.org/2024/03/from-panic-to-policy-the-limits-of-foreign-propaganda-and-the-foundations-of-an-effective-response/ - a long article, but links to a lot of other studies, and the general thrust being that "propaganda is generally most effective at providing people with rationalizations to ideas that they already had, rather than at giving them new ideas on their own"

junior narwhal
fierce horizon
#

Next step: Elon Musk unveils new short video function on X with “Free Speech”*
*as long as you’re not critical of Elon or anyone helping him to make the US into his Corporatocracy
… or at least that would be the case if there were any engineers left at X.

devout elbow
fierce horizon
#

A platform’s moderation proved to be woefully inadequate to curb the platform’s use to coordinate and spread disinformation
How is that in any way unique to TikTok lol

vast wren
#

which happens on every platform tbh, and would be a reasonable thing to try to prevent, but banning one random app isn't likely to do much

fierce horizon
#

exactly what I just said 😄

#

The CIA has replaced democratically elected governments in significantly more direct ways, and the Republican party’s survival can only be attributed to the use of social media to erode the belief in truth itself.

I can’t believe the supreme court can see us down here from their mountainously high horses, pointing fingers towards a teenager platform while they decide what the law is based on what boomers on Facebook want it to be.

junior narwhal
#

I think it's rather uncharitable to believe that the Supreme Court judgments are determined by "boomer" opinions on Facebook

fierce horizon
#

I was being facetious. In full, educated honesty: The conservative judges on the supreme court make it have its most partisan composition in living memory (and probably quite a bit beyond). The source is of course not (directly) Facebook, but the cases they took up are clearly a coordinated mission to destroy decades-old liberal staple case laws and replace it with populist reactionary ones.

The fact that this is what it does with bipartisan support is tragic.

junior narwhal
#

which rulings in particular do you disagree with?

#

or more precisely, which rulings do you think that the outcome was in direct opposition to a robust reading of the law e.g. a fully politically motivated ruling?

#

I disagree in principle with some of the rulings, for example Roe v Wade, but when I read the rationale it makes sense based on legalese

#

and at the end of the day that is the purpose of the Supreme Court, only (mostly) to enforce constitutionality. if modern times require an update to the Constitution then that's a different branch of government

devout elbow
vast wren
devout elbow
#

I agree that many social media platforms are problematic and I hope the EU makes good progress in limiting them robustly - or outright banning them if they do not co-operate.

vast wren
#

As a reasonable demonstration to how silly attempting to ban a single app for a systemic issue is, people are flooding to Red Note now instead of TikTok, and on every axis that the law used to justify banning TikTok, RedNote is worse.

junior narwhal
#

doesn't xiaohongshu translate to "Little Red Book" (not sure if it's actually a reference to Mao's)? I'm not sure where Red Note came from but I see that now in some circles

vast wren
#

afaik it does translate to that, but when they translated the app into english they translated it to RedNote

devout elbow
#

Another topic... hmmmm I wonder how the logistics work for providing light lunch and refreshments for remote conference attendees works.

vast wren
#

$50 doordash coupon here we go

junior narwhal
#

speaking of doordash, sometime in the past few months they changed their map service to something else and the routes are absolutely horrible now. the driver will be almost right next to my place and the path will still circle around half the town

#

mapbox apparently, not sure what they used before

devout elbow
fierce horizon
analog oyster
# devout elbow They used to use google: https://careersatdoordash.com/blog/scaling-geospatial-i...

Oh cool. My field. I’d guess they’re still using the Google Maps Direction API in some form since it’s mentioned on their Dasher Support page. There’s a reddit post from 3 months ago, but it’s just a commenter that’s claiming it’s still Mapbox. There’s a 2017 medium post from MapBox about it, but that careers page is more recent.

Check out that simulation architecture diagram (really cool stuff). That article is from 2020. I’ve heard a lot about what they’re trying to do with this, and my guess is that more recently they’ve started making automated, on-demand modifications to their directions, and some users are probably seeing issues with that if there are tricky traffic patterns or some other tuning/helpful data. FWIW I’m working on very similar stuff right now. So it’s probably them and not some vendor’s API, but idk for sure.

junior narwhal
#

@analog oyster example, it's right outside

analog oyster
#

lol wow. Yea that's wild, but it really could be anything. Any recent traffic issues?

#

Something as simple as construction can cause this. Even if it's not directly in-route.

junior narwhal
#

not sure as I really don't ever go out but if that's the cause then there must be permanent traffic 😂

analog oyster
#

Lol yea also in general that highway is probably more reliably higher travel speeds, so maybe something happened recently that’s causing it to favor that. I’m sure they’re baking a lot into it.

glacial bear
#

TIL: The walrus operator won't work in f-strings but sometimes also won't throw a ValueError because := is a valid format expression if followed by a digit. f"{x := 5}" will just print the current value of x with padding, or throw a NameError.

shadow zealot
#

We should have a linter rule for this…

#

Or maybe the formatter shouldn’t allow you to have a space after the variable. If this is formatted as f"{x:=5}" it’d be a lot easier to notice

silk jungle
#

@steel crane wow this sucks. I think your work is great heart_blob

#

I am aware that hacker news is particularly critical of seemingly anything Python related, but this is a new low.

junior narwhal
#

I've no idea the context of that

silk jungle
#

It's the PyPI supports digital attestations HN thread.

steel crane
#

@silk jungle thank you, i really appreciate that! and yeah, that thread was definitely a new low -- HN is always overly critical of processes that they don't understand/aren't legible to them, but that devolved right into conspiracies

#

it's really just 3-4 specific people on HN who seem to have a personal vendetta against anything packaging, and especially anything that involves the security stuff PyPI has done in the last 3-4 years

silk jungle
#

Is anyone else having issues with github notifications automatically marked read despite not opening it? I think gmail is activating whatever email receipts github uses to automagically mark notifications as read.

silk jungle
#

Hm, I wonder if it's due to Google Advanced Security, and thus they're more rigorously checking my emails.

silk jungle
#

Ugh, probably.

silk jungle
#

Ugh, it looks like unenrolling in Advanced Security resets the 2FA recovery codes. Lovely.

plush trench
#

hey everyone, first time poster here. wasn't sure where to stick this question but wanted to do my best not to clutter other spaces.

I'm working on a monorepo of atomic python packages, there's roughly 60+ standalone atomic packages.

Does anyone have an idea of what the pypi rate limit is for creating new projects? For updating projects?

Struggling to find a resource with this info, greatly appreciate any info on the topic.

thank yall so much

junior narwhal
stoic mural
#

also, these questions are totally appropriate for #pypi, feel free to ask there in the future

plush trench
#

excellent, thank you so much!

plush trench
# junior narwhal what do you mean by atomic?

small, discrete, typically a single component. our atomic components feature entry-points allowing them to plugin to our namespace pkg.

I anticipate publishing and maintaining many atomic components over time, but as of now, we just need to get away with publishing the 60 or so that currently exist.

I was able to get to around ~20 before running into a 429.

So I was just hoping for a rough ball park on the API rate limits to see what we're able to work with.

I did look over what di provided but I'm still not seeing a ballpark estimate.

I want to responsibly use pypi of course, but I also want to publish these packages haha, so if there's any specifics please let me know and I'll make sure we plan within the limits.

#

thank yall!

silk jungle
#

Whelp, and now my github notifications are totally broken. Lovely

#

Fortunately, I don't really mind using email as my GH inbox. Still frustrating, however.

plush trench
#

quick update on my end, got everything published. My best guess despite lack of clarity in docs is that the rate limit for new projects is something like 20-25 new projects per hour or so.

After my 429 lifted, I published the remainder slowly to ensure I'm not riding the limit.

Fortunately, I don't expect too many days where we need to publish 20+ packages now that we've completed our initial batch

dreamy hatch
plush trench
#

yeah, di scared me a bit because there's a big "use responsibly or be banned forever" message on there haha, but super thankful for the outstanding resposne time and extra insight.

As you said @dreamy hatch, took a break and tried again after awhile and just published the remainder in between chores and back to business, barely felt the bump.

@silk jungle, this was precisely what I was looking for to get that peace of mind overall, thank you so much for circling back here and diving deeper. 🙏

stoic mural
plush trench
#

right, that was my main worry, now we know the terrain better, smooth sailing!

junior narwhal
#

@visual furnace we're interested in using keyring at work for a CLI but importing it is extremely slow, would you accept a PR that introduces some lazy imports?

fierce horizon
#

What environment? On macOS for me, with Python >=3.12 it’s only 35ms (on 3.11, some compat code takes 50 extra ms to import)

junior narwhal
#

tests on Windows show ~170

junior narwhal
#

the new Gemini models are amazing. not only for regular use cases do they appear superior to ChatGPT but 2.0 Pro Experimental is actually the first model that I feel produces better code than Claude, and that is quite the feat

#

it's funny actually, the released emails from the early days of OpenAI spoke about Google being some kind of existential threat and that they would de facto produce AGI without competition. I thought last year that it was a bit hyperbolic but now I understand

lapis solstice
#

Given the off-the-record praise I had heard from Googlers for their internal AI tooling a decade+ ago, it actually surprised me that Bard was initially so poor. The internal tools might not have been structured around the current conversational AI model, though, so maybe it just took them a while to bring the relevant expertise to bear on their public solution.

visual furnace
# junior narwhal <@720786343097270374> we're interested in using `keyring` at work for a CLI but ...

Possibly. Maybe first double-check that it hasn't already been tried. Part of the plugin-based architecture of keyring makes it ill-suited for lazy imports (backends are imported eagerly to determine if they're viable and at what priority for a given environment). It specifically breaks out of Mercurial's demand-import mechanism. I'd start with a modest draft PR to illustrate the concept, because I'll be disinclined to accept it if it's too imposing on the implementation. I'm happy to explore the problem space, however. I'm also interested in exploring systemic solutions to this problem. I'm finding it to be a repetitive problem that one implements a piece of code using standard idioms only to find that it needs manual delayed imports (in the standard library and beyond). I'd like to see something like Mercurial's demand import or perhaps something even more integrated with the core implementation. Ideally, it shouldn't be the responsibility of Python programmers to write idiosyncratic code to achieve better performance. Do feel free to ping me on GitHub (if you haven't already).

fierce horizon
mighty flower
#

I'm definitely following issues and requests on Mozilla's bug database that are over 20 years old

junior narwhal
#

for any distro experts here, is there a modern equivalent to this? http://createrepo.baseurl.org

currently we vendor it but it lacks support for Python 3 and I would prefer not to use it at all rather than manually updating it

orchid cave
frank shore
# orchid cave This is actually cool! https://fi-le.net/pypi/

Objective C bridge PyObjC
A little Python history: it was Python's ObjC bridge which intrigued the team at CNRI about Python in 1994. Roger Masse and myself were working on their software agents ("Knowbots" although Bob Kahn really did not like us using that term as a noun) project as part of the digital library initiative, and we were doing that in ObjC on NeXT machines. We went to the first workshop ostensibly to talk to Guido about the ObjC bridge and see how we could use and evolve that. We recognized that the two languages were a really great pair. After the workshop we were talking to folks at CNRI about all the cool things we could do and someone (Dave Ely IIRC) suggested we try to hire Guido. The timing was right and as they say, the rest is history.

Follow up: It was not long after Guido joined CNRI that he said we could just do it all in Python and forget about ObjC, which of course, is exactly what we did 😆

mighty flower
onyx spindle
mighty flower
#

Yeah, and if you go through their closed it's because they've addressed it by answering or fixing

onyx spindle
#

Flask is pretty much the same

#

David Lord (lead maintainer of Pallets community) spent a lot of time to go "inbox zero"

mighty flower
#

Amazing, I don't think my brain can work like that

jaunty marlin
#

pyright another incredible example of this

#

They close way more things as wont-fix than we do though

#

It's sort of a different attitude / user-facing brand

jaunty marlin
#

This is indeed not supported today, but it's out of scope for the core team. Well-tested PRs are welcome.

silk jungle
#

@timber sphinx I have a prototype that enables pytest-socket to work in Python subprocesses

#

I am not excited to write the tests though. Skimming through the pytest-socket tests, I have no idea what I'm looking at

silk jungle
#

This is not suspicious at all :P

silk jungle
lyric quiver
#

was looking into speeding up import times

#

came across now what seems to be a dead / no longer maintaned project named oxidized-importer and I got intrigued.

#

my main offender has always been pytorch with it's notorious 2-2.5s import, coupled with tensorrt adding another 300-350ms.

#

I am developing a CLI and I want it to feel as snappy as possible.
Has anyone went down the rabbit hole of improving start-up times past lazy-importing?

#

A simple argparse prompt with -h, python 3.12.9 compiled with LTO+PTO results in approx 1.5s of start-up from the go.

strong blade
lyric quiver
#

I forgot I was importing pytorch from the get go

#

so that's why my initial -h import time was so large

#

doesn't fix the whole issue, just a confusion I had 🤣

#

seems like torch takes roughly 1.04s to import

lyric quiver
lyric quiver
#

will see if there's any other improvement I can make

#

lazifying the torch import

#

🤔

#

a lot better

#

I only moved the problem elsewhere

#

but I did manage to save at least 150ms on total initial runtime which is still quite significant

silk jungle
#

Tidelift was acquired recently ... that's a bit worrying. While I have no reason to think their acquirer specifically will shut down tidelift, the general theme of acquisition seems to imply that will happen at some point.

clear wigeon
#

you can always ask the PR person

lyric quiver
#

Kinda weird question

#

say I do some ops with the os module

#

I do say some string concatination with os.path.join

#

and I will never ever use the os module past that point

#

does the GC simply yeet out the os module from memory since it no longer gets used?

#

Or does it linger around until the python execution ends.

boreal bramble
#

The latter

#

Reference to it is kept in sys.modules (and by any module that uses it)

lyric quiver
#

do devs typically try to remove unused modules from memory especially for long running processes?

#

to free up space ofc

boreal bramble
#

os is probably a module that is loaded by interpreter itself (or at least so many stdlib modules that you're bound to import it through another one anyway)

lyric quiver
#

os was mentioned particularly to get my point through

boreal bramble
#

I don't think anyone would remove unused modules to free memory, code objects don't take that much space and most modules don't have globals that would keep a lot of data in them

lyric quiver
#

🤔 not even those that load .dlls in memory?

boreal bramble
#

Not in any typical usage

#

I can't speak to all potential edge scenarios where it maybe makes sense for some very specific reason

#

But like, generally, there's no reason to

lyric quiver
#

the 9pm free will of over optimizing python for no point is probably kicking in.

junior narwhal
#

if you want to use something once and not have it introduce a continued runtime cost run it in a separate process

lyric quiver
#

oh like a separate thread and then simply kill it?

junior narwhal
#

no, process

boreal bramble
#

In long living apps, code objects are probably a largely insignificant part of program memory compared to whatever data they operate on, while for short living processes, you'll get the memory back soon anyway.

lyric quiver
#

🤔

#

thanks

#

will try out the process thing ofek mentioned, got me intrigued

junior narwhal
lyric quiver
#

I have nightmares with mp

#

Data sharing with shared memory is the biggest pain

junior narwhal
#

one thing to watch out for is that if you aren't on Linux/a COW system then you can have a large memory spike from the copying

lyric quiver
#

^ This, experienced it first hand

junior narwhal
#

the alternative would be subprocess.run([sys.executable, "-c", "..."])

lyric quiver
#

Oh that could work

junior narwhal
#

and get output from somewhere or use a shared temporary file that is known with an environment variable for example

lyric quiver
#
        cmd = [
            ffprobePath,
            "-v",
            "quiet",
            "-print_format",
            "json",
            "-show_format",
            "-show_streams",
            "-count_packets",
            inputPath,
        ]

        result = subprocess.run(cmd, capture_output=True, text=False)
        if result.returncode != 0:
            logging.error(f"ffprobe failed: {result.stderr}")
            raise Exception(f"ffprobe failed: {result.stderr}")

        stdout = result.stdout.decode("utf-8", errors="replace")
        if not stdout:
            raise Exception("No output received from ffprobe")
``` been using it for getting video metadata with `ffprobe`
#

never considered modules tho'

#

that sounds fun

shadow zealot
#

Even C programs don’t unload most functions either, it’s generally reserved only for extreme cases

onyx sphinx
lyric quiver
onyx sphinx
#

Oh it saves a lot of the code here. check=True raises on a bad exit code, and encoding+errors makes result.stdout text

#

Also you probably don't want to log an exception you raise because it will usually get logged again at the top of the stack

lyric quiver
#

Will try to add those

#

Thanks for pointing it out

lyric quiver
#

didn't know these existed

#

and it still works with weird chinese characters ❤️

onyx sphinx
#

You can just let the subprocess.CalledProcessError raise

lyric quiver
#

oh, so just remove try except?

#

I guess you're right again

#

and it's also more stylish 😎

kindred hound
#

you can also remove text=True. It's redundant with encoding.

#

If encoding or errors are specified, or text is true, file objects for stdin, stdout and stderr are opened in text mode using the specified encoding and errors or the io.TextIOWrapper default.

lyric quiver
silk jungle
#

@jaunty marlin is there a machine-readable list of all of the rules Ruff implements? Just wondering. If not, I'll scrape the webpage, no biggie.

jaunty marlin
#

(yes I don't have Ruff installed, it's 2025!)

mighty flower
#

I should remember to read off-topic

silk jungle
#

can't see that message link

silk jungle
kind moon
#

thanks

silk jungle
#

@kind moon we're pretty strict about the invites we allow

kind moon
#

I see that

mighty flower
#

It was just in the "#linter" channel

mighty flower
silk jungle
#

I definitely prefer if someone does it :P

#

I could do it, but I have other things I'd rather work on (reviewing that resume PR for one)

mighty flower
#

Well, I'm happy to keep at it, just going to be fairly slow

silk jungle
#

it's one of the last things that needs to be done promptly

silk jungle
#

A random thought that just came to mind is that I actually haven't expanded the set of PyPI projects I know and regularly use in a long time. I spend almost all of my time on OSS projects where it's undesirable to gain dependencies so I've rarely had a reason to look for an existing project to solve a specific problem.

#

It's not a problem per se, but I do wonder how much I'm reinventing the wheel (unnecessarily) whenever I do write scripts/projects for personal use.

fierce horizon
#

That’s what these “awesome …” lists are for, but I have the same problem. I think there are a few things I’ve caught, but probably not a lot:

  • dataclasses (stdlib) for most simple data containers
  • ruff instead of flake8/black for formatting and linting
  • httpx instead of requests for HTTP(S) requests (and generally async libraries for IO)
  • cyclopts instead of typer/click/… for CLI
  • rich for ANSI formatting (links, colors, … in the terminal)
  • plumbum for shell scipting (i.e. calling subprocess and piping them)
  • py-spy for profiling (might already be outdated, this has changed almost every time I researched the best choice)

For Rust there’s https://blessed.rs, would be nice to have something like that for Python as well.

onyx spindle
#

TIL cyclopts

#

aaaand I don't like it

dreamy hatch
fierce horizon
# onyx spindle aaaand I don't like it

Why? It does what Typer does, but with less boilerplate, and actually supports obvious things like positional-only parameters or typing.Literal in an obvious way.

onyx spindle
#

I don't really like the "magic" libaries that do stuff by analyzing the function signature

#

call me weird, but I prefer the argparse way of declaring the CLI

fierce horizon
#

I like argparse too, except for the fact that almost everything has to be specified twice.

And not only is that not DRY, but there’s still the problem that the typing isn’t actually enforced anywhere. Below, we just say that Args.foo exists, but it’s not actually tied to the parser in any way!

import argparse
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from collections.abc import Sequence
    from typing import Self


class Args(argparse.Namespace):
    """Some docs."""

    foo: int
    """The foo parameter."""

    @classmethod
    def parser(cls) -> Self:
        """Construct a CLI argument parser."""
        parser = argparse.ArgumentParser(description=cls.__doc__)  # *
        parser.add_argument(
            "foo",  # 1
            type=int,  # 2
            help="The foo parameter.",  # 3
        )

    @classmethod
    def parse(cls, argv: Sequence[str] | None = None) -> Self:
        """Parse CLI arguments."""
        return cls.parser().parse_args(argv, cls())


def main(argv: Sequence[str] | None = None) -> None:
    args = Args.parse(argv)  # args is typed, unless we accidentally lied above …
    ...


if __name__ == "__main__":
    main()

* using cls.__doc__ I don’t actually have to specify this one twice

  1. the name is already specified by the attribute above
  2. likewise for the type
  3. the docstring has to be typed twice, since there’s no standard way to use attribute docstrings.

Compare to the cyclopts variant, which looks exactly how I would write that function anyway, plus a decorator.

import cyclopts


app = cyclopts.App()


@app.default
def main(foo: int, /) -> None:
    """Some docs.

    Parameters
    ----------
    foo
        The foo parameter.
    """
    ...


if __name__ == "__main__":
    app()
#

It’s not magic, if you know exactly what happens.

Any sufficiently analyzed magic is indistinguishable from science
– Corollary of Clarke’s third law, apparently from a comic called Girl Genius

west basin
#

the API's quite nifty, my only reservation is rich being a required dependency - I'd rather I wasn't locked into a specific formatter, especially one which is slow to import

kindred hound
#

Big fan of cyclopts, really love its API

kindred hound
#
❯ hyperfine --warmup 1 --shell=none "python -c 'import cyclopts'" "python -c 'import click'" "python -c 'import typer'" "python -c 'import argparse'"
Benchmark 1: python -c 'import cyclopts'
  Time (mean ± σ):     175.6 ms ±   5.8 ms    [User: 104.2 ms, System: 62.5 ms]
  Range (min … max):   170.8 ms … 195.2 ms    15 runs

Benchmark 2: python -c 'import click'
  Time (mean ± σ):     111.8 ms ±   1.6 ms    [User: 63.8 ms, System: 38.1 ms]
  Range (min … max):   109.1 ms … 116.7 ms    25 runs

Benchmark 3: python -c 'import typer'
  Time (mean ± σ):     360.9 ms ±  20.1 ms    [User: 187.5 ms, System: 159.4 ms]
  Range (min … max):   344.5 ms … 414.5 ms    10 runs

Benchmark 4: python -c 'import argparse'
  Time (mean ± σ):      62.0 ms ±   2.1 ms    [User: 28.1 ms, System: 25.8 ms]
  Range (min … max):    58.7 ms …  67.1 ms    49 runs

Summary
  python -c 'import argparse' ran
    1.80 ± 0.07 times faster than python -c 'import click'
    2.83 ± 0.13 times faster than python -c 'import cyclopts'
    5.82 ± 0.38 times faster than python -c 'import typer'
west basin
#

cyclopts defers importing rich, so that's the import time without rich

#
$ python -X importtime -c 'import cyclopts' 2>| grep rich
[nada]
#

a slightly more representative benchmark:


app = App()

@app.command
def foo(loops: int): ...

app()' --help"
Benchmark 1: python -c 'import cyclopts'
  Time (mean ± σ):      63.7 ms ±   4.6 ms    [User: 54.1 ms, System: 7.3 ms]
  Range (min … max):    52.5 ms …  74.0 ms    54 runs

Benchmark 2: python -c 'from cyclopts import App

app = App()

@app.command
def foo(loops: int): ...

app()' --help
  Time (mean ± σ):     132.9 ms ±   6.8 ms    [User: 114.6 ms, System: 14.2 ms]
  Range (min … max):   125.0 ms … 150.9 ms    22 runs

Summary
  python -c 'import cyclopts' ran
    2.09 ± 0.18 times faster than python -c 'from cyclopts import App

app = App()

@app.command
def foo(loops: int): ...

app()' --help
fierce horizon
#

so it doesn’t import rich if you don’t pass --help? Then there’s no issue. --help doesn’t need every millisecond of saved runtime, only actually running the CLI needs to be as fast as possible.

west basin
#

? unless your CLI's doing actual work, I'd say feedback should be/feel instantaneous, and that includes --help. It's when you're "actually running the CLI" that you're least likely to notice its import penalty

lapis solstice
quartz yew
#

idea is to re-do the python.org website, as a hub of the python ecosystem

#

so that it could be the starting point for docs

#

it would link to documentation for beginners, use-case specific (eg. conda, pyopensci), list of popular libs, etc

#

on the Diversity and Inclusion we were also talking about a resource for users to find local user groups and events

#

also, resources for people wanting to setup user groups, and support resources like suggested guidelines on how to address CoC complaints, etc.

west basin
#

would any Windows users know if there's a way to group or tag packages in winget? I thought I'd give it a shot after upgrading to Windows 11, but I want to be able to distinguish packages I've installed from other (corporate) packages

shadow zealot
buoyant flame
#

I respond to pings, but I can't help with winget. You're probably best to go to their GitHub repo and post an issue or discussion (but since they aren't a package manager - rather, they're just a package install runner - they probably aren't keeping any of their own metadata for things you install, which means they aren't going to be able to tag things like that) @west basin

clear wigeon
dreamy hatch
clear wigeon
#

(provided that Matt Milner decides not to continue with this)

#

i see that he made that comment on feb 11, 2024, should i contact him and ask him what the status is? or should i just go ahead on my own, not sure what the niceties around this is

dreamy hatch
#

yeah, contact him first, maybe you can help him with it. ping him on the issue?

clear wigeon
valid radish
clear wigeon
dreamy hatch
#

timelines: the 3.14 feature freeze is just over a month away (beta 1: 2025-05-06), so you either have a mad rush to get the PEP written, discussed, updated, submitted, accepted, and implemented; or you have plenty of time before 3.15 feature freeze in ~13 months 🙂

clear wigeon
#

i gotta prepare for a few pycon talks so this gives me time to prepare and not rush things

marsh kite
long knoll
#

(it would be really nice if the existing standard library support for compression/archive formats had a more unified interface...)

onyx spindle
long knoll
#

seems unlikely given my previous experience, honestly. Generally one gets told to put stuff in a third-party package on PyPI first. And even if everyone liked it and used it, it'd be hard to build a case for deprecation

#

certainly just trying to write a PEP laying out a design, without actually doing implementation seems like a non-starter. And I have other projects

#

However, I suppose I could blog my design thoughts at some point, at least.

marsh kite
#

A unified interface to compression APIs would be great! Definitely out of scope for this PEP. It is something I considered, but I expect it can be introduced at a later date

#

I'd be interested to see what ideas you have for unifying APIs. There are quite a number of corners of compression libraries that are rather unique/specific to that library

long knoll
#

yeah, an entirely separate idea. As is the general idea of allowing wheels and sdists to use other compression formats (it does sort of make sense IMO to have "anything explicitly supported in the standard library" as the line for what formats are supported; but maybe we want to exclude uncompressed tar?)

marsh kite
#

The tricky thing with allowing just any old compression formats is not all formats have as wide adoption

long knoll
#

well, it would probably have to be either common-subset functionality, or have some well-defined rule for what happens if the format doesn't support a feature (e.g. storing file permissions or symlinks)

marsh kite
#

zlib is everywhere, more or less, so it is reasonable to expect users to have it

long knoll
#

FWIW, in my testing, LZMA would help quite a bit with many popular packages, and that's in the standard library already

marsh kite
#

oh absolutely, but it's decompression speed is quite slow :(

long knoll
#

(separately, but also relevant to reducing the overall bandwidth: I would love to be able to get smaller wheels for Numpy, without all the testing/doc/other development stuff bundled in)

#

I've been kinda insulated from caring about decompression speed because I don't do PyTorch/CUDA kinda stuff and because my Internet is pretty slow anyway

#

(I guess this sort of thing also matters a lot more for people running CI/build farms etc.)

marsh kite
#

Yeah I think a PEP about compression formats should ideally allow both LZMA and eventually Zstandard to be used

onyx spindle
long knoll
#

you'd be surprised

marsh kite
#

LZMA is on the order of 3x slower than zlib to decompress, but with 1.33x better compression (both at max compression levels)

long knoll
#

that sounds about right, yes

#

another design issue: do we allow the index to store (or provide on demand!) the same build artifact in multiple compression formats? If so, do installers provide a UI to choose? (Do they get to choose?)

marsh kite
#

Yeah the details of the implementation are tricky

long knoll
#

I think it's more about making choices and living with them (and historically that has been difficult and slow)

silk jungle
#

I'd imagine for pip, we'd default to zstandard assuming that it isn't worse than zlib on average. I guess we'd gain a flag to select the type of wheel which I'm not a huge of, but out of everything in the wheel 2.0 transition, this is probably one of the easier things to deal with.

silk jungle
#

(I haven't been following the wheelnext discussions. Presumably this has been [very briefly?] discussed at some point.)

marsh kite
#

Yep, we have been discussing that. It's not totally clear to me what that ends up looking like to be honest, there are a lot of choices to be made

silk jungle
#

fwiw, when y'all are ready to present this to the broader packaging community, it'd be nice to have a clear summary of what's expected of the various stakeholders.

#

While I probably should catch up on the discussions, it'd be nice to have a summary of what's expected of us for the pip project. I'm sure some of the pip maintainers are involved in the discussions, but I am not.

marsh kite
#

Well I'll say that you are very welcome to participate, though I am sure maintaining pip is time consuming as it is :)

silk jungle
#

Well yeah, the problem is that I don't have time to read 200+ posts 😅

marsh kite
marsh kite
lapis solstice
#

Regarding generic compression APIs: the best part of adding these to the stdlib is it means the shutil archiving APIs can support them by default.

silk jungle
#

I don't really participate in standards discussions intentionally. I like to know what's going on, especially as a pip maintainer, but I don't have much to contribute.

#

I don't really work on the packaging-side of pip, if I'm being honest. I work on everything around the core package management code, networking, error handling, etc. :P

marsh kite
#

Those parts matter too!

marsh kite
silk jungle
marsh kite
#

That's true

marsh kite
lapis solstice
marsh kite
#

yeah

lapis solstice
#

The compression format discussion reminded me that folks here might be interested in https://github.com/python/cpython/issues/120036 (basic idea: offer shutil.make_reproducible_archive with different defaults that favour future reproducibility of the same output archive over faithfully recording every detail of the input files)

long knoll
#

... supposing I were to make a high-level wrapper for the standard library compression modules, and thus try to give them a unified API, and put it on PyPI. Any suggestions for a distribution name?

kindred hound
#

I actually attempted something similar

#

I'm not too happy with the write API but it works

marsh kite
junior narwhal
#

what's the benefit of having a unified compression API, and secondarily what's the benefit of having that in the standard library?

onyx spindle
#

you can easily switch the compression lib used without having to do significant refactoring/rewrite

long knoll
#

also just general ease/pleasantness of use, and lower learning curve

onyx spindle
#

tbf, there could be a "unified api" and "specialized api", where specialized has features specific to the particular library

lapis solstice
mighty flower
#

Every time I'm in powershell it's a nightmare to do anything simple

junior narwhal
#

true, but it's relatively straightforward to read once working and committed to version control

lyric quiver
#

Has anyone ever attempted dynamically downloading certain heavy dependencies such as Pytorch or TensorRT using a pyinstaller generated .exe?

onyx spindle
#

what do you mean by "dynamically downloading"?

lyric quiver
#

Pytorch + TensorRT shipped through pyinstaller equate to about 97% of the total storage occupied and pushing new software versions is also relatively harsh.

lyric quiver
#

all within a pyinstaller environment.

onyx spindle
#

I don't think there is much benefit in that. All the tools now cache the wheels, so downloads are once per system usually. Also, that won't work unless you do manual dependency management and not using uv/poetry/pdm or whatever

lyric quiver
#

maybe a versioning system where updates could be done easier would be a nice addition as well

#

therefore users would only have to download the dependnecies once and I could simply ship an update.exe that overwrites what has changed with the new version

lapis solstice
#

The sheer enormity of PyTorch and Tensorflow is pretty much the whole reason venvstacks actually exists now, as opposed to being the vague idea percolating in the back of my brain that it was for the previous decade.

It's more for the app-written-in-something-else-embedding-Python use case than it is for native Python apps, though.

lyric quiver
#

venvstacks seems fun

#

Just looked it up on github

long knoll
#

(there's also a channel for it here!)

silk jungle
#

@lapis solstice I'm curious to whether there is a way to convince multiprocessing to avoid importing the main module during worker startup entirely. The context is that I'd like to add parallelized logic to pip. I've designed it so the parallelized logic lives entirely in its own module that (barring the entrypoint) does not depend on the rest of pip. The problem is that multiprocessing will import the main module, which may result in a large portion of pip being re-imported. This is quite slow, and also a potential security vulnerability.

#

I do patch sys.modules[__main__] with a lightweight module (pip) but I'd like to avoid the import altogether. Alternatively, I'm probably going to have to switch back to using multiprocessing.Pool as at least it initializes the workers immediately (addressing the security concerns).

lapis solstice
#

I don't believe there is a formally supported API, but I'm pretty sure if you mutate __main__.__file__ and/or __main__.__spec__, multiprocessing will respect that.

silk jungle
#

Unfortunately, that didn't work.

#

I also tried replacing __main__ with builtins (:P) but that breaks in other ways.

lapis solstice
#

Even if you change them before importing multiprocessing?

silk jungle
#

I've looked at the multiprocessing implementation, AFAICT it grabs __main__ right before a worker is started so that wouldn't matter.

#

I'm going to do the more reasonable thing and stop using concurrent.futures and do best-effort patching, but leave it at that otherwise.

#

(which sucks as sub-interpreters are only available through concurrent.futures, but alas, I don't want to cause the next pip security report).

lapis solstice
#

Can you refactor the main module in pip so the unwanted imports are inside the if __name__ == "__main__": block? multiprocessing runs the child main as __mp_main__, so only code outside that block will run. (This is the officially supported way of keeping the code execution minimal in child processes spawned by multiprocessing)

#

(it's yet another point in favour of the from .cli import main implementation layout)

silk jungle
#
#!/home/ichard26/dev/oss/pip/venv/bin/python
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(main())

When running pip directly pip install six, the wrapper is the main module.

#

If I run pip indirectly via python -m pip install six, then pip._internal.cli.main is never imported.

lapis solstice
#

What would break (if anything) if the wrapper template was changed to be:

if __name__ == '__main__':
    import re
    import sys
    from pip._internal.cli.main import main
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(main())
silk jungle
#

Anyway, I've already worked around the problem. This is purely an academic question.

#

And I should go to bed, it's almost 1 AM 😅

lapis solstice
#

Heh, I was about to disappear for a late lunch.

silk jungle
lapis solstice
#

Heh, I thought that might be it - theoretically safe, in practice, who knows?

long knoll
#

(I must be missing something - why would the script wrappers for installed package entry points, be getting called by multiprocessing?)

silk jungle
#

It's like doing python venv/bin/pip which is a silly but functional way of using pip (except on Windows).

long knoll
#

oh, so the same as what happens with --python

#

(sort of)

silk jungle
#

It's a good default (e.g., the parallelized code depends on global logging state) but it's annoying when you want to spin up a pool of wholly independent workers.

long knoll
#

I should look into how compileall handles that.

silk jungle
#

In theory, I could work around by managing my own pool of processes, but that would be a nightmare to write and maintain (and be quite buggy).

silk jungle
long knoll
#

yeah

#

... would that not work for parallel downloads and/or resolutions as well?

#

(oh wait, is unzipping the bottleneck? is that cpu-bound? but still)

silk jungle
long knoll
#

it's hard to imagine a use case for explicitly importing a script wrapper...

mighty flower
silk jungle
#

I just realized, I have no idea how uv implements their console scripts though.

mighty flower
#

Yeah, they copied distlib and then accepted a PR to remove the re import

long knoll
#

(it occurs to me that .removesuffix is available since 3.9....)
(then again, those conditions are only required for windows support, yes?)

#

(ah, but installing into environments for older python is still supported... ? even if --python doesn't necessarily work?)

onyx spindle
junior narwhal
dreamy hatch
#

mkdocs-material (A) has lots of releases (~weekly), so when you upgrade, mkdocs (B) which is released ~twice/year, is already downloaded and in the cache?

junior narwhal
#

ohhh nice that's a good point, I was thinking about caching but in terms of some sort of multi-stage container build/CI stuff. the release frequency aspect makes more sense. thanks!

long knoll
#

speaking of caching, I've been puzzling over how Setuptools and Pip end up getting downloaded so much. I can see how users end up with multiple versions, and repeatedly install them into temporary environments - but the cache hit rate should be a lot higher, I would think? Is there something about CI systems, Docker containers etc. that defeats the caching perhaps?

#

(also: would PyPI count a separate download if, for whatever reason, Pip tries to scan for metadata with the range request technique etc. but ultimately rejects that build artifact?)

mighty flower
#

For docker I have to assume a lot of users are running "pip install pip --upgrade" after a step that invalidates dockers layer cache

#

For CI like GitHub, I think a lot of people just don't cache installing dependencies

long knoll
#

at some point I guess I'll have to learn those technologies properly. It's kinda hard to find a motivation when I'm so accustomed to solo dev

#

the other thing that would be interesting to figure out, is how much of PyPI's bandwidth is actually serving packages, vs. metadata-related requests (oh, and I guess the actual pypi.org pages count for something too...)

marsh kite
#

But they shouldn't really happen as far as I know

stoic mural
#

yes, currently range requests are indistinguishable from full-file requests in terms of download counts

#

in the last week pypi.org served 1.74 PB (this includes Simple and JSON APIs, project pages, etc) while files.pythonhosted.org served 25.88 PB (can't distinguish between metadata-only and full files)

long knoll
#

that's quite a bit more than I recall the average rate looking like...

#

didn't they say something like 600 PB total for 2023? is it going up o_O

stoic mural
#

oh yeah, it's definitely going up

#

that's jan 1 2023 to now

long knoll
#

wow, close to double yeah

long knoll
stoic mural
#

sure

jaunty marlin
#

I guess (B) must be cached more? (edit: Discord way out of date for some reason... I see the discussion now)

robust sandal
junior narwhal
jaunty gust
silk jungle
junior narwhal
#

shoutout to whoever added this tip callout, I've gone to that page so many times just to look at the codes and always have to muddle around finding the section because the page is so long

dreamy hatch
#

You're welcome!

lyric quiver
#

turns out

#

Pytorch 2.7.0 is now 1.1GB* larger than before

#

went from 6.1GB* to 7.28GB

#

diabolical

#

☣️

#

good luck packaging this in the max 2GB filesize limit github provides

lyric quiver
#

Crazy

serene fulcrum
#

@lyric quiver how does a 3338MB Wheel can lead to a 7.28GB folder/file ? I'm confused

lyric quiver
#

TensorRT and Pytorch primarily

serene fulcrum
#

Oh so it's not just PyT

lyric quiver
#

oh, not at all

#

there's a couple more, but 6.9-7.0GB of those 7.28GB are just Pytorch and TensorRT

serene fulcrum
#

gotcha gotcha. Well I guess @torn fjord will have immense fun very fun

#

Static linking for the win 😄

lyric quiver
#

after 3h of grinding my teeth & 2 energy drinks

#

I was able to lower it to 90mb