#general

1 messages · Page 5 of 1

frigid jewel
#

that's a byproduct of the pep i'd say

#

do we see that in the python discord?

elder tinsel
#

not only that, but sys.executable is right there to get the right interpreter, this isn't hard for malware authors with or without it and this is a massive red herring

frigid jewel
#

i can't tell you how many times i have helped (and so have you @shen ) people who have multiple interpreters in their system and they werent aware of it

elder tinsel
#

dont deprive yourself of good tools just cause malware authors might also use them

frigid jewel
indigo token
#

Use a Linux distribution instead. They vet the packages they re-package. Some more than others

frigid jewel
elder tinsel
#

No, but you can tell people not to run random code they don't trust

#

which is just a good general piece of advice

frigid jewel
elder tinsel
#

This isn't unique to python. it's no different from users downloading random executables and not really your responsibility or a reason to cripple good tooling

#

There are significant benefits (including security ones) to scripts inlining what they depend on in a predictable format. This makes static analysis easier when scripts do this, and for regular, non malicious cases, it ends up having the right behavior by default, while being a self documenting set of requirements

blazing lantern
#

FYI all the tools I know implementing inline script metadata are putting it behind a run subcommand and are creating ephemeral virtual environments to install into.

lusty quarry
elder tinsel
#

yeah, all ive seen so far is the script at least gets it's own venv. I dont buy this as a security issue since the worst that happens here is the malicious dependencies fetched this way are more obvious.

blazing lantern
glass sand
#

I've been working on https://github.com/pypa/packaging-problems/issues/342 and I have a proposal I'd like to circulate with the system integrators at places like Conda, Debian, Fedora, and Spack. I know some of them, but I'm not sure I'll be able to come up with a proper distribution list. I'm thinking this Discord server should have an #integrators channel where we invite stakeholders for those projects to coordinate relevant concerns. Any objections? Should it go under Lobby or Other Projects?

blazing lantern
glass sand
honest geyser
blazing lantern
elder tinsel
#

I think py script.py resulting in dependencies for scripts being fetched is in that territory of "users might be surprised", but I don't have any actual problem with it.

blazing lantern
fading veldt
#

I would absolutely hope that a script asks for confirmation before trying to download something unless I already expect the script to download something

queen hornet
elder tinsel
placid wyvern
# elder tinsel There are significant benefits (including security ones) to scripts inlining wha...

Catching up on this (and resurrecting a dead topic a bit, I see I was mentioned but I never got around to addressing this.)
My contention with in-lining requirements is actually from a code-scanning perspective, I'm not sure it's an overt security issue.
It's going to be a bit precarious (depending on how inlined requirements are handled) if we need to start resolving deps outside of the 'usual suspects'.

If you're familiar with the 'fatal funnels', it is remarkably hard to identify installation and import of malicious modules as it stands right now-- it's a computational issue. Resolution of dep trees on packages is something I'd love to do with our framework, but it's not really feasible at our scale. If packages start also having components with in-lined requirements, it's a bit inconvenient for me to also try to resolve these to the best of my ability.

At the moment, it's computationally inexpensive for me to detect things in pyproject and setup that are misbehaving or malicious. It will be slightly problematic if I need to start looking everywhere for this behavior. We're trying to out-pace this, in the coming days/weeks/months, I should have a dynamic analysis sandbox that can be run transparently alongside anti-malware scanning that is specifically instrumented to detect this behavior.

But Python is awful to sandbox. This is a known commodity, and it's taken some very creative thinking to get us to this point where this is accomplishable at scale and within the requirements of CSPs' AUPs.

tl;dr: My contest with this being a 'security issue' is mostly that it doesn't present any new or novel attack methodology, but instead drives my costs up a bit to try and mitigate. It is what it is, I've been fairly quiet about it thus far for a reason. 🙂

obsidian lily
#

wow wall of text

#

traumatic memories resurfacing

elder tinsel
# placid wyvern Catching up on this (and resurrecting a dead topic a bit, I see I was mentioned ...

Yeah... I think the script block is actually an improvement here though. unlike import x (without a script block with it) where you don't neccessarily know what x is, with the script blocks, only the dependencies specified in the script block, or in the standard library will be in the environment.

It's possible to extract out the actual dependencies as they will be passed to a resolver from it with a parser that has predictable performance characteristics.... actually resolving those dependencies to check for malice still has all of the existing problems about package resolution though

obsidian lily
#

also pyproject malware?

elder tinsel
#

you can smuggle in various ways to have malicious code that are not obvious into pyproject.toml

obsidian lily
#

huh

west drift
#

For wheel tags, what's the difference between cp39-none-macosx_10_9_x86_64 and cp39-cp39-macosx_10_9_x86_64?

The only docs i found only say (https://packaging.python.org/en/latest/specifications/platform-compatibility-tags/#faq):

Why is the ABI tag (the second tag) sometimes “none” in the reference implementation?

Since Python 2 does not have an easy way to get to the SOABI (the concept comes from newer versions of Python 3) the reference implementation at the time of writing guesses “none”. Ideally it would detect “py27(d|m|u)” analogous to newer versions of Python, but in the meantime “none” is a good enough way to say “don’t know”.

limber cliff
#

Hi, since last 1hour, our deployments are failing trying to install v1.3.5 of package "install", i just checked on pypi and i see a 404 https://pypi.org/project/install/
Anyone else facing the same issue?

torpid veldt
fading veldt
#

It's now part of the prohibited names list

torpid veldt
#

@limber cliff do you need the module? did you add it as part of a typo? or does a third party package depend on it?

torpid veldt
#

Otherwise I'd recommend forking the repo and using that as a git:// dependency

elder tinsel
#

😐

#

This package only exists to install things in source code rather than in normal places for requirements passed to pip, and is a thin wrapper around subprocess and pip, I'd just stop relying on this by explicitly adding your dependencies

torpid veldt
elder tinsel
#

I'd look at anything depending on this as a hazard

torpid veldt
#

@humble phoenix why are you installing install? What's the output from your failing pip?

limber cliff
#

thanks, we just removed dependency on the package.

torpid veldt
#

That seemed to be how fastapi-keycloak-middleware ended up depending on it

humble phoenix
#

I've been trying to track down if it was a dep of a dep, but does not seems to be the case

limber cliff
#

probably it was some mistake like pip install install, just removed it and letting it run the tests 🙂
Exactly, lock file doesn't have it as a dep of a dep, so mostly a typo 😢

torpid veldt
#

Like including a reason

fading veldt
#

there's no standard for the repository to declare that AFAIK

torpid veldt
#

Not in simple/ just in the html UI for now

fading veldt
#

Oh you mean pypi not pip

#

Yeah, should honestly be decently easy to change warehouse to show something

torpid veldt
#

Yes that's a horrendous error on my part

#

I'm shook

fading veldt
#

All good, I understand what you meant now

blazing lantern
west drift
#

Is there an environment that could install one, but not the other?

fading veldt
#

Yes, I believe that a free-threaded Python would work with cp313-none-macosx_10_9_x86_64

fading veldt
west drift
#

Thanks i see now!

frigid jewel
#

@blazing lantern sorry for the ping, would you be open to modifying the py launcher on unix a little so that it doesn't obscure the parent of an activated venv?

#

current on Windows we see this

❯ py -0
  *               Active venv
 -V:3.11          Python 3.11 (64-bit)
 -V:3.10          Python 3.10 (64-bit)
 -V:3.9           Python 3.9 (64-bit)
 -V:3.8           Python 3.8 (64-bit)
 -V:3.7           Python 3.7 (64-bit)```
whereas on linux we see this

❯ py --list
3.12 │ /usr/bin/python3.12
3.11 │ /usr/bin/python3.11
3.10 │ /home/azureuser/niner_test/test/venv/bin/python3.10
3.9 │ /usr/bin/python3.9
3.8 │ /usr/bin/python3.8

#

i mean i can extrapolate the path based on the last path segment,
but it would be nicer if it had /usr/bin/python3.10 and showed something like active venv, but i'd like an output closer to py -0p

#

PSS: i should've opened a github issue now that i think about it

high stone
#

👋🏻 @hardy walrus! Nice to see you here!

hardy walrus
lyric hedge
hardy walrus
uneven karma
#

Did you mean: X?

lyric hedge
#

shut—

woven yarrow
#

This is probably not the right place to ask, but I have a failure at the intersection of wheels/pip/free-threading/nightly-builds/tox and maybe someone here can direct me to the right place. Trying to run the coverage.py test suite on 3.13t (and 3.14t), but I get this failure:

% COVERAGE_ANYPY=/usr/local/pyenv/pyenv/versions/3.13t-dev/bin/python3 tox -re anypy
anypy: remove tox env folder .tox/anypy
.pkg: remove tox env folder .tox/.pkg
anypy: pip-24.2-py3-none-any.whl already present in /Users/ned/Library/Application Support/virtualenv/wheel/3.13/embed/3/pip.json
anypy: install_deps> python -m pip install -U -r requirements/pip.pip -r requirements/pytest.pip
.pkg-cpython313: remove tox env folder .tox/.pkg-cpython313
.pkg-cpython313: install_requires> python -I -m pip install setuptools
.pkg-cpython313: _optional_hooks> python /usr/local/virtualenvs/coverage/lib/python3.8/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg-cpython313: get_requires_for_build_editable> python /usr/local/virtualenvs/coverage/lib/python3.8/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg-cpython313: build_editable> python /usr/local/virtualenvs/coverage/lib/python3.8/site-packages/pyproject_api/_backend.py True setuptools.build_meta
anypy: install_package_deps> python -m pip install -U 'tomli; python_full_version <= "3.11.0a6"'
anypy: install_package> python -m pip install -U --force-reinstall --no-deps .tox/.tmp/package/3/coverage-7.6.1a0.dev1-0.editable-cp313-cp313-macosx_14_0_arm64.whl
ERROR: coverage-7.6.1a0.dev1-0.editable-cp313-cp313-macosx_14_0_arm64.whl is not a supported wheel on this platform.

anypy: exit 1 (0.35 seconds) /Users/ned/coverage/trunk> python -m pip install -U --force-reinstall --no-deps .tox/.tmp/package/3/coverage-7.6.1a0.dev1-0.editable-cp313-cp313-macosx_14_0_arm64.whl pid=97510
  anypy: FAIL code 1 (9.99 seconds)
  evaluation failed :( (10.41 seconds)
queen hornet
#

is it because the wheel is cp313 but your interpreter 3.13t so expecting cp313t?

uneven karma
#

Impromptu #meta:
Re: #pip message
Links to end user support are automodded?

high stone
#

Posting links to other discords.

#

It's how we're dealing with the spam bot attacks we had a little while back (they still show up regularly and get caught by those checks).

inland prawn
#

@high stone did you consider a "honeypot channel" strategy?

#

(requires a bot though)

high stone
#

We should probably have the Python discord allow listed here tho.

high stone
lyric hedge
#

huh, are masked links not matched?

#

I'll allowlist all of the current project roles for now. I'll take a closer look sometime later *waves hand*

high stone
#

huh, are masked links not matched?

brittle token
lusty quarry
#

PEP about repository namespaces is now live and ready for review! https://discuss.python.org/t/pep-752-package-repository-namespaces/61227

celest apex
#

Question about packaging.
I have a package that can use multiple 'backend' code.
The backend codes can be put inside sub packages.
I want an easy user-friendly way to select packages
Is there anyway this can be done with pip install package[subpackage] ?

Or perhaps

pip install package # default
pip install package.subpackage # option 1
#

I use scikit build core and backends are c++ codes
I want to have a default implementation in python for example

distant briar
brittle token
#

To allow the different subpackages to be installed separately:

  1. Define a namespace subpackage (e.g. package.backends) where the different subpackages put their code
  2. Publish the backends separately (e.g. package-backend-subpackagename1, package-backend-subpackagename2)
  3. Define extras on the main project that depend on the different backends (e.g. package[subpackagename1], package[subpackagename2]

Steps 1 & 3 are generally available in all Python package build tools, but I don't know if any of them offer good support for publishing multiple different packages from one repository.

lusty quarry
fallen shuttle
topaz pine
#

It might be a spammer, but since the user's activity is private it's hard to know if the user is a real person or not.

#

I'd suggest to block the user if it opens more empty tickets

queen hornet
#

And report them to GitHub via the "..." button

high stone
#

Blocked

fallen shuttle
# queen hornet And report them to GitHub via the "..." button

I have been frustrated with how Github handles the report.

Recently they send an automatic reply that basically tells you to do the block in your own organisations. You need to send a second email asking them to take global action:

Thanks for taking the time to let us know.
 
If you have write permissions to a repository, you can hide, edit, or delete comments by using the tools described here:
 
Managing disruptive comments
 
Additionally, users with admin permissions in a repository can permanently delete an issue from a repository by following these instructions:
 
Deleting an issue
 
Keep in mind, you may block users at any time by following the steps outlined in either of the articles below.
 
Blocking a user from your personal account
Blocking a user from your organization
 
We're happy to take a closer look at the account in question to determine if any action beyond maintainer moderation is necessary. If you have any additional details or information to share with us at this time, feel free to do so, and please let us know if we can help in any other way.
fallen shuttle
queen hornet
high stone
#

I've basically had a ~80% success rate with getting people banned.

#

ALA GitHub deleting accounts

distant briar
queen hornet
#

It's not clear, especially the "We're happy to take a closer look ... if any action beyond maintainer moderation is necessary. If etc..."

queen hornet
#

today's abuse report autoreply was much clearer:

Thank you for contacting GitHub Support. We wanted to let you know that we've received your message. We are experiencing high volumes and therefore, you may experience longer than normal wait times. In the meantime, you may find answers to commonly asked questions in our community forum or in our documentation.

Ticket ID: <snip>

If you have any additional information or would like to add anything to your initial message, now would be a great time to do so, feel free to reply to this email. If not, then rest assured your request is in the right hands 🙂
Thank you!
The GitHub Support Team

fallen shuttle
#

Not sure why packaging-problems is so targetted by these spammers/bots...

queen hornet
frigid jewel
honest geyser
#

We're getting a lot of the spam too

uneven karma
#

They’re hitting everyone

#

_And now for something completely different _
I shall regale you with my last screenshot before that one:

fallen shuttle
lyric hedge
#

Hi, I wrote a thing. About a month ago, we released pip 24.2. There are some fairly neat changes and one important deprecation included in this release (the deprecation of legacy setup.py develop editable installs) . While the changelog is an accurate summary of the changes, I felt like a deeper dive into the changes (such as the reasoning and historical context) would make for an interesting article, which now exists!

https://ichard26.github.io/blog/2024/08/whats-new-in-pip-24.2/

Please enjoy!

#

It's fair to say that we got a bunch of packaging nerds experts in here, so I won't include a TL;DR :P

lyric hedge
#

OK, of course, no one caught the fact that the setup.py develop fallback is going to be killed with pip 25.0. Not 25.1. Ugh. Excellent job me.

fallen shuttle
#

Thanks for the text @lyric hedge , one note regarding the paragraph:

If you stick with setuptools, one potential snag is that setuptools has introduced a new kind of editable installs while rolling out PEP 660. They’re called strict editable installs, which behave closer to a normal installation, but are implemented in an entirely different way from setup.py develop, potentially breaking certain workflows.

Please note that strict mode is not the default in setuptools. It only happens on request (i.e. editable_mode=strict), so it should not be a snag (unless the user want it to be).

#

The default in setuptools is to be "reasonably" lenient (i.e, if you project uses a src layout it will make use a static .pth file; if your project uses a flat-layout, setuptools will expose the top-level packages and everything under it)

#

Accordingly, the paragraph is imprecise:

Warning: Static analysis tools including mypy, pyright, and pylint may not function properly with strict editable installs. This is a known issue. The recommended workaround is to pass --config-settings editable_mode=compat.

The problem here is not strict editable installs. I do believe that static analysis tools work very well with strict editable installs (as they use hard/sym links).
The problem is that static analysis tools (justifiably) don't know how to handle some fundamental aspects of the import machinery, such as sys.meta_path and sys.path_hooks, which enable editable installs for flat layouts in the setuptools implementation (setuptools is carefull to not expose auxiliary top-level python scripts such as noxfile.py , conftest.py , setup.py - or whole folders like tests and docs - as if they were part of the installation - and that is the compromise I found regarding the polemic on PEP 660).

lyric hedge
#

I do have to say this could've been clearer, but I accept responsibility for not catching the error. I'll fix the post either much later tonight or tomorrow, depending on how my free time works out.

#

Sorry about that. I really don't mean to paint y'all as incompetent engineers or label setuptools as a bad backend.

lyric hedge
#

Thanks, I do wish it was later so I had a chance to fix the post, but oh well.

#

I held off on posting it to reddit and DPO so I could get the story right without leaving a ton of people in the dark

#

Ah, I see it was already posted on Mastodon earlier, that's fine then.

#

Does Mastodon disable the referrer header? I'm surprised to see zero referral traffic from there in my analytics.

#

Yup, they do. So you're the reason why the post is already making the rounds even though I've been coy about it so far, haha

frigid jewel
#

is there an env variable that tells pip to install the latest dependencies onto a venv? i.e latest pip

fallen shuttle
# lyric hedge Ugh. I should've asked for a review on the setuptools text from you. I felt like...

No problems, I myself am confused sometimes by this.

But this added complexity was the only way I found out of the stalemate that we were before implementing PEP 660: a compromise default behavior that was lenient but avoided the most criticised silly side effects (of simply adding the top-level project folder to sys.path) + a option for the user to be strict (opt-in).

Later because of projects possibly relying those same silly side effects (Hyrum's law) and/or some potential hiccups with namespaces, I added the compat.

Initially I wanted it to go away very quick, but it turned out to be very useful regarding the limitations of static analysis tools for import hooks (which are used by setuptools in the case of flat and custom layouts). So we probably need to find a way out of this problem before it can be fully gone...

fallen shuttle
fallen shuttle
fallen shuttle
#

Despite the terrible complexity, I believe setuptools implementation of PEP 660 is the most comprehensive working in the wild. We basically cover every single method mentioned in the PEP (and by this point I know by heart the flaws of all of them).

One thing that I am proud on this implementation is to have figured out a way of implementing namespaces via importlib hooks (super complicated, and difficult to do - I have an open ticket on CPython about it).

One day in the future (hopefully before the anniversary of 20 years of the PEP), I can sit down and write my feedback about 660 and how it can be improved.

lyric hedge
fallen shuttle
#

Thank you!

honest geyser
#

Does anyone know how 3.13 free-threaded builds should be tagged? https://github.com/indygreg/python-build-standalone/issues/320#issuecomment-2336723880

If we have cpython-3.13.0rc2-aarch64-apple-darwin-noopt-20240907T1906.tar.zst how are we supposed to add a free-threaded tag? e.g. 3.13.0rc2t and 3.13.0trc2 are both "invalid" versions.

Similarly, in uv we'll need a way for users to request these versions. I thought --python 3.13t made sense but the collision with pre-release tags is a problematic. We can definitely strip the t when parsing the version but I'm not sure if it's "the right thing".

I feel like I saw some discussion about this previously but can't find anything in the PEP.

lusty quarry
honest geyser
#

That's cool 🙂

elder tinsel
honest geyser
#

Yeah wheels add a t to the ABI tag but we don't have one of those in the distributions

elder tinsel
#

ah

brittle token
#

I was about to reply here, but realised it made more sense to put my question on the GH issue.

honest geyser
#

👍 yeah that seems good for visibility / discussion

brittle token
#

Specifically, if noopt indicates the default build, then that's where the marker for it being a free-threaded build would go.

honest geyser
#

Yeah I think that kind of makes sense but then we need a free-threaded variant of all the optimization kinds?

brittle token
#

Oh, its short for optimization, not option. Might need a new field independent of that one, then.

honest geyser
#

I think it actually is "no options" because it's a "a regular optimized build"

#

Anyway, thanks for your reply

brittle token
honest geyser
#

Awesome thanks

toxic cypress
#

What's the simplest way to build and package a small C extension? My C extension doesn't represent a whole library, literally just a single function I need to work around a ctypes limit

hollow trout
#

Anything wrong with just using setuptools?

toxic cypress
hollow trout
toxic cypress
#

That sound kosher?

hollow trout
#

I've no experience with that kind of project, so I can't advise there, sorry.

toxic cypress
#

Fair enough. Thanks anyway. 👍

brittle token
high stone
woven plover
#

The pypi namespace debatte makes me confusedly jealous of Java reverse dns package names - name ownership grants prefix ownership

brittle token
#

Having gone through multiple org name changes, acquisitions, etc, tying software component names directly to comparatively ephemeral organisation names is awful in practice. It's like someone saw Conway's Law and said "Yes, please, I'll have some more of that!" instead of going "Ugh...".

mortal shore
#

I don't really see any of the more "rigid" namespace ideas as even workable

brittle token
#

I can see the appeal in their apparent clarity and simplicity, but when reality is messy, trying to apply only clear and simple models gets complicated fast (and I believe the Zen of Python has something to say about that)

woven plover
#

Hence the confused jealousy, it's less complex and more complicated on the hidden side

brittle token
#

Ah, I missed the "confusedly" in your first comment. Yeah, it can definitely be easier to see the good parts from a distance (and now I have Maven flashbacks, so if hearing "pom.xml" doesn't induce any kind of creeping horror, be happy).

woven plover
#

My last serious contact with java was on a sun sparc workstation with netbeans on solaris back in 2008 for university classes

oak sequoia
#

I may have misunderstood the original point, but I want to use this moment to think out loud a bit: I think namespacing packages would be a great improvement for Python packaging. Using Maven's GAV coordinates as an example, a package coordinate system like this is great in that it allows multiple package maintainers to arrive at the same package name like requests (simple, intuitive, concise) while allowing all those maintainers to publish their work simultaneously thanks to the multidimensional "key" of group (usually reverse DNS name) + package name.

#

Imo right now Python packaging is hard for newcomers like myself because there are so many packages on PyPI that there's almost no chance a clear and concise name is available for use still. And, like in a recent case of mine, pypi.org doesn't necessarily expose all the information necessary for package authors to know ahead of time that a name isn't available for use so they go to publish it only to be rejected. In addition to this the existing package coordinate system seems to have put a lot of additional burden on pypi.org maintainers in the form of PEP 541 requests from people trying to revive old packages with great names that have been abandoned, or just revive a name that was used briefly and discarded.

brittle token
#

Folks can already discriminate packages they publish with a personal or organisation identifier so they don't need to worry about PyPI top-level name availability, and there are plenty that choose to do so. It's a consequence of package distribution names and import names being allowed to be different. (This relationship between implicit and explicit namespace prefixes came up in https://discuss.python.org/t/pep-752-package-repository-namespaces/61227/88 , but it's easy to miss in the volume of discussion on the prefixing proposals)

oak sequoia
#

Definitely, thanks for sharing! I'll take a look at this - seems to be a very thorough discussion.

Folks can already discriminate packages they publish with a personal or organisation identifier so they don't need to worry about PyPI top-level name availability

Do you have an example of this? After a quick read of PEP 752 I think I understand what you mean to be this sort of thing: mkdocs-material, pytest-cov, jupyter-*, is that right?

fading veldt
#

So there is discussion going on in the conda ecosystem about vNext, aka the next ABI version of MSVC compiled binaries. https://github.com/conda-forge/conda-forge.github.io/issues/2295

For the past decade we've been fortunate to not need to worry about an ABI breaks on Windows, but an MSVC ABI break would effectively break the stable ABI promise for Python (i.e. a wheel tagged abi3 compiled with v14 will not work with a future vNext CPython distribution, even if the Python stable ABI is the same).

So what do folks think we should do about this? I think the status quo could lead to many headaches of people trying to make compatible wheels and having difficulty supporting newer Python releases.

brittle token
brittle token
fading veldt
lusty quarry
fading veldt
#

Ah I haven't posted yet @lusty quarry

#

I'll post it some time tonight

obsidian hedge
#

I am using ubuntu 18.04.6 while installing the package pip install simsimd getting this error Building wheels for collected packages: simsimd

ERROR: Failed building wheel for simsimd
Failed to build simsimd
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (simsimd)

fading veldt
#

Simsimd build failure

fading veldt
lusty quarry
merry rune
#

Dep groups PEP and uniform URL PEPs accepted! 🙂

merry rune
#

Is it really 753 and 735 that got accepted at the same time? I'm totally going to muddle those numbers. 🙂

shut quest
#

there isn't a way to get a "default" extra right? where the normal install installs more dependencies than a more minimal extra?

spiral urchin
#

No

merry rune
#

Yes, that's one feature I'd like in a future revision. It's in Rust, and there are lots of places I'd like to use it. Like making build depend on virtualenv, but allow it to be skipped for minimal bootstrapping

high stone
#

It's something that's got a not-yet-proposed PEP for it, in the works.

obsidian lily
#

is it better practice to use the simple api or the json api

inland prawn
#

Simple API is cross-index, json API is warehouse-only

spiral urchin
#

Technically Simple API has a JSON option so it’s not Simple vs JSON. After PEP 691 it’s always best to use the Simple API (either in HTML or JSON format), while the legacy JSON API is specific to pypi.org.

lusty quarry
#

to be even more specific, the JSON option of the simple API I believe is preferred nowadays and improvement PEPs sometimes only target that

inland prawn
#

well, PyPI "legacy" JSON API was never PEP-ified AFAIK and was never marked as something stable

bright parcel
#

I've been thinking about the PyPI JSON API a lot recently: https://github.com/pypi/warehouse/pull/16912

I'd really like to extend this to a much more robust and general REST API so more of PyPI can be scripted. One thing in particular is PEP 694, with my proposed changes here: https://github.com/python/peps/pull/3997 -- it makes sense to me to make this exclusively a JSON API.

For background, I helped build the Launchpad REST API, then took those concepts and wrote the Mailman 3 REST API: https://docs.mailman3.org/projects/mailman/en/latest/src/mailman/rest/docs/rest.html and in my $job-1 was tech lead for their REST API. I have a lot of experience... and opinions 😆

The question is, what's the best way to move forward? PEP 694 is a pretty large feature so has its own PEP (which I might update with things I learned doing the above PR). That above PR, programmatic yank, doesn't have a PEP. Maybe we need a PEP for the overall design and approach for a generalized PyPI/warehouse REST API, but I want to be careful not to include too much detail, and I'm not sure every new endpoint needs a PEP.

This probably not the right forum to discuss the details, but given the above thread, I thought I'd at least mention my thinking. Either #pypi here or more likely a DPO thread to start the ball rolling.

inland prawn
#

hot take: all APIs should be PEPified standard so that we can have nice features not only for PyPI, but for other package indices too (kinda forcing others to follow the rules)

west drift
#

i would love a proper REST API for warehouse

#

imho the important part for new apis is buy-in from other registries (gitlab, aws, google cloud, artifactory, cloudsmith, codeberg, etc.)

#

from my "uv implementor" perspective, it's great if warehouse supports those features, but to design a matching workflow in uv around it, we need to know that (or at least, be able to test if) other registries support those features too

honest geyser
west drift
#

we shouldn't try to design a standard for others to implement without talking to them

honest geyser
#

I just don't think they're actually going to engage and implement these things. What was the process like for the other changes?

#

I agree it'd be great if they did though

#

Is the goal to make a standard for everyone or for PyPI to continue to improve?

honest geyser
#

Because they're already years behind on standards?

#

(From a tooling perspective, @west drift is right that if it's not a standard it'll be harder to justify adding support, e.g., in uv)

indigo token
#

The PyPI JSON API works too well to become legacy organically. With it, I do:

url = f"https://pypi.org/pypi/{pypi_name}/{version}/json"
resp = await http_client.get(url)
resp.raise_for_status()
metadata = cast(packaging.metadata.RawMetadata, resp.json()["info"])

The simple JSON API needs me to do this thing, which has 2 requests and I’m less sure if it’s correct

url = f"https://pypi.org/simple/{pypi_name}/"
headers = {"accept": "application/vnd.pypi.simple.v1+json"}
resp = await http_client.get(url, headers=headers)
resp.raise_for_status()
resp_json = resp.json()
url = next(  # TODO: need to filer more, like yanked and stuff
    f["url"]
    for f in resp_json["files"]
    if f["filename"].startswith(f"{pypi_name.replace('-', '_')}-{version}")
    and f["data-dist-info-metadata"]
)
resp = await http_client.get(f"{url}.metadata", headers=headers)
resp.raise_for_status()
metadata, _ = parse_email(resp.text)
west drift
#

the upload api doesn't have a spec currently

honest geyser
#

What about 658 and 714? I'm sure you're more familiar than I am

honest geyser
#

I also presume things like support for range requests aren't in a spec

west drift
#

PEP 714 is a fixup for PEP 691, but yeah PEP 658 would eliminate the need for range requests

west drift
west drift
brittle token
#

We haven't heard a peep out of them in the past 10+ years, so unless someone has personal contacts, that doesn't seem likely to change anytime soon. When the client tools have things they can only do when the commercial repositories support the modern API standards (like reliable staged releases with the upload 2.0 API), then we might potentially see some movement from them, but until then, I wouldn't expect anything.

west drift
#

my last experience was that it took one email to get a vendor to change their authentication to match pypi's

mystic stag
#

My experience working inside large organizations is many developers don't realize that getting involved in standards discussions is even an option, it can be a workflow of: 1. Look at most popular implementation, 2. Copy behaviour so that internal/external client are happy, 3. Move on and never think about that code again

#

And generally the point of leverage is the client going from happy to unhappy, I've been hopeful with uv working great on PyPI and occasionally very slowly on indexes that don't support certain features, that this will push these indexes to update, though I expect timeframe will be years not months

inland prawn
#

meanwhile, github still doesn't have a python package repository

uneven karma
brittle token
#

Depending on when they tried, frustration with the lack of standardisation could legitimately have been the reason. It's been a long journey cleaning up the "seemed like a good idea at the time" legacy from when the Cheese Shop was first created (coughXML-RPCcough).

mystic stag
#

Users still complain pip search no longer works

inland prawn
#

meanwhile PDM and Poetry parsing PyPI search result page

bright parcel
#

Generally I agree, but I think it may be inconvenient. A proper REST API will likely evolve new endpoints over many years. That could mean a whole host of PEPs for each endpoint. Maybe that’s okay, but in that case I’d likely want an Informational PEP about the REST API in general, with principles and an index to PEPs for each endpoint.

bright parcel
bright parcel
#

I definitely wouldn’t block on support for a REST API from any other vendor

fading veldt
#

One alternative not mentioned is to implement the API and get a feel for what works, then standardize it once content with the API design.

bright parcel
#

We can try that strategy out with the yank API 😄

queen hornet
bright parcel
#

The tech equivalent of "spending more time with my family"

lusty quarry
#

I think it's fine that other repositories don't adopt certain standards, or at least they should not feel compelled to. I think it's perfectly reasonable for some to have a very minimal implementation e.g. just simple API HTML pages

#

maybe we could make "recommendations" with explicit rationale like faster package resolution, and we should do that sparingly for the most important of standards

frigid jewel
brittle token
brittle token
#

Admin question: are there guidelines anywhere for requesting an "Other Projects" channel? While venvstacks will be an LMStudio project rather than a PyPA one, I'd prefer if there was a Python-specific channel for folks to ask me questions about it rather than relying solely on the LMStudio Discord instance. (The contributing guide will be asking contributors to abide by the PyPA CoC regardless)

brittle token
mild dew
drowsy moss
#

Oh, finally

#

They've been promising this for the past few years

#

Thanks

mild dew
#

np!

drowsy moss
#

I wonder if I should enter PyPA into the company field

#

It's weird they the form is built around work stuff

mild dew
#

yeah

#

i put my company in and then listed a repo in a different org, so 🤷‍♂️

brittle token
frigid jewel
#

does anyone know why pip install . build artifacts don't get deleted?

#

there is a

  --no-clean                  Don't clean up build directories.
``` flag but no `--clean` flag it seems
#

before

ls
LICENSE        README.rst    parse.py    pyproject.toml    tests

after pip install .

ls
LICENSE        README.rst    build        parse.egg-info    parse.py    pyproject.toml    tests
#

for pip install -e . the build folder isn't present but the parse.egg-info one is

torpid veldt
frigid jewel
torpid veldt
#

You possibly want to copy the directory to a tmp folder

#

But I don't think that will work for editable installs

frigid jewel
torpid veldt
#

setuptools writes the tree of symlinks to the source directory somewhere

#

Users will be using the whl from pypi though so they're unimpacted

#

You can also make your own build backend that wraps your current build backend, but copies everything to a tmpdir first

frigid jewel
#

no, they aren't installing my tool locally, they're installing their tool locally, my tool is supposed to run on top of their projects

#

but i'm running analysis over their projects, and encountering the build artifacts causes bugs, actually, just had an idea

fluid halo
#

Hello I am new to the server and I am looking to start contributing to the project, any advice for getting started?

inland prawn
#

which project do you mean? there are many under PyPA org

fluid halo
#

preferably pip

inland prawn
#

I'd say go to pip docs/repo and checkout contributing guidelines

fluid halo
#

Ok Thanks

wicked canyon
#

Possibly silly question, does python packaging / pyproject.toml support default dependencies? In the rust ecosystem, it's possible to define optional dependencies which are enabled by default but which the user can disable, is that a thing that exists for pip?

torpid veldt
wicked canyon
wicked canyon
zealous swan
#

👋 hey folks, I'm starting work on an "SBOMs for Python packages" project that is fairly cross-functional. I was wondering what the threshold is for creating a separate channel in the Discord for projects like this? I want to avoid spamming up spaces for folks who aren't interested in this topic.

#

The individual parts of the project might have their own spaces for discussion w/ relevant people (ie, any PEPs on DPO, etc), but I suspect there may be some people interested in the "whole".

inland prawn
#

I'd say "other projects" could have a #sboms-by-seth channel

lusty quarry
zealous swan
#

Omg def not "sboms-by-seth" 😛 how about "sboms-for-python-packages"?

lusty quarry
#

is there something more specific since this server is already about Python packages?

zealous swan
#

or sboms-for-packages, or just sboms also works!

lusty quarry
#

#sboms, feel free to post more info so I can update the channel topic

queen hornet
zealous swan
#

hahaha, I definitely don't want to be "credited" with a lot of the stuff around SBOMs 😛

uneven karma
zealous swan
#

The pitch is: How can Python packages solve the phantom dependency problem and encode non-Python dependency metadata into the distributions? Secondary goal is being able to push SBOM users to self-serve using existing tools and knowing that tools are working correctly for Python projects.

uneven karma
#

That sounds nice
I’m a solo dev at a very non-technical company
I’m submitting “some code” to a Fortune 100 and while it wasn’t requested, I wanted to include an SBOM just to be “professional”, but I’m having a hard time wrapping my head around the ecosystem.
I just included the one that GitHub generates.
A clear way to say “here’s how you can get an SBOM for your distribution” would be greatly appreciated.

zealous swan
#

Glad you are interested 🙂 My read today is there are a handful of good open source tools and they are doing the best they can w/ the information they have (for the most part). Tools like Grype and Trivy. But there is a gap in the information they have to work with, this project would be trying to fill that gap and enable people who want SBOMs to contribute to projects productively, because today there is no way to "forward" that information along to a place where a tool would be able to find and consume it.

#

I posted a link in the #sboms channel to my project repo if others are interested.

#

I'll continue the conversations there, as the main reason for the channel was to avoid too much happening in channels where folks might be less interested 🙂

brittle token
# uneven karma _but I like that one_

Tired of your regular old mass-produced factory SBOMs? Well, we have a solution for you. Introducing, "sboms-by-seth", for all your handcrafted, artisanal, SBOM needs!*

*(in fast ad disclaimer voice) sboms-by-seth are neither handcrafted, nor artisanal, and are in fact the very definition of mass-produced factory SBOMs

zealous swan
#

Hand-crafted SBOMs, I'm only willing to do that once for CPython 🙂

brittle token
# torpid veldt Nope, it's not possible right now

Technically it's kinda possible, just clunky since it involves publishing two packages. typer and typer-slim are an example. A full-featured main package with the full set of dependencies, and then a lower level core package with some optional dependencies not declared.

echo ruin
#

@mild dew just in case you aren't aware and care about it, zizmor doesn't work in pre-commit.ci because it asks for too new of a rust version for the current image looks like -- (in my case I'm quite happy ignoring and waiting until probably it's updated by Anthony)

mild dew
echo ruin
mild dew
wicked canyon
#

Does anyone have resources on making codegen'd packages in modern python packaging?

I've a codebase which is largely data driven (from third party data too), the data loading is a pretty heavy hit so a while back codegen was added to the package build to provide a precompiled data baseline, however:

  • the codegen is a bit crusty as it's old setuptools era
  • because the codegen is part of the consumer package, a change in the upstream data files requires a new release to get the data updates, which leads to version inflation and relase politics issues

So I was wondering if I could move out the codegen's stuff into a separate project, which would only package the codegen, and the consumer would have a dependency on that with relaxed version bounds. That way the codegen could follow the upstream's versioning scheme exactly (rather than needing a mapping), the consumer and codegen versioning would not have to be mixed, and users could independently update consumer and codegen if they need to pin one or the other.

But I'm not really up to snuff with codegen in the modern packaging landscape, so I'm not quite sure whether it's a good idea, and if so how to go at it.

echo ruin
# wicked canyon Does anyone have resources on making codegen'd packages in modern python packagi...

others may have even better advice, but the two things that occur to me immediately offhand are:

  • hatchling's build support is very good (miles better than figuring out how to do this with setuptools back in the day) -- specifically, here's how I do a bit of generation in a project of mine: https://github.com/bowtie-json-schema/bowtie/blob/main/hatch_build.py in this case for data generation but it's the same if you're generating code
  • openapi-python-client does this kind of thing if you want to look at how a codegen-focused project does it "in general"
uneven karma
#

I have a user saying that they're running
python3 setup.py bdist_wheel --plat-name=manylinux2010_x86_64 --python-tag cp39-cp39
And it's producing
dm_reverb-0.8.0-cp39-cp39-none-manylinux2010_x86_64.whl
But they need
dm_reverb-0.8.0-cp39-cp39-manylinux2010_x86_64.whl

The none just means it's pure-python, right?
So they're trying to forcibly create a platform-specific distribution for something that isn't platform-specific?

#

Is there any [good] reason to do that?

#

I plan to just say "don't do that"

fading veldt
fading veldt
glossy coyote
#

I use setuptools now for my codegen project, but want to move away from it. hatchling is what I'm trying out, it's mostly straightforward except for documentation.

tidal kiln
#

I'm engaging with it as an Arch Linux and Python maintainer, and hope we can get a better history when it comes to distro patching Python stuff

indigo token
#

FHS 4.0

tidal kiln
#

@high stone @opal wolf might also be interesting to you

blazing lantern
uneven karma
#

I copy/pasted Ethan’s response and… they never said anything else, so shrugch

fallen shuttle
#

Again it seems that https://github.com/pypa/packaging.python.org is under spam attack... So many spam issues recently.

I wonder why it is such a honey pot for spammers and if there is anything we can do about it.

I wonder if adding an issue template would help to prevent the attacks, or if it would not be useful.

queen hornet
#

I heard it's happening to other repos too, there have been big spam attacks recently

fallen shuttle
#

Yeah, I was wondering why pypa/packaging.python.org seems to be suffering more than most repositories.

I noticed that it does not use issue templates, while other repos like pypa/pip do. So I am conjecturing if having an issue template with mandatory fields and checkboxes help to filter out automated attacks.

queen hornet
#

it's possible to turn on temporary limits for new users:

#

we have had big spam batches recently at python/cpython too, and we have issue templates

#

you can easily bypass issue templates

fallen shuttle
#

we have had big spam batches recently at python/cpython too, and we have issue templates
Ok, so that is not useful 😢

queen hornet
indigo token
#

Having that link is a choice. You can turn that off.

queen hornet
#

oh! where's the setting?

indigo token
#

It’s blank_issues_enabled: false in the config.yaml

fallen shuttle
#

Maybe if we have some evidence that having a few mandatory checkboxes and or dropdowns help to reduce the spam, it may be worthy to create a template for the "blank" case (even if it means to have some dumb fields there just to confuse the bots)

abstract nebula
#

[x] I'm not a platypus

uneven karma
#

Perry: firSneak

queen hornet
fallen shuttle
drowsy moss
#

Not sure if we need to disable empty issues. I think this could be postponed as it require people to know that it exists in the first place. And the spam seems to be low-effort

#

Also, I was going to start adding a drop-down on whether the submitter used gpt.

#

I wonder, though, if adding a ✅ I'm a robot 🤖 checkbox selected by default would let us detect automatic submissions (with a companion GHA automation checking for this).

#

Another idea could be redirecting traffic to discussions like Pradyun did in Furo

fallen shuttle
lyric hedge
#

If anyone likes to read blogs ^

#

I assume that doesn't ping, otherwise, oops... cockatiel_ping

mystic stag
#

It did not

mystic stag
obsidian lily
#

did malware scanning activity go up? i'm seeing this on a few of my unused pypi projects

queen hornet
#

what is this a graph of?

obsidian lily
torpid veldt
pallid mango
#

Download stats are all over the place sometimes, I have no idea how to explain some of the spikes for my libraries.

vital pier
#

The beauty of open source!

drowsy moss
fallen shuttle
abstract nebula
#

Would an uncheck be more effective? Something along the lines of "[x] I'm a bot" and only unchecking it would allow you to post. I admit I have no idea if that's even possible but something that came to my mind assuming these bots just check everything

drowsy moss
drowsy moss
torpid veldt
# torpid veldt Maybe there could be support for dynamic trusted publishing, so anyone can chuck...

Ah it's not possible

In other words: it would be catastrophic for PyPI to support an OIDC IdP that can’t distinguish between its users, or permitted claim malleability such that users could impersonate each other.

Ensuring that each accepted IdP meets these conditions requires a nontrivial time commitment that gets balanced against the expected real-world usage of a given IdP: an IdP with one-to-few users is not worth the tradeoff in review time.

@mild dew can you elaborate on this, I'm not looking to push back on the decision just interested in the constraints

drowsy moss
torpid veldt
#

What's claim malleability?

torpid veldt
#

@drowsy moss ^

drowsy moss
#

Oh, I missed that bit. I'm not familiar with the term.

#

If I were to assume, I'd say that this is about claims being immutable or not post creation.

mild dew
#

and also yeah, we need to handle resurrection prevention when the OIDC provider gives us the ability to (mainly by checking IDs and not just human names for users, repos, etc)

fading veldt
pallid mango
fading veldt
#

Who knows!

pallid mango
#

Ah, it is zope. It has the same unusual spike. I still don't know what changed, though 🙂

fading veldt
#

maybe it moved from an extra to the default list of dependencies?

echo ruin
#

Is there any reason inline script metadata wouldn't/shouldn't work on a script with no .py extension?

#

Glancing at PEP722 doesn't indicate this should be the case (in particular besides not excluding them, it mentions shebang lines in one place, which are mostly pointless unless it's also talking about extensionless files) so I'm just double checking I'm not missing something.

#

(uv run foo seems to not consider them, so I'm about to file a bug there assuming I'm not missing something.)

abstract nebula
echo ruin
#

I see someone just commented that (thanks)

#

Confusing.

honest geyser
#

Yeah we don't sniff for inline metadata in non-Python files

echo ruin
#

(Obviously understanding that that's not the case at the minute.)

honest geyser
#

I mean, yeah maybe we could? How do we know your shebang points to a Python interpreter?

#

It could be /foo and you would expect us to read the metadata?

echo ruin
#

No, probably I'd expect similar behavior to the "historic" pip behavior

#

Which I think just whitelisted which shebangs are assumed to be Python

honest geyser
#

Interesting, hm.

#

We added the --script flag to account for ambiguous situations

echo ruin
#

(But I think what I mentioned there would be even nicer, namely having separate commands. But also I expect <1% of users to use this, it's the first time I've done so myself. So who knows. Just brainstorming.)

honest geyser
#

You're welcome to open an issue asking us to inspect shebangs though.

echo ruin
#

Cool! I may do that as well, thanks for the quick response/working on the tool of course!

lusty quarry
brittle token
#

Interesting problem with cross-platform locks and toga-gtk that the Beeware sprinters ran into at PyCon AU last week: https://github.com/beeware/toga/issues/2995 (cc @honest geyser @west drift ).

While they specifically encountered the problem with uv add, it's a general issue with cross-platform locks on projects that depend on Linux components that are outside the manylinux specs (pygobject in this case)

#

In particular, the tool.uv.dependency-metadata trick doesn't work for transitive dependencies, since the supplemental info doesn't end up in the toga-gtk dist-info folder.

brittle token
west drift
#

Setting tool.uv.dependency.metadata doesn't solve the problem, since that's only effective for application locks - the tool table metadata doesn't make it into the toga-gtk .dist-info folder, so uv add doesn't have access to it when the dependency on toga-gtk is resolved.
So the problem is that tool.uv.dependency-metadata works for you, but isn't inherited to users?

west drift
brittle token
#

I don't have a macOS machine to test it on, but I pointed to your PR on the original toga issue, so Russell should be able to provide feedback.

lusty quarry
#

do we know of a library/code that maps a packaging.metadata environment dictionary to a set of allowed wheel tags?

fading veldt
#

By environment dictionary do you mean the platform metadata? The information this mapping would want to use is on distrowatch I believe.

drowsy moss
glacial basin
#

Hi there

Facing an issue with packaging and distributing a script with pdm

I have a little python script like that using typer for cli args

import typer
from typing import Optional

app = typer.Typer()

@app.command()
def main(config: Optional[str] = typer.Option(
    None, "--config", "-c", help="Path to configuration file"
)):
    if config:
        typer.echo(f"Config: {config}")
    else:
        typer.echo("No config provided.")

if __name__ == "__main__":
    app()

in my pyproject.toml

[project.scripts]
var_report_paris = "srm.reports.var_report.var_report_paris:app"```

When i run it with python binary no problem. But when I try to use the .exe built from pdm -> https://pdm-project.org/en/latest/reference/pep621/#console-scripts It does not recognize the Typer arg
uneven karma
uneven karma
#

(You just copy pasted and still didn't describe your problem)

uneven karma
glacial basin
uneven karma
#

ah

#

What's the issue then?

#

It doesn't recognize the app at all, or it doesn't recognize the config?

glacial basin
#

sorry for the horrible paste i don't have choice

#

but the config arg is not detected when running the .exe

#

(with python the Typer arg is read and works well)

uneven karma
#

Have you tried that top version?

glacial basin
#

I remember I used it but not built it ( don't remember why)

#

Will change it

uneven karma
#

Are you using different Python versions or something?

glacial basin
#

nop 3.9

uneven karma
#

Did you redact your code?

glacial basin
#

And the virtualenv has been created by pdm

uneven karma
#

ah

#

so thats it

glacial basin
#

?

uneven karma
#

I'm assuming the part you took out is the part that's passing the config on to your code

#

But you said that config = typer.Option(), and now your code is trying to convert typer.Option() into a file path and failing

glacial basin
#

it's failing only using the .exe

#

the python cli works

uneven karma
#

How are you making the exe?

glacial basin
#

pdm build

uneven karma
#

... doesn't that build a wheel?

glacial basin
#

It is but also an executable if you add a section script to your toml

glacial basin
uneven karma
glacial basin
#

Annotated

#

Many many thanks @uneven karma

#

Its more than a day I ve been struggling with

uneven karma
#

Annotated makes it part of the type instead of setting the value equal to that thing

glacial basin
#
var_report_paris = "srm.reports.var_report.var_report_paris:app"```
#

But I guess if I used : main even with Annotated it would fail

glacial basin
#

@uneven karma In fact it's woring 50% as var_report_paris can't take any args

#

It loads the default None args everytime

west drift
queen hornet
#

we just got a very similar one on CPython ("𝐇𝐨𝐰 𝐓𝐨 𝐡𝐚𝐜𝐤 𝐅𝐚𝐜𝐞𝐛𝐨𝐨𝐤 𝐀𝐜𝐜𝐨𝐮𝐧𝐭") from a different account that also spammed rust and flutter

jovial holly
#

there was a wave of spam last week that got cleaned up, that one apparently was missed. thanks

echo ruin
#

Is anyone aware of a tool that can query license information for installed packages (and/or equivalently for dependencies) that still works now post PEP 639?

#

(Specifically pip-licenses appears to not be PEP639 aware. It has an open issue I see, so if the answer is "no nothing else exists" I'll try to see what it's blocked on.)

#

Does uv have functionality for querying package metadata

#

OK, I'm gonna just NIH parsing it myself (famous last words?)

#

(Well not parsing, but reimplementing pip-licenses via importlib.metadata)

fading veldt
#

I think most tools that check licenses are targeted at before install time

echo ruin
#

I don't personally care much if the check happens before or after, implementation-wise.

#

The functionality I need is "tell me what licenses all the transitive deps of this package use"

#

Doing it post install seems easier to avoid needing to dep solve, but if there's something that does it before that works

#

(The use case FWIW is jsonschema has a nongpl package extra which promises to install only non-GPL deps, and I have a noxenv which checks that's the case, previously via pip-licenses and now via... we shall see.)

fading veldt
#

I am not aware of any tools that I know for sure are updated actually

vital pier
echo ruin
knotty gulch
#

as personal, I want to make contribution to pypa ( any project is ok ) and bring more folks to here, where should I start? can anyone give me some guide, recommendations?

tidal kiln
#

not exactly pypa, but if you want to contribute to importlib_*, feel free to ping me

#

or CPython

#

but on PyPA specifically, I think an UX analysis is something we are in dire need of, not sure if that's your skillset

#

but it may be an interesting project for a larger user group

tired nebula
#

Transitive license checking tools

drowsy moss
shell glen
#

Good morning. Are there "test pattern" projects on pypi.org or test.pypi.org (with known properties/dependencies) for testing package installation itself? Or does such testing just use real world packages (e.g. django, requests, ..)? I'm writing a test for something that invokes uv pip.

inland prawn
#

you mean mocking? pytest-mock or unittest.mock (from stdlib)

shell glen
#

I don't. In the world on television there are test patterns, pictures designed to exercise the camera, transmitters, screens etc, and aid with diagnostics or calibration https://en.wikipedia.org/wiki/Test_card. I'm wondering if there are any Python packages designed to exercise PyPA, pip, setuptools, etc.

queen hornet
brittle inlet
#

Hello! Is there an easy way to check if a certain PyPI package provides wheels for a given Python version + platform?

inland prawn
#

There is a list of all artifacts for a given version on PyPI

brittle inlet
#

thanks!

drowsy moss
drowsy moss
honest geyser
tired nebula
#

i forget which project a company I worked at used for their internal cache

honest geyser
#

Though the focus is on resolution not installation at this time

tired nebula
#

oh hmm ty

#

i'll need to look again later, ty for posting this

honest geyser
#

If it doesn't meet your use-case in some way lmk! You can see how we consume it in uv

sinful pecan
#

Greetings there,

Hope you are well. I need some advice. I just got a corporate threat (you know which ones I mean) that my package name is too similar to another package (mine is qickit, because you usually have to kick it until it works, and the other one is qiskit). Their lead dev just spoke to me saying that I need to change the name or they can contact PyPi to take it down, as well as GitHub.

Is that a thing? Must I change the name? I really like it, and have made logos and everything. My preference is that if they can't legally force me to not change it.

tidal kiln
#

I understand where they are coming from, but I don't think they have a valid claim, and they immediately threatening you instead of trying to resolve the issue in an amicable and more diplomatic way is a pretty bad way to handle the issue

#

I doubt either PyPI or Github would do anything about it

sinful pecan
#

Their lead called me and basically said (I will quote so as to provide a just portrayal):
"The names are too similar, so if someone accidentally installs yours instead of ours it'll cause issues for us. The IBM legal team has informed me of this, as well as the security team. If you don't change it, then we will have to get legal involved which will start with talking to pypi to take down qickit, and then GitHub to take down the repo. We've never gotten to the legal part before, so please change it now. And add headers for parts which are similar to qiskit code."

#

TLDR; I interpret that as a threat, and I feel rightfully so. It's saying change it or we will take it down. Sounds like a threat to me.

#

Also, I haven't used their code much (I have used it for three files all in all, and even if I rewrite them from scratch, it won't matter because it's implementing the same research paper). I genuinely don't understand why they want me to change my code.

lusty quarry
#

agreed, they have no standing

indigo token
#

You think so? Both libraries are quantum stuff and you’re using some of their code and the name is one letter away from theirs.

I think it’s the combination that counts here. If it was completely unrelated in field and content/code with only the name being similar, I’d agree, but like this I’m not so sure.

#

Who was first?

brittle token
#

IBM were first in this case (qiskit first release August 2017, vs qickit being new). Given that, it's likely an automated scan of PyPI's "new project upload" feed that has set off an automated alert in IBM's security team that resulted in the aggressive tone of their initial contact.

indigo token
#

And US law probably. People often interpret the “you have to defend copyright to keep it” as “you have to be maximally rude and threatening to harmless open source contributors”

brittle token
#

Trademark in this case, but yeah, same deal. That said, the real typosquatting attacks in recent years have lit a fire under the security teams at the big tech multinationals.

(Edit: oh, I missed the part about the code headers. So, yeah, copyright is involved, too)

#

Anyway short version: we're not lawyers, this is not legal advice, but given the similarity in purpose and naming of the two projects, and the relative timing of their initial releases, IBM likely wouldn't get immediately thrown out of a courtroom, and even before things reach that point, the GitHub and PyPI admins might concede that they have a valid point (I'm not saying they would, just that they might given the context of wanting to help publishers defend against real typosquatting attacks).

Undoubtedly not the response you were hoping for, sorry 😦

#

On a potentially more positive note, there are some other phonetically-but-not-lexically similar names that look like they're free (including the literal kickit), plus the option of keeping qiskit as the import API name, and adding a distinguishing prefix or suffix to the PyPI package name.

sinful pecan
#

qickit is an acronym for quantum integrated circuit kit (which is the core feature of the package, it integrates existing packages into one and uses them as execution backends for circuits) and a pun on how I usually have to kick it until it works.

sinful pecan
sinful pecan
sinful pecan
sinful pecan
# indigo token And US law probably. People often interpret the “you have to defend copyright to...

It's also that I genuinely got my heart broken a bit. Like I thought someone was noticing my work, someone who is a main dev of a package I like. Only for them to basically tell me to change the name of my package otherwise legal would talk to pypi and github to take it down (it doesn't matter how nicely you say it, you're bullying me to change my package name and forcefully tell people I stole code, which I didn't. I used some code, and modified to my package which afaik is allowed in Apache 2.0 without needing to say explicitly, otherwise they should use sth like GPL imo). Like it went from me being excited to me just being embarrassed and trying to keep my poker face hehe.

sinful pecan
# lusty quarry agreed, they have no standing

I am trying to be understanding, I really am. But I genuinely feel I am being bullied in this case, and don't want to just roll over for them. Next time they'll tell me my package is derivative and that I need to take my code or blah blah blah. I am a solo developer. I am not a threat to a multi-billion dollar company who has had a 8 year headstart.

#

That's how it felt.

#

You can be "nice", doesn't change the sentiment of your demand.

#

Now watch. The moment I publish my private code which they've been asking for a while, they'll just copy it and say "Welll, it's based on a paper you don't own, so we can use whatever we want. Yeah no, sorry, yeah no! 🙂 "

brittle token
#

Yeah, it's never fun when folks start wielding IP law as a hammer.

While I doubt it is any consolation right now, I would still assume that it isn't likely to be you personally that the IBM security department are worried about, it's the kind of genuine attack described in https://www.techtarget.com/searchsecurity/news/366577455/Typosquatting-campaign-malicious-packages-slam-PyPi (I can't speak for the qiskit dev that contacted you, though - it sounds like you had some previous contact with them through the project, and it has to be disappointing that they didn't assume good faith on your part)

(for what it's worth, as far as Apache 2.0 goes, it does have an "attribution clause" in it, so you do need to specify where code using that license came from. Even the most permissive of the commonly open source licenses still tend to require that much, including Apache, MIT, and BSD. That doesn't apply to genuinely independent derivations from the original paper, though).

indigo token
sinful pecan
sinful pecan
# indigo token You don’t need to be understanding towards a corporation. Corporations aren’t pe...

I am trying to be understanding towards Qiskit, not IBM. However, I genuinely feel they're just trying to bully me into doing sth I don't want to do. It's not a favor they're asking for. It's a demand, an ultimatum. He even said when can I expect to see the change. Like you're asking me to change my package name and rewrite some code. It's not an overnight change. And then you start to ask yourself why the fuck am I being a soldier when I don't even work at IBM?!

#

It's not even a product, it's a free software. I tried to make it as free as possible. I myself always have the idea that when you publish a code under apache 2.0 it's to say this code is for the community, go nuts.

indigo token
#

I totally get it. They benefit from your stuff, you feel betrayed, and you’re right to feel that.

sinful pecan
#

Now watch. The moment I publish my MPS encoder (which genuinely, without any brags, makes their isometry encoder look like trash with its exponental scaling, whereas mine is linear) they'll copy it, and then say "Well, it's based on a paper you don't own, sooooo..."

#

Now try to get your copyright.

indigo token
#

A friend and colleague of mine was in a similar situation with NVIDIA, where he basically did work for them for free. He got hired by them recently. Not saying you should let them string you along, but there are good ways out of shit like this

sinful pecan
#

That's why I never publish a code unless I am 100 percent fine with it being copied, used, and distributed without any credits to my name.

indigo token
#

You could start by publishing your stuff under Apache or MPL or so as well, so that they have to copy your header

sinful pecan
sinful pecan
#

And at this point, I prefer NVIDIA. They're employing MLIR into their compilation process.

indigo token
#

I think the only thing I have left to say is that you could try to get in touch with their engineers. Legal often doesn’t listen to engineers, but they’re often a way to talk to actual human beings who understand where you’re coming from. I hope this works out for you!

sinful pecan
#

Their lead dev told these to me.

#

That's as engineer as it gets.

#

I hope so. I'm rewriting the parts he said look similar (I told him which parts myself). I don't want to change the name. If they can't force me and can't take my repo down, and it's a matter of them asking this as a favor, I genuinely don't want to do it. I like my name. I rarely like my names.

#

I'm gonna go sleep, it's 1 am here, and I am just getting my sleep back on track. I'll probably spend tomorrow finishing the rewrite and checking for any guidelines for measuring similarity in a codebase. If I am good on that part, I'll leave it at that.

#

Thank you so much guys for being so supportive. I rarely stand up for myself in these situations, and your support, well, it means the world to me. Thank you.

sinful pecan
#

@lusty quarry Apologies for the bother. Since you mentioned what they're saying is invalid, could I ask how I can get an official statement regarding this to present to them in case they threaten me again?

#

From PyPa (for PyPi to state they won't take my repo down for this).

lusty quarry
sinful pecan
#

I see. Who do I have to contact regarding this?

#

I imagine there's someone who makes judgement regarding these matters.

lusty quarry
#

I think perhaps it's not as big of a deal as you might think, although I understand if that is the first email you've received of that kind then I'm sure it can be disconcerting

sinful pecan
#

It wasn't an email. Their lead dev DMed me on Slack, and then asked for a video call immediately after I responded.

lusty quarry
#

also, I responded to you based on your first post which turned out to be imperfect information. after reading more of what you wrote about the project I think they probably do have a valid claim

sinful pecan
#

He made it seem like a big deal. Threatening a kid (I am a kid compared to a 40 sth year old developer) solo developer with legal action and security report for a simple name similarity is definitely drastic.

#

How so?

#

Them saying because of the name similarity, people may accidentally install mine and my package could then be potentially malicious in the future is really not a valid claim. It's like saying you shouldn't get a car because you may run someone over in the future.

sinful pecan
#

It's literally punishing me for a crime I haven't even commited.

sinful pecan
#

I am importing their package, which is not an offense. It's an Apache 2.0 software, and I am just importing it.

#

And theirs is not the only package I have integrated. I have integrated cirq, pytket, tket2, quimb, and pennylane.

lusty quarry
#

I think you shouldn't worry about preemptively doing anything, if they make an official request you will be notified as part of the process and can state your case

sinful pecan
#

But I want to be prepared. I am not in a situation that I can just react to these when they arrive.

#

I am practically in and out of university and embassies for my visa process, and the rest of the time working to save money for tuition. Really not in the headspace to deal with this without being prepared or knowing my facts.

#

Also, you said because
a) I am using their package
b) My package does sth similar
That's literally the same for cirq and pytket as well. And they're companies. Actual, commercial companies.

lusty quarry
#

respectfully, I think you're causing yourself undue stress by thinking about this. if they make a request then you will be notified and admins can hear both sides, it's really not a big deal

sinful pecan
#

...I genuinely don't think I can mentally handle such an ordeal on a as-it-comes basis. It's alright, I won't take more of your time. I'll try to speak to someone from PSF to see what I should do.

#

Thank you all for your time and the kind insight. It means the world.

lusty quarry
fading veldt
sinful pecan
#

Thank you. I don't think I have violated any copyright or trademark laws, but to be safe (and as a matter of personal principle) I am rewriting any and all parts for which I took inspiration from qiskit. They do not own the IP for the algorithms, and only have objected regarding implementation similarity, which I will respect even if I don't understand and will change to provide adequate distance.

#

In case anyone from here ever serves as a jury role (sounds dramatic) for this case, I'd like to state that no matter what I do, the implementations will be similar as they are implementing the same algorithm which is permitted given the algorithm is public access and I have the cited publication in all relevant code docstrings.

tired nebula
sinful pecan
tired nebula
#

no, not quite

#

to understand a publicly traded corporation is not human and should not be expected to act with compassion or understanding.

sinful pecan
#

I see.

#

So the opposite. Expect them to dedicate a good amount of resources to screw me over.

tired nebula
#

I'm not a lawyer and not your lawyer, etc, etc but:

  1. In some countries, trademark and IP law means you're pretty much obligated to shoot first, ask questions later
sinful pecan
#

I see.

tired nebula
#
  1. Sometimes, they'll send you an angry letter but it really means change a name of something and they'll leave
sinful pecan
#

I didn't make my package anything but the most free license possible to avoid these stuff. I just want to write some useful stuff for people to use.

tired nebula
#
  1. Stop, drop, and talk to your own lawyer (if possible)
sinful pecan
sinful pecan
tired nebula
#

this may or may not have something to do with it

sinful pecan
#

With what?

tired nebula
#

hold on, rereading

sinful pecan
#

Take your time.

tired nebula
sinful pecan
tired nebula
#

"Yo, legal is angry and wants to flip, so you can prevent this situation if you change the name"

#

I get it, but it's also how the law is written as another person said.

sinful pecan
#

But why would legal even dare to be angry? I haven't done anything wrong.

tired nebula
#

Well, not angry per se in the individual level.

#

I'm anthropomorphizing the policy interpreter system of a giant, inhuman entity.

sinful pecan
#

I get what you mean, what I'm saying is they don't have any ground regarding the name.

tired nebula
#

They do per trademark law concerns.

sinful pecan
#

PyPi allowed me to register this. GitHub allowed me to register this name.

#

Right, I am not using their name.

tired nebula
#

Also, tbh, if you're in an authoritarian-leaning country, being the package maintainer for a potential typo-squat is a bad idea.

sinful pecan
#

They own Qiskit. They do not own all names close to Qiskit.

tired nebula
#

They don't own those names, but copyright and trademark law sorta obligates them to send nastygrams.

#

(that's slang for "cease and desist" letters)

sinful pecan
tired nebula
#

I said potential. I'm not saying your package is one.

#

I'm saying that if someone steals it from you, then it can become one.

#

I'm also not saying they're in the right. This whole situation just kinda sucks.

sinful pecan
#

What should I do? I am ok to change my code so they don't think I used their code, and at this point would be ok to fully remove qiskit support if I absolutely have to (I added it for the people who are used to qiskit, not to please IBM). I don't want to change my name.

tired nebula
#

my suggestion: come up with a better, cooler name

#

write better doc

#

and make a better tool than them.

#

out of spite.

sinful pecan
#

I'm a solo developer who started a year ago. They're a multi-billion dollar company with over 100 developers, half of which are PhDs with 8 years of experience. I am not gonna just outdo them.

tired nebula
#

I am not gonna just outdo them.
Not in the same areas, no.

#

But you can focus on what you're good at.

#

They're a multi-billion dollar company with over 100 developers, half of which are PhDs with 8 years of experience.
this is a team that will be awful at doc and clear code unless they specifically have someone hired to do it.

sinful pecan
#

I can guarantee the moment I publish my currently private encoder that makes their encoder look like trash, they will immediately take it and say you published it and it's based on a paper you don't own.

tired nebula
#

So making "requests, but for quantum circuits" is plausible. Requests is a urllib wrapper, more or less.

sinful pecan
#

I can't outdo them when they outnumber me, and play so dirty.

#

My package isn't just a wrapper though.

tired nebula
#

Doesn't need to be.

sinful pecan
#

Only my circuit and backend module.

tired nebula
#

I'm using that as an example.

#

The point is that the requests module makes it easy to do http.

#

it handles all the state and config management for you.

#

you don't have to think about importing the urllib and cryptography session management stuff.

#

that's added value, and the doc is good, too.

#

My suggestion: QICflip

#

it's a joke about both a skateboarding trick and bit flips

#

I don't think I have violated any copyright or trademark laws,

#

also this is really a lawyer question, especially in the US.

brittle token
#

@sinful pecan Another naming idea that's harder to confuse with qiskit, while staying closer to your preferred qickit name (and would even be pronounced the same way): qicit

sinful pecan
#

For the name, the most I will rollover for them is to do qickit[core] like what cirq does.

sinful pecan
#

Would qickit[core] work?

#

or qickit-core?

brittle token
#

The second one would (first one isn't changing the name, it's referring to an extra under the current name)

sinful pecan
#

I see. I saw cirq use it.

#

!pypi cirq

#

Aw, no pypi command here?

#

For qickit-core, do I just change it in pypi? I would like the repo itself to remain qickit. Afterall, all they want is regarding the install for typosquatting.

brittle token
#

You could also just stick py on the front of the distribution name: pyqickit for "Python quantum integrated circuit kit". Import name can still be import qickit even if you add a prefix or suffix to the distribution name.

sinful pecan
#

I see.

#

I'll have to learn more about how I can do that. I've only ever done qickit and qoin both of which are import it, install it, use it with the same name.

brittle token
#

To change the distribution name, you can just change the name field in pyproject.toml, while leaving everything else as is. Since GitHub is scoped by org name, they've got zero reasonable ground for claiming that the GitHub repo might be confused with the main qiskit repo.

sinful pecan
#

I see. I'll change the pyproject.toml and pypi today.

#

Thank you so much Alyssa.

sinful pecan
#

@brittle token This is what they said after trying to gaslight me into changing qickit-core as well with "your package wouldn't get noticed"

typosquatting is not the only only concern concern here. JOSS will ask you to change it if at any point you publish there. A web domain might also be tricky.
#

The way he says it, IBM would have to own Qiskit, and every version of the word with different letters. Including only wrong letters and missing letters, there'd be a possible number of 6 x 26 (one letter difference for each letter assuming alphabet only) plus 6 more considering one missing letter (i.e., iskit, qikit, qisit, etc.), thus bringing the sum to 6 x 27 possible combinations (I am tired, and stressed, so if I made a silly mistake, I apologize).

There's no way by choosing a name you'd own 162 other names as well.

#

And that's ignoring extra letters, multi-letter misspelling, and more than one letter missing.

tired nebula
tired nebula
#

For example, look at how much boilerplate many projects have. "$PROJECT is not affiliated or endorsed by $COMPANY. It is an independent work which..."

brittle token
#

Sure, but trademark laws (at least in Australia and the US) rely on a "would a reasonable person be confused" test. The written caveats are leaning into that since it means that anyone that missed that the project name and org name were different has been explicitly informed that there is a separate project with a different name that might be what they were actually looking for. It all makes it harder to claim reasonable confusion, and lawyers are expensive for everyone (even big multinationals), so they prefer to spend their time on sure things rather than cases they might not win.
Which path a project takes depends on how attached people are to their originally chosen project name, and how genuine they judge the risk of legal or institutional action to be (attachment low, or perceived risk unacceptably high -> pick a completely different name; attachment high, perceived risk acceptably low: tweak the name enough that they can't be accused of typosquatting, add a "Were you looking for X?" disclaimer pointing to the original project)

tired nebula
#

reasonable person
my understanding is that was more a commonwealth country thing. "Reasonable" doesn't really apply in the US, but yeah, you're definitely right about the second part.

sinful pecan
brittle token
sinful pecan
#

Copy that.

#

I'm re-reading the papers the implementation is based on and adding inline comments for exactly where the line is coming from in the code so they'd get off my back.

sinful pecan
#

I was wondering, do I also have to create a new pypi name?

#

Like qickit-core instead of qickit?

#

Or just changing project=qickit-core suffice in project.toml?

sinful pecan
#

I see. I'll make a new pypi tomorrow then.

#

And publish to that instead of qickit.

blazing wagon
#

hi friends - can anyone help direct me to where I should post with issues building the warehouse docs? i'm getting a bunch of warnings when i run docker compose up user-docs and i can't seem to get the page to load locally when i go to localhost:8000 or 80 both ports can't load the page. the output suggests it's at port 8000. this could all be user error. Is this the right place to ask questions? many thanks!!

fading veldt
#

Hi Leah! Probably best to ask in #pypi

blazing wagon
#

ok great thank you @fading veldt !! and hi!! 👋

tired nebula
#

Sure, but trademark laws (at least in

queen hornet
honest geyser
inland prawn
#

Ofek also did a lot of marketing and "transition to Hatch/hatchling" PRs

honest geyser
#

True! It's in a lot of guides.

#

It's interesting to compare the backend to the frontend — e.g., hatchling downloads are much higher than hatch (which makes sense, since it's used by anyone consuming the package too — but the ratio between hatch / hatchling and poetry / poetry-core downloads is pretty different.

#

It's a hard space to measure usage

indigo token
#

Poetry needed a long time until it implemented standard dependency metadata support. I think in that period, a lot of maintainers probably thought that they’d get the best tooling support if they went for a solution that did support the standard dependency metadata. But I might be projecting.

Currently I think if people leave Poetry, it’s because of the custom solver. Solver edge cases tend to be fixed before the average person encounters them in pip and uv, so running into them tends to happen more often when using Poetry.

Finally (and this more of a guess than the other points), I think people are more likely to switch to Hatchling because of the plugins, independently of their workflow. I feel like few people use poetry-core without Poetry. (As said: just a wild guess)

queen hornet
#

yeah, I agree with the final point

inland prawn
#

I feel like few people use poetry-core without Poetry.
Until recently it was pretty much impossible to use poetry-core without Poetry (at least for metadata declaration)

blazing lantern
vital pier
blazing lantern
vital pier
queen hornet
merry rune
#

Standardization question: Is the dist-info package name specified as normalized anywhere? That is, should jaraco.text make jaraco.text-4.0.0.dist-info or jaraco_text-4.0.0.dist-info? hatchling and scikit-build-core normalize it, but setuptools does not, and pkg_resources.require doesn't support the normalized version, which is bothering the Fedora packaging.

spiral urchin
merry rune
#

I don't see anything about the filenames inside the archive there, there's a discussion of the wheel name only (above)? And where {distribution} is replaced with the name of the package doesn't say the normalized name; the name in the metadata is not normalized.

inland prawn
#

isn't pkg_resources deprecated?

merry rune
#

Yes

spiral urchin
#

Oh you mean the dist-info inside a wheel, not a dist-info for an installation

lusty quarry
#

isn't that the same thing e.g. it just gets unpacked?

spiral urchin
#

Yeah if you read the wheel spec in a certain way, you can probably argue dist-info in it must use the normalised name because it says the installation dist-info is generated directly from the wheel’s content.

#

The spec isn’t really clear about this though; it only says you should “unpack” the directory (it does not technically mean the unpacked file should have the same name as its entry inside the zip)

strange knot
honest geyser
vital pier
#

personal opinion: leave the spec as is, as it discourages folks from leaving trailing whitespace

#

And wouldn't allowing trailing whitespace break the example shown in the specification ? Since there can be # /// lines inside the block the regex would terminate mid-block?

honest geyser
#

I don't think that's actually a problem though

#

Precedence for an ending line # /// is given when the next line is not a valid embedded content line as described above.

queen hornet
#

Hmm, it doesn't seem very helpful to fail with just a bit of whitespace at the end, I'm not sure what ambiguity it could cause to allow it (although I've probably not dug deeply into it as you all)

indigo token
#

A better error message that explains the problem would fix that without tampering with the spec

west drift
#

Is there other syntax in python or toml that doesn't allow trailing whitespace? i have my IDE set to remove trailing whitespace and every formatter i know removes trailing whitespace, but i'd go for consistency with python and toml

west drift
#

would it make sense to read data-upload-time from html simple index pages, if provided? i know PEP 700 skipped backporting upload-time to the html simple index because there were no use cases, but with --exclude-newer we now have a use case in uv where we feel the pain of the asymmetry between the json and the html output

strange knot
mystic stag
blazing lantern
honest geyser
honest geyser
blazing lantern
blazing lantern
honest geyser
inland prawn
#

why wouldn't it be possible to host static JSONs?

fading veldt
#

You need to set the content type to text/json

hollow trout
#

I would also guess that it's hard to make a statically-hosted website to serve HTML and JSON responses for the same URL(s).

blazing lantern
honest geyser
#

We use HTML to serve our test index on packse for this reason. We version the entire index though, so each snapshot is immutable and we don't need exclude newer for it. (Pretty niche use-case)

pallid mango
#

Huh? Serving json with the correct content type is trivial with most webservers I have worked with. It's no different from images, js or css. Those all need correct content types. Do you have an example for a hosting situation in which this would be hard?

fading veldt
#

The "hard" part is that not all static serving scenarios can serve JSON or HTML for the same path with the correct content type based on what the user requests in headers. nginx can do this pretty easily, but if you are looking at the scale of CDNs and such, it becomes less straightforward

#

Honestly the maybe harder thing would be re-writing requests to normalize requested project names. which I don't think is required but is recommended IIRC?

#

Yeah

Repositories MAY redirect unnormalized URLs to the canonical normalized URL (e.g. /Foobar/ may redirect to /foobar/), however clients MUST NOT rely on this redirection and MUST request the normalized URL.

pallid mango
#

Ah, so the hard part is server-side content negotiation based on headers. I missed the 'same path' bit. Yes, that's annoyingly hard to do.

#

For the redirect part, just don't? I wonder why this is in the spec at all. If repositories don't have to do it, clients can't rely on it and should always do the normalization themselves anyways. And if valid clients don't do it, why support it on server side?

fading veldt
#

Because not all clients are well behaved 🙂
But practically speaking that isn't required so it's fine

hollow trout
#

It could be useful for human clients.

mortal shore
#

you're not required to have HTML and JSON exist at the same URL and use conneg, you can put them at seperate urls, just if you do that, then selecting the right URL is up to the user who is configuring their client

inland prawn
#

tbh having to set the special content type for JSON API could be a stopper for some static indices... but I trust that whoever designed it knew what they were doing and there were good reasons to do so

west drift
blazing lantern
blazing lantern
#

Personally, based on my experience I think we should change the spec to allow for application/json, especially if we are going to have fields that don't get backported to the HTML API

mortal shore
#

is there any popular static host besides github that doesn't let you control the content type?

queen hornet
#

I think you can't with Read the Docs. (Looks like you can with Netlify and Vercel)

fading veldt
#

But I think most static hosts don't let dynamic content type based on the requests

mortal shore
#

One thing that did change was needing names to be normalized.

At the time the way it worked was if you did pip install Foo, pip would request /simple/Foo/ and if that didn’t work it would fall back to request /simple/ and would look up the correct URL from the entire list of names.

The way some indexes (PyPI included) selected the URL was using the non-normalized name, so if you registered the name as Foo, then the URL would be /simple/Foo/. But pip at the time would use whatever the user specified verbatim.

This worked because of the fallback to /simple/, but that page was getting huge so we wanted to stop it from hitting that. But there was no way for pip to know if the project was registered as Foo or foo or FOO or whatever. So we forced normalized URLs, but we still had every pip in existence (at that time) generating the URLs verbatim, so PyPI started to redirect to make it just work.

Then PEP 503 was written to codify the existing behavior. The redirect was only needed to support older versions of pip, but was still part of the API contract PyPI promised, so it got documented.

#

Hysterical Raisins (tm)

inland prawn
#

I wonder, is there any routine package deletion done on PyPI? Packages that don't have any downloads and new versions?

mortal shore
#

No

uneven karma
#

Question on behalf of a user in another server

#

Where do I actually see what metadata version, for example, setuptools uses?
That’s something I don’t quite understand yet.

vital pier
#

I think for setuptools you'd look at the vendored packaging version for what it supports?

inland prawn
#

2.4 is the SPDX licenses one? I am working on the support for it in Poetry (poetry-core) now.

queen hornet
lyric hedge
#

I'm a little confused at how the PyPA will be managed until a suitable formal relationship between the PC and the PyPA is agreed upon, but in fairness, PEP 609 isn't really that important for PyPA's day to day activities. Any sort of formal actions governed by PEP 609 (e.g. adding a project) will probably end up in limbo until the inaugural PC and the PyPA decide the details of PyPA's governance under the PC... which seems fine.

tidal kiln
#

would appreciate reviews

blazing lantern
lusty quarry
uneven karma
#

After switching to Trusted Publishing, is there any reason to keep the repo link in [project.urls]?
It ends up getting listed twice and looks funny.

lusty quarry
#

yes, for those who don't look at the PyPI project page (the vast majority of folks do not)

#

if you don't want the duplication perhaps rename the URL key to something else that means the same thing like "Source"

lyric hedge
#

is anyone aware of what is the general approach to configuring the number of worker threads/processes? I'm working on parallelized bytecode compilation for pip install. Given the inheritant edge cases w/ parallelization, we'll need an opt-out/knob. My current design is:

  • --workers <positive integer> - sets # of workers
  • --workers auto (the default) - uses the process/system CPU count (**with other considerations)
  • --workers none - disables parallelization
#

I'm also open to alternative names for the option itself. I'm just copying the name from black :P

fading veldt
#

Wouldn't --workers none be the same as --workers 1?

lyric hedge
#

Technically no, it'd still create a process pool, but with only one subprocess which is effectively useless. The difference is that if the system doesn't support multiprocessing, it'll blow up (although the pip impl. will fall back)

#

I could probably replace --workers none with --workers 1.

queen hornet
#

And no need to add none, if they give 1 then disable parallelisation. No need to spin up the process pool for 1 job.

blazing lantern
#

I've also seen 0 represent "auto" when "no arg" wasn't used

lyric hedge
blazing lantern
#

Maybe a flag name that without arguments sounds reasonable and makes sense with an argument? E.g. --parallel on its own makes sense, while, you can infer that --parallel 1 means one subprocess/thread.

lyric hedge
#

so --parallel would replace --parallel auto...?

#

I doubt there will be a strong need to enable parallelization. I'd like it be enabled by default, so I just need a knob to adjust/disable it.

#

I wonder what uv does here.

#

And it is possible pip will gain more concurrent logic (using threads or processes) so that's why I'm aiming for a general design.

#

Although it may be better to simply ignore the technicality that concurrent != parallel in formal CS terms

#

There's also concurrent-builds (didn't scroll up high enough).

#

Multithreading I/O-bound is separate from multiprocessing CPU-bound work so I understand the distinction.

#

Hmmmmmm.

#

I'm going to think about this more later.

#

cc @honest geyser I'd be interested in hearing what your experience has been with your design for configuring concurrency. One sticking point is that pip's configuration file story is not as strong as uv's so adding a bunch of options is not ideal.

honest geyser
#

@west drift has the most relevant context

lyric hedge
#

But I also don't want to pigeon-hole us into a bad design if we expand pip's concurrency down the line.

honest geyser
#

I definitely wouldn't require opt-in unless you think it's dangerous for some reason.

honest geyser
lyric hedge
#

Oh yeah, I'm more curious to why y'all decided to separate the concurrency knobs for build/install/download. Has that been a good/confusing thing?

#

Given I'd imagine most people won't need to touch the concurrency knobs (unless something is broken), giving them specific knobs is probably a "good thing".

blazing lantern
lyric hedge
#

Not sure about the value of that given I want parallelization to be the default.

#

(Certain parallelization features may need to start life as an optional --use-feature option, but the goal would be on by default.)

west drift
#

we should probably respect the install limit there too

#

i don't think we got any reports about the parallelism for bytecode compilation

#

we did get some though for download parallelism

lyric hedge
#

I'll keep these in mind when I land the feature. I am trying to add sufficient error-handling to avoid breakage, given that essentially every pip user will use this new code path, but it is a balancing act.

tidal kiln
#

it would be nice if one of the existing maintainers backed the request, and an org admin answered the request

#

maybe @high stone, since you're both?

high stone
#

What did I do? XD

tidal kiln
#

you're a packaging.python.org maintainer and a org owner 🤣

high stone
#

Ah, sure. Lemme comment publicly. 😅

tidal kiln
#

thanks!

high stone
#

(goes back to figuring out what to do about dinner)

tidal kiln
#

thanks!

lusty quarry
#

I have a Python package that at build time uses CFFI in the out-of-line API mode. At runtime, CFFI is a required runtime dependency because the compiled code imports the shared library _cffi_backend that is part of the CFFI package itself. I want to ship this in my own package to avoid that runtime dependency. I could copy the file at build time so it's at the root of the wheel but that would technically conflict with any environment where a user happens to depend on CFFI. Is there a way to indicate where the _cffi_backend module is looked up by my built shared binary so that it doesn't have to be at the root of the wheel (and eventually site-packages) but rather within the package directory?

lusty quarry
#

all I have to do is modify one line of the C file that I already emit right before compiling!

lapis hound
#

👋

lyric hedge
#

@mortal shore I'd like to contribute some features to scripttest for pip's test suite. Being a pip maintainer, I already have the commit bit for the repository, but it seems like only you and Ian control the PyPI project. Would it be possible to give me access?

#

(I'm surprised nothing has broken given the last release is from 2013 :P)

mortal shore
#

huh, missed this somehow

#

@lyric hedge what's your PyPI username

lyric hedge
lyric hedge
#

As PyPI and GitHub? yeah

#

Accepted PyPI invitation 👍

fervent cedar
#

hey guys can anyone help me 🙂

#

getting this error

#

how to install something with pypa

#

i am new 🙂

inland prawn
#

what are you trying to do?

fervent cedar
#

Hunyuan3D-2

fading veldt
#

What command are you running?

fervent cedar
lusty quarry
fervent cedar
# lusty quarry `pip install .`

thnx it kinda worked i forgot about that xD . i didnt have any wheel -- user . so i had to install that . after that pip install . --user --no-build-isolation . boom done

brittle token
#

While I'm not sure of it's applicability outside venvstacks, the transitive upper bound dependency problem I'm pondering in #venvstacks message feels like the kind of problem that may interest others working on multi-project version consistency problems.

strange knot
brittle token
#

For anyone interested in dependency groups, I noticed a missing cross-link today (the main pyproject.toml page doesn't mention the [dependency-groups] table): https://github.com/pypa/packaging.python.org/issues/1840

Shouldn't be super complicated to add, but does need more time than I'm prepared to give it just now.

high stone
mystic stag
lusty quarry
#

I would assume so... but if not then virtual attendance for maintainers would be pretty sweet

lyric hedge
#

👍 although given that PyCon itself isn't doing any virtual things, I doubt the conference is set up to support any virtual participation.

#

This summit is in-person only, at PyCon US 2025 in Pittsburgh, PA. It will be held on Saturday, May 17th, from approximately 1:45 pm - 5:45 pm (room 319).

mystic stag
#

Well, I could physically attend, but the value proposition is very different depending on whether it requires a PyCon ticket as that would be potentially more expensive than travel and accomindation combined

lyric hedge
#

I don't go to PyCon, and definitely won't be for the forseeable future.

mystic stag
#

I ask, because I've been to many events that are "at a conference" but don't require a ticket to that conference

lusty quarry
#

travel is quite strenuous for me so it only makes sense if I'm giving a talk

lyric hedge
#

As long as someone takes good notes, I'll be fine. It's not like I actually I do much for the packaging ecosystem :P

high stone
strange knot
brittle token
#

While US travel won't be an option for me any time soon, I must admit that seeing the packaging summit announcement was the first time I minded that I wasn't planning to go to PyCon US this year.

honest geyser
brittle token
honest geyser
#

Alright you're forgiven haha

tidal kiln
lusty quarry
#

^ Obsidian is a great product

abstract nebula
brittle token
blazing lantern
oak quiver
#

I don't think I understand what the proposal is. Is it about standardizing a name, so that tools can automatically discover venvs? Would the standard library venv default to this name, for example?

inland prawn
#

yes

brittle token
#

The two areas I could see potentially causing an extended discussion are whether pip itself should be updated to look for .venv in the current and parent folders when there is otherwise no active venv, and the fact that portable environments are deliberately out of scope.

@blazing lantern Would you be willing to consider proposing a share/venv/venv.json metadata file where we can record a bunch of info that pyvenv.cfg doesn't capture? That way venvs that are incompatible with the current platform could at least be reported nicely. (I generate a venvstacks.json file along these lines in venvstacks so the consuming app can just read relevant relative paths out of that file instead of having to know all the quirks of the way venvs are laid out)

blazing lantern
peak ocean
#

Cross PyPA issue here (that I raised in #setuptools, but it covers many things):

The latest wheel has un-bundled packaging, presumably because setuptools now has native bdist_wheel support, and we don't need to bootstrap wheel into virtualenvs any more.
This means that we can't bundle wheel 0.46.0 inside virtualenvs or Python's test.wheeldata anymore, without also bundling packaging.
That lead me to try to clean this up for Debian, I've filed:

hollow trout
#

Has venv ever installed wheel? I thought it only did pip and setuptools.

inland prawn
#

I think pip, wheel and setuptools were a default set of packages installed in venvs for a long time

peak ocean
lusty quarry
#

setuptools and wheel stopped being automatically installed in virtual environments by default in 3.12 since packages are expected to be built in an isolated environment nowadays by frontends like pip (which is still installed by default in virtual environments that aren't created by UV)

inland prawn
#

virtualenv doesn't install setuptools for some time now

oak quiver
#

(I think you can also configure virtualenv to default to pip-less environments)
(Paper will definitely follow uv's lead on this one)

woven plover
#

im wondering - could a package scheme where instead of src/root/....one would use the toplevel directory src:root/...be feasible
im looking for ways to make sure that the source folder is not directly importable

hollow trout
#

A directory name with a colon in it? That wouldn't work on Windows.

brittle token
#

It's kinda hacky, but putting this in the top-level __init__.py would prevent src.root and parent.src.root imports:

if "src." in __name__:
    raise ImportError("Explain the problem here")
woven plover
woven plover
brittle token
oak quiver
#

yes, this technique is also used by Pip, albeit on a file rather than a folder (__pip-runner__.py, IIRC)

#

Of course, this doesn't defeat dynamic imports that use a string path

inland prawn
#

is there any simple way to "reverse search" the pypi for the packages that depend on my package?

fading veldt
inland prawn
#

I guess project TODO#11145 incoming 🤣

#

reverse package index

fading veldt
#

Yeah I've wanted to write a site like wheelodex, but with a goal of being aggressively cached/as static as possible

queen hornet
vital pier
high stone
#

background grumbling about build dependencies being excluded from those ecosystem dependency sites

queen hornet
#

And dependents

vital pier
tidal kiln
#

and you can also run custom queries against the dataset

vital pier
honest geyser
oak quiver
#

Thanks to the power of git the contents of PyPI takes up only 439.4 GB on disk

I thought I'd heard there are on the order of 20TB of wheels... ?

#

oh, I guess this repository aliases identical files between wheels for a project

vital pier
# inland prawn care to share the numbers?

Sure, keep in mind, these don't mean anything "real" 😉

At the end of Feb this year, the count of LOC in Python on PyPI (space-wise, this is dwarfed by compiled binary artifacts, but that's not the point of this silly exercise) was: 428,967,879,093 lines of text (428 billion, for those who don't wanna count commas).

Assuming 50 lines per page, and 250 pages of an average book (yes, lotsa assumptions!), it's ~34 million books. This is about the same size as the US Library of Congress' book count.

#

I estimate the cost of printing 1 full set at ~$160MM USD

queen hornet
#

Do you have a count of "words"? Then we can use the NaNoWriMo/NaNoGenMo definition of a novel as 50k words

vital pier
woven plover
#

i'm looking for a suggestion on approaching a "breaking the build" situation

it seems setuptools_scm is used in rather broken ways on some old pythons - and if i where to split it into vcs_versioning + a setuptools_scm schim with minimal correct dependencies a lot of old setuptools would just go broke

queen hornet
#

how old Pythons? is it possible to wait until they're EOL?

woven plover
queen hornet
#

wait until 3.9 is EOL and then go ahead and make the changes (give warning already if possible). people using LTS can pay their enterprise distros to patch for them, or stick with the last versions supporting the old Pythons

solemn mica
#

Hi all, I'm helping as a rep of Anaconda with thumbs up from PSF/Ee to coordinate the Python Packaging User Survey for 2025.

Want to sign up as the representative for your organizations? Short timeline on this. Gathering question suggestions until the 7th of May. Group Review 8th May. PyPI final review 9th May. Go live 14th May.

More details in the form: https://forms.gle/Xzg3HvAHKQocZUBQ7

lean lake
#

Anyone else had pip issues with python-docs-theme of late?

 INFO: pip is looking at multiple versions of python-docs-theme to determine which version is compatible with other requirements. This could take a while.
doc_build: exit 1 (3.94 seconds) /home/runner/work/bandersnatch/bandersnatch> python -I -m pip install -r requirements_docs.txt pid=1931
  doc_build: FAIL code 1 (4.30 seconds)
  evaluation failed :( (4.44 seconds)

Bandersnatch is failing like so: https://github.com/pypa/bandersnatch/actions/runs/14570894343/job/41096183820

I can't repro locally.

#

Am I trying wrong?

cooper@cooper-fedora-MJ0J8MTZ:~/repos/bandersnatch$ python3 -m venv --upgrade-deps /tmp/tb
...
cooper@cooper-fedora-MJ0J8MTZ:~/repos/bandersnatch$ /tmp/tb/bin/pip install -r requirements_docs.txt
...
Successfully installed ...
oak quiver
#

I'm not sure what you expect the pid=1931 part to do. I would expect it to try to locate version 1931 of a package named pid, and install it in addition to whatever's specified in the requirements file. As far as I can tell, there is a package with that name on PyPI, but it's only up to version 3.0.4.

#

(If you meant to set an environment variable, that would have to come at the start, before python on the command line)

#

Ah, I see, that's part of what the CI for bandersnatch is trying to run? Maybe there's something wrong with the GH workflow or something. Probably better for the #bandersnatch channel; I don't know anything about that program really

lean lake
#

It's very niche ... even I don't use it anymore ... (mirrors PyPI locally for pip etc.)

queen hornet
queen hornet
lean lake
#

O many thanks

#

I'll sit down and remove the unmaintained swift support for bandersnatch so we can unblock 3.13 doc building

#

Wonder if pip could say package doesn't support puthon3.X there more explicitly?

oak quiver
#

ah, my fault I'm unfamiliar

woven plover
#

im thinking of creating a meta build backend that can use different backend configurations/dependency sets for vcs/sdists to solve the bootstrapping problem for sdist
the idea would be to enable bootstrap by providing the metadata and a way to compose a reproducible wheel as simple and possible, but also provide more details for backends that support it

my goal would be to have something that produces a sdist, which contains a dist-info-for-source metadata folder + something that maps source artifacts to locations good enough to enable a editable install and/or a wheel in a minimal non-validated bu reproducible manner

later it could be expanded to still invoke other build backends but passing along its metadata

currently the main pitfall would be that layering of get_requires_for_wheel/sdist multiple levesl is not supported - a wheel for meta-backends may be needed

queen hornet
lean lake
bold patio
#

specifically about sdist generation, like who generates them

woven plover
#

latest victim importlib_metadata vs setuptools_scm (due to a bug in python pre 3.10)

#

now what if instead of the normal backend with all deps, there was a meta backend, that if the sdist was provided with specific metadata could build the wheels with practically zero dependencies

#

then at least when going via sdist everyhting could build safely

bold patio
#

and the meta-backend itself wouldn't have any dependencies?

woven plover
#

correct

#

for vcs based buidls it would use the hooks to get the real backend

#

and for the sdist build the metadata would need to be part of the sdist in a consumable minimal way

bold patio
#

at least for us, we have a dedicated fetch phase so any and all network access to pull resources et al have to happen there, and never during any other phase

#

so when we deal with go and rust (cargo), we literally specify every dependent go package/cargo crate for the fetch phase, then rearrange everything so the respective package managers understand the resources are already present and accounted for (before the build phase)

lyric hedge
#

does anyone know of any good resources that describe the new pylock.toml that isn't the formal specification?

uneven karma
#

Has there even been enough time for anything to be created? firT
Astral just released their initial support for it in uv, you might check their docs

honest geyser
#

I don't think we wrote much

lyric hedge
#

Yea, it's not anything better that's in the pip docs/what I'm writing.

oak quiver
#

re pylock.toml, I could possibly be interested in writing something, since I want to implement installing from them anyway

#

oh, also about the meta-backend idea, I had this idea for bbbb to (optionally) build sdists that include the wheel-building code as an in-tree backend.

lyric hedge
lyric hedge
#

I'm likely going to build a script to try to build the top $number Pure Python packages on PyPI and flag those that are missing a pyproject.toml (especially those that don't build correctly with the fallback pyproject.toml pip synthesizes when PEP 517 is forced) in an effort to aid the deprecation cycle of setup.py bdist_wheel. I know that people have done similar experiments before so is there any infrastructure that's publicly available?

#

Otherwise, it can't be too hard to build. Simply pull Hugo's top PyPI list, download the sdist and figure out whether it's pure python and missing a pyproject.toml, and attempt the build. Throw that in a docker container on a VPS and let it do its thing for a little bit.

fading veldt
lusty quarry
#

the logical next question would be whether it's ethical to send an email to those maintainers...

woven plover
lusty quarry
#

personally I would agree but from what I've seen folks would be more likely to take issue with an automated PR rather than an email, even if the change is completely correct and passes tests

woven plover
jovial holly
#

Consider a notice on upload like PyPI does for deprecations?

#

I’m not entirely familiar with what the issue is, but if it is a breaking situation, we have that option

woven plover
jovial holly
#

again, i'm not familiar with the specifics of what breakage is anticipated if it is truly catastrophic enough that 1) it needs fixed on a horizon of less than a year 2) it requires action on the behalf of the maintainer... then why not propose blocking non compliant uploads to PyPI.

maybe worth filing an issue on pypi/warehouse explaining the situation regardless

#

i'm trying to scroll back but it is non-obvious to me what precisely the issue is

woven plover
#

the problem is that pip will start to fail to build packages that potentially are on pypi since ages

#

aka some of them build ony if a very legacy setuptools build chain is used - and the new path will fail some of those that are wonky

jovial holly
#

when is this pip expected to release?

fading veldt
#

pip 25.3, so presumably later this year

jovial holly
#

thank you emma

#

given what i've read in both cases, i really would strongly lean towards adding a warning on-upload given that there is still some amount of time. that is going to put it infront of a lot-more people's faces who are actively maintaining projects and probably move the needle a lot further.

like ultimately people who have "abandonware"'d their projects on PyPI are unlikely to come back to fix what is ostensibly a client decision.

#

if there's a simple test that PyPI can run on upload, it's practically free.

#

porting the excellent guidance for both situations to packaging.python.org and we can link that to the maintainer. kind of a no-brainer win IMHO

fading veldt
jovial holly
#

we email them post-upload! so even automated get the notice

fading veldt
#

Oh good point!

jovial holly
#

i wish we had a talk-back on upload 🙂

fading veldt
#

You could also check that the pyproject.toml file has a [build-system].build-backend key too

woven plover
#

like dont allow upload sessions to complete unless the errors are solved or aked

jovial holly
#

if someone is willing to write the issue, provide the copy, and a way of testing for when to notify i am happy to do the drudgery of implementing

jovial holly
#

i doubt upload 2.0 ships before pip 25.3

woven plover
oak quiver
#

(as long as the build is trivial enough to figure out without context)

oak quiver
#

but yeah generally I'd be more concerned about projects not uploading new patches that are compliant. If they're trying to keep using the same legacy process it's easy enough to complain about the result. but if they've abandoned the project and it's popular... ugh

#

and end users meanwhile need better ways to find out "this project hasn't seen an upload in X years (and also didn't follow new practices in the latest release, but that's usually kinda redundant)"

lyric hedge
#

If PyPI is going to go in the direction of warning on upload w/o a pyproject.toml, I'd strongly prefer if the warning doesn't just focus on the current pip deprecations. Yes, that's the main concern due to the risk of breakage, but there are other benefits associated with adding a pyproject.toml.

#

I don't have time today to look or think about this further, but I can follow up on this sometime probably tomorrow /cc @jovial holly.

#

I'm generally in favour of using PyPI to "encourage" the ecosystem to adopt pyproject.toml and therefore modernize the ecosystem, but I don't want PyPI to be caught up in a PR mess because we communicate the what/why poorly.

steady lion
# lyric hedge I'm likely going to build a script to try to build the top $number Pure Python p...

With the rough heuristic below, I took the top 10,000 packages* (actually 9,338, some don't have sdists). Of these, 8,429 are pure-Python. Of this 8,429, 8 don't have a PKG-INFO file, 4,130 don't have a pyproject.toml file, and 468 have a pyproject.toml without a a [build-system] table. This gives ~54% (4598/8429) that use legacy (non PEP 517/518) building.

For this, I have a pre-processed repo of the top 10k sdists (1.5G download, ~16G on disk):

$ git clone https://github.com/AA-Turner/top-pypi-sdists-full
$ cd top-pypi-sdists-full

I then ran the below:

from pathlib import Path
projects = list(Path('extracted').glob('*/*'))
count_pure = 0
no_pkg_info = set()
no_pyproject = set()
no_build_system = set()
for p in projects:
    pure_python = True
    for f in p.rglob('*'):
        if f.suffix in {'.c', '.cc', '.cs', '.c++', '.cpp', '.f', '.go', '.java', '.rs', '.s'}:
            pure_python = False
            break
    if not pure_python:
        continue
    count_pure += 1
    if not p.joinpath('PKG-INFO').is_file():
        no_pkg_info.add(p.name)
    if not p.joinpath('pyproject.toml').is_file():
        no_pyproject.add(p.name)
        continue
    if not b'[build-system]' in p.joinpath('pyproject.toml').read_bytes():
        no_build_system.add(p.name)
        continue

print(f'Total projects: {len(projects):,}')
print(f'Pure-python projects: {count_pure:,}')

print(f'Projects with no PKG-INFO (n={len(no_pkg_info)}):')
for proj in sorted(no_pkg_info, key=str.casefold):
    print(f'    {proj}')
print()

print(f'Projects with no pyproject.toml (n={len(no_pyproject)}):')
for proj in sorted(no_pyproject, key=str.casefold):
    print(f'    {proj}')
print()

print(f'Projects with no [build-system] table (n={len(no_build_system)}):')
for proj in sorted(no_build_system, key=str.casefold):
    print(f'    {proj}')
print()
lyric hedge
#

how up to date is the pre-processed repo?

steady lion
#

This probably does overestimate the number of missing files, because it only looks for the pyproject at the top level, and it is very aggressive on pure-Python (e.g. a C++ file in tests means it isn't pure-Python)

steady lion
lyric hedge
#

Ah very awesome, this will probably be quite helpful then!! Thank you so much!

#

I'll take a look and automate the build/flagging part and throw it onto a VPS probably tomorrow.

steady lion
#

I planned on posting about it to Discourse at some point, the idea is that it's easier to update with git pull, and you can use e.g. rg with it. Git has pretty great compression, too, so the download size is much smaller than downloading each sdist individually.

lyric hedge
steady lion
#

I was surprised to see the number was so large, though, to be honest I expected more adoption of pyproject

lyric hedge
#

If you filter by packages that have seen a release in like the last year, I'd imagine (and hope) the percentage would be better,

steady lion
#

What this doesn't record is when the project was last updated, but all are in the top-10k for downloads.

lyric hedge
#

I can probably use the PyPI APIs to figure that out.

#

We have time, at least six months assuming the deprecations don't get pushed back. I'll collect some data and decide what's the best way forward.

oak quiver
#

I'd also be interested in projects that are pure Python but don't publish a wheel.

steady lion
#

Filtering to those with an upload in the last 365 days:

Total projects: 6,033
Pure-python projects: 5,289
Projects with no PKG-INFO: 4
Projects with no pyproject.toml: 1,837
Projects with no [build-system] table: 367