#wheelnext

1 messages Β· Page 1 of 1 (latest)

lucid pike
#

πŸ‘‹

short epoch
#

πŸ‘‹

knotty elbow
#

πŸ‘‹

lucid pike
#

Hello all! See the above GH link for most of the details we have written down atm. A bunch of folks are collaborating on evolution of the wheel spec, variant support, and other stuff.

#

This is very much a community-driven fully open initiative to evolve the wheel spec, produce PEPs, reference implementations, etc.

sick nest
#

A thing from the top of my mind: top_level.txt should be standardized. importlib.metadata.packages_distributions() relies on it. See also: https://discuss.python.org/t/record-the-top-level-names-of-a-wheel-in-metadata/29494

Any better place where to bring that up for wheel-next?

toxic wraith
#

Reading through the thread as context, I think this is a great idea. I think I agree with Thomas, Ofek, and others that re-using Provides seems the most appealing.

toxic wraith
nova knoll
hard escarp
#

For anyone else like me that so rarely looks at Discord channel headers I barely even recognised the screenshot that mirrored the current contents of my Discord window: https://github.com/wheel-next πŸ™‚

GitHub

wheel-next has 14 repositories available. Follow their code on GitHub.

river linden
lucid pike
#

Ah, looks like on iOS at least, you have to first click on the > next to #wheel-next

toxic wraith
scenic pulsar
#

I wish that foo-version.dist-info folders could be shuffled to be named dist-info/foo before the next standard

short epoch
#

sounds kinda reasonable

tawdry raven
#

If that happens, it will mean that old installers will not be able to locate the WHEEL file and thus determine the version.

toxic wraith
#

Yeah, I think for that reason the format of "dist-info folder in a zip file with metadata" was one of the invariants I've listed to keep.

hard escarp
#

Depends on the compatibility story, though. If it's "emit wheel 1.x wheels if they work for you, otherwise emit wheel 2.x wheels and accept some clients won't be able to install them", then it may not matter too much how the 2.x wheels fail on older clients, so long as they do fail.

toxic wraith
#

Current resolvers will still pick a 2.0 wheel even though they can't install it, so that is something that needs to be handled before any 2.0 wheels start existing on pypi.org. Otherwise people will see CI breakages and downstream get error messages saying "this wheel is unsupported" with no actionable information. I don't want a Python 2->3 for wheels, and I expect most users do not want a breaking in change in wheels with no clear migration path for users.

pulsar narwhal
#

Is that just an issue with current resolvers not knowing how to handle 2.0 wheels? Does this need more metadata exposed by index, or is this something that can just be solved by contributing what's missing to various resolvers?

short epoch
#

don't know about other options, but installer is hardcoded to fail installation if wheel version is not 1.x. And IIRC indices don't expose wheel info other than file itself and file hash, sometimes METADATA

toxic wraith
#

All installers MUST fail on installation of a wheel of a major version they do not support per the wheel 1.0 spec. Currently @short epoch is right that the version is not exposed at all unless the resolver can get the WHEEL file either through a range request or downloading the wheel. But no resolvers as far as I am aware check for wheel compatibility when resolving wheels.

junior tundra
short epoch
#

true

toxic wraith
#

Yeah, this is why I think it is important to maintain the outer "zip file of metadata". the actual package data (which will take up most of the size anyway) could be something completely different

scenic pulsar
#

Is there anything stopping new filenames for future proof wheels

pulsar narwhal
toxic wraith
#

I mean old installers won't recognize it, so there are issues with doing so

pulsar narwhal
#

so, do we just need installers to actually fail predictably for things outside of what they support?

#

because that seems like something they already should have been doing

scenic pulsar
#

It's slightly preferable that at least the resolvers keep trying until they find a installable one (and warn that the newer was not installable)

pulsar narwhal
#

But if they havent already been doing that, then people would need to update their resolver anyhow right?

toxic wraith
#

I think defining semantics around what to do with unsupported files is really hard and starts to restrict what you can do in the future. If someone does pip install somerandomfile.extension. I don't want pip assuming that .extension files are potentially some future wheel variant, but at the same time, I don't want to have older installers break if I change the extension. Somewhat of a catch-22

#

*break in confusing ways

pulsar narwhal
#

I've written resliant to change binary protocols before, the easiest thing to do here is if anything isn't as expected, you assume you aren't capable of handling it.

#

so there's currently multiple layers of that, if you get something that doesn't decompress in an expected way, not a wheel you support.

If you get a wheel file with unexpected content, you dont support it
If you get to the point where you can parse a wheel file, and it's a version you don't support, you don't support it

toxic wraith
#

Oh absolutely, but there is a more fundamental issue of what to do if you don't know if a file is a wheel or not

#

That is an issue if you want wheels to fail reliably on future wheel versions if the extension changes

#

so I think as a practical matter we should not change the extension

#

anway, I have a lot written up for a wheel 2.0 PEP that goes over issues like this, that I hope to publish in a few weeks

pulsar narwhal
#

agreed on not changing the extension, but I do somewhat wonder if the choice to base the initial spec on a zip file is going to create any problems. it would have been easier here if the format had it's own semantics and a version field at a fixed offset from start of file, even if the inner content was "just an embedded zip file" at first.

scenic pulsar
#

As far as I'm aware nothing forbidden ensuring the first few entries in a file are the metadata, then one wouldn't even need the index to do opportunistic fetch of the metadata

#

Personally I found custom nested formats more painful, whl is super easy to create and debug with preexisting tools

#

It's literally just a zip of what would be installed

short epoch
#

How about we define the list of compression formats to try in order?

scenic pulsar
#

Please don't make it complicated, just keep the zip file

pulsar narwhal
#

the zip file, and the metadata being inside it with compression, means that the metadata can't be reliably gotten in a forward compatible manner

#

it doesn't have to be complex to change that, but there's no way to benefit from picking a better different compression to standardize on if a requirement is that resolvers give friendly error messages involving the required wheel version other than forcing the index to provide that info.

#

I'd be fine with the error message quality being sacrificed here tbh, I'm fine with the error message just being "hey, we don't know how to support this file, the assumption is that it is a newer wheel version, but it could also be corrupted"

#

but with a zip file, there's no forward compatible option here. what if 2 years from now we're all using foo_compress because of some miracle savings over zstd? tools today don't even know about foo_compress, it's a mythical future. We can't tell them they have to try every compression method, that version will still fail to extract the info needed for the metadata version.

#

(for the record, I'm fine with keeping it a zip file, I'm just pointing out where the choice of a zip file means it can't reliably get the metadata in the future unless we force the index to handle it for supported wheel versions)

toxic wraith
#

I think we'd probably want to do something similar to debian files, where in wheels the metadata is kept un-compressed in the zip file, and the actual data can be in a separate archive format (e.g. tar.zst)

#

This allows keeping the old metadata but adopting new compression formats

scenic pulsar
#

I vaugely remember that Zip supportes per file compression Format choice

pulsar narwhal
#

it does, but that still requires downloading a lot more to determine the version.

toxic wraith
#

i.e. inside the zip file, you'd have a .dist-info folder, then next to it package_data.tar.zst (straw man name, please don't bikeshed πŸ˜… ) for the package data.

toxic wraith
scenic pulsar
#

How about standardize to have metadata as first entries and providing the range details in index data

#

Then fetching metadata would just be a partial download

pulsar narwhal
toxic wraith
toxic wraith
pulsar narwhal
#

yeah, ship sailed on that. I do think it could have been done in a way that remained easy to work with if we all agreed that the data section of the file needed to remain easy to work with, and that the standard library would have a way to invoke python -m some_mod some.whl to get at that easy to work with inner file

#

especially cause that could have just been "header with metadata, append zip file"

toxic wraith
#

sure

pulsar narwhal
#

so I guess the questions of how to get to wheel 2.0 is

how should resolvers fail wheels the installer can't handle?
do resolvers need any metadata that isn't already in pep658?
which resolvers or installers in use need changes?

I imagine your draft covers these, but if you have an answer for the third one of those, I may be able to put some of my time into helping address that.

toxic wraith
#

Yep! These are exactly along the lines I've been thinking. The third question seems like it shouldn't be covered in the PEP to me (perhaps discussed in the compatibility section). But I expect the answer to be "all of them"

pulsar narwhal
#

yeah, I think the 3rd is more a consequence of the first two + the current state. probably something that needs addressing for the most popular resolvers before pypi can accept 2.0 wheels even if the pep doesn't end up mandating that.

scenic pulsar
#

i suppose the idea being to grab the last "chunk" of a archive to get both the zip index and the metadata

toxic wraith
hard escarp
#

On the file naming front: if the wheel major format goes in the extension (i.e. *.whl, *.whl2, *.whl3, etc), then older clients won't even see the wheel versions they don't support (since they won't be looking for that extension). With build tools emitting the oldest wheel version that provides all the features the project they're building requires, this frees up major iterations of the spec that add significant chunks of functionality to also clean up cosmetic details when doing so is deemed worthwhile.

The difference between this and the py2 -> py3 transition is that installers will be able to happily support both whl and whl2 installs. Projects would also be able to choose whether to fall back directly to sdist from the whl2 format, or whether it makes sense to offer a whl when whl2 isn't supported (for example, if whl2 has better dynamic linking support for extension modules, a statically linked whl may be a valid fallback option).

quasi robin
#

If it’s whl whel wheel wheeel etc I’m on board

tardy crown
lucid pike
#

@toxic wraith I was wondering over the weekend whether version pinning could partially solve the "you started releasing 2.0 wheels but I don't understand them" problem? Meaning, I would expect a package to start releasing 2.0 wheels after something like a major version bump (yeah, I know, can't count on proper semvering), but in any case, a project can't release both 1.0 wheels and 2.0 wheels at the same time. So if you have a dependency that uses 2.0 wheels, and your install starts to fail, you can just ceiling the package version, and then that old installer will never try to grab the 2.0 wheel. Not super convenient but workable, I think!

toxic wraith
# lucid pike <@288228162008252416> I was wondering over the weekend whether version pinning c...

I don't think 2.0 wheels should require a major version bump in semver of the package, that would probably hurt adoption (people don't like to do major version bumps) and seems leaky abstraction-wise.

Pinning does solve the issue of incompatible wheels, but only after running into the problem. A lot of my design thinking about this is how to make the UX of the transition as painless as possible. Part of that is to avoid a 2.0 wheel failing to install because of an old installer, pretty much ever. I do think that @hard escarp may be right that the correct path is to use an entirely different file extension so there is no chance of conflicts. I need to think about the implications of that. Another important constraint is I don't want to have to have a new file extension for each new feature, that would get very confusing quickly.

lucid pike
#

It's possible that's the way out, but agreed you don't want a new extension for every new feature. I'm also concerned about what happens for wheel 3.0 (if that is ever needed). Do we just keep bumping the extension?

It's true that you can only pin after the failure, but it's a failure of your installer's compatibility so the other solution is just to update your installer, but that does leave a window between the time 2.0 wheels are uploaded and your installer of choice gains support for the new format.

toxic wraith
#

Yep, I think the most important part of defining the transition process between wheel 1.0 and 2.0 is the gap where most users' installers don't support 2.0

lucid pike
#

Let's add support to the installers before the package managers then!

hard escarp
#

The way I've been thinking about the major version bump: if a project can already ship wheels, then they would probably prefer a minor version bump where an installer ignoring a new wheel feature doesn't break the install. That suggests the primary intended audience for a new major wheel format version would be projects that only ship sdists because wheels don't work for them.

Otherwise there would need to be some huge improvement to counterbalance the "lower levels of installer support" downside that would exist for some time, and I must admit I struggle to picture what could be that beneficial.

#

Another possibility that occurred to me is that if the major benefit was an opt-in mechanically reversible trade-off (e.g. internal xz compression at the expense of giving up direct import support), and parallel publication was supported, tools could emit and upload both formats during the transition period before eventually dropping the legacy 1.x wheel.

toxic wraith
#

There are a few conflicting considerations I see:

  • if wheel 2.0 can be published in parallel to 1.0 wheels, projects suddenly take up twice the space on pypi, which is already an issue
  • if we don't allow this, either
    a) 2.0 wheels are selected by old installers and raise an error which likely will be unactionable (older installers don't know what to recommend if they encounter a new wheel, except to try to update which may or may not help) or
    b) they aren't selected and users don't see updates if their installer is older. It would also be unfortunate if this happened silently

I think not selecting them but emitting a warning that such a choice happened is the most reasonable choice to make. I also think whether or not to allow side-by-side wheels is a question of whether or not the pypi admins think it is sustainable

hard escarp
#

Variation on b) for projects that publish sdists: --only-binary and --prefer-binary on older clients don't see the updates, source-allowed installs start building from source instead of using the no-longer-recognised wheels.

#

As far as the allowing-side-by-side-distribution question goes, the one carrot I could see making that worthwhile for PyPI in the long run is if the eventual promise is wheels getting much smaller on average (which implies the two-layout solution: the importable layout needed for bootstrapping use cases like ensurepip, and a more agressively compressed one which allows for everything other than the wheel metadata to be shipped inside a nested xz compressed tarball).

toxic wraith
toxic wraith
hard escarp
#

Hah, I'd genuinely forgotten about that option. I think it came up way back when Daniel Holth first raised the prospect of wheel 2.0.

short epoch
short epoch
hard escarp
#

We've done it before. Linux wheels were defined for private use long before they were accepted on PyPI (and for comparable reasons: in private use cases, you could make sure they were only installed in the environments they were designed for, even though the pre-manylinux Linux wheel tags were hopelessly ambiguous about the system ABI they needed)

#

It's certainly not a great option, but it's an option worth considering.

toxic wraith
lucid pike
#

If installers communicated to PyPI the version of wheels they supported, PyPI could potentially adjust the Simple API view they present to such installers. It could possibly leverage some of the multiview approaches being proposed by the variants work.

Other than that, what installer support would be considered "enough"? Is it enough that say the top 3 installers in use support the new standard, or that the top X% of installers talking to PyPI support it?

toxic wraith
#

I would probably view it as proportion of all downloads over a time period use installers that support it, or something like that

#

A multiview would still double the size of projects pypi.org hosts, and I'm not sure if that's a reasonable ask.

#

On the other hand, while it would double the storage, it would show immediate decreases in bandwidth comparatively if 2.0 wheels could use xz or zstd

lucid pike
#

Yeah. It might be a wash. The storage doubling problem might only effectively be a problem for packages that are already bumping up against project size limits (i.e. doubtful pure-python packages would have much of an impact).

In any case, I think installers generally should start communicating wheel version compatibility to PyPI now, just in case it's super handy later (if they don't already).

gray spindle
#

someone somewhere suggested tying the transition phase to python version. this has some nice properties... you expect to have to update things for new python anyway, no one's existing setup will be broken, you don't double storage, most of the projects that would benefit from a wheel2 will probably be uploading new wheels for new python anyway. especially if there are stdlib changes

toxic wraith
#

Do you mean that wheel 2.0 wheels would only be installable on newer Python versions? I don't particularly like that, as I think it would lengthen the adoption window, and be confusing for users

#

Some people will want to adopt the new wheel format right away, and by tying it to a Python release it would mean they'd have to wait however long until the next release is

gray spindle
#

as a starting place, i meant that build tools default to wheel1 for wheels that target old python and wheel2 for wheels that only target new python versions

but i think e.g. if we use zstd and include zstd in stdlib, wheel2 being only installable on newer py versions might well make sense, otherwise to my naive mind bootstrapping pure python installers might be slow or fiddly

i also think it's good for the adoption window to be legible. most folks know what pythons they intend to support, but i don't think many folks have a sense of what installers their users use

toxic wraith
#

So part of my goals with wheel 2.0 is to not need everything changed all at once. I'd like zstd support to be a follow up PEP that won't require a new major version

icy trench
toxic wraith
#

Yeah, I have been mulling that over more. I want to think through it a bit more and weigh the pros and cons.

icy trench
#

I have an email thread somewhere with @glossy cradle and @cedar iris about exactly this... that I meant to respond to... but then it got awkwardly delayed... and now it'd be weird to respond there. πŸ™ˆ

cedar iris
#

Just spotted this thread (thanks for the ping @icy trench!) I find tying this to Python version very weird, and I've not been able to think through how it would work cleanly. I feel like it might create more problems than it solves.

cedar iris
icy trench
#

It's a saner request to be like "gimme the first 1000 bytes" or so, rather than "the last 1000 bytes" but I also don't know how the zip format works. πŸ˜…

toxic wraith
#

The wheel spec currently recommends putting .dist-info at the back, since zip is designed to make mutation at the end easier

#

(basically you can add a new file/directory record and re-write the listing and be done)

#

mutating the metadata of a wheel seems like a rarish thing to do, so I feel uneasy making this recommendation going forward, but perhaps someone has more context on why it is this way

#

Also @icy trench the way zip works, you actually cannot just read the first 1000 bytes, because technically (though very unlikely) a zip file could list an entry as deleted, so unless you check the central directory record, you just don't know if what you're reading is right or not.

#

zip is a very odd file format

icy trench
#

That's good to know!

cedar iris
#

The fact that the zip format is rooted at the end rather than the beginning is what makes it possible to append an executable at the start (a zipapp script wrapper, or a self-extractor, or whatever). The format is weird, but in a useful way.

toxic wraith
#

Oh absolutely!

frank hound
#

Is using JSON for metadata on the table? (Spent quite a while fighting EmailMessage in pyproject-metadata).

toxic wraith
#

I think it is unlikely because I would like the metadata format to be backwards compatible

#

There is a lot of tooling that assumes EmailMessage and I don't really want to boil the ocean on that

frank hound
#

The current metadata format is soo bad, though.Things like multilines and unicode are a pain to have to hack around for each tool. You'll notice most backend can't even use EmailMessage to write it, only to read it.

#

Also, is negative extras also being discussed? There was a recent issue on packaging-problems, but don't think it got into the wheel-next repo.

toxic wraith
#

Currently wheel-next is as far as I am aware focused on the wheel format and metadata changes needed for changing the format to better support various improvements to the wheel format. I think negative extras could be it's own standalone PEP regardless of the wheel format, right?

frank hound
#

I think negative extras require new metadata or changed metadata, which seem like it might make since for wheel-next?

toxic wraith
#

I mean changes to the metadata are somewhat orthogonal to the wheel format aren't they? They also touch sdists

#

(sorry early send)

#

The wheel spec only says "{distribution}-{version}.dist-info/METADATA is Metadata version 1.1 or greater format metadata."

frank hound
#

Ah, true.

toxic wraith
#

I could see an argument that a new file format makes it easier to make the transition to a new metadata format, but I also don't really want to sign up to write another PEP at this time, I already have 4 or so stewing in various states πŸ™‚

frank hound
lucid pike
#

You could probably have a multiyear transition period from RFC 2822-ish format to JSON where both were included in the wheel. And/or automatic-ish conversion between the formats. But yeah, that's also not a PEP I'm going to write!

hard escarp
#

Metadata 2.1 defined a canonical translation from the key:value header format to JSON (after metadata 2.0's failed attempt at going JSON-only): https://peps.python.org/pep-0566/#json-compatible-metadata

So a new wheel PEP could just reference back to that and use JSON internally for the metadata files. However, any new metadata.json file would need to be shipped alongside the existing METADATA key:value file, otherwise you would lose the "just unpack the .dist-info folder" compatibility with the installed package database spec.

#

Some PEP is going to have to be the first that says "Ship both files, make sure they say the same thing" if we're ever going to be able to ditch the legacy key:value format.

#

The wheel 2.0 PEP could also recommend that installers generate metadata.json from METADATA for the wheel 1.x format.

Then some future wheel 3.0 PEP could ditch METADATA entirely.

frank hound
#

Hmm, so would it make sense for #pyproject-metadata to produce both formats? (that is, at least provide the tools to do so easily?)

#

I see there doesn't seem to be a standard name for it, just a way to convert it

short epoch
#

any reason to ditch METADATA in favour of json? not sure what the gains would be

toxic wraith
#

@short epoch #wheelnext message

I think the biggest advantage is having explicitly terminated strings.

Plus it would be better for cross language tooling. If a build tool is in another language, it may not have an email parser, but it almost certainly will have a json parser.

frank hound
#

And the email parser module in Python is a mess. Riddled with TODOs, strange inheritance, etc. And the parts we use don't officially support unicode, that has to be hacked in.

#

The format is basically "RFC 822" but with uncode and indented multiline strings" - when you modify a standard, it's no longer a standard πŸ™‚

lucid pike
lucid pike
nova knoll
#

Just just take a break real quick and go create email-next
Once we’ve got that wrapped up we won’t have any more issues here!

toxic wraith
#

oh that should take no time at all, I have the IETF on speed dial!

tawdry raven
#

You already have the JSON metadata spec as a base. Just need to extend it to email messages.

lucid pike
hard escarp
hard escarp
glossy cradle
#

project urls should be a dict 😦 I think that's the main thing I dislike

toxic wraith
#

I spent most of today re-writing a draft of a wheel evolution PEP, based on the idea of using a different file extension for wheel 2.0 (and onwards). Now that I am done and went through the rejected ideas I think I am starting to be of the opinion it would be better to use *.whl.

#

My main qualm is that unexpectedly getting an sdist is probably quite bad

hard escarp
# glossy cradle project urls should be a dict 😦 I think that's the main thing I dislike

True, we added a special case for the comma-separated list in Keywords, but missed the comma-separated key,value pair in Project-URL. Maybe we could add an optional parallel project-urls field to the JSON, and then eventually deprecate the oddly formatted project-url list? A similar trick could be used to migrate from maintainer-email and author-email JSON strings to maintainer-emails and author-emails JSON lists (those are also comma-separated lists, but they didn't get special cased the way keywords did).

lucid pike
#

I like the general move to JSON for these kinds of largely inter-machine formats. TOML is great for humans, JSON terrible for humans, but JSON is fine for machine-to-machine.

tardy crown
short epoch
#

(while we are at that, we could consider moving all* meta files in wheels to JSON)

gray spindle
#

there are some confidence inspiring words in the spec πŸ˜‰

However, email formats have been revised several times, and exactly which email RFC applies to packaging metadata is not specified. In the absence of a precise definition, the practical standard is set by what the standard library email.parser module can parse using the compat32 policy.

frank hound
bronze bear
#

from trying to figure out the correct format of METADATA files, i'd strongly favor switching from the email-ish format to json in a next wheel iteration

hard escarp
# short epoch (while we are at that, we could consider moving all* meta files in wheels to JSO...

RECORD is a well-defined CSV format, and INSTALLER is a single line of text, so they wouldn't gain much from a migration to JSON.

Similarly, while entry_points.txt and EXTERNALLY-MANAGED aren't as well defined as a JSON document would be, ini-files aren't as dire a mess as email header parsing.

That leaves METADATA as really the only file where the status quo is so bad that a migration effort might actually be worth it. (We're already trying to contain the file format problem to these 5 files by making newly standardised files JSON)

bronze bear
#

from my perspective, we'd ideally have an archive of a single json file with all metadata and a directory with the module that is being installed.

#

both from the perspective of a wheel writer and a wheel installer, we only need the information in METADATA plus maybe a wheel version

glossy cradle
#

I don't think we'd want a single json file, the different files serve different purposes and having them split out is useful I think

#

INSTALLER for instance doesn't come from the wheel, it comes from the installer, so a single json would mean that the installer has to mutate the file that comes from the archive

short epoch
bronze bear
toxic wraith
tawdry raven
#

I'm pretty sure that's possible. You just put the entire thing in quotes and double every inner quote.

toxic wraith
#

Oh interesting! Maybe I don't understand the CSV parser implementation well enough, I didn't realize it supported quote character escaping.

hard escarp
#

The multiple files are also there so checks for optional metadata can be done with a single stat call.

That said, entry points are potentially a decent candidate for merging into the main metadata file.

toxic wraith
#

Agreed, that may be something I add to the wheel 2 format PEP

lucid pike
#
frank hound
toxic wraith
#

You'll be happy to know that I hope to publish the wheel evolution planning PEP later this week πŸ™‚

bronze bear
frank hound
bronze bear
#

there seem to be different indent lengths and i haven't found a reference which one is the correct one; i saw the two version in https://github.com/pypa/pyproject-metadata/pull/150/files#diff-7d938dbc255a08c2cfab1b4f1f8d1f6519c9312dd0a39d7793fa778474f1fbd1L135-R141 and another tool (hatchlign or poetry i think?) had a third style; i went with pyproject-metadata out of pragmatism but i'd prefer something specified with common library support (e.g. a serde plugin); serde or toml would be the default choices here

#

i haven't looked into attestations at all yet - do you want to file a feature request with some context on motivations and the current state of pypi support?

#

that is PEP 740, right?

frank hound
#

In the current 0.9.0 betas for pyproject-metdata, it indents to the width of the field name, based on setuptools. I think any indent causes it to be fine for the parser, and the parser keeps the indentation, so the consumer has look for common indentation and remove it. But I'm now wondering if there's some escaping mechanism that I'm missing (hatching seems to miss it too, if so).

tardy crown
frank hound
#

Ahh, good point about PyPI - yes, that happily parses it and adds a ... after some point. So I guess I should expect that test to pass. Great, that means I need actual importlib.metadata.metadata in testing.

toxic wraith
#
prime marsh
#

@toxic wraith just want to say thanks for putting up the proposal. I know the conversation is a bit derailed and frustrating, but I'm appreciative of the work you've put in.

toxic wraith
#

Thank you! I really appreciate the kind words.

tardy crown
#

And a reminder you don't have to feel obligated to reply to everyone

toxic wraith
#

True πŸ™‚

sick nest
toxic wraith
lost nest
#

Reading a single file from a zip is quite cheap, you usually only need two read calls (one for the directory at the end, one for the actual file). Downloading the entire zip file is not necessary if the web server supports Range request. I wrote a wheel inspector once that would extract metadata from wheels from pypi without fully downloading them. If there is version info embedded as a file-tag (an empty file with a special name) that that would be just one read call.

toxic wraith
#

Yeah, both pip and uv (and I'm pretty sure poetry too) use range requests if the metadata isn't available via PEP 658

short epoch
#

yeah, Poetry does range requests

frank hound
#

Is there a discussion thread for PEP 771? I was just going to ask if an implementation in pyproject-metadata would be useful, I could probably put that together pretty quickly.

fathom lintel
toxic wraith
#
stone lodge
#

@restive vessel - this is the Debian/Ubuntu spec for multiarch: https://wiki.ubuntu.com/MultiarchSpec
see specifically the "Binary package control fields" and "Extended semantics of per-architecture package relationships" sections. The problem space they have is a little more complicated than what wheelnext is currently pondering, in that they need to support simultaneously installing both (e.g.) i386 and x86-64 libraries. (It may be the case that this is worth pondering for Python, actually.) So there are two axes, whether a package is designed to not conflict with other architectures of itself (no overlapping files) and they can be co-installed, and whether a package can satisfy a dependency from another package of a different architecture. That second axis can be expressed as "this is always true" or "this is true if the depending package opts in to it being okay".

They only have one one dimension of variant, though.

(and you might find https://wiki.debian.org/Multiarch/TheCaseForMultiarch interestingly familiar πŸ™‚ )

short epoch
toxic wraith
#

it was renamed to wheelnext (all one word)

toxic wraith
frail portal
#

@pseudo galleon here, here

sturdy imp
#

Hello everyone, I just joined this channel by introduction from Donghee!

#

Since I'm delivering Python-based product to many air-gapped on-premise enterprise setups, the topics covered by the wheelnext community intrigued me!

toxic wraith
#

Welcome! We'd love to hear more about your pain points and which current topics are of interest

sturdy imp
#

there are many; i will summarize them later!

#

BTW, I'm currently in San Jose to attend NVIDIA GTC, and also it is possible to attend the summit on this Friday morning as announced in the website. It would be nice to get to know how things are going and share my interests if it's ok.

#

Could somebody let me know how to attend the summit? (possibly with one of my colleagues)

restive vessel
#

@sturdy imp I'm contacting you in private.
I can't guarantee it but I'm gonna do my best πŸ˜‰

thick basalt
#

Long time listener, first time caller. Great sessions today, thanks for the organizers for organizing πŸ‘Œ

restive vessel
#

@thick basalt it was a pleasure having you !

river lynx
#

Howdy everyone! It's Eli from the wheelnext summit last week!

toxic wraith
#

Hi Eli!

midnight marsh
#

is there a plan to produce a report of the summit?

wise matrix
#

Hey everyone! It's Vyas from the summit. Nice to see everyone here.

restive vessel
#

Hi

Sorry I forgot to check to discord @midnight marsh
Yes it's in process

formal inlet
#

I didn't realize there was a summit πŸ˜…

#

was it a one time event, or is it a regular meeting?

white jacinth
#

Hello everyone! I just noticed this page https://wheelnext.dev/who_are_we/

I wasn't able to attend the summit unfortunately but am quite interested in the effort nonetheless, so you can include Hatchling to that list and I plan to support whatever we do (not sure how that page gets updated)

restive vessel
#

@white jacinth that will be with pleasure. Can you confirm you received the meeting invite for next week ?

#

Super interesting to get your feedback on the work we are doing

white jacinth
restive vessel
#

Fantastic !

knotty elbow
#

(on behalf of @restive vessel as their message got zapped by our filters)

@formal inlet @midnight marsh @thick basalt @sturdy imp @frail portal @short epoch @stone lodge @fathom lintel @frank hound @lost nest @sick nest @tardy crown @prime marsh @glossy cradle @hard escarp @nova knoll @icy trench @scenic pulsar @junior tundra @quasi robin @river linden @tawdry raven @glossy cradle

~ Sorry for the mass tagging - only once I promise ! ~

Did you all receive the monthly meeting invite by email ?

If not:

stone lodge
#

I did and I plan to attend!

formal inlet
#

thanks for the ping! subscribed now and got the invite from the archive

frank hound
#

I can see it in the Archives. Now subscribed. Where's the date/time?

sturdy imp
#

I just registered and subscribed to the mailman list!

formal inlet
sturdy imp
#

In the archive message, you could find the ics attachment file.

frank hound
#

Ahh, I see it, thanks!

formal inlet
#

probably easier that way since we don't have to mess with timezones πŸ˜…

frank hound
#

I have a 50% collision with another meeting if it's monthly on this day and time (including the one next week) but I might be able to make it sometimes and for special cases (like first meeting)

restive vessel
frank hound
#

I have a biweekly Wednesday meeting 10:30-12, this is at 11:00 for me, FYI.

restive vessel
#

We could move it to your "offweek" like one week forward or backward if that helps

#

Like "first wednesday of every months"

frank hound
#

First Wednesday of every month would collide 50% of the time, approximately.

junior tundra
#

I haven't received an email (I'm not on the announce list) but I don't think I really followed what's going on with wheelnext (or contributed to the discussion) for there to be value in me attending πŸ˜„ Mostly just posting to let you know I got the ping :)

lucid pike
restive vessel
restive vessel
tardy crown
toxic wraith
#

We should send out the meeting invites through the mailing list, so no more work should be needed on your end. It's a loss to the community to not have you working on packaging as much :(

white jacinth
#

agreed, but Brett's new work on AI tooling/MCP for VS Code sounds equally as awesome πŸ˜„

tardy crown
tardy crown
# white jacinth agreed, but Brett's new work on AI tooling/MCP for VS Code sounds equally as awe...

For anyone wondering what Ofek is referring to because I haven't talked publicly about it yet, my work time has shifted to 70% tools for AI agents to use (i.e. like what all the new-fangled MCP servers do), 10% WASI, and 20% whatever open source I want (which will be motivated by the Python Launcher for Unix so that when my kid is old enough to use Python they don't come home from school one day and say, "Dad, why is Python code so hard to get running?" πŸ˜…)

scenic pulsar
#

I won't make the call this time mix of migraine and having the kids

formal inlet
#

I am running a bit late

fathom lintel
#

@restive vessel hi, I'm waiting to be let in

wise matrix
#

For anyone interested in continuing the shared library loading discussion from the meeting, please comment here! I also note that @formal inlet already has a #dynamic-library channel, so it may eventually make sense to coalesce there (but for now, starting here since this is where everyone coming from our meeting will see)

fathom lintel
#

Is the meeting over? Shall I stop waiting to join? πŸ™ƒ

thick basalt
#

they're doing Q&A I think

#

I had to drop off but I wanted to know how we'll handle the overlap between variants and compatibility tags. Are we planning to have build backends create variants for things like python versions or abi3, etc?

toxic wraith
fathom lintel
#

I had tried that too, will try again!

#

now it worked, thanks!

toxic wraith
#

Absolutely! Sorry about the confusion over which meeting it is!

fathom lintel
#

np! I read "The original event is the correct one" on the cancellation so tried the first email πŸ™‚

restive crown
#

I also got confused and joined the wrong meeting πŸ˜›

wind cosmos
#

@restive vessel β€” Is it useful for me to update our implementation in uv to match the latest changes?

#

Per my question on the call, I think it does make sense to use a variants.json per version

#

Otherwise, wouldn't we need to query every provider used in any version, in perpetuity?

#

The downside is that we need an additional registry query for every package-version that we inspect

formal inlet
#

it works on the majority of systems, but it isn't inherently portable, so it's not something I'd consider adopting in a PEP

#

implementing it properly would require changes to the Python upstream, and even then, it's quite tricky

#

either that or modifying package binaries on installation

#

or sacrificing per-package dynamic dependencies, in favor of global dependencies

#

none of which is great

restive vessel
#

@wind cosmos as a highlevel summary of the workflow to integrate variants in an installer:

In return you get an ordered list of hashes (str) and you can do whatever you want with it and install the wheel that you want (respecting or not the order that variantlib provided you).

Not every hashes might be available on platform X (example no CUDA variant on MacOS) so you might have to go down the "hash priority list a few shots to find one compatible variant actually on the index"

wind cosmos
#

Makes sense, thanks

#

(Just confirming that we’ll likely re-implement all the variantlib logic in Rust)

restive vessel
#

@wind cosmos until the design stabilize I recommend not doing that though. The ground will be moving under your feet for quite a while still.
We purposefully designed things so that it's easy for you to do that. There is no "forced inheritance" on variantlib even on the plugin side. However plugins might actually depend on it because it's practical

We need to find how the plugin systems could work for you, we heavily used the concept of entrypoints. Can you read them from rust ?

wind cosmos
#

Is the plugin system coupled to entrypoints?

restive vessel
#

Yes that's how we do "auto detection" we didn't find any other way

wind cosmos
#

I thought the intent was to move closer to something like PEP 517

#

"Auto detection" of what? Installed providers?

restive vessel
#

Yes

wind cosmos
#

Why is that necessary, though?

#

Aren't the variant providers declared in variants.json?

restive vessel
#

We might actually be able to relax this - but we still need auto detection to build variants

wind cosmos
#

I think auto-detecting state from an existing environment is a design anti-pattern, personally

#

Can you explain why it's necessary for building?

restive vessel
#
$ variantlib make_variant \
    --file xgboost-3.1.0-py3-none-win_amd64.whl \
    --property "nvidia :: driver :: 12" \
    --property "nvidia :: arch :: sm90" \
    --property "x86_64 :: version :: 4" \
    --output-directory .

Variant Created: xgboost-3.1.0-py3-none-win_amd64-a0e2749e.whl

We don't specify plugins (we could) - variantlib look on the system which plugin declare nvidia or x86_64 namespace and auto-validate the properties and inject the dependency into variants.json/METADATA on the plugin

#

we could add --plugin <package_name>

#

But then similarly

$ variantlib analyze-platform

variantlib.loader - INFO - Discovering Wheel Variant plugins...
variantlib.loader - INFO - Loading plugin from entry point: x86_64
variantlib.loader - INFO - Loading plugin from entry point: nvidia_variant_provider

#################### Provider Config: `nvidia` ####################
    - Variant Config [001]: driver :: ['12.8', '12.7', ... '12.0', '12']
###################################################################

#################### Provider Config: `x86_64` ####################
    - Variant Config [001]: level :: ['v4', 'v3', 'v2', 'v1']
###################################################################

or

$ variantlib get-supported-configs

variantlib.loader - INFO - Discovering Wheel Variant plugins...
variantlib.loader - INFO - Loading plugin from entry point: x86_64
variantlib.loader - INFO - Loading plugin from entry point: nvidia_variant_provider

nvidia :: driver :: 12.8
...
nvidia :: driver :: 12.0
nvidia :: driver :: 12
x86_64 :: level :: v4
x86_64 :: level :: v3
x86_64 :: level :: v2
x86_64 :: level :: v1

Would cease to work (not really critical they are more helpers)

wind cosmos
#

From an installer perspective, I'll just repeat that I really think the design should be declarative: the user or the package declares the providers it needs, and tools handle the rest. Inferring from the existing environment makes everything stateful and also couples the provider environment to the target environment unnecessarily (unlike PEP 517, where the build backend and build dependencies can be installed in an isolated environment).

#

Honestly I'd suggest the same for (e.g.) variantlib make_variant. I think I'd expect the caller to specify the providers.

restive vessel
#

We might actually be able to do that. In the original design (up to 3 weeks ago) wasn't possible.
I believe we might have the tools to actually make it happen

#

And - as a bonus - it would probably fix our namespace clash issue as we move to default-auto-variant mode

#

Let's touch base on this. There's a very good chance that could be doable

wind cosmos
#

Ideally (IMO), in that invocation, the user specifies the providers, and variantlib creates an ephemeral environment, installs them, and queries them. Then the whole flow is stateless and the user doesn't have to think about creating the env, installing thigns, etc.

restive vessel
#

I quite agree with you

#

How about caching

#

It's quite expensive to install X package at every install

#

Even beyond that - some plugins might take 1-2 seconds to execute - so you really want to cache their output

#

Now caching is "sensitive" because when to void it ? Maybe until you restart your computer ? We can assume you're unlikely to hot-swap a CPU or GPU πŸ˜„ Most driver change will require a restart. And it would need to be a cache-to-disk because it could be reused from multiple uv sync commands

#

Give me a few weeks maybe a few weeks after PyCon to play with this. It might actually be more easy than we originally thought

#

I think we took very good steps in this direction

wind cosmos
#

Yeah I'd probably cache the variant provider outputs and we'd support invalidating it on the uv CLI

#

(Separately, I also think that following the PEP 517 design is likely a good path for optimizing PEP acceptance)

#

(It's well-proven)

restive vessel
#

Indeed there seems to be some commonalities in terms of requirements

wind cosmos
restive vessel
#

For now we assume that variants are already installed on the machine.
Variantlib load them for you and invoke them

#

The function I pointed above is really the main entrypoint to the logic

#

For a first implementation you really "just need" to pass the variants.json you downloaded as a dict to the function and it returns you everything sorted by priority.

Then you look on the index which ones actually exist/don't and you stop at the first available on the index πŸ‘Œ

#

We are trying as we speak to change some assumptions to allow auto installation.
Thanks for the idea of following PEP 517 design πŸ‘Œ

That's why i was saying dont reimplement variantlib just yet. The design is far from stable.

The only parts I would consider stable are:

  • data model
  • resolver
restive vessel
#

@wind cosmos do you mind glancing at: https://github.com/wheelnext/pep_xxx_wheel_variants/issues/35
I think this new design will allow us to do everything you mentioned and talked about

Took us quite a bit of time to find a "functional recipe" - we believe that this design would check absolutely all the boxes.
And from the pyproject.toml we stayed as close as possible to PEP 517 - it was a very good advice

wind cosmos
#

On first glance this looks pretty good

restive vessel
#

I think this is a major step forward in the design

#

Your comments were very useful actually πŸ™‚ Thanks for that

tardy crown
#

FYI https://github.com/wheelnext doesn't have a direct link to the website (had to go to the website repo to find the URL)

GitHub

wheelnext has 41 repositories available. Follow their code on GitHub.

toxic wraith
#

Perhaps not clear enough that it is a link but this leads to the website

tardy crown
toxic wraith
#

Understandable πŸ˜†

fathom lintel
toxic wraith
#

Yes, I'll do that now

#

(done)

wind cosmos
#

@restive vessel -- If I want to update the uv prototype based on the latest changes, what's the best resource to look at?

restive vessel
#

Do you want to reimplement variantlib? Or for now you're okay to use it?

wind cosmos
#

I honestly think it may be easier to reimplement it, but for sake of explanation, let's assume I'll use it for now!

restive vessel
#

https://github.com/wheelnext/variantlib/blob/main/variantlib/api.py#L42

this function is virtually the "only entrypoint" you need .

  1. you resolve the version you would normally install for package X
  2. you check if {package}-{version}-variants.json exists on the index

if no => package not variant enabled proceed as usual

if yes => download it and pass the entire content as a dict to this function.

The function will do everything and return you a "sorted list" of compatible hashes.

You go one by one to see if you can find it on the index. You install the first you find. If you can't find any, install the non variant

GitHub

Contribute to wheelnext/variantlib development by creating an account on GitHub.

#

I would recommend to rely on variantlib until the design stabilize and the proposal also.
It will really reduce the amount of re-engineering you have to do.

It's not exactly the funniest part to engineer over and over smthg πŸ˜…

However we think that interface is reasonably stable now.

wind cosmos
#

Got it, thanks!

#

And where can I find an example variants.json file?

#

(That’s effectively post-processed wheel metadata from across the release, right?)

restive vessel
#

Use this index: https://mockhouse.wheelnext.dev/pep-xxx-variants/
You should be able to do uv pip install dummy-package and it should install

Downloading https://mockhouse.wheelnext.dev/pep-xxx-variants/dummy-project/dummy_project-1.0.0-py2.py3-none-any-36028aca.whl (1.3 kB)
Would install dummy-project-1.0.0-36028aca
#

Note: Variantlib can call both uv or pip in the background to install the plugins in an independant virtualenv. However given that we are using a special index. I suspect I'm gonna need to do some adjustments to read the --index and --default-index from uv.

wind cosmos
#

Variantlib installs plugins? I assumed that was the responsibility of the installer

#

Is that a temporary thing for the prototype or part of the design?

restive vessel
#

We are not entirely sure about this part. If you want I can include a flag to deactivate this part. I think ultimately this will be complicated to store that in variantlib

#

Though assuming you install then yourself in a separate venv. How will variantlib know where to load them from? You're providing a path to the venv?

wind cosmos
#

I'll need to look at the design more closely, but I think it's unlikely that you'll want variantlib to be responsible for installing and managing an environment

#

I was imaging that variantlib was like packaging: a well-isolated library, small enough to be vendored, that implements the standard

#

I don't think it should rely on an installer. It's also really hard to respect user settings, etc., if you're calling uv (for example) from within variantlib.

#

I guess I was assuming that the installer would interact with the plugins, and the variantlib API would be simple enough that you're just passing in data and getting data out without having to interact with any external systems?

#

So the installer would install the plugin in an isolated environment, ask it for the enabled variants, then pass those to variantlib, etc.

#

I'll comment on the PR

restive vessel
#

I'll need to look at the design more closely, but I think it's unlikely that you'll want variantlib to be responsible for installing and managing an environment
I actually agree with you - it was just easier for now.
Good thing is that I actually managed to implement cross-environment loading. So I should be able to allow you to install on your side - and me to load them.

These types of things are not part of the "PEP" per say. For now - let's just get it to work.
In the future we can think about what would be the best integration for installers.

I'm also completely opened to the idea to have many entrypoints depending on "how much do you want variantlib to do for you"

#

I think all your assumptions are reasonable - so far we are very much "playing with interfaces" and discovering - as we implement it - what we need and from where.

wind cosmos
#

Sounds good!

restive vessel
#

@wind cosmos I have an interface complete for you to be able to install & controle the plugins (without us doing it).
However we don't have yet a good idea on how to load from an isolated venv - so only "current environment" supported [for now] we'll think after pycon how we can best do isolation (Installation is easy, the difficulty is cross environment execution)

#

I hope that helps

sturdy imp
#

you could think pex as a simplified interface of managing a dependency tree + hermetic but volatile virtualenvs

sturdy imp
#

the problem here is also cross-environment (and cross-interpreter) execution, too...

#

we should rely on stringified code snippets passed to the subinterpreter API

#

for variantlib, we would need another mechanism if it needs to support pre-3.13 CPython versions where subinterpreters are not available, though

#

though this is an extremely sketchy idea, but I think we could consider "packaging an entire dependency tree and loading it in an isolated environment" for plugin usecases as a new wheelnext topic...?

#

this is what I've shared with Chris Gottbrath in the summit on March 21st

thick basalt
#

Isn't that a very similar problem to build time dependencies in general? Maybe we should use the same solution

#

PEX in particular doesn't work on Windows (last time I checked)

restive vessel
#

Isn't that a very similar problem to build time dependencies in general? Maybe we should use the same solution
Yes and pretty much solved with subprocess.run which I think is the reasonable idea to use

sturdy imp
#

ah, yes, we could take the same approach like build backends

#

regardless whatever venv isolation method we use, i think we also need to take care of managing (cleaning up) cached venvs from the user side or automating it

restive vessel
#

So far I have ~ virtually ~ re-implemented / borrow from PyPA/Build

#

It's actually working but realllly dirty

#

Sooo ... Let's just pretend it's not πŸ˜„

#

@sturdy imp will you be coming to PyCon ? Would be great to catch up !

sturdy imp
#

Yes, I'm coming! @restive vessel

patent vapor
#

Hi everyone! I’d love to contribute to wheelnext and wasn’t sure whether this channel is open to newcomers or not. I totally understand if it's not the case and I'll just lurk and read the discussions (that was my primary goal, get close to where innovation happens) 😁

I see that the recent discussions are about variantlib. I didn't get the chance to read the code in-depth and play with it yet, but from what I've gathered it's a toolkit to pick the most appropriate binary wheel for a given machine, taking into account the GPU backend, provider, CPU specific builds etc. right? It does resonate with me as someone who struggles continuously with GPU-aware packages (working as a machine learning engineer).

I was wondering where would an extra pair of hands be most useful right now for the project, if even it needs that πŸ™‚ and I totally get that the project is still in its early days (I think it's not even mentioned in the wheelnext.dev proposals) so maybe you don't want a lot of people wandering around it xD

toxic wraith
restive vessel
#

Welcome @patent vapor absolutely delighted to have you onboard. Make sure you check the How to Contribute page and signup to the mailing list: https://wheelnext.dev/how_to_participate/

May I ask that you develop a little your recent struggles around It does resonate with me as someone who struggles continuously with GPU-aware packages (working as a machine learning engineer). . This is very useful for us to understand the problems people are facing.

Also do you wish to contribute on your free time or as part of your role in a company ? We are happy to provide credit for the work & efforts on the page: https://wheelnext.dev/who_are_we/

Will you be at PyCon ? If so let's catch up ! Happy to grab a coffee

patent vapor
# toxic wraith Welcome, we're definitely open to newcomers! The variants proposal https://wheel...

I read the document and the issue #38. I'll go through the rest as well to understand the different perspectives and the current state. I'm unfortunately having trouble reading the DPO discussions, though I'm sure they're a trove of knowledge as well. I saw that you have tutorials as well! I'll try them out and I saw that variantlib is a dependency of the project (I understand now why it's not mentioned in the wheelnext.dev proposal!).

patent vapor
# restive vessel Welcome <@993938126173122590> absolutely delighted to have you onboard. Make sur...

Thank you! I have registered to the mailing list.

My struggles are similar to the user story "A user wants to install a version of PyTorch that is specialized for their GPU architecture."

I work in a consulting company and in general we deploy for clients using their preferred stack on their preferred infrastructure. Sometimes it's not us, the developers of the package or the solution that deploy it, but other teams. And the issue here, is how to ensure they use the appropriate versions of our dependencies. The client might be using a different infrastructure than which we developed for initially. If it's only PyTorch, then there is a way, a bit clunky in my opinion, to communicate this information through the pyproject.toml. And, I think, this is thanks to PyTorch having an index for their wheels. It's not always the case though, and that's when issues arise because it's harder to communicate that information and enforce it.

The other issue, less common though, is when researching / testing different packages from ML papers. In many cases these packages are not mature and were developed just as a way to showcase the results of the paper or for reproducibility. So we have to manually go through the dependencies and look for which ones are GPU-aware (generally through our own knowledge of the field but that's error prone) and try replacing them with the corresponding versions for whatever we're testing on. Sometimes it works, sometimes not ^^'

I wish to contribute on my free time πŸ™‚ but it's amazing to see such big players, both from open source and companies contribute to this.

#

And thank you both for the welcome, I really appreciate it ^^

toxic wraith
#

DPO is probably not a good place to start, but also has a lot of past discussions on the topic which are useful to know. I don't think you need to rush to read all of the DPO threads.

patent vapor
#

I'll keep that in mind thanks! I'll check-in regularly on the DPO threads to follow along and not having to read 100+ long threads in the future haha

restive vessel
#

Will you be at Pycon?

As Emma said I would not get too deep in DPO. It's a rabbit hole of information.

The absolute most useful thing - not too much time consuming - is to participate on public channels / discussions and test prototypes. Try it yourself on your own projects. Have others to test it and provide feedback.

Many many of these projects are far from trivial and clear actionable feedback is absolutely critical to forge the best proposal we can. Especially from people who understand and are confronted to the difficulties these proposals try to address

patent vapor
#

Oh sorry I forgot to reply to that. I won't be at PyCon unfortunately :/ Hopefully next year! Hope everyone has a great time!

Thank you both for your advice and for the welcome. I'll test on as many different projects as possible and if I encounter any issues or if I have any quality feedback I'll share it on this channel or on GitHub issues.

restive vessel
languid bolt
#

@restive vessel Just landed to Pittsburgh today. Looking forward to seeing you at PyCon πŸ™‚

restive vessel
#

@languid bolt can you ask a few core dev of XGBoost to try the tuto ?

languid bolt
#

Yes, I will ask Jiaming to try it

restive vessel
#

I changed a few things πŸ˜‰ Please try again

languid bolt
#

Yes, I will try it again

#

Also I remember you saying you'd organize an engineering sprint at Pycon. Is it going to happen? Let me know the schedule

restive vessel
#

Sunday: talk
Monday/Tuesday Sprint

sturdy imp
#

it seems to report a misleading error message, saying "You can not access plugins outside of a python context", when it fails to import the plugin entrypoint by mis-spelling (NVIDIAPlugin should be NvidiaVariantPlugin as in the followed successful command)

#

another strange behavior:

#

maybe "armv8-a" is not recognized as "aarch64 :: version :: 8a"

#

(it's a Linux VM run by OrbStack on a M4 pro laptop)

#

anyway this looks like a plugin-specific issue

restive vessel
#

@sturdy imp thanks for the bug report - I'm aware of the :"You can not access plugins outside of a python context" error - still need to identify the issue.

The rest will look into it πŸ™‚

Thanks a lot

sturdy imp
#

on a development node with RTX5080 and AMD Ryzen CPU, the plugin's property detection seems to work as expected

restive vessel
#

@sturdy imp

anyway this looks like a plugin-specific issue
Yes it is - there are a few "quirks" with archspec - we are figuring them out. It orthogonal to variants but we need to address it. archspec is really lacking on MacOS - inside docker it completely fails to report anything

#

"You can not access plugins outside of a python context"
Re-Open the tutorial for this one - I rebuilt variantlib and fixed the issue. And updated instructions πŸ˜‰

#

Any success / issue on the install tutorial ? Should be pretty straightfoward

patent vapor
#

Hi everyone!

I think there is a very minor issue with the pip submodule of pep_xxx_wheel_variants.

E.g., at this line: https://github.com/wheelnext/pip/blob/27bbc0d370aaf80cc8d0a64c300d8971481bddf8/src/pip/_internal/utils/variant.py#L39

I don't think it's possible to use the same quote character inside an f-string as the one deliming the string itself for Python versions <= 3.11 (and pep_xxx_wheel_variants requires >= 3.9) since it was introduced in 3.12 (PEP 701)

At least with Python 3.11 I get SyntaxError: f-string: unmatched '('

patent vapor
#

I also have a question if you don't mind. I was trying to understand the issue #38 as an entry point to understanding variantlib. What I am not able to fully comprehend is the build isolation issue. Or at least, if the issue still holds.

From what I could gather from the issue, the preferred solution was to use something similar to PEP 517. I thought that I could try my hands at this, just as an exercise and tried to figure out the parts of variantlib responsible of this. And it seems to me like everything is fully implemented.

Running the tutorials with build isolation (e.g., isolated=True in AutoPythonEnv) while removing the raise NotImplementedError from IsolatedPythonEnvMixin and IsolatedPythonInstallerEnv seems to be working correctly, at least on Linux x86_64 with pip as env backend.

I was wondering what I was missing out in my understanding πŸ€” or maybe the current code for build isolation was implemented right after the issue?

Thank you in advance!

PS: I mean by tutorials, the numpy, pytorch and xgboost tutorials.

sturdy imp
restive vessel
#

Don't worry you don't need

#

You're building a GPU variant right?

#

Build your wheel the normal way. Add this to the pyproject.toml

#

Once done transform the wheel as a variant with this command

#

This is literally how we do it for torch @river lynx πŸ‘Œ

#

Changing the build backend is a complicated endeavor. That's why we created this tool. It changes nothing to how you build your wheel. You just "make it a variant afterwards"

sturdy imp
#

ahha! πŸ’‘

#

but i've already almost done it...

#

and i also wanted to retire setup.py and setup.cfg anyway πŸ₯²

#

btw, it will make the job easier a lot!

icy trench
#

They're both separate steps effectively. You can do either change, in either order that you want.

sturdy imp
#

almost there, but somehow the plugin namespace is not recognized... hmm

#

anyway, it's time for the packaging summit!

merry rapids
#

looking through the variant demos, no hatchling support yet? I might be able to hack something together though, it doesn't look hard

#

though, now that I'm looking through the repos, I'm way more interested in the native loader stuff -- I've been doing a variation of that since 2020 across a dozen or so wheels

#

I don't quite understand why it's using RTLD_LOCAL though -- I had ran into issues with that on macOS so I use RTLD_GLOBAL on that platform, but maybe it's just something with the libraries I'm using

merry rapids
# merry rapids looking through the variant demos, no hatchling support yet? I might be able to ...

if I understand the way that you patched meson-python, the idea is that you specify the variant that you're building using config setting? Unfortunately, hatchling doesn't support config settings yet (https://github.com/pypa/hatch/issues/1072) so it would have to be implemented by a build hook instead -- but I don't think it would support adding those extra metadata keys you need. Hm. I guess the wheel modification command is good enough for now.

sturdy imp
#

Now I succeeded to build the first variant-enabled wheel!

sturdy imp
#

In sprint: the variantlib design issue about how we are going to support nvidia driver version (in addition to the nvidia cuda toolkit version). e.g., cuda versions could be enumerated but nvidia versions... could be MANY, and we would need a rich version constraint expression support here.

languid bolt
sturdy imp
#

Bradley suggested me to use the "paired" cuda toolkit versions instead of the nvidia driver versions

#

but this may not work in some cases -- where the application (like Backend.AI) heavily depends on NVML, etc.

#

also in the scenario of containerized CUDA apps, the host may not have the CUDA toolkit at all but only NVIDIA drivers

#

if the variant-aware wheels are installed in the host side (as part of a workload hosting platform like Backend.AI), the variant matching should run against the NVIDIA driver version number.

#

for simplicity and manageability, we could split or fork the official nvidia-variant-provider plugin for our purpose, but still the nvidia driver versions are too complex to embrace in the current variantlib design.

river lynx
#

my only wish for nvidia-variant-provider is to best fit cuda drivers with cuda variant. i.e. 12.6 built wheel is technically compatible with a 12.4 driver (afaik) so for torch 2.7.0 it should install 12.6 if you have 12.4 installed locally (like on my test machine)

stone lodge
#

wait, is that actually true? I would assume that a wheel built against CUDA 12.4 is compatible with a prod system with installed CUDA 12.6 but not vice versa? (also this seems like something the driver should know for sure or not, so yes, if that is true, then I agree the driver should reflect that)

#

(or is the thing that building against 12.6 is fine if you don't use any new symbols since 12.4 which rarely happens, and so it turns into the same shape of problem as auditwheel/manylinux? I think in this case it's still the driver's responsibility to look at the wheel and determine that it is 12.4-compatible and label it as 12.4 instead of 12.6)

river lynx
#

I've always been under the impression that they should be compatible within one major cuda version

#

(unless it's a newly added architecture)

stone lodge
#

if that's the case, then IMO the variants should just advertise the CUDA major version (so 11/12/etc. alone), and if necessary have compute capability as another variant axis

#

it's definitely possible to have binaries compiled against an older CUDA version to work with GPUs that weren't even announced when that CUDA version came out, though they won't take full advantage of the new GPU's power

wise matrix
#

The answers to the above questions about compatibility depend on a few factors. If you build against CUDA 12.4, then yes you are pretty much always safe to run on CUDA 12.6. The converse is conditionally true (I really wish I could give you a simpler answer on this part). If you only use the runtime APIs (functions starting with cuda* like cudaMalloc) and not driver APIs (functions starting with cu* like cuMemGetInfo), then you are almost always guaranteed that things will work (the exceptional case is if you are using a brand new API that specifically requires a newer driver, e.g. if you started using cudaMallocAsync on the minor version of CUDA that introduced it). If you use the driver API, then you cannot usually build with a newer driver and run on an older driver (yes, the term driver is unfortunately very overloaded here).

wise matrix
# stone lodge it's definitely possible to have binaries compiled against an older CUDA version...

Yes, it is also possible to have binaries compiled with an older CUDA version work with newer GPUs. Typically this works by embedding PTX code in your binary. PTX is effectively an IR that can be JIT-compiled for any compute architecture, so at runtime the CUDA driver will detect that you have PTX and compile that for you even if you don't have precompiled machine code specific to your architecture.

stone lodge
#

(also sometimes new hardware is the same compute capability major, I remember being surprised at things showing up in 8.x. I believe that means you don't even need PTX/JIT, the same machine code will work, but using e.g. SASS built for capability 8.0 on an 8.9 GPU is presumably suboptimal in some way)

topaz summit
sturdy imp
stone lodge
#

where is the current draft spec (or closest thing to a spec) for the thing where wheels declare in their metadata which providers they need? I don't see it in either wheelnext/pep_xxx_wheel_variants/ENGINEERING.md nor wheelnext/wheelnext/docs/proposals/pepxxx_wheel_variant_support.md

patent vapor
topaz summit
#

@sturdy imp multiple extensions will already be built in parallel. For a single .pyx -> .c -> extension-module there is no way to parallelize, since they're single file compiles.

sturdy imp
topaz summit
#

@sturdy imp that is automatically done in parallel. meson generates a ninja.build file in the build directory, and ninja will by default build with 2*n_cpu + 2 jobs in parallel. You can control build parallelism with its -j flag. You can pass that through meson-python like so when building a wheel: python -m build -Ccompile-args="-j6".

This is probably best continued in https://github.com/mesonbuild/meson-python/discussions/ or #meson-python.

wise matrix
stone lodge
#

what are the exact semantics around variants? my understanding is

  • a variant property is a triple {namespace, feature, value}
  • a wheel has a list of variant properties (with at most one property with the same namespace and feature, maybe?)
  • each provider outputs variant properties
  • if there exists at least one variant property that was outputted by a provider, and is in the list of the wheel, then the wheel is eligible
    ?
#

so if I declare a wheel that says nvidia :: cuda :: 12.8 and amd :: rocm :: something, then it will install on machines with CUDA 12, machines with ROCm, and machines with both, but not machines with neither?

and if I declare a wheel that says nvidia :: cuda :: 12.8 and x86_64 :: level :: v4, then it will install on machines with CUDA 12 or machines with AVX-512 (or both)?

or am I misunderstanding

patent vapor
# stone lodge so if I declare a wheel that says `nvidia :: cuda :: 12.8` and `amd :: rocm :: s...

From my understanding, when you build a distribution, you can build different wheels for it, using different provider plugins. So like before you get the windows wheel, the Linux wheel etc., but now there's support for specific environments encoded by the hash.

So you can't build one specific wheel for both Nvidia and amd but you'll build two wheels.

Both these two wheels will have a METADA file that contains all the previous spec metadata and now including which provider plugins it requires (all of them, even if the wheel is built for one specific platform).

But I don't think that's what most of us should care about, the index will have a ...-variants.json like on the mockhouse from the tutorials. I think that file will allow tools like pip to fetch the correct wheel for your environment. It'll look at what plugins it requires, it can install them (in an isolated environment I believe?), get the value for your environment, the hash, and then fetch the proper wheel and install it. Note: the provider plugins give a relative preference for different values that are compatible with a given environment as well.

On an environment that satisfies no provider plugin then the default will be the non variant wheel.

So says PyTorch requires both Nvidia and AMD, it'll build different wheels for both. If you're on CPU, you'll get the CPU wheel by default.

stone lodge
#

My reading of the current implementation is that it will pull the json file and match a wheel that has at least one variant that the local plugins are advertising (and if there are multiple wheels the precedence stuff goes into effect, yeah, but at the moment I want to know whether a wheel matches atall), meaning that I can in fact build a single wheel with both nvidia and amd support, but I canot meaningfully build an nvidia+x86_64-v4 wheel. Specifically the patched pip calls variantlib.api.check_variant_supported() which does bool(list(filter_variants(...))), i.e., if at least one variant survives the filtering, then it returns True.

#

I'm curious if I'm reading it right, and if so, if this is intentional

patent vapor
#

Oh sorry I think I misunderstood what you meant by your question. I thought requiring different plugins for different environments. If you require two (or more) different provider plugins for the same wheel, then the environment must satisfy all conditions. Because in the code you link to, you have this allowed_properties=supported_vprops and they're built from all plugins (I might be wrong on this one, I didn't look for which code calls this function etc., to make a proper analysis). I might be missing something though. But just conceptually I think it's better to enforce an "AND" relationship rather than an "OR". "OR" introduces some kind of uncertainty I guess. You'd be asking questions like, "does it work on this environment or on that environment or on both" (in that case just do an AND). It wouldn't make sense to have a wheel that works in one environment but not on the other, better just publish two wheels separately. I guess πŸ€”

GitHub

Contribute to wheelnext/variantlib development by creating an account on GitHub.

cosmic crystal
#

Hey there πŸ‘‹
Thanks for the PyConUS talk, I had a few questions that were clarified by watching it πŸ™‚

#

Most of the problems described, are things that I'm suffering by packaging PySide and Shiboken (the Qt bindings from the Qt Company), so looking forward to give it a try

#

Also I help a bit PyPI support by addressing the size limit issues and it's an everyday issue for new projects requesting >1GB wheel sizes, or >100 GB project size meow_sad, so hopefully this can be addressed with wheelnext πŸŽ‰

lethal idol
#

(I hope I never need a wheel like that...)

restive vessel
#

I hope that one day I'll play a jam session with Jimmy Hendrix πŸ˜„ Until then I focus on what I can do and brush up my skills πŸ˜„
Jokes aside - happy that we are able - bit by bit - to move the needle.

@cosmic crystal do you mind detailling what was unclear before & what the talk answered ? It's helpful to know how we can improve the communication.
Also what are the specific bits you are struggling with - happy to get you involved in the design phases if you're open to it (or even have you contribute to the proposals!)

lucid pike
toxic wraith
#

Wow PyConUS really sneaks up on you!

short epoch
#

@restive vessel spammers got to us? Or was someone hacked?

restive vessel
cosmic crystal
# restive vessel I hope that one day I'll play a jam session with Jimmy Hendrix πŸ˜„ Until then I f...

oh noooes, I did a long write-up and might have never hit enter :C I was on some conferences...so let me be brief:
dotI didn't know there were many companies behind
dotI though it was "only for having new tags"
dotwhen you mentioned solving the symbolic links during the talk, you got me. This IS one of the major issues we currently face.
dotWith my pypi hat on: tackling the problem with other architectures, like gpu, is what we need, it's a very large of projects that decide to 'ship everything' just because we don't have a way to satisfy some gpu-specific shared libraries in a better way, ending up with bloated wheels.

stone lodge
#

Wheel variants is definitely the most involved proposal and in my memory showed up around the time of the WheelNext branding (or at least I heard about them both around the same time), so that's understandable! Discussions about symbolic links, making puns about "how to reinvent the wheel," etc. happened as far back as PyCon US 2024 (maybe earlier) but I don't think either variants or the WheelNext name was around then.

restive vessel
#

We kinda had this idea in the back of our mind for a while. We decided to unify everything under a single banner to clarify communication and make it easier for people to get involved.

@cosmic crystal really glad the talk resonated with you. Let us know if there's anything you'd like to get involve directly on. Otherwise getting your feedback on designs & proposals will always be immensely appreciated

cosmic crystal
#

Thanks for sharing more info, I for sure would like to look into the project, because I will probably move PySide6 to use it the moment I can allocate some time to work on it.

Another question, maybe that for sure is orthogonal to the project, but still wanted to ask: I didn't see in the docs/talk is Stable ABI, we are one of the few 'large' projects using it, and we had a problem for some time regarding getting the cp3X-abi3 tag by using pyproject.toml only, we were required to create a placeholder setup.py in order to fake an Extension and get the abi3 tag. This can be done now on pyproject.toml with the experimental option [tools.setuptools] .
( Here you can see the old fake setup.py Extension: https://codereview.qt-project.org/c/pyside/pyside-setup/+/649992/1 )

So I was wondering how is the relationship of the project with the development of setuptools and the options we have within pyproject.toml. Are you planning to contribute to setuptools or try to maybe add a new backend?

hidden jewel
#

Hi all, I've just joined, looking forward to participating in discussions! πŸ‘‹

toxic wraith
hidden jewel
#

Hi all, just FYI an updated draft of PEP 771 has been published, and I've opened a new DPO thread with information about the updates: https://discuss.python.org/t/pep-771-default-extras-for-python-software-packages-round-2/94905

languid bolt
#

Hi everyone, I was working on building variant wheels for XGBoost and found out that pip wheel no longer works with latest version of pep_xxx_wheel_variants. I submitted a small patch to fix it: https://github.com/wheelnext/pip/pull/9

GitHub

Running pip wheel -v --no-deps to build a wheel fails with the error
ERROR: Exception:
Traceback (most recent call last):
File &quot;/home/phcho/miniforge3/envs/wheelnext/lib/python3.13/site-...

restive vessel
#

Thanks a lot @languid bolt - unfortunately until we vendor variantlib into pip ... There will be broken APIs due to how PIP is working
Though your PR is merged πŸ™‚

bronze bear
#

for example for the cpu microarchitecture/SIMD features, maybe the authors only publish for x86_64 (windows and linux) and aarch64 (mac), but the variant provider could support x86_64, aarch64 and other architectures, so that someone building locally gets (unless they chose otherwise) a SIMD-enabled build optimized for their CPU

#

this would reduce the configuration surface a bit as variant providers are always there, just don't do anything on platforms they don't know

wise matrix
# wise matrix For anyone interested in continuing the shared library loading discussion from t...

Hi everyone, I just posted https://discuss.python.org/t/native-lib-loader-documentation-and-best-practices-on-using-native-libraries-in-python-wheels/98111 to continue the discussions around best practices for loading shared libraries. If you know anyone who would be interested please link them to that, as well as to the #dynamic-library channel here. Thanks all! Looking forward to continuing that discussion

coarse mantle
#

Hi everyone, my name is Travis, and I'm a core maintainer for conda, and I'm looking to contribute to this effort however possible πŸ’ͺ πŸ™‚

restive vessel
restive vessel
#

The team has been hard @ work ! And we just posted a major update on DPO about Wheel Variants: https://discuss.python.org/t/wheelnext-wheel-variants-an-update-and-a-request-for-feedback/102383

This update is enriched by 4 blog posts simulataneously published:

Astral: A variant-enabled build of uv: https://astral.sh/blog/wheel-variants/
NVIDIA: Streamline CUDA-Accelerated Python Install and Packaging Workflows with Wheel Variants: https://developer.nvidia.com/blog/streamline-cuda-accelerated-python-install-and-packaging-workflows-with-wheel-variants/
PyTorch Foundation: PyTorch Wheel Variants, the Frontier of Python Packaging: https://pytorch.org/blog/pytorch-wheel-variants/
Quansight: Python Wheels: from Tags to Variants: https://labs.quansight.org/blog/python-wheels-from-tags-to-variants/

Eager to hear about your feedbacks !

In collaboration with PyTorch, NVIDIA, and Quansight, we've released an experimental build of uv with support for wheel variants.

If you’ve ever installed an NVIDIA GPU-accelerated Python package, you’ve likely encountered a familiar dance: navigating to pytorch.org, jax.dev, rapids.ai, or a similar site to find the artifact…

The story of how the Python Wheel Variant design was developed

coarse mantle
restive vessel
#

@coarse mantle are you looking to contribute as part of your free time or with Anaconda ? Janis seemed to have great ambitions for WheelNext, maybe have a chat with him. Otherwise happy to give you ideas where help would be much appreciated πŸ™‚

hard escarp
restive vessel
# hard escarp I had been vaguely following along with the wheel variant work (since CUDA stack...

Thanks Alyssa. We have all major build backends implemented (except Scikit build and maturing WIP)

We didn't believe we could write a solid and functional PEP that fundamentally change so many pieces of the ecosystem without actually putting it to the test.

So this is the last step before we can finally finish the PEP draft and submit it officially to the community.

I think we needed to convince ourselves it can work and well first with some of the most complex scenarios.

prime marsh
modern spruce
#

I have been following the discussion on discuss and we had some very similar discussions with Henry(cibuildwheel) and NVIDIA folks including Andy Terrel at SciPy this year. I also have serious concerns about an arbitrary plugin model that could be installed without user knowledge as this creates software supply chain security risks. I think that the discussion in the post experiment phase will need to focus in around this blind installation of variant selectors that cannot be inspected for provenance. My sort of two cents on this is that for the use cases I represent with Amazon, that requiring explicit declaration of the variant selector plugins is the model that I see as a feasible solution even more so for any use case where builds and installs are occurring in network isolated build environments.

With all that said, I am very excited about WheelNext and think it does solve a big problem for the community!

restive vessel
#

Thanks a lot @modern spruce there are a lot of ways to make that "plugin system to work", there are mitigation that can be applied (like using a static file as "analysis of the system" instead of "dynamic analysis" => a program running the analysis.

I'm not sure it's mentioned anywhere but we plan to enforce that plugins must NOT have any dependency period. If you need smthg - must be vendored. Greatly reduce the security exposure to supply chain attacks.

Some people and environments will probably prefer different things.

If we want to go there - it's possible that tools adopt a special security model for said plugins (like they need to be on a green list for example). To be on the green list your package need to be publish through trusted publishers, with attestation and get a reviewed at each release. I don't know I'm just throwing ideas.

If you're interested in helping us to build ideas to reinforce the security model of "plugins". I'd love to have you on board or contribute ideas. Anything you can think of.

modern spruce
#

I am going to spend some time this weekend taking a look at all the work in the github repos and start to digest where things are at so I can start to reason about additional ideas to help ensure the security model of plugins.

restive vessel
#

In collaboration with PyTorch, NVIDIA, and Quansight, we've released an experimental build of uv with support for wheel variants.

If you’ve ever installed an NVIDIA GPU-accelerated Python package, you’ve likely encountered a familiar dance: navigating to pytorch.org, jax.dev, rapids.ai, or a similar site to find the artifact…

The story of how the Python Wheel Variant design was developed

GitHub

Contribute to wheelnext/pep_xxx_wheel_variants development by creating an account on GitHub.

#

And obviously if you have questions / doubts / anything, you can ask here or open a Github issue on the repo above

hard escarp
#

On the ensuring-variant-names-are-not-valid-existing-wheel-names front, did the idea of using a field-internal separator with a currently disallowed character come up? Specifically thinking of name@label or platform@label. I know @ already indicates a direct URL reference in a full dependency declaration, and we generally prefer to avoid symbols that require escaping in file names and/or URLs, but neither of those seem like deal breakers. We permitted ! and + in the less common epoch and local version identifier fields after all.

(Posting here rather than DPO because that thread is long enough and I don't think the question is high priority at all)

bronze bear
terse lintel
#

Help me please understand something about wheelnext and UV at the same time

#

Currently I have 2 requirements files, one for AMD/intel gpus and one for nvidia

#

I do a bit off guessing if Nvidia-smi is available, if so I just install the Nvidia requirements.txt if not install the one for AMD

#

Ofc with the new pytorch 2.8.0 and cu129 they dropped support for pascal GPUs like the 1000 series from what I am gathering

#

In this case my approach, that installs torch 2.8.0 cu129 by default, completely bricks my software for users who own a pascal GPU ( or just falls back to CPU )

#

Would UV pip install torch automagically install cu126 that still has support for pascal?

#

This is what I have gathered from here, but I just wanted a 2nd opinion

fringe birch
#

πŸ‘‹ Hi, I'm the lead for the Spack project. Here to try to keep up with wheelnext, mostly wheel variants. I am not sure how easy that will be. I put a couple comments on discourse. As I think people know:

  • we have variants in Spack;
  • they can have single or multiple values;
  • they're known to the solver; and
  • you can use them in dependency specifiers, requirements, conflicts, etc.

I can at least talk about pitfalls, combinatorial issues, package evolution issues, etc. we've encountered with variants they way we do them. Also we're working on ABI compatibility constraints in Spack. That might also be of interest to folks here.

I would eventually like Spack to rely much more on PyPI metadata for python packages than it does, as I do not think we can scale to support all the packages users want. I am not sure we will ever want to use PyPI binaries, but it would be nice to be able to consume wheel variant info and use it to update spack packages for things that require native compilation.

bronze bear
#

i'm really curious about how spack solves this, are there docs describing the variants?

fringe birch
#

See, e.g.:

Here for an example of a package with variants: https://spack.readthedocs.io/en/latest/package_fundamentals.html#spack-info

variants can exist or not exist depending on other variants (e.g., cuda_arch only exists when cuda is enabled), so spack info there is showing what variants there are for mpich and when they exist.

source for the package where this stuff is defined:

and the base package for cuda with common constraints: https://github.com/spack/spack-packages/blob/develop/repos/spack_repo/builtin/build_systems/cuda.py

I’m not sure what else to point to but can try to explain more….variants are a fundamental part of specs (package descriptors) and you can ask whether an installation of foo satisfies foo+cuda or foo cuda_arch=90. We also have a notion of microarchitectures… you can ask if a spec satisfies, say foo target=x86_64_v4: which will get you anything with avx512 (using the libc generic arch names there but you could also refer to cascadelake or zen4 specifically).

Packages can define preferred values for variants, and the solver will prefer those in the absence of other constraints if it’s building, or it’ll prefer what you have installed by default.

Is that what you’re looking for? I could talk about how we model variants or how this is done in the solver too.

GitHub

Spack's community package recipes. Contribute to spack/spack-packages development by creating an account on GitHub.

GitHub

Spack's community package recipes. Contribute to spack/spack-packages development by creating an account on GitHub.

fringe birch
#

sorry if that was TMI

languid bolt
# terse lintel Would UV pip install torch automagically install cu126 that still has support fo...

If the user installs uv pip install on their machine, yes, the correct cuda126 variant will be selected.
Complexity arises if you are creating a Docker container and you are packaging torch on the user's behalf: uv will choose the variant according to the GPU of the builder machine, which can be potentially newer than the user's. You will need to explicitly instruct the nvidia plugin to choose a lower CUDA variant.
@restive vessel Can you chime in?

restive vessel
#

@terse lintel interesting name btw πŸ˜‚

#

@fringe birch I'd love to chat with you! When are you available? I'm so impressed with what you guys did with archspec and we are trying to make the scientific python community build on top of archspec for CPU / SIMD.

#

I'm sure you came to realize we took heavy design inspiration from you guys. The Spack project was hitting a lot of nails right on the head!

It would be super interesting to have a chat 😊

#

Maybe I can invite you to our bi-weekly meeting - every 2 weeks - I'm sure many would appreciate your perspectives

terse lintel
#

So I am nothing but a Fake

#

But thanks for the help you bunch

bronze bear
willow sparrow
#

@restive vessel @prime marsh Hi there. I tried the example of uv wheel-variant from the post on my macbook air m2. I had expected that it would install numpy along with torch. That is the behavior that I would see with venv / pip install.

Is it expected that it will default to 3.12?

P.S. No rush in responding. I'm not feeling 100% so am logging off for the day.

prime marsh
restive vessel
#

@willow sparrow we didn't ship a mac os build of pytorch (it wasn't very interesting given that they dont have ARM optimizations for M1...4)
I believe pytorch does not have explicit dependencies on numpy.

I believe we have a variant build of numpy somewhere that is optimized for M processors - let me see if I can upload it for you πŸ™‚

#

Actually there are many gaps in archspec around Apple M processors - I hope to be able to work with @fringe birch to address them

coarse mantle
restive vessel
# coarse mantle I would like to do this in my free time so I can learn more about what’s going o...

At the moment we are deep down into the PEP writing process and probably for a while ... And it's not exactly the most exciting bit of the process (though just a beginning).

On top of my head, I see 3 potential code contributions that would be appreciated and interesting:

  • scikit-build-core needs to be integrated with variant logic => @frank hound is the creator and maintainer, also part of WheelNext
  • maturin needs to be integrated with variant logic => @bronze bear is the creator of the project and part of WheelNext.
  • archspec needs improved ARM support (namely Apple Silicon) => Maybe @midnight marsh would be interested to give you a hand on this one (maybe ?) (he's also part of WheelNext)
    • We use this in our X86 & ARM variant plugins
    • conda also depends on it for the same reason

Any of them are of more interest to you ?

coarse mantle
restive vessel
coarse mantle
restive vessel
#

Looks probably correct. Just check with the project leads. I'm honestly not an expert. I tagged you in a Github issue

coarse mantle
#

Will do. Thanks!

lethal idol
#

I'd like to say, I'm not a big fan of the proposed changes to wheel filename schema to accommodate variant tags. Specifically I don't like the idea of having to parse the components of the name in order to try to decide which kind of component they are. It seems brittle and likely to cause problems in the longer term.

My idea is to have "optional" components of the name (aside from the build number for legacy reasons, but this could presumably be deprecated on a long enough timeline) be explicitly tagged, for example like foo-1.0-py3-abi-platform-variant=.... Alternately, leave optional components blank (with consecutive hyphens).

In the latter case one could even imagine "defaults" of -none-any- being abbreviated to ---; that would fail on older tools, but only for new packages.

(By my reasoning, the handling of build numbers was already not good, but at least it can be isolated...)

#

Basically what I'm concerned about is a future where someone has another good idea for something to add to the wheel filename metadata, and packages want to use that but not a variant, etc.

leaden pelican
#

Why use the file name at all? Yes, current tools do so. Conda stores metadata inside the package, and loads it to build an index. The file name is only for ensuring that files do not clobber each other. PyPI has some capability here with the JSON API.

Relying on only the file name for all metadata causes needless conflict and terrible hacks. And no, you don’t have to download the whole package to get at this metadata in conda packages.

lethal idol
#

Presumably, because the filename is visible before downloading starts?

knotty elbow
#

While it is possible to fetch metadata w/o downloading the whole distribution, you still need to download a whole file for each distribution under consideration. Whether that would be problematic in practice with variants, I'm not sure, but it doesn't scale well. It is quite fast to be able to fetch a package index page and filter out the vast majority of the distributions by simply their wheel names.

#

We could extend the simple API though to include metadata fields, though then that becomes a balancing act of including enough but not too much metadata in the simple responses.

lethal idol
#

another consideration: how likely is it that the variant-selection process causes rejection of a candidate?

#

because if you basically already know that you want some variant of the current wheel...

leaden pelican
#

Presumably, because the filename is visible before downloading starts?
I think that you are assuming a PEP 503/691 simple index. PEP 503 is very much based around filenames, and I understand your complaints in that light. PEP 691 has more room for flexibility, with potentially new metadata in each file's dictionary.

What I'm talking about is much more like PEP 691. The main index file in is "repodata.json" - https://docs.conda.io/projects/conda-build/en/stable/concepts/generating-index.html#repodata-json.

Conda filenaming is described here: https://docs.conda.io/projects/conda-build/en/stable/concepts/package-naming-conv.html#package-naming-conventions. You may notice that the filename is a key in the repodata.json, whereas PEP 691 has an array of equivalent dicts. It isn't used for anything in conda. Making it a key just enforces uniqueness.

Regarding Richard's point, scale is exceedingly important. That was a millstone for conda for a long time, because the repodata grew unbounded and never fell off. The same is true for PyPI and other simple indexes, barring other software behavior (artifactory?), but PyPI handled it much better by serving the file index for one package at a time. Conda used the entire package collection. This has since been improved (https://prefix.dev/blog/sharded_repodata), so I'd expect conda and PEP 691 indexes to have metadata that's pretty comparable in scale. Conda might have more data, but it'll be a factor of 2 or 3 difference in metadata download, rather than the extraordinarily bloated whole-system repodata.

Filenames have served really well for a long time, but I think they are at their limit. Paul Moore said as much a while back: https://discuss.python.org/t/selecting-variant-wheels-according-to-a-semi-static-specification/53446/98

#

I drank the conda kool-aid a long time ago, but I'm not trying to say "just use conda" or "conda did it right, do it that way". I'm saying that I think filenames as the bearer of metadata are too limited, and we need something better. Conda has an example of something else, but there's bound to be other good ways, too.

glossy cradle
#

Variants don’t really use filenames to convey much information. It’s basically just an arbitrary identifier. Filenames probably have to be unique beteeen variants anyways otherwise static hosting gets more annoying / harder

scenic pulsar
#

btw - would it be possible to do something like shipping a pure pthon version of a lib and the mypyc/cython based speedups for that wheel

hard escarp
#

Shipping with and without accelerator libraries doesn't need variants, as the existing tags cover that (ship a none-any wheel in addition to the ones with compiled code)

scenic pulsar
river lynx
#

by the way if people are interested in @restive vessel and myselfs talk from PyTorch Conference about WheelNext it just got posted to youtube!

https://youtu.be/BH4c7QrGB7E?si=gRo1dy6bdW9sccHO

Lightning Talk: Hardware-Aware Python Packages ~ PyTorch and WheelNext Grab the Wheel! - Jonathan Dekhtiar, NVIDIA & Eli Uriegas, Meta

The PyTorch ecosystem thrives on innovation and a vibrant open-source community. PyTorch’s reach continues to evolve, fueled today by specialized hardware, variations within CPU architecture families, dedicate...

β–Ά Play video
sturdy imp
#

i just tried uv-wheelnext (0.9.9) to install torch on DGX Spark following https://astral.sh/blog/wheel-variants, and got this failure:

TRACE Resolver derivation tree before reduction
term root==0a0.dev0
  root==0a0.dev0 depends on intel-variant-provider>=0.0.2, <1.0.0
  intel-variant-provider not found in the package registry
TRACE Resolver derivation tree after reduction
term root==0a0.dev0
  root==0a0.dev0 depends on intel-variant-provider>=0.0.2, <1.0.0
  intel-variant-provider not found in the package registry
TRACE Error trace: Failed to resolve requirements from `variant.providers.requires`

Caused by:
    0: No solution found when resolving: `intel-variant-provider>=0.0.2, <1.0.0`
    1: Because intel-variant-provider was not found in the package registry and you require intel-variant-provider>=0.0.2,<1.0.0, we can conclude that your requirements are unsatisfiable.
error: Failed to resolve requirements from `variant.providers.requires`
  Caused by: No solution found when resolving: `intel-variant-provider>=0.0.2, <1.0.0`
TRACE Resolver derivation tree before reduction
term root==0a0.dev0
  root==0a0.dev0 depends on intel-variant-provider>=0.0.2, <1.0.0
  intel-variant-provider not found in the package registry
TRACE Resolver derivation tree after reduction
term root==0a0.dev0
  root==0a0.dev0 depends on intel-variant-provider>=0.0.2, <1.0.0
  intel-variant-provider not found in the package registry
  Caused by: Because intel-variant-provider was not found in the package registry and you require intel-variant-provider>=0.0.2,<1.0.0, we can conclude that your requirements are unsatisfiable.

Is this a configuration error (e.g., trying to fetch intel-variant-provider from pypi) or is just the torch package missing support for aarch64 (Grace CPU)?
@restive vessel @wind cosmos

#

in the debug log, it also says:

TRACE Attempting unauthenticated request for https://download.pytorch.org/whl/variant/nvidia-variant-provider/
DEBUG Traceback (most recent call last):
DEBUG   File "<string>", line 6, in <module>
DEBUG     import priority as backend
DEBUG ModuleNotFoundError: No module named 'priority'
TRACE Request for https://download.pytorch.org/whl/variant/intel-variant-provider/ failed with 403 Forbidden, checking for credentials

maybe is this the cause?

round rune
#

Is it a goal of wheelnext to handle arbitrary types of variants besides hardware and driver-based? As a simple example, would it be possible to ship both debug and O3 builds as variants?

restive vessel
restive vessel
thorn spade
#

Hi everyone, I'm a developer from Huawei's Ascend Ecosystem team. We are reaching out to discuss adding Huawei Ascend NPU support as a new variant provider in wheelnext.

#

Do you know how to contribute a provider plugin to the WheelNext community?

restive vessel
# thorn spade Do you know how to contribute a provider plugin to the WheelNext community?

Hi @thorn spade ! Absolutely πŸ™‚

Welcome in !

  1. Open a PR to this file adding Huawei: https://github.com/wheelnext/wheelnext/blob/main/docs/who_are_we.md (make sure you got proper approval internally)

  2. Send me your Github Handle, I'll add you to the WheelNext Github Organization

  3. Create a repo inside the WheelNext org for your plugin, you can take example on the followings: https://github.com/wheelnext/pep_817_wheel_variants/tree/main/plugins

  4. Do you have a package in mind for which you'd like to test that it works ?

thorn spade
#

@restive vessel Thanks for your help!
This PR adds Huawei to who we are: https://github.com/wheelnext/wheelnext/pull/115
We have two members in charge of this.
Github Handle: zhihangdeng and wjunLu.
We are pleased to join the WheelNext community!
We only made an experimental package for testing (simple, but it covers all the features we desired)

GitHub

This pull request adds Huawei to the list of organizations in the docs/who_are_we.md file.

restive vessel
#

Amazing ! I'll review the PR tomorrow!

Welcome aboard

thorn spade
hard escarp
#

Amazing work on the published PEP 817. Two errant details caught my eye so far (and hitting the second one prompted me to post them before I got distracted):

  • the up front statement of affected specifications is incomplete (only binary archives are listed, and those certainly see the biggest changes, but source trees, source archives, environment markers, pyproject.toml, pylock.toml, and the simple index API are all also affected)
  • the statement on how to handle variant environment markers is not correct, specifically this part:

If a non-variant wheel was selected or built, all variant markers evaluate to False.

That implies even variant_label == "" would be false, contradicting the preceding example. It needs to say the label marker is the empty string and the set markers are all empty (so "in" checks will be false and "not in" checks will be true)

hard escarp
#

And after finishing the whole thing, my only further comment is that we may want to be stricter on avoiding the use of the new variant markers in non-variant wheels, to the point of having PyPI disallow them outright (a null variant would still be free to include them).

Congratulations to all involved in putting that PEP together, it's an impressive piece of work.

leaden pelican
hard escarp
# hard escarp And after finishing the whole thing, my only further comment is that we may want...

This case is actually even trickier than I thought, as I hadn't accounted for the index server metadata APIs (which are generally build independent). Expressing variant specific dependencies without confusing old clients is going to be tricky unless we do something like allowing variant-{label} as implicit extra names (I prefer the syntax in the PEP as the long term answer, so anything along those lines would just be a transitional mechanism).

glossy cradle
#

I don't think you have to care about old clients TBH. As long as the mere existence of a feature (rather than a project using it) doesn't cause problems, then it's up to each individual project when they are willing to break compatability with old clients by using a new feature.

#

That's how basically every new, non-optional, feature works in software. If your library wants to use a new Python 3.N feature, you can't use it until you're happy breaking all your Python 3.N-1 users.

lethal idol
#

or you can make wheels that are version-specific despite having only python code

bronze bear
#

for visibility, we're adding two updates to the PEP https://github.com/wheelnext/peps/pull/45:

  1. Make value lists in variants sorted, since the values are actually sets and this makes it easy to use == without having to convert everything back into sets. This matches the current behavior of variantlib.
  2. Change the merging rule to say "result is the same irrespective of the order in which the wheels are processed". I think it's essentially a better way of wording the same goal (lack of ambiguity) than my original thought.
frank hound
stone lodge
#

Wheel 2.0 discussion is on topic here, right? Or should we make another channel so this one can focus on wheel variants?

toxic wraith
#

Yes, absolutely

upbeat elm
#

@stone lodge I updated the 'wheel greater compression' repository to support an inner .tar.zst. This is an LLM written streaming extract in Rust. With this formulation a client might unpack the inner tarball while streaming the download; store the rest of the ZIP to a sparse file or use other ZIP tricks so that the inner .tar is not saved to disk; then deal with the much smaller metadata portion. If the ZIP header does not look like .data.tar.zst then it should do whatever it was doing before. "Normal" users can still figure it out with the unzip command. My set of 63 big/popular wheels as determined from pypi bigquery (shown in repository) shrinks from 646M to 453M.

lethal idol
#

Does this work better than just allowing the wheel to use .zst directly? Regardless, that's good news (and slightly better than the results I was seeing with very rough tests using LZMA)

upbeat elm
#

The point of putting an archive inside a ZIP is that the compression algorithm works across all the files in the inner archive, instead of each file individually. The advantage of ZIP's per-file compression is that you can read the files individually. Zstd is great because it compresses quickly and decompresses extremely quickly, so the total compress + send over network + decompress time is worthwhile. (Compare to a compression algorithm that is slower than downloading the uncompressed version of the file... we want to save time and space.) So if the metadata goes in the ZIP portion but the files go in the inner archive, we get good compression, quick metadata access, and follow most of the rules of wheel 1.

#

When unpacking the archive layer should feed the installer an iterable over (name, file object), so that the part of the installer that decides where files are placed on disk is ignorant of whether an inner archive is used, or not.