GitHub
Hook to load package-provided dynamic libraries. Contribute to FFY00/dynamic-library development by creating an account on GitHub.
1 messages ยท Page 1 of 1 (latest)
Hook to load package-provided dynamic libraries. Contribute to FFY00/dynamic-library development by creating an account on GitHub.
this is somewhat similar to pkgconf-pypi, it allows folks to register dynamic libraries via entrypoints, which will then be loaded to the global namespace at startup
so when you try to load an extension linking to libfoo.so, for eg., even though the library isn't on the linker search path, since we have already loaded a libfoo.so, the extension will load instead of throwing a linker error
you can take this an example https://github.com/FFY00/dynamic-library/tree/main/tests/packages/register-library
hm, ok
needs work still; for a demo loading all of the dynamic libraries that are registered is fine, but if this had broader adoption you would only want to load the things you actually depend on
my solution for this is I have an _init_foo.py that lives next to the .so, and does this loading thing that you're doing, and you just add a from . import _init_foo in your package's __init__.py
it would be nice if you could have something magically make that happen though via a custom loader or something
you also would need to support specific load orders if you had libraries that depended on other libraries
the idea I was playing with at https://github.com/virtuald/hatch-pkgconf-meson was to add a variable to the .pc file indicating that a .py file existed that needed to be imported to load the library, and then the generator that was creating the _init_xxx.py would automatically add the imports for any libraries that it was dependent on
that's this function here: https://github.com/virtuald/hatch-pkgconf-meson/blob/main/hatch-mkpkgconf/src/hatch_mkpkgconf/plugin.py#L239
having an entrypoint for this is neat, but since there's already an entrypoint for the .pc file, it feels like binding them together is probably simpler even if its abusing the .pc variable mechanism
unless there was a custom loader that could leverage the entrypoint mecanism to notice that "this package got imported, but it needs this library that I know about, so let me just import that real quick"
but it's probably just simpler to have something in your __init__.py do load_library_by_entrypoint('foo')
I guess the strength of the entrypoint mechanism is that the publisher of the library can move it around to arbitrary places and it still works; whereas in my example the import paths can't change otherwise the package breaks... but I'm not sure that dynamic behavior is really required, and most python code already depends on package paths anyways
we moved the repo to https://github.com/pypackaging-native
I expect to work on it this month, and my main focus will be to try to delay the library loading to import time
so, I've been rolling this around in my head, and I think the focus is not quite right
specifically, I think the problem that an enduser is going to care about is "if I import foo (where foo is some compiled python extension) I would like for all of its linked-to libraries to be loaded so that it actually loads"
from that perspective, I would use the entrypoint mechanism to mark foo as a thing that something special needs to happen when it gets imported
the other thing is that using entrypoints to register the libraries that you're loading isn't a good idea on macOS -- on macOS if the library you care about is not in the correct location, then it will not resolve the symbols to it
this is why delocate exists, so that you can set the loader path to point at the right library
and so what I mean by "using entrypoints is bad" in this instance is that the location where our foo module is in the filesystem is fixed, and so the things it depend on also have to be in a fixed location relative to it
an entrypoint would allow a package author to move the library around, which would break anything that depended on it in an unintuitive way
so
awhile ago I was thinking it would be really useful if python had some builtin thing where when import compiled_extension occurs that it checks to see if _compiled_extension.py exists and if it did then import that first
but I think you could probably accomplish that with an import hook of some kind?
probably instead of a _compiled_extension.py could just look for a compiled_extension.dynlib.json and then use that metadata to load the appropriate libraries
on macOS if the library you care about is not in the correct location, then it will not resolve the symbols to it
this works the same as linux, at least when using ELF
the major difference is that on macOS land, libraries generally include their full path in DT_NEEDED
but if you remove the path, like we generally do on linux, it works as expected
this is something that needs to be documented, because users need to take special care in this situation
not really because packages might exist on different paths
can you clarify the kind of situation you have in mind here?
yes, my plan was to move from loading everything at startup to an as-needed basis using an import hook
generally shared libraries on macOS are .dylib files though, not elfs?
yeah, I wasn't remembering, I did test mach-o and it behaves like that
ok. yeah, macOS is the annoying case. though, https://discuss.python.org/t/native-dependencies-in-other-wheels-how-i-do-it-but-maybe-we-can-standardize-something/23913/30 did some experimentation to move them around, and apparently you can do it if the ID is set consistently? I haven't tried it, but they published a repo at https://github.com/amol-/wheeldeps
For your interest, on OSX I was able to find a fairly satisfying solution by setting the libaries ids consistently. This is nearly the same as what I do on Linux using GitHub - amol-/consolidatewheels: Consolidate wheels libraries included by auditwheel but as you mentioned, on linux the linker is happy as far as the library has been loaded som...
yes, you need to set it consistently
my recommendation in the docs would be to just set it to the library name without the path
if dylibs can be moved around, then my comment is invalid
@next meteor Thanks for the comment at https://github.com/lmstudio-ai/venvstacks/issues/38#issuecomment-2764812377 - even the "that looks like it should work" is good feedback, since the topic is esoteric enough that most folks would barely make it past the start of the issue description before going "Nope, that's somebody else's problem" ๐
Even more helpfully, I like the way you've set up the dynamic-library test suite, so I'll be able to use that as inspiration when designing venvstacks test cases that don't need to rely on enormous real world examples like pytorch
thank you so much for the comment! it's honestly refreshing seeing the appreciation for this kind of work, where we usually work months on end to make the user experience better, which by design, when it works, users don't notice it. then when something breaks, or we run into an edge case, we get reports and more stuff to fix.
TBH, my dayjob has been working on this kind of complex issues, and I have been really struggling with motivation and burnout. this work isn't very visible to users, and often takes months, or even years, to finish projects, and they only make a really small dent into the UX.
even inside my dayjob, it's difficult to make the work visible to management, to justify the resource allocation
I love the technical aspect of the work โ the technical challenges โ but putting all the necessary infrastructure in place, and engaging with the stockholders needed for that, sucks a whole lot of my energy
at the end of the day, I am not building anything exciting for users, like uv, I am working on the backend, and trying to shoehorn solutions in an ecosystem that was not designed for them
sorry for the vent, the past couple months have been hard ๐
I definitely appreciate the visibility problem (I'm lucky enough not to face it in my current role, but it's definitely been a complicating factor in some previous roles)
as someone who also likes build systems and behind the scenes tooling, I can also appreciate the thanklessness of the work.
x2
circling back to this, it seems that on macos to get the desired behavior each dependency needs to be specified as @rpath/name.dylib .. and if it is, then it will find previously loaded libraries just like Linux and Windows
from my testing, the name just need to match be it either a full path, or something arbitrary
I got it working by just specifying them as name.dylib
@next meteor I recently got the first pass of the documentation I was writing up at https://native-lib-loader.readthedocs.io/en/latest/reference/index.html. I'm less interested in the code at this moment and more in getting all of the information that we know documented in one place (eventually it might make sense to merge this into pypackaging-native, but that's for later). I'd love for you or anyone else to take a look and submit any corrections/improvements/etc. I think you've tested on a broader set of OS and platforms than I have, so you will be able to add quite a bit of information.
this problem is probably not particularly new to anyone here, but here's a particularly fun/annoying example of the limitations of the current definition of rpaths and runpaths: https://github.com/astral-sh/python-build-standalone/issues/619#issuecomment-2902499473
Oh good there's more to the python-build-standalone story ๐ฅน I won't be able to look at that immediately, but I'll take a look when I can. I can work with the ucxx developers to get a solution in place if we need to fix it on that end
Yeah, I think if you can get ucxx to stop linking libpython that will actually be helpful, this is hairy enough that there are no good solutions on the python-build-standalone end (but I have several bad ones....).
No rush on my end, of course....
It shouldn't be a problem, as I mentioned on the python-build-standalone issue, the CMake to do that is slightly nontrivial since it requires some manual work to extract includes, that's all. Since we build for conda too we might be better off using the patchelf approach rather than branching our build, but in any case either should produce binaries with the result that we want
FYI, kicked off a DPO thread for this as well (see #wheelnext message)
Can I use this tool to bundle some DLLs that aren't Python modules with my library, but only load them if they're explicitly imported?