#importlib-resources
1 messages · Page 1 of 1 (latest)
@proud bloom, would you have time to review? :)
are package resources supported by hatch/hatchling?
What do you mean? If you mean accessing package files though importlib.resources, that has no relation to hatch/hatching. Any file in the wheel will be accessible that way.
Of course you can control which files end up in your wheel using your build tool, which is where hatchling’s configuration comes in if you use that.
Does that answer your question?
yes, the options to include them in sdists/wheels respectively are a little different, but resource files are normally supported
yep, the importlib.resources API should support any importable module
@proud bloom I think remembered the thing that annoyed me about the importlib.resources files API (which probably you don't remember what I'm talking about but I was trying to remember this like a month ago)
it's this operation: https://github.com/python-jsonschema/jsonschema-specifications/blob/main/jsonschema_specifications/_core.py#L20-L34
i.e. "I have a directory with files, I want to recursively read all of them"
lemme know if I am still off base and it really can do that
humm, can't you write a simple recursive function that does this?
that's what that code does
but it's annoyingish that this is trivial with pathlib and not with importlib.resources
with the former it's just a genexp
not killer obviously, but a bit frustrating given how often I get to do that operation
if you are okay with just 2 levels, itertools.chain.from_iterable(c.iterdir() for c in t.iterdir()) should work
itertools.chain(*(c.iterdir() for c in t.iterdir())
this is smaller
but preference I guess
I wouldn't be opposed to add/extend a glob functionality in the stdlib to work with traversables
sorry, maybe I didn't stress "non-uniform" properly
this will blow up when you mix files and directories
subdirectories
but this sounds good 🙂 I think I started on adding it somewhere here
in Python 3.12, when a module path is passed to files, it returns the path to the parent package. Is this intentional?
$ python3.11 -c "import importlib.resources; print(importlib.resources.files('importlib.resources.readers'))"
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "/nix/store/9qhyp2qkb4s5zi1vahj4bjjjmdin6jnz-python3-3.11.8/lib/python3.11/importlib/resources/_common.py", line 22, in files
return from_package(get_package(package))
^^^^^^^^^^^^^^^^^^^^
File "/nix/store/9qhyp2qkb4s5zi1vahj4bjjjmdin6jnz-python3-3.11.8/lib/python3.11/importlib/resources/_common.py", line 55, in get_package
raise TypeError(f'{package!r} is not a package')
TypeError: 'importlib.resources.readers' is not a package
[1]$ python3.12 -c "import importlib.resources; print(importlib.resources.files('importlib.resources.readers'))"
/nix/store/g4vnngnn4rpfs5hm2p4bw394pbdgk312-python3-3.12.2/lib/python3.12/importlib/resources
to answer my q, it appears to be intentional: https://github.com/python/importlib_resources/pull/258
it looks like arg-less files also relies on __file__ now, which definitely won't play nice with pyoxidizer: https://github.com/python/importlib_resources/commit/e15a6b79ba1b11e21476f7274e0f2fafd5054486
@ripe fern, okay, I have opened a few bpo tackling some of the issues I found in importlib.abc
it would be really helpful if you could look at https://github.com/python/cpython/pull/26272 as some of the other PRs will conflict with that, so it needs to go in first
GitHub
Signed-off-by: Filipe Laíns lains@riseup.net
thanks!
I think I may have been unclear. I do prefer to resolve issues in the backports (python/importlib_resources,metadata) first, as it's slower to move in cpython, the integration between bpo and github lacks some features, and syncing the changes is harder. The one exception is the docs for importilb_resources are in CPython only, so they don't have an equivalent in the backport.
okay, even for importlib.abc?
Well, yes, if that code derives from importlib_resources. Only some of importlib.abc is synced with importlib_resources.
okay
anyway, most of the bpo bugs I opened are for documentation or cpython specific
Cool.
IIRC I only opened one proposing to change something in importlib.abc (the __fspath__ one)
I am on my phone right now so I am not gonna check
but yeah
btw, I am usually always available after 2PM UTC+1 if you want to have a call
sometimes that's easier
so feel free to ping me
is ResourceReader deprecated? The importlib docs say it's been superseded by TraversableResources (TraversableResources is not linked) but that refers readers back to ResourceReader and mentions a "files" interface which isn't documented? I assume the "files" interface is in fact importlib.resources.abc.Traversable and a resource reader is any class which inherits from TraversableResources and implements Traversable?
I think the plan is to eventually deprecate ResourceReader in favor of TraversableResources (or better, a TraversableReader protocol -- protocol with only files())
the files() protocol is https://github.com/python/cpython/blob/09eb81711597725f853e4f3b659ce185488b0d8c/Lib/importlib/abc.py#L429
essentially, it returns a Traversable
TraversableResources takes a files() implementation and provides the legacy ResourceReader protocol
this is documented in code, but not in the docs, I wanted to do that
the idea is that you shouldn't be implementing a ResourceReader directly
you should inherit from TraversableResources and just implement files()
do you have any question? I know this is a little bit messy
I suppose, this is all a little confusing. Is importlib.resources.files meant to replace all of the older functions or is it just for resources on disk?
everything
there are still some quirks on fully replacing the legacy API with files() as a base, but they are being worked on
though, I'd say it would be very unlikely for you to run into them
if I'm loading resources from a zip file or from memory, why would I want to think of them as "files"? The docs even say, "Resources are roughly akin to files inside directories, though it’s important to keep in mind that this is just a metaphor. Resources and packages do not have to exist as physical files and directories on the file system", and then goes on to call them "files"... There's also a contradiction in terms with files() meaning "any package resource returned by a loader" and as_file() meaning "any package resource returned by a loader, copied on disk"
Also with Traversible being a protocol and files() returning a Path object for packages on disk, we're running a serious risk of people using Path methods outside of the protocol
if I'm loading resources from a zip file or from memory, why would I want to think of them as "files"?
well, they are files
they just aren't on the file system, but they are files
as_file will read the traversable to an empty directory and return that path, having an optimization on pathlib.Path to return the path directly
Also with Traversible being a protocol and
files()returning aPathobject for packages on disk, we're running a serious risk of people usingPathmethods outside of the protocol
returning pathlib.Path is an implementation detail of the loader
ziploader should return zipfile.Path instead IIRC
I'm simply following the logic that I assume went into naming importlib.resources. Why draw a distinction between "resources" and "files", if we're gonna turn around and call them "files" several years down the line?
well, it's an implementation detail only to the people who understand it as such. Returning Path also raises other questions like, should you be able to navigate into sub-packages? Is that going to work with a zip Path and even then, is it a guarantee provided by the protocol?
You can have a package foo with an __init__.py and a folder foo/bar without an __init__.py so that importlib.resources.files('foo') / 'bar' / 'baz' works but importlib.resources.files('foo.bar') / 'baz' doesn't. I don't suppose that this is intentional
alright, thinking about it a bit more I can see how that might be desirable
I think the function name is unfortunate, but it's probably too late to change that. Reducing the surface of the objects returned by files() would probably be something worth pursuing however.
(I'm sorry if I sounded too critical, I appreciate all the work that's been put in importlib.resources.)
I still do not see the problem here
Resources are roughly akin to files inside directories, though it’s important to keep in mind that this is just a metaphor. Resources and packages do not have to exist as physical files and directories on the file system
^ the relevant bit there is "files inside directories", not "files"
perhaps it could be clarified, but what that is trying to say is that resources do not need to be physical files in the file-system
they are "roughly that"
but if I am loading a module from somewhere that does not have such concepts, let's say a db, it would be whatever kinda matches the metaphor in that context
maybe files would be better named as resources, but eh, we can't really do anything about that now
anyway, resources should be whatever matches the "files inside directories" metaphor in the place the module is being loaded from
well, it sounds like you do see the problem with how the same term's being used to mean two different things in the same context
in addition to "resources", which should now mean what, with the new API?
they're traversable files returned by the api
I know what they are meant to be, what I'm saying is that the new API dilutes the distinction between files and resources. I think we're going round in circles now so let's just leave it.
this is what you want, no?
well, is_resource(...) is part of the legacy API that files() is meant to replace. What I'd like is (a) some clarity around the terminology used in importlib.resources and (b) that we might consider replacing Path with a wrapper class. There isn't a specific issue that I'm having that I'm trying to solve. I want things to be easy to understand from a pedagogical perspective and I want to prevent mistakes that could be made from exposing Path objects. For example, you can now use Path.open to overwrite a resource but this is not something that anybody should ever do. Or you could use Path.symlink_to but that's not a method supported by Traversable. There exists a new class of errors which were previously impossible but are now possible. The new and old APIs are also indistinguishable in the docs save for the "New in version 3.9." note at the bottom. The nomenclature needs to be addressed and either files() or as_file(...) should be renamed. The docs should explain that the new API was introduced to allow traversing into resource containers which are not packages, and that that is the difference between path and as_file, the latter you can use to open a resource in a resource container which is not a package. They should also explain what the status of the legacy API is and make a recommendation to use files with as_file only going forward. Traversable needs to be documented. I'm not asking you to do any of these things, I'm trying to gauge whether you agree that they should be done :)
yeah, I think documentation should be improved
is there a class of Distribution that refers to a zip imported package?
Are you referring to importlib-metadata instead? AFAIK there's no Distribution class in importlib-resources
there is not indeed
@silk marsh no, Distribution in importlib-metadata uses pathlib-like objects
like zipfile.ZipPath
oh ok, thanks
You'll get one of those if you use pep517.meta.load for a local package. It'll build the metadata in an in-memory zip and load that into a PathDistribution object.
the meta module is not explicitly deprecated though it probably should be, see https://github.com/pypa/pep517/issues/91#issuecomment-997624684
Is there a good way to statically type importlib_resources? I normally like targeting oldest supported Python with mypy, but importlib.resources's type hints are way better than imporlib_resources. Things like to untyped function "joinpath" of "Traversable" and such show up when I target Python 3.6.
are you restricted in terms of using the same interpreter for type checking and running the code? You could use if TYPE_CHECKING to coerce mypy into using the type stubs from the stdlib...
No, interpreter is 3.10. Hmm, I could try that, as long as they are not gated by version. IIRC, though, the interface changed after the first release, so I'm using the 3.9+ interface and the backport otherwise. so that probably is gated?
you can't use Traversable etc. from stdlib if you're targeting the oldest support Python
I'm using them from importlib_resources if python < 3.9. I think the suggestion was to tell mypy I'm using them from the stdlib anyway, but they are probably gated on Python version. At least the new interface should be.
yes, I was responding to @green citrus, they're version gated
I think the core problem is importlib_resources' type annotations are not nearly as good as typeshed.
I think it's only a few methods that are annotated
they are fine now but the last version that supports 3.6 is not
the type hints could possibly be backported
That is the main reason why I never use these interfaces 😅 (I know is bad, but I prefer to have less dependencies anyway - so I am stuck with the legacy API)
I prefer the legacy API anyway 😛
3.6 is EOL now, so I would just drop support 🤷
I don't consider using a backport of a stdlib to be a real dependency - it goes away over time.
For context, this is coming up because we just dropped 2.7 & 3.5 so I'm improving the type checking... Not sure I'd be too popular for dropping 3.6 less than a week (and no release) later 🤦
Though I'd be totally fine targeting 3.7 with typechecking, I thought I tried that. Can check, will report back
that won't help if you're using the traversable API, it was added in 3.9
Yes, but it is usable from 3.7, so the type annotations should work for 3.7. And you are correct, targeting 3.7 does not get annotations for Traversable from importlib_resources.
Right, because there aren't any in importlib_resources. It's only annotated in the typeshed for the stdlib module
Plus the code you linked to was from main. 🙂
well, I can't imagine that they'll have removed the type annotations on main ;P
Fun fact: for static type checking, you can't drop a Python version nicely, since the running Python is newer than the static target. So dropping 3.6 could break 3.6 type checking even with Requires-Python. Python 3.6 is still the third most popular version of Python, and default on CentOS 7 & 8 and Ubuntu 18.04, and three times more popular than Python 3.9.
PS: I'm not against dropping 3.6, but importlib-resources is sort of backport-like package, and core stuff dropping support before libraries drop support can cause issues for the libraries
(Jan 10 stats)
Would adding types to the Protocol be a good PR? Assuming there's not a reason they were left off?
I don't think there's a particularly reason that they were left out other than the repo not being typed at the time the protocol was added
yes
you can copy from the stdlib
That's my plan 😉