#importlib-resources

1 messages · Page 1 of 1 (latest)

rigid zenith
#

@proud bloom, would you have time to review? :)

twin ore
#

are package resources supported by hatch/hatchling?

digital loom
#

What do you mean? If you mean accessing package files though importlib.resources, that has no relation to hatch/hatching. Any file in the wheel will be accessible that way.

Of course you can control which files end up in your wheel using your build tool, which is where hatchling’s configuration comes in if you use that.

Does that answer your question?

native oasis
proud bloom
#

yep, the importlib.resources API should support any importable module

pearl storm
#

@proud bloom I think remembered the thing that annoyed me about the importlib.resources files API (which probably you don't remember what I'm talking about but I was trying to remember this like a month ago)

#

i.e. "I have a directory with files, I want to recursively read all of them"

#

lemme know if I am still off base and it really can do that

proud bloom
#

humm, can't you write a simple recursive function that does this?

pearl storm
#

that's what that code does

#

but it's annoyingish that this is trivial with pathlib and not with importlib.resources

#

with the former it's just a genexp

#

not killer obviously, but a bit frustrating given how often I get to do that operation

proud bloom
#

if you are okay with just 2 levels, itertools.chain.from_iterable(c.iterdir() for c in t.iterdir()) should work

#
itertools.chain(*(c.iterdir() for c in t.iterdir())
#

this is smaller

#

but preference I guess

#

I wouldn't be opposed to add/extend a glob functionality in the stdlib to work with traversables

pearl storm
#

this will blow up when you mix files and directories

#

subdirectories

pearl storm
rigid zenith
#

in Python 3.12, when a module path is passed to files, it returns the path to the parent package. Is this intentional?

#
$ python3.11 -c "import importlib.resources; print(importlib.resources.files('importlib.resources.readers'))"
Traceback (most recent call last):
  File "<string>", line 2, in <module>
  File "/nix/store/9qhyp2qkb4s5zi1vahj4bjjjmdin6jnz-python3-3.11.8/lib/python3.11/importlib/resources/_common.py", line 22, in files
    return from_package(get_package(package))
                        ^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/9qhyp2qkb4s5zi1vahj4bjjjmdin6jnz-python3-3.11.8/lib/python3.11/importlib/resources/_common.py", line 55, in get_package
    raise TypeError(f'{package!r} is not a package')
TypeError: 'importlib.resources.readers' is not a package
[1]$ python3.12 -c "import importlib.resources; print(importlib.resources.files('importlib.resources.readers'))"
/nix/store/g4vnngnn4rpfs5hm2p4bw394pbdgk312-python3-3.12.2/lib/python3.12/importlib/resources
rigid zenith
proud bloom
#

@ripe fern, okay, I have opened a few bpo tackling some of the issues I found in importlib.abc

#

thanks!

ripe fern
#

I think I may have been unclear. I do prefer to resolve issues in the backports (python/importlib_resources,metadata) first, as it's slower to move in cpython, the integration between bpo and github lacks some features, and syncing the changes is harder. The one exception is the docs for importilb_resources are in CPython only, so they don't have an equivalent in the backport.

proud bloom
#

okay, even for importlib.abc?

ripe fern
#

Well, yes, if that code derives from importlib_resources. Only some of importlib.abc is synced with importlib_resources.

proud bloom
#

okay

#

anyway, most of the bpo bugs I opened are for documentation or cpython specific

ripe fern
#

Cool.

proud bloom
#

IIRC I only opened one proposing to change something in importlib.abc (the __fspath__ one)

#

I am on my phone right now so I am not gonna check

#

but yeah

proud bloom
#

btw, I am usually always available after 2PM UTC+1 if you want to have a call

#

sometimes that's easier

#

so feel free to ping me

rigid zenith
#

is ResourceReader deprecated? The importlib docs say it's been superseded by TraversableResources (TraversableResources is not linked) but that refers readers back to ResourceReader and mentions a "files" interface which isn't documented? I assume the "files" interface is in fact importlib.resources.abc.Traversable and a resource reader is any class which inherits from TraversableResources and implements Traversable?

proud bloom
#

I think the plan is to eventually deprecate ResourceReader in favor of TraversableResources (or better, a TraversableReader protocol -- protocol with only files())

#

essentially, it returns a Traversable

#

TraversableResources takes a files() implementation and provides the legacy ResourceReader protocol

#

this is documented in code, but not in the docs, I wanted to do that

#

the idea is that you shouldn't be implementing a ResourceReader directly

#

you should inherit from TraversableResources and just implement files()

#

do you have any question? I know this is a little bit messy

rigid zenith
#

I suppose, this is all a little confusing. Is importlib.resources.files meant to replace all of the older functions or is it just for resources on disk?

proud bloom
#

everything

#

there are still some quirks on fully replacing the legacy API with files() as a base, but they are being worked on

#

though, I'd say it would be very unlikely for you to run into them

rigid zenith
#

if I'm loading resources from a zip file or from memory, why would I want to think of them as "files"? The docs even say, "Resources are roughly akin to files inside directories, though it’s important to keep in mind that this is just a metaphor. Resources and packages do not have to exist as physical files and directories on the file system", and then goes on to call them "files"... There's also a contradiction in terms with files() meaning "any package resource returned by a loader" and as_file() meaning "any package resource returned by a loader, copied on disk"

rigid zenith
#

Also with Traversible being a protocol and files() returning a Path object for packages on disk, we're running a serious risk of people using Path methods outside of the protocol

proud bloom
#

if I'm loading resources from a zip file or from memory, why would I want to think of them as "files"?

#

well, they are files

#

they just aren't on the file system, but they are files

#

as_file will read the traversable to an empty directory and return that path, having an optimization on pathlib.Path to return the path directly

#

Also with Traversible being a protocol and files() returning a Path object for packages on disk, we're running a serious risk of people using Path methods outside of the protocol

#

returning pathlib.Path is an implementation detail of the loader

#

ziploader should return zipfile.Path instead IIRC

rigid zenith
# proud bloom well, they are files

I'm simply following the logic that I assume went into naming importlib.resources. Why draw a distinction between "resources" and "files", if we're gonna turn around and call them "files" several years down the line?

rigid zenith
#

You can have a package foo with an __init__.py and a folder foo/bar without an __init__.py so that importlib.resources.files('foo') / 'bar' / 'baz' works but importlib.resources.files('foo.bar') / 'baz' doesn't. I don't suppose that this is intentional

proud bloom
#

I think it is

#

those are just "sub-resources" of the package

rigid zenith
#

alright, thinking about it a bit more I can see how that might be desirable

#

I think the function name is unfortunate, but it's probably too late to change that. Reducing the surface of the objects returned by files() would probably be something worth pursuing however.

#

(I'm sorry if I sounded too critical, I appreciate all the work that's been put in importlib.resources.)

proud bloom
#

I still do not see the problem here

#

Resources are roughly akin to files inside directories, though it’s important to keep in mind that this is just a metaphor. Resources and packages do not have to exist as physical files and directories on the file system

#

^ the relevant bit there is "files inside directories", not "files"

#

perhaps it could be clarified, but what that is trying to say is that resources do not need to be physical files in the file-system

#

they are "roughly that"

#

but if I am loading a module from somewhere that does not have such concepts, let's say a db, it would be whatever kinda matches the metaphor in that context

#

maybe files would be better named as resources, but eh, we can't really do anything about that now

#

anyway, resources should be whatever matches the "files inside directories" metaphor in the place the module is being loaded from

rigid zenith
#

well, it sounds like you do see the problem with how the same term's being used to mean two different things in the same context

#

in addition to "resources", which should now mean what, with the new API?

proud bloom
#

they're traversable files returned by the api

rigid zenith
#

I know what they are meant to be, what I'm saying is that the new API dilutes the distinction between files and resources. I think we're going round in circles now so let's just leave it.

proud bloom
#

this is what you want, no?

rigid zenith
#

well, is_resource(...) is part of the legacy API that files() is meant to replace. What I'd like is (a) some clarity around the terminology used in importlib.resources and (b) that we might consider replacing Path with a wrapper class. There isn't a specific issue that I'm having that I'm trying to solve. I want things to be easy to understand from a pedagogical perspective and I want to prevent mistakes that could be made from exposing Path objects. For example, you can now use Path.open to overwrite a resource but this is not something that anybody should ever do. Or you could use Path.symlink_to but that's not a method supported by Traversable. There exists a new class of errors which were previously impossible but are now possible. The new and old APIs are also indistinguishable in the docs save for the "New in version 3.9." note at the bottom. The nomenclature needs to be addressed and either files() or as_file(...) should be renamed. The docs should explain that the new API was introduced to allow traversing into resource containers which are not packages, and that that is the difference between path and as_file, the latter you can use to open a resource in a resource container which is not a package. They should also explain what the status of the legacy API is and make a recommendation to use files with as_file only going forward. Traversable needs to be documented. I'm not asking you to do any of these things, I'm trying to gauge whether you agree that they should be done :)

proud bloom
#

yeah, I think documentation should be improved

silk marsh
#

is there a class of Distribution that refers to a zip imported package?

vivid scarab
#

Are you referring to importlib-metadata instead? AFAIK there's no Distribution class in importlib-resources

proud bloom
#

there is not indeed

#

@silk marsh no, Distribution in importlib-metadata uses pathlib-like objects

#

like zipfile.ZipPath

ripe fern
#

You'll get one of those if you use pep517.meta.load for a local package. It'll build the metadata in an in-memory zip and load that into a PathDistribution object.

rigid zenith
upbeat locust
#

Is there a good way to statically type importlib_resources? I normally like targeting oldest supported Python with mypy, but importlib.resources's type hints are way better than imporlib_resources. Things like to untyped function "joinpath" of "Traversable" and such show up when I target Python 3.6.

green citrus
#

are you restricted in terms of using the same interpreter for type checking and running the code? You could use if TYPE_CHECKING to coerce mypy into using the type stubs from the stdlib...

upbeat locust
#

No, interpreter is 3.10. Hmm, I could try that, as long as they are not gated by version. IIRC, though, the interface changed after the first release, so I'm using the 3.9+ interface and the backport otherwise. so that probably is gated?

rigid zenith
#

you can't use Traversable etc. from stdlib if you're targeting the oldest support Python

upbeat locust
#

I'm using them from importlib_resources if python < 3.9. I think the suggestion was to tell mypy I'm using them from the stdlib anyway, but they are probably gated on Python version. At least the new interface should be.

rigid zenith
#

yes, I was responding to @green citrus, they're version gated

upbeat locust
#

I think the core problem is importlib_resources' type annotations are not nearly as good as typeshed.

rigid zenith
#

I think it's only a few methods that are annotated

proud bloom
#

they are fine now but the last version that supports 3.6 is not

#

the type hints could possibly be backported

rigid zenith
#

none of this stuff is annotated

green citrus
rigid zenith
#

I prefer the legacy API anyway 😛

proud bloom
#

3.6 is EOL now, so I would just drop support 🤷

upbeat locust
#

I don't consider using a backport of a stdlib to be a real dependency - it goes away over time.

#

For context, this is coming up because we just dropped 2.7 & 3.5 so I'm improving the type checking... Not sure I'd be too popular for dropping 3.6 less than a week (and no release) later 🤦

#

Though I'd be totally fine targeting 3.7 with typechecking, I thought I tried that. Can check, will report back

rigid zenith
#

that won't help if you're using the traversable API, it was added in 3.9

upbeat locust
#

Yes, but it is usable from 3.7, so the type annotations should work for 3.7. And you are correct, targeting 3.7 does not get annotations for Traversable from importlib_resources.

rigid zenith
#

Right, because there aren't any in importlib_resources. It's only annotated in the typeshed for the stdlib module

upbeat locust
#

Plus the code you linked to was from main. 🙂

rigid zenith
#

well, I can't imagine that they'll have removed the type annotations on main ;P

upbeat locust
#

Fun fact: for static type checking, you can't drop a Python version nicely, since the running Python is newer than the static target. So dropping 3.6 could break 3.6 type checking even with Requires-Python. Python 3.6 is still the third most popular version of Python, and default on CentOS 7 & 8 and Ubuntu 18.04, and three times more popular than Python 3.9.

PS: I'm not against dropping 3.6, but importlib-resources is sort of backport-like package, and core stuff dropping support before libraries drop support can cause issues for the libraries

#

(Jan 10 stats)

#

Would adding types to the Protocol be a good PR? Assuming there's not a reason they were left off?

rigid zenith
#

I don't think there's a particularly reason that they were left out other than the repo not being typed at the time the protocol was added

proud bloom
#

you can copy from the stdlib

upbeat locust
#

That's my plan 😉