0

I'm trying to make a library out of a Python project I don't own. The project has the following directory layout:

.
├── MANIFEST.in
├── pyproject.toml
└── src
    ├── all.py
    ├── the.py
    └── sources.py

In pyproject.toml I have:

[tool.setuptools]
packages = ["mypkg"]

[tool.setuptools.package-dir]
mypkg = "src"

The problem I'm facing is that when I build and install this package I can't use it because the author is importing stuff without mypkg prefix in the various source files.

F.ex. in all.py

from the import SomeThing

Since I don't own the package I can't go modify all the sources but I still want to be able to build a library from it by just adding MANIFEST.in and pyproject.toml.

Is it possible to somehow instruct setuptools to build a package that won't litter site-packages with all the sources while still allowing them to be imported without the mypkg prefix?

metatoaster
  • 17,419
  • 5
  • 55
  • 66
evading
  • 3,032
  • 6
  • 37
  • 57
  • 1
    Short answer: no, they have to be at the appropriate directory (i.e. `site-packages`) level. Longer answer: work around this limitation by implement an [import hook](https://stackoverflow.com/questions/54456352/import-hooks-in-python) with your package and intercept all the appropriate imports such that they would resolve to the one installed at `site-packages/mypkg` location. Note that using import hooks won't necessarily fix the global namespace pollution problem, unless you can figure out a way to make those imports available only to those modules from `mypkg`. – metatoaster Oct 12 '22 at 09:01
  • Thanks @metatoaster! If you give the answer as an answer I can mark it as accepted. – evading Oct 13 '22 at 14:40

1 Answers1

1

It isn't possible without adding a custom import hook with the package. The hook takes the form of a module that is shipped with the package, and it must be imported before usage from your module (e.g. in src/all.py)

src/mypkgimp.py

import sys
import importlib  

class MyPkgLoader(importlib.abc.Loader):
    def find_spec(self, name, path=None, target=None):
        # update the list with modules that should be treated special
        if name in ['sources', 'the']:
            return importlib.util.spec_from_loader(name, self)
        return None

    def create_module(self, spec):
        # Uncomment if "normal" imports should have precedence
        # try:
        #     sys.meta_path = [x for x in sys.meta_path[:] if x is not self]
        #     return importlib.import_module(spec.name)
        # except ImportError:
        #     pass
        # finally:
        #     sys.meta_path = [self] + sys.meta_path

        # Otherwise, this will unconditionally shadow normal imports
        module = importlib.import_module('.' + spec.name, 'mypkg')
        # Final step: inject the module to the "shortened" name
        sys.modules[spec.name] = module
        return module

    def exec_module(self, module):
        pass

if not hasattr(sys, 'frozen'):
    sys.meta_path = [MyPkgLoader()] + sys.meta_path

Yes, the above uses different methods described by the thread I have linked previously, as importlib have deprecated those methods in Python 3.10, refer to documentation for details.

Anyway, for the demo, put some dummy classes in the modules:

src/the.py

class SomeThing: ...

src/sources.py

class Source: ...

Now, modify src/all.py to have the following:

import mypkg.mypkgimp
from the import SomeThing

Example usage:

>>> from sources import Source
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'sources'
>>> from mypkg import all
>>> all.SomeThing
<class 'mypkg.the.SomeThing'>
>>> from sources import Source
>>> Source
<class 'mypkg.sources.Source'>
>>> from sources import Error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'Error' from 'mypkg.sources' (/tmp/mypkg/src/sources.py)

Note how the import initially didn't work, but after mypkg.all got imported, the sources import now works globally. Hence care may be needed to not shadow "real" imports and I have provided the example to import using the "default"[*] import mechanism.

If you want the module names to look different (i.e. without the mypkg. prefix), that will be a separate question, as code typically don't check for their own module name for functionality (and never mind that this actually shows how the namespace is implicitly used - changing the actual name is more akin to a module relocation, yes this can be done, but a bit more complicated and this answer is long enough as it is).

[*] "default" as in not including behaviors introduced by this custom import hook - other import hooks may do their own other weird shenanigans.

metatoaster
  • 17,419
  • 5
  • 55
  • 66