External Publication
Visit Post

Interest in preprocessing utilities for multifile model uploads

Hugging Face Forums [Unofficial] May 25, 2026
Source

Pitch Huggingface follows a ‘single file’ paradigm that can catch developers off guard. All relevant classes must be located in a single file, called huggingface.py. While some support has been included for recursive transfer, it is not thorough, with bugs on loading locally vs remotely. Huggingface has developed around that assumption, and it cannot easily be bypassed.

This significantly decreases the quality of coding for huggingface products; one has to locate everything in one file, making development messy. In theory, staging by inlining relative dependencies into one file can solve this issue, but the existing builtin solutions do not handle recursive directory traversal or comment preservation well. I have built a solution that inlines to a single huggingface.py file instead. Is the community interested?

Details

In theory, relative imports can be inlined. So long as a project is structured such that it’s dependency tree is a Directed Acyclic Graph one can replace the import statement with the module being imported as a linking step. Unless a namespace collision happens, code then just runs successfully.

How the Inliner Works

Naive line-by-line import detection is defeated by multiline parenthesized imports, inline comments, docstrings containing the word “import”, and TYPE_CHECKING blocks. The inliner therefore operates through a sentinel pipeline:

  1. TYPE_CHECKING blocks are converted to comments so their imports are inert.
  2. Docstrings and comments are extracted and replaced with COMMENT_N sentinels, preventing their content from being misread as live imports.
  3. Top-level import blocks — including parenthesized multiline forms — are extracted and replaced with IMPORT_N sentinels. Inline comments are promoted above the sentinel.
  4. Each import block is standardized into canonical single-line forms. The six supported forms are: import module import module as alias from module import name from module import name as alias from .relative import name from .relative import name as alias Multi-name, parenthesized, and semicolon-separated imports are all expanded into these forms. Star imports raise ValueError.
  5. Each IMPORT_N sentinel is resolved: relative imports are replaced with the recursively inlined content of the target file; external imports are emitted on first encounter and commented out on recurrence.
  6. COMMENT_N sentinels are restored, returning docstrings and comments to the merged source verbatim.

Right now, it throws on ‘inline’ imports that are indented, since those are not scoped globally. It also does not handle importing modules, though I do know how to add support for that; it is not worth it unless commonly demanded. One thing it has for my purposes that I particularly needed was it preserves comments and docstrings while inlining, which was a constriction I was operating under. It also expects everything to have module docstrings, so I would have to make that more robust before widespread release. I want opinions before going through the extra hassle.

Questions:

  • Would you like to be able to develop huggingface compatible models that span multiple files and subfolders, and would tolerate flattening before staging into hub?
  • How important do you find the idea of preserving comments when uploading?
  • If you find this beneficial, would you prefer a utility in huggingface proper, or a standalone package?
  • Is there any interest in fixing a few of the parsing bugs in huggingface dynamic utils, or is that code in a “don’t touch” state?
  • How important is it that your code can import a module rather than a class from it? namespace support is possible, but ugly.

Discussion in the ATmosphere

Loading comments...