{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreihvbw24p53pletpam2h7bs7nhtwrhzapztvgutbdykh5ctnfuczla",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mmo7ggco7hd2"
},
"path": "/t/interest-in-preprocessing-utilities-for-multifile-model-uploads/176211#post_1",
"publishedAt": "2026-05-25T08:04:56.000Z",
"site": "https://discuss.huggingface.co",
"textContent": "**Pitch**\nHuggingface follows a ‘single file’ paradigm that can catch developers off guard. All relevant classes must be located in a single file, called huggingface.py. While some support has been included for recursive transfer, it is not thorough, with bugs on loading locally vs remotely. Huggingface has developed around that assumption, and it cannot easily be bypassed.\n\nThis significantly decreases the quality of coding for huggingface products; one has to locate everything in one file, making development messy. In theory, staging by inlining relative dependencies into one file can solve this issue, but the existing builtin solutions do not handle recursive directory traversal or comment preservation well. I have built a solution that inlines to a single huggingface.py file instead. Is the community interested?\n\n**Details**\n\nIn theory, relative imports can be inlined. So long as a project is structured such that it’s dependency tree is a Directed _Acyclic_ Graph one can replace the import statement with the module being imported as a linking step. Unless a namespace collision happens, code then just runs successfully.\n\n**How the Inliner Works**\n\nNaive line-by-line import detection is defeated by multiline parenthesized\nimports, inline comments, docstrings containing the word “import”, and\nTYPE_CHECKING blocks. The inliner therefore operates through a sentinel\npipeline:\n\n 1. TYPE_CHECKING blocks are converted to comments so their imports are inert.\n 2. Docstrings and comments are extracted and replaced with **COMMENT_N**\nsentinels, preventing their content from being misread as live imports.\n 3. Top-level import blocks — including parenthesized multiline forms — are\nextracted and replaced with **IMPORT_N** sentinels. Inline comments are\npromoted above the sentinel.\n 4. Each import block is standardized into canonical single-line forms. The\nsix supported forms are:\nimport _module_\nimport _module_ as _alias_\nfrom _module_ import _name_\nfrom _module_ import _name_ as _alias_\nfrom ._relative_ import _name_\nfrom ._relative_ import _name_ as _alias_\nMulti-name, parenthesized, and semicolon-separated imports are all expanded\ninto these forms. Star imports raise ValueError.\n 5. Each **IMPORT_N** sentinel is resolved: relative imports are replaced with\nthe recursively inlined content of the target file; external imports are\nemitted on first encounter and commented out on recurrence.\n 6. **COMMENT_N** sentinels are restored, returning docstrings and comments to\nthe merged source verbatim.\n\n\n\nRight now, it throws on ‘inline’ imports that are indented, since those are not scoped globally. It also does not handle importing modules, though I do know how to add support for that; it is not worth it unless commonly demanded. One thing it has for my purposes that I particularly needed was it preserves comments and docstrings while inlining, which was a constriction I was operating under. It also expects everything to have module docstrings, so I would have to make that more robust before widespread release. I want opinions before going through the extra hassle.\n\n**Questions:**\n\n * Would you like to be able to develop huggingface compatible models that span multiple files and subfolders, and would tolerate flattening before staging into hub?\n * How important do you find the idea of preserving comments when uploading?\n * If you find this beneficial, would you prefer a utility in huggingface proper, or a standalone package?\n * Is there any interest in fixing a few of the parsing bugs in huggingface dynamic utils, or is that code in a “don’t touch” state?\n * How important is it that your code can import a module rather than a class from it? namespace support is possible, but ugly.\n\n",
"title": "Interest in preprocessing utilities for multifile model uploads"
}