External Publication

Interest in preprocessing utilities for multifile model uploads

Hugging Face Forums [Unofficial] May 27, 2026

I am personally not deeply familiar with this area, but this seems closely tied to the long-running Transformers discussion around the single-file philosophy:

You may already be aware of this context, so apologies if this is redundant, but I think this proposal might be easier to evaluate if it is connected to the broader history of the Transformers single-model-file policy and the newer Modular Transformers compromise.

My understanding is roughly this:

Transformers has historically preferred a single model file style, where the code needed to understand a model’s forward pass lives in one modeling_*.py file.
This is intentionally not very DRY. The tradeoff is that model behavior is easier to inspect, review, copy, debug, and modify locally.
More recently, Modular Transformers seems to introduce a compromise: contributors can write a more modular source file with imports/inheritance, and a converter/linter generates standalone modeling_*.py, configuration_*.py, etc. files from it.
Your proposal feels like a similar pattern, but aimed at Hub custom code / multifile model uploads rather than models contributed directly to the Transformers repository.

Some relevant background links:

Hugging Face design philosophy: “Don’t Repeat Yourself” / single model file policy
Current docs: Add a model with Modular Transformers
Modeling rules: cross-model imports violate the single-file policy
Converter implementation: utils/modular_model_converter.py
Dynamic module loading: dynamic_module_utils.py
Custom models docs: Customizing models
Auto classes / remote custom code docs: trust_remote_code

So perhaps the proposal could be framed not only as an isolated “inliner”, but as something like a Hub-side analogue of Modular Transformers :

Transformers repo:
  modular_<model>.py
    -> generated modeling_<model>.py / configuration_<model>.py / processing_<model>.py

Hub custom code:
  src/*.py or model/*.py
    -> generated flattened upload artifact

That framing might make the idea stronger, because it does not necessarily oppose the single-file philosophy. It could instead be seen as preserving the same final-user property:

authors can work with modular source code, while users/reviewers/loaders get a flattened, inspectable artifact.

In other words, the important question may not be “is flattening compatible with the single-file philosophy?” — it probably is, at least in spirit. The harder questions seem to be around source of truth , reproducibility , semantic equivalence , and integration with the existing dynamic module loader.

A few design questions that seem important to me:

What is the source of truth? Is the multifile source tree the canonical code, with the flattened file treated as generated output? Or is the flattened file itself supposed to be edited/reviewed as the canonical Hub artifact?
When is the flattened file generated? Should it be generated locally before upload, by save_pretrained, by push_to_hub, by a Hub-side build step, or by a standalone CLI?
Should the generated file be committed to the Hub repo? Committing it makes the actually executed code inspectable. But it also creates the risk that the generated file drifts from the source tree unless there is a check.
Should there be a CI / pre-publish check? For example, something like: regenerate the flattened artifact, compare it with the committed one, and fail if they differ. This seems analogous to how generated files are checked in the Modular Transformers workflow.
How do we verify semantic equivalence? Python imports are not just textual inclusion. try/except import, optional dependencies, TYPE_CHECKING, module-level side effects, circular imports, __all__, lazy imports, and dynamically imported modules can all be tricky.
How should this interact withdynamic_module_utils.py? Transformers already has logic for discovering/copying relative imports in custom code. For example, get_relative_import_files recursively follows relative imports. A pre-flattening approach might reduce reliance on that mechanism, but it also overlaps with the same responsibility area.
How should generated files be made reviewable? It may be useful for the flattened artifact to include clear source-file boundary markers, for example:

# ---------------------------------------------------------------------
# BEGIN inlined file: src/modeling/attention.py
# ---------------------------------------------------------------------

...

# ---------------------------------------------------------------------
# END inlined file: src/modeling/attention.py
# ---------------------------------------------------------------------

Should generated files include a “do not edit manually” header? Something like:

# This file is generated from the multifile source tree.
# Do not edit this file manually; edit the source files and regenerate.

Should the original multifile source be uploaded too? Uploading both the source tree and the flattened artifact may help reviewability, but it also raises the question of which one loaders should use.
Should this be an external tool, a Transformers utility, or part of the Hub upload flow? An external tool is easier to experiment with. A Transformers utility might be easier to standardize. A Hub/upload integration would provide the best UX, but probably requires the clearest contract.

I also think there is a useful distinction between two related but different problems:

Problem A:
  How should models inside the Transformers repo be authored and maintained?

Problem B:
  How should arbitrary custom model code on the Hub be packaged, loaded, cached, inspected, and trusted?

Modular Transformers seems primarily aimed at Problem A.

Your proposal seems primarily aimed at Problem B.

That distinction makes the proposal more interesting, not less. It means this may not be a duplicate of Modular Transformers. It may be the same underlying compromise applied to a different layer of the ecosystem.

There are also some existing pain points around multifile custom code and relative imports that seem relevant:

Forum thread: Relative imports are quirky and not well documented
Older issue: Relative path causes error when calling push_to_hub to upload a custom model
Recent issue: AutoModel.from_pretrained() doesn’t work for models with ‘.’ in their name when there’s a relative import
Dynamic import edge case: get_imports failing to respect conditionals on imports
Modular converter edge case: modular_model_converter cannot handle objects import from try except

These examples make me think the hard part is not merely concatenating files. It is defining a small, predictable packaging contract for custom model code.

One possible contract could be something like:

1. The multifile source tree is the source of truth.
2. The flattened artifact is generated output.
3. The generated artifact is committed/uploaded for inspectability.
4. The generated artifact contains source-boundary comments.
5. The generated artifact contains a do-not-edit header.
6. A check verifies that the generated artifact is up to date.
7. A load test verifies that AutoModel.from_pretrained(..., trust_remote_code=True) works locally.
8. The tool explicitly documents unsupported Python patterns.

That kind of contract might make the tool easier to reason about, because it would avoid silently becoming a general-purpose Python bundler.

For example, it could explicitly support a conservative subset:

Supported:
  - acyclic relative imports
  - same-repository Python source files
  - normal class/function/constant definitions
  - straightforward external imports
  - comments/docstrings preservation
  - source-file boundary markers

Possibly unsupported or warning-only:
  - circular imports
  - wildcard imports
  - importlib-based dynamic imports
  - module-level side effects that depend on import order
  - optional imports inside complex conditionals
  - runtime mutation of module globals

This would also make the safety/review story clearer. A flattened file is not automatically safer than multifile code, especially with trust_remote_code=True, but it can make the review target more explicit if the generated file is deterministic and inspectable.

So my tentative reaction is:

The idea seems useful.
It seems philosophically compatible with the single-file policy if treated as “modular authoring → flattened artifact”.
It seems especially relevant for Hub custom code, where relative imports and dynamic module loading can be fragile.
The main challenge is not the basic inlining idea, but the contract around reproducibility, reviewability, semantic equivalence, and source-of-truth.
It might be worth explicitly positioning this as a Hub/custom-code-side counterpart to Modular Transformers , rather than only as a standalone preprocessing script.

This may also help decide where it should live. If the goal is experimentation, an external package seems natural. If the goal is standardizing custom-code packaging for the Hub, then it probably needs alignment with Transformers’ existing dynamic module loading and/or Hub upload APIs. If the goal is to eventually become official, it may be useful to first define the exact supported subset and failure modes.

In short, I like the direction, but I think the strongest framing is:

not “let’s replace the single-file philosophy,” but “let’s give custom Hub model authors the same kind of modular-authoring / standalone-artifact compromise that Transformers itself is moving toward.”

Discussion in the ATmosphere