Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreihiea3nsw3x25vziifwkle2msf72io6s5tsngemzx54sorih5roni",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mnfago6erxi2"
  },
  "path": "/t/interest-in-preprocessing-utilities-for-multifile-model-uploads/176211#post_4",
  "publishedAt": "2026-06-03T11:48:38.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "15224: Copy of the custom modeling file when saving a model",
    "20884: Santacoder saved checkpoints missing required .py files",
    "21008: Make sure dynamic objects can be saved and reloaded",
    "24737: Falcon models saved with save_pretrained no longer get saved with Python files",
    "24785",
    "27688: Remote code improvements",
    "29714: push_to_hub for a trust_remote_code=True model",
    "32923",
    "33100",
    "34855: Offline mode does not work with models requiring trust_remote_code=True",
    "sentence-transformers #2613",
    "36808: Support loading custom code objects in offline mode from local",
    "37716: Fix custom code saving",
    "37751: Stop autoconverting custom code checkpoints",
    "45684: save_pretrained custom model files copied with readonly permissions",
    "45698: from_pretrained loads wrong custom module after save_pretrained",
    "dynamic_module_utils.py",
    "Modular Transformers",
    "37716",
    "36653",
    "29714",
    "34855",
    "36808",
    "45698",
    "transformers-community/support Discussions",
    "Custom generate methods discussion",
    "The Transformers Library: standardizing model definitions",
    "With the new multi-backend modular system…"
  ],
  "textContent": "Oh. My previous comment about Modular Transformers came partly from a somewhat fuzzy memory, and I’m not a Transformers maintainer, so please take this with the appropriate amount of salt. But based on things I had happened to see before, plus what I looked into further using your reply as a starting point, I think one thing has become fairly clear: regardless of the exact implementation strategy,** there does seem to be real demand for solving this broader custom-code packaging problem**:\n\n* * *\n\nAfter looking into this a bit more, I would answer the original “is there demand?” question with a fairly strong **yes** , but with one important nuance:\n\n> The recurring demand seems to be less specifically for “an inliner” as such, and more for **complete, reproducible, inspectable custom-code artifacts** for `trust_remote_code=True` models.\n\nAn inliner may be one good strategy for that broader problem, especially if the goal is to let authors develop in a normal multifile layout while still publishing something closer to the traditional Transformers single-file / standalone-artifact style.\n\nBut I would probably avoid presenting `inline` as the only correct implementation. It may be more maintainable to frame it as one possible **custom-code packaging strategy**.\n\n## Direct reaction to your points\n\nMy direct reaction to your reply is roughly this:\n\nYour point | My current read\n---|---\n`dynamic_utils` / dynamic module loading has serious deficiencies around multifile or recursive imports | This seems plausible. The surrounding issue history suggests that custom-code saving/loading has been fragile for a long time. I would still separate “current loader/save bug with an MRE” from “new opt-in packaging feature”.\nAn opt-in `save_pretrained` flag would be useful | I agree that this seems like a reasonable user-facing shape. I would maybe phrase it as a possible `custom_code_packaging` strategy rather than committing too early to one exact flag name.\n`push_to_hub` should receive the same behavior | That also seems reasonable if the packaging transform is part of the saved artifact contract. Ideally, `push_to_hub` would upload the same complete custom-code artifact that `save_pretrained` creates.\nRebuilding the dynamic import resolver directly may be risky | I agree. An opt-in packaging transform may be safer than changing dynamic-module resolution globally. It avoids making the loader responsible for arbitrary Python project layouts.\nThe long-term fix might still involve better multi-level import handling | Possibly, but I would treat that as a separate track. One track is “fix current loader/save behavior with concrete MREs”; another is “provide an explicit packaging transform”.\nThe generated output is human-readable and preserves comments/docstrings | That is one of the strongest arguments for the approach. Reviewability matters a lot for `trust_remote_code=True`, because the artifact is executable code.\nCI testing is unclear | I meant something narrower than proving all possible semantic equivalence: regenerate the artifact, compare it with the committed/generated output, then run local/remote/offline load smoke tests.\nA DAG gives a clean boundary for inlining | I agree that acyclicity is a very useful support boundary. I would only be cautious about treating DAG-ness alone as a full semantic-equivalence proof in Python.\nThis is about uploading/saving for the Hub, not necessarily changing the Hub itself | That distinction makes sense. I would frame the problem as producing a complete custom-code artifact for `save_pretrained` / `push_to_hub`, rather than asking the Hub or dynamic loader to support arbitrary Python package layouts.\n\nSo my current read is:\n\n\n    Yes, the need is real.\n\n    But the strongest framing may be:\n      custom-code artifact completeness for trust_remote_code=True models\n\n    rather than:\n      a multifile inliner, specifically\n\n\n## Why I think the demand is real\n\nThere seems to be a long-running pattern of related issues and fixes around this area:\n\nYear | Issue / PR | What it suggests\n---|---|---\n2022 | #15224: Copy of the custom modeling file when saving a model | Users already needed `save_pretrained()` to copy custom modeling files for dynamic-code models.\n2022 | #20884: Santacoder saved checkpoints missing required .py files | Fine-tuned checkpoints for `trust_remote_code=True` models could be unusable because required code files were missing.\n2023 | #21008: Make sure dynamic objects can be saved and reloaded | Core Transformers already added fixes so dynamic/custom objects could be saved and reloaded with their code.\n2023 | #24737: Falcon models saved with save_pretrained no longer get saved with Python files / #24785 | Another concrete regression/fix around copying custom Python files during saving.\n2023 | #27688: Remote code improvements | Broader concerns around `trust_remote_code`, `auto_map`, downstream libraries, and documentation.\n2024 | #29714: push_to_hub for a trust_remote_code=True model | Users wanted `push_to_hub()` to push all files needed by a custom model, not only weights/config/tokenizer files.\n2024 | #32923 / #33100 | Local-vs-remote custom code behavior affected pipeline registration and AutoClass behavior.\n2024 | #34855: Offline mode does not work with models requiring trust_remote_code=True | `save_pretrained()` artifacts were not always self-contained enough for offline / fresh-machine loading.\n2024 | sentence-transformers #2613 | Downstream users also need hermetic/offline Docker-style deployments for models requiring remote code.\n2025 | #36808: Support loading custom code objects in offline mode from local | Ongoing work around fully saving/loading `trust_remote_code=True` custom objects in offline/local settings.\n2025 | #37716: Fix custom code saving | A major merged PR explicitly aimed at making `save_pretrained()` and `push_to_hub()` correctly save relevant custom modeling files.\n2025 | #37751: Stop autoconverting custom code checkpoints | Custom-code checkpoints may need special handling in adjacent infrastructure.\n2026 | #45684: save_pretrained custom model files copied with readonly permissions | Saved custom-code files are touched by post-save tooling, so generated/copied artifacts are a real workflow.\n2026 | #45698: from_pretrained loads wrong custom module after save_pretrained | Custom module identity/cache/local-source behavior can still be subtle after saving.\n\nSo, to me, this looks like a real problem family. It has appeared as:\n\n  * missing custom `.py` files after `save_pretrained()`;\n  * missing custom `.py` files after `push_to_hub()`;\n  * local-vs-remote custom-code inconsistencies;\n  * offline/hermetic deployment failures;\n  * `auto_map` / `_auto_class` fragility;\n  * pipeline registration differences;\n  * dynamic-module cache/module identity issues;\n  * relative-import limitations and under-documentation.\n\n\n\nThat is a fairly strong signal that there is real demand.\n\n## How I would frame the core problem\n\nI would probably frame the problem less as:\n\n\n    How do we support arbitrary multifile Python projects on the Hub?\n\n\nand more as:\n\n\n    How do we produce a complete, reproducible, inspectable custom-code artifact\n    for `trust_remote_code=True` models, across `save_pretrained()`,\n    `push_to_hub()`, local loading, remote loading, and offline loading?\n\n\nThat framing seems to connect better with the existing Transformers work.\n\nIt also avoids forcing the loader to become a general Python package resolver. Instead, the save/push step could produce an artifact that the dynamic loader already knows how to consume.\n\n## Why your inliner idea still seems relevant\n\nThe current custom-code machinery already appears to be somewhat artifact-oriented.\n\nThe relevant area seems to be dynamic_module_utils.py, especially functions such as:\n\n  * `custom_object_save`\n  * `get_relative_imports`\n  * `get_relative_import_files`\n  * `get_cached_module_file`\n  * `get_class_in_module`\n\n\n\nFrom the current code, `custom_object_save()` looks like it already saves custom object source files and discovered relative imports into the target folder. It also appears to copy files by basename, which makes the current save path feel closer to a **flat artifact** than to preserving an arbitrary nested Python package layout.\n\nSo I think your proposal can be framed as a natural extension of an existing direction:\n\n\n    Current-ish direction:\n      collect custom code files\n      copy them into the save/push artifact\n\n    Possible inline strategy:\n      collect custom code files\n      generate one deterministic, inspectable file\n      update metadata so AutoClass loading points to that generated file\n\n\nThat does not necessarily fight the single-file philosophy. It may actually align with it:\n\n\n    Authoring:\n      modular source tree\n\n    Published artifact:\n      generated standalone/flat/inspectable custom-code artifact\n\n\nThis is similar in spirit to the broader compromise behind Modular Transformers, though the target layer is different:\n\nArea | Source authoring | Published / consumed artifact\n---|---|---\nTransformers repo models | `modular_<model>.py` with imports/inheritance | generated standalone `modeling_*.py`, `configuration_*.py`, etc.\nHub custom code proposal | multifile custom source tree | generated flat or inline artifact for `trust_remote_code=True` loading\n\nI would still be cautious about saying it is “the same thing” as Modular Transformers. It is not. But the design pattern is similar: **modular authoring, standalone artifact**.\n\n## I would present `inline` as one packaging strategy, not the whole proposal\n\nOne useful way to make the implementation discussion less binary may be to define a small strategy space:\n\nStrategy | Output artifact | Advantages | Risks / open questions\n---|---|---|---\n`current` | Whatever current `save_pretrained()` / `push_to_hub()` produces | Maximum backward compatibility | Existing edge cases remain.\n`flat_copy` | Copy discovered `.py` files into the save directory | Close to current `custom_object_save()` behavior | Basename collisions, lost package structure, relative import quirks.\n`preserve_package` | Preserve nested package directories | Most Pythonic for authors | More work for dynamic module loading/cache; may conflict with current same-directory assumptions.\n`inline` | Generate one standalone `.py` file | Inspectable, single-file-compatible, loader-simple | Semantic equivalence, deterministic generation, source-of-truth questions.\nexternal CLI | Pre-publish generated artifact | Easy to experiment with outside core Transformers | Not standardized; users must wire it into their own publishing flow.\n\nThen your proposal becomes:\n\n\n    Add or experiment with an `inline` custom-code packaging strategy.\n\n\nrather than:\n\n\n    Replace the current custom-code loader with an inliner.\n\n\nThat seems easier to evaluate.\n\n## Possible API shape, very tentatively\n\nI do not know where maintainers would want this to live, so I would treat this as illustrative rather than prescriptive.\n\nMaybe something like:\n\n\n    model.save_pretrained(\n        save_directory,\n        custom_code_packaging=\"inline\",\n    )\n\n\nand eventually:\n\n\n    model.push_to_hub(\n        repo_id,\n        custom_code_packaging=\"inline\",\n    )\n\n\nor perhaps a lower-level utility first:\n\n\n    from transformers.utils import package_custom_code\n\n    package_custom_code(\n        entry_file=\"modeling_my_model.py\",\n        output_file=\"modeling_my_model_generated.py\",\n        strategy=\"inline\",\n    )\n\n\nI am not saying these are the right API names. The important part is the contract:\n\n\n    Given a custom-code entrypoint and a supported subset of relative imports,\n    produce a deterministic artifact that can be saved, pushed, inspected,\n    cached, and loaded.\n\n\n## Possible responsibility boundary\n\nI would be careful here. From the outside, it is tempting to say:\n\n\n    Just add a flag to `save_pretrained()`.\n\n\nBut the recent custom-code saving work appears to touch more than one function. For example, #37716 touched custom-code saving, `_auto_map`, AutoClass behavior, multiple save/load paths, tests, and docs/docstrings.\n\nSo I would phrase the implementation boundary cautiously:\n\n\n    `dynamic_module_utils.custom_object_save()` looks like one plausible hook,\n    because it already saves custom object source files and updates config-side\n    metadata for Hub loading.\n\n    But I would not claim it is definitely the correct hook. The right abstraction\n    may need to account for AutoClass behavior, `auto_map`, local-vs-remote loading,\n    processors/tokenizers/configs, and push-to-hub behavior.\n\n\nThat keeps the proposal helpful without over-prescribing internals.\n\n## What I meant by CI / checks\n\nWhen I mentioned CI, I did not mean:\n\n\n    Prove all possible model behavior is equivalent for all inputs.\n\n\nI meant a much narrower generated-artifact consistency check:\n\n\n    1. Run the packager/inliner.\n    2. Compare the generated file with the checked-in generated file.\n    3. Fail if they differ.\n    4. Run AutoModel.from_pretrained(<local_saved_dir>, trust_remote_code=True).\n    5. If practical, also test a Hub-like or remote load path.\n    6. Optionally compare a tiny forward pass or at least state_dict keys\n       between the source and packaged forms.\n\n\nSo the CI input would not need to be arbitrary user inputs. It could start from a tiny toy custom model fixture.\n\nFor example:\n\n\n    toy_model/\n      configuration_toy.py\n      modeling_toy.py\n      backbone.py\n      modules.py\n\n\nwith:\n\n\n    # modeling_toy.py\n    from .backbone import ToyBackbone\n\n\nand:\n\n\n    # backbone.py\n    from .modules import ToyModule\n\n\nThen the check could be:\n\n\n    model.save_pretrained(tmpdir)\n    AutoModel.from_pretrained(tmpdir, trust_remote_code=True)\n\n\nplus, for the packaging tool specifically:\n\n\n    generate artifact\n    compare generated artifact with expected artifact\n    load from generated artifact\n\n\nThat is much narrower than full semantic verification, but still useful.\n\n## About DAGs and semantic equivalence\n\nI agree that acyclicity is probably a very good **support boundary**. If the relative-import graph is cyclic, the packager can clearly reject it.\n\nI would only be cautious about saying that DAG-ness alone proves semantic equivalence in Python.\n\nA DAG means a topological inline order can exist. But Python import behavior can also depend on:\n\n  * module identity;\n  * import order;\n  * `sys.modules`;\n  * `__name__`;\n  * `__package__`;\n  * `__file__`;\n  * `__all__`;\n  * module-level side effects;\n  * optional imports;\n  * `try/except import`;\n  * `TYPE_CHECKING`;\n  * wildcard imports;\n  * duplicate names after flattening;\n  * monkey-patching;\n  * `importlib`;\n  * local-vs-remote cache behavior.\n\n\n\nSo I would phrase it as:\n\n\n    Acyclic import graph:\n      necessary / practical condition for supported inlining\n\n    Full semantic equivalence:\n      still worth checking with load tests and possibly a tiny forward pass\n\n\nThis does not make the inliner idea weaker. It just makes the support contract more precise.\n\n## Why `inline` might be attractive\n\nAn inline artifact could have several practical advantages:\n\nAdvantage | Why it matters\n---|---\nFewer dynamic relative imports | The loader has less dependency graph to reconstruct.\nMore inspectable artifact | Reviewers/users can inspect one generated file.\nCloser to single-file philosophy | The final artifact resembles the traditional Transformers model file style.\nBetter offline/hermetic behavior | The saved directory can contain executable custom code without needing to fetch remote code again.\nEasier upload completeness | `push_to_hub()` has fewer files to miss.\nPotentially simpler cache invalidation | One deterministic file may be easier to hash than a graph of relative imports.\n\nBut these advantages depend on the generated file being deterministic and honest about its origin.\n\nFor example, I would expect generated files to include something like:\n\n\n    # This file was automatically generated from a multifile custom-code source tree.\n    # Do not edit this file manually; edit the source files and regenerate.\n    # Source root: <source_root>\n    # Entry point: <entry_file>\n    # Packaging strategy: inline\n\n\nand source boundary markers such as:\n\n\n    # ---------------------------------------------------------------------\n    # BEGIN inlined file: layers/attention.py\n    # ---------------------------------------------------------------------\n\n    ...\n\n    # ---------------------------------------------------------------------\n    # END inlined file: layers/attention.py\n    # ---------------------------------------------------------------------\n\n\nThat would make the artifact more reviewable.\n\n## Possible initial supported subset\n\nSomething like this may be easier to maintain:\n\n\n    Supported:\n      - one custom-code entry file\n      - same-repository relative imports\n      - acyclic dependency graph\n      - normal `from .foo import Bar` imports\n      - normal class/function/constant definitions\n      - external imports preserved at the top\n      - comments/docstrings preserved\n      - deterministic output\n      - generated source boundary markers\n      - clear error messages for unsupported patterns\n\n    Unsupported at first:\n      - circular imports\n      - wildcard relative imports\n      - dynamic imports via `importlib`\n      - imports outside the source root\n      - namespace packages\n      - complex module-level side effects\n      - ambiguous duplicate symbols\n      - package layouts that require runtime package identity\n\n\nI would not present this as the final design, only as a possible starting point.\n\n## Possible tests / MREs\n\nIf this becomes a GitHub issue or PR, I think the most useful thing would be to split examples into small reproducible cases.\n\n### 1. Save artifact completeness\n\n\n    Goal:\n      `save_pretrained()` should produce a directory that can be loaded\n      without manually copying custom `.py` files.\n\n\nMinimal layout:\n\n\n    toy_model/\n      config.json\n      configuration_toy.py\n      modeling_toy.py\n      helper.py\n\n\nImport chain:\n\n\n    # modeling_toy.py\n    from .helper import ToyBlock\n\n\nTest:\n\n\n    model.save_pretrained(tmpdir)\n    AutoModel.from_pretrained(tmpdir, trust_remote_code=True)\n\n\n### 2. Recursive relative imports\n\n\n    Goal:\n      transitive relative imports are either supported, clearly rejected,\n      or transformed into a generated artifact.\n\n\nMinimal layout:\n\n\n    toy_model/\n      configuration_toy.py\n      modeling_toy.py\n      backbone.py\n      modules.py\n\n\nImport chain:\n\n\n    # modeling_toy.py\n    from .backbone import ToyBackbone\n\n\n\n    # backbone.py\n    from .modules import ToyModule\n\n\nThis is close to the kind of issue described in #36653.\n\n### 3. Nested package layout\n\n\n    Goal:\n      decide whether nested subpackages are unsupported, preserved,\n      flat-copied, or inlined.\n\n\nMinimal layout:\n\n\n    toy_model/\n      configuration_toy.py\n      modeling_toy.py\n      layers/\n        __init__.py\n        attention.py\n        rope.py\n\n\nImport chain:\n\n\n    # modeling_toy.py\n    from .layers.attention import ToyAttention\n\n\n\n    # layers/attention.py\n    from .rope import apply_rope\n\n\nThis would clarify whether the desired behavior is:\n\n\n    preserve package layout\n\n\nor:\n\n\n    generate a flat/inline artifact\n\n\n### 4. Push artifact completeness\n\n\n    Goal:\n      `push_to_hub()` should push the same complete custom-code artifact\n      that `save_pretrained()` would produce locally.\n\n\nThis is close to #29714, where the issue was that a custom model needed additional files to function properly after push.\n\n### 5. Offline/hermetic loading\n\n\n    Goal:\n      A saved model directory should be usable on a fresh machine in offline mode\n      if all required custom code was saved.\n\n\nThis connects to:\n\n  * #34855\n  * sentence-transformers #2613\n  * #36808\n\n\n\n### 6. Module identity / cache behavior\n\n\n    Goal:\n      A saved model should not accidentally load a different local custom module\n      with the same filename/class name.\n\n\nThis connects to #45698.\n\n## Possible issue split\n\nIf this is taken to GitHub, I would probably avoid one giant issue.\n\nMaybe split it like this:\n\nIssue type | Possible title | Purpose\n---|---|---\nBug / MRE | `Recursive relative imports are not reliably included for trust_remote_code custom models` | Show current behavior with a minimal failing repo.\nFeature request | `Add an opt-in custom-code packaging strategy for save_pretrained / push_to_hub` | Discuss `inline`, `flat_copy`, `preserve_package`, etc.\nDocs clarification | `Clarify supported relative-import layouts for Hub custom code` | Explain same-directory imports, nested packages, generated artifacts, and reload tests.\nExperimental package | `External custom-code inliner / packager` | Prove the idea before proposing core integration.\n\nThat separation may make the discussion easier for maintainers to act on.\n\n## Possible venue\n\nI am less certain about the best venue, so I would treat this only as practical guidance, not official routing.\n\nMy understanding is:\n\nPlace | Probably good for\n---|---\nThis Forum thread | Initial context, demand check, design sketch.\ntransformers-community/support Discussions | Cross-linking a broader Transformers design/API question. It appears to be used for some semi-official community discussions, but I would not call it guaranteed/canonical.\nGitHub Issue | Focused bug report or feature request with MRE/API sketch.\nGitHub PR | Tests, docs, or implementation once the target behavior is clear.\n\nThe `transformers-community/support` Space seems relevant because there are already broader discussions there, such as:\n\n  * Custom generate methods discussion\n  * The Transformers Library: standardizing model definitions\n  * With the new multi-backend modular system…\n\n\n\nBut I would not rely on that as the only path. For concrete bugs and feature requests, GitHub issues are probably still the most actionable place.\n\n## My tentative summary\n\nI would summarize the situation like this:\n\n\n    There is real demand, but I would name the demand carefully.\n\n    The demand is for complete, reproducible, inspectable custom-code artifacts\n    for `trust_remote_code=True` models.\n\n    Inlining is one possible packaging strategy.\n\n    It may be especially attractive because it aligns with the single-file /\n    standalone-artifact style, reduces relative-import complexity, and can make\n    the saved/pushed artifact easier to inspect.\n\n    But it should probably be presented as an opt-in strategy, not as the only\n    right design.\n\n    The exact implementation hook should be left open for maintainers, though\n    `dynamic_module_utils.custom_object_save()` looks like a plausible place to\n    start reading because it already handles saving custom code files and metadata.\n\n\nSo I think your idea is useful, but I would pitch it less as:\n\n\n    Here is a preprocessing script for multifile uploads.\n\n\nand more as:\n\n\n    Here is a possible opt-in packaging strategy for the broader custom-code\n    artifact completeness problem that Transformers has already been working on\n    for several years.\n\n\nThat framing seems both stronger and safer.",
  "title": "Interest in preprocessing utilities for multifile model uploads"
}