modules¶
modules provides a collection of pre-defined modules for users to implement their own architectures.
MultiMolecule is built upon the
ecosystem, embracing a similar design philosophy: Don’t Repeat Yourself.
We follow the single model file policy where each model under the models package contains one and only one modeling.py file that describes the network design.
The modules package is intended for simple, reusable modules that are consistent across multiple models. This approach minimizes code duplication and promotes clean, maintainable code.
Key Features¶
- Reusability: The
modulespackage includes components that are commonly used across different models, such as theSequencePredictionHead. This reduces redundancy and simplifies the development process. - Consistency: By centralizing common modules, we ensure that updates and improvements are consistently applied across all models, enhancing reliability and performance.
- Flexibility: While modules such as transformer encoders are widely used, they often vary in implementation details (e.g., pre-norm vs. post-norm, different residual connection strategies). The module package focuses on simpler components, leaving complex, model-specific variations to be defined within each model’s
modeling.py.
Modules¶
- heads: Contains various prediction heads, such as
SequencePredictionHead,TokenPredictionHead, andContactPredictionHead. - embeddings: Contains various positional embeddings, such as
SinusoidalEmbeddingandRotaryEmbedding. - model: The model layer that the
Runnerconsumes — abstractModelBaseplus two concrete subclasses,MonoModel(single-task wrapper around a Hugging FaceAutoModelFor*) andPolyModel(composition of backbone, optional neck, and one head per task).
Models¶
modules exposes a small model layer used by the runner package:
ModelBase: Abstract base. Defines theforwardandtrainable_parameterscontract every multimolecule model implements; the runner discriminates models withisinstance(model, ModelBase)rather than against any concrete subclass.MonoModel: Single-task wrapper around a multimolecule (or HuggingFace)AutoModelFor*prediction model. Hides the wrapper at thestate_dictlayer, so checkpoints round-trip with the bare HF model.PolyModel: Composes a backbone, optional neck, and one head per task into a single trainable module. Use when the task graph involves multiple labels, extra non-sequence features, or a neck transform.
Both classes are registered with [MODELS][multimolecule.MODELS] under the keys "mono" and "poly". The default network.type: auto dispatches between them based on the resolved network shape; users may set network.type: mono or network.type: poly explicitly to bypass the dispatcher.