MultiMolecule
What are you trying to do?¶
Start from the task you need: predict from a sequence, fine-tune on your data, load a pretrained model, or use a curated dataset.
Prediction
Predict from a sequence¶
Registered pipelines turn biological task names and input sequences into structured predictions without manual model assembly.
Training
Fine-tune on your data¶
The runner connects pretrained checkpoints with Hugging Face datasets or labelled local tables, using sequence and label columns to start supervised training.
Models
Load a pretrained model¶
Model cards give checkpoint IDs, expected inputs, citations, and licenses, while Python APIs support direct model control beyond task pipelines.
| Python | |
|---|---|
Datasets
Use a curated dataset¶
Curated biological datasets include sequence and label fields, task metadata, source information, citations, and licenses for benchmarks, examples, and fine-tuning.
One stack underneath¶
When you need more control, the same ecosystem exposes documented resources, biological input handling, reusable model components, and execution tools for prediction, training, evaluation, and scripted use.
Execution
Pipelines, runner, and API
Pipelines provide ready task predictions, the runner manages supervised training and evaluation, and API entry points support scripts and applications.
Resources
Models and datasets with provenance
Dataset cards and model cards collect supported inputs, task names, checkpoint IDs, citations, licenses, and training metadata.
Data layer
Biological data to model-ready inputs
IO, tokenisers, and data utilities turn biological sequences, structures, and annotations into consistent inputs for pipelines, training, and evaluation.
io · tokenisers · data
Model layer
Reusable model building blocks
Models provide pretrained configs, AutoModel classes, checkpoints, and output contracts; modules provide backbones, heads, losses, and embeddings for custom architectures.
Community¶
-
Google Group
Receive release announcements, migration notes, and design RFCs without following every issue.
-
Discourse
Ask which pipeline, model, or dataset fits a biological problem; share configs, request models, and discuss model components.