APARENT2¶
Deep residual neural network for predicting human 3’ UTR Alternative Polyadenylation (APA) and cleavage magnitude at nucleotide resolution, and for deciphering the impact of genetic variants on polyadenylation.
Disclaimer¶
This is an UNOFFICIAL implementation of Deciphering the impact of genetic variation on human polyadenylation using APARENT2 by Johannes Linder, Samantha E. Koplik, et al.
The OFFICIAL repository of APARENT2 is at johli/aparent-resnet.
Tip
The MultiMolecule team has confirmed that the provided model and checkpoints are producing the same intermediate representations as the original implementation.
The team releasing APARENT2 did not write this model card for this model so this model card has been written by the MultiMolecule team.
Model Details¶
APARENT2 is a residual convolutional neural network (a ResNet successor to the original APARENT) trained on a 3’ UTR massively parallel reporter assay (MPRA). Given a fixed 205 nt polyadenylation signal (PAS) sequence, it predicts a nucleotide-resolution cleavage probability distribution as well as the overall isoform abundance. It is primarily used to score the effect of genetic variants on polyadenylation by comparing the predictions for a reference and an alternate sequence.
Model Specification¶
| Num Layers | Hidden Size | Num Parameters (M) | FLOPs (G) | MACs (G) | Max Num Tokens |
|---|---|---|---|---|---|
| 28 | 32 | 0.19 | 0.08 | 0.04 | 205 |
Links¶
- Code: multimolecule.aparent2
- Data: Massively-parallel polyadenylation MPRA with variant-effect evaluation data
- Paper: Deciphering the impact of genetic variation on human polyadenylation using APARENT2
- Developed by: Johannes Linder, Samantha E. Koplik, Anshul Kundaje, Georg Seelig
- Model type: 1D residual CNN successor to APARENT for polyadenylation isoform, cleavage, and variant-effect prediction
- Original Repository: johli/aparent-resnet
Usage¶
The model file depends on the multimolecule library. You can install it using pip:
| Bash | |
|---|---|
Direct Use¶
Polyadenylation Cleavage Prediction¶
You can use this model directly to predict the cleavage distribution of a 205 nt polyadenylation signal sequence (core hexamer starting at position 70):
Variant Effect Scoring¶
Score a reference and an alternate sequence separately, then compare:
Interface¶
- Input length: fixed 205 nt window
- Hexamer position: core hexamer (e.g.,
AAUAAA) at position 70 (0-indexed) of the 205 nt window - Output: 206-dim cleavage distribution (one score per input position + trailing “no cleavage in window” bucket)
Variant Effect¶
- Score reference and alternate sequences separately and compare their cleavage / isoform predictions
- There is no separate ref/alt output dataclass
Training Details¶
APARENT2 was trained to predict nucleotide-resolution cleavage and isoform abundance from 3’ UTR MPRA measurements.
Training Data¶
The model was trained on the 3’ UTR MPRA library used by the original APARENT, re-processed with additional improvements (exact cleavage positions for the Alien1 Random sublibrary and a 20 nt random barcode upstream of the USE in the Alien1 sublibrary). The measured variant data and processed data repository are available at the original APARENT GitHub.
Training Procedure¶
Pre-training¶
The model minimizes a combination of a sigmoid KL-divergence isoform loss and a KL-divergence cleavage loss, weighted equally. The released inference model corresponds to the residual-network model trained for 5 epochs on all sublibraries (excluding ClinVar wild-type sequences), with dropout disabled for inference.
Citation¶
Note
The artifacts distributed in this repository are part of the MultiMolecule project. If MultiMolecule supports your research, please cite the MultiMolecule project as follows:
| BibTeX | |
|---|---|
Contact¶
Please use GitHub issues of MultiMolecule for any questions or comments on the model card.
Please contact the authors of the APARENT2 paper for questions or comments on the paper/model.
License¶
This model implementation is licensed under the GNU Affero General Public License.
For additional terms and clarifications, please refer to our License FAQ.
| Text Only | |
|---|---|
API Reference¶
Aparent2Config
¶
Bases: PreTrainedConfig
This is the configuration class to store the configuration of a
Aparent2Model. It is used to instantiate an APARENT2 model according to the
specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a
similar configuration to that of the APARENT2 johli/aparent-resnet
architecture.
Configuration objects inherit from PreTrainedConfig and can be used to
control the model outputs. Read the documentation from PreTrainedConfig
for more information.
APARENT2 is a residual convolutional network that predicts human 3’ UTR Alternative Polyadenylation (APA) and cleavage magnitude at nucleotide resolution. The network is fully convolutional plus a position-wise locally-connected library-bias layer; it does not contain any flatten/dense layers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
int
|
Vocabulary size of the APARENT2 model. Defines the number of one-hot input channels derived from
|
5
|
|
int
|
The fixed length of the polyadenylation signal sequence the model was trained on. APARENT2 expects a 205 nt
window with the core hexamer (e.g. |
205
|
|
int
|
Number of feature channels used throughout the residual network. |
32
|
|
int
|
Number of residual-block groups. |
7
|
|
int
|
Number of residual blocks per group. |
4
|
|
int
|
Convolution kernel size used inside each residual block. |
3
|
|
list[int] | None
|
Dilation factor for each residual-block group. Must have |
None
|
|
int
|
Dimensionality of the one-hot training sub-library bias input. |
13
|
|
int
|
The training sub-library index used to construct the deterministic library-bias input. The upstream variant-effect workflow always uses index 11. |
11
|
|
str
|
The non-linear activation function used inside the residual blocks. |
'relu'
|
|
float
|
The epsilon used by the batch normalization layers. |
0.001
|
|
float
|
The momentum used by the batch normalization layers. |
0.99
|
|
int
|
Number of output labels. APARENT2 predicts a cleavage distribution over |
206
|
|
HeadConfig | None
|
The configuration of the prediction head. Defaults to a regression head
( |
None
|
Examples:
Source code in multimolecule/models/aparent2/configuration_aparent2.py
| Python | |
|---|---|
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 | |
Aparent2ForSequencePrediction
¶
Bases: Aparent2PreTrainedModel
APARENT2 with a sequence-level prediction head.
The backbone already produces a sequence_length + 1 dimensional cleavage score (the APA cleavage distribution
before softmax), so this wrapper exposes those converted upstream scores directly and adds the shared
MultiMolecule regression loss.
Examples:
Source code in multimolecule/models/aparent2/modeling_aparent2.py
Aparent2Model
¶
Bases: Aparent2PreTrainedModel
The bare APARENT2 residual network.
APARENT2 predicts a nucleotide-resolution cleavage distribution for a fixed 205 nt polyadenylation signal window.
The core hexamer (e.g. AAUAAA) is expected to start at position 70 (0-indexed). Variant effect is an
input-schema concern: score a reference and an alternate sequence separately and compare their cleavage /
isoform predictions; there is no separate ref/alt output dataclass.
Examples:
Source code in multimolecule/models/aparent2/modeling_aparent2.py
Aparent2ModelOutput
dataclass
¶
Bases: ModelOutput
Base class for outputs of the APARENT2 model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
`torch.FloatTensor` of shape `(1,)`, *optional*
|
Not produced by the bare model; present for API compatibility. |
None
|
|
`torch.FloatTensor` of shape `(batch_size, sequence_length + 1)`
|
APA cleavage scores (before SoftMax) for each position plus a trailing “no cleavage in window” bucket. |
None
|
|
`torch.FloatTensor` of shape `(batch_size, sequence_length + 1)`
|
Same content as |
None
|
|
`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`
|
The residual-network feature map before the final cleavage projection. |
None
|
|
`tuple(torch.FloatTensor)`, *optional*
|
Hidden states of the model at the output of each layer. |
None
|
Source code in multimolecule/models/aparent2/modeling_aparent2.py
Aparent2PreTrainedModel
¶
Bases: PreTrainedModel
An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.