DeepMEL¶
Convolutional and recurrent neural network for predicting melanoma-specific accessible chromatin regions and chromatin topics directly from DNA sequence.
Disclaimer¶
This is an UNOFFICIAL implementation of Cross-species analysis of enhancer logic using deep learning by Liesbeth Minnoye, Ibrahim Ihsan Taskiran, et al.
The OFFICIAL repository of DeepMEL is at aertslab/DeepMEL.
Tip
The MultiMolecule team has confirmed that the provided model and checkpoints are producing the same intermediate representations as the original implementation.
The team releasing DeepMEL did not write this model card for this model so this model card has been written by the MultiMolecule team.
Model Details¶
DeepMEL is a hybrid convolutional / recurrent neural network trained to predict 24 melanoma chromatin topics (a 4-MEL melanocytic, a 7-MES mesenchymal-like, and additional accessibility programs) directly from 500 bp DNA sequence. Each input sequence is processed by a shared encoder consisting of a 1D convolution, max pooling, a time-distributed dense projection, and a bidirectional LSTM, followed by a fully-connected layer. The same encoder is applied independently to the forward DNA strand and to its reverse complement; a final 24-way decoder produces a sigmoid probability per topic in each branch, and the two branches’ probabilities are averaged into the model’s prediction. Please refer to the Training Details section for more information on the training process.
Model Specification¶
| Conv Filters | Conv Kernel | BiLSTM Hidden | FC Hidden | Num Topics | Num Parameters (M) | FLOPs (M) | MACs (M) | Max Num Tokens |
|---|---|---|---|---|---|---|---|---|
| 128 | 20 | 128 | 256 | 24 | 3.44 | 40.76 | 20.19 | 500 |
Links¶
- Code: multimolecule.deepmel
- Weights: multimolecule/deepmel
- Data: Melanoma cell-line single-cell ATAC-seq topic models
- Paper: Cross-species analysis of enhancer logic using deep learning
- Developed by: Ibrahim Ihsan Taskiran, Liesbeth Minnoye, Stein Aerts
- Model type: 1D CNN + BiLSTM over 500 bp DNA with reverse-complement averaging for multi-task chromatin-topic prediction
- Original Repository: aertslab/DeepMEL
Usage¶
The model file depends on the multimolecule library. You can install it using pip:
| Bash | |
|---|---|
Direct Use¶
Chromatin Topic Prediction¶
You can use this model directly to predict the 24 melanoma chromatin-topic activities of a 500 bp DNA sequence:
Interface¶
- Input length: fixed 500 bp DNA window
- Alphabet:
ACGT(one-hot encoded); the reverse complement is computed internally - Output: 24 chromatin-topic logits (multi-label binary);
postprocessreturns the branch-averaged sigmoid probability per topic
Training Details¶
DeepMEL was trained to predict cell-type-specific accessible chromatin topics derived from single-cell ATAC-seq of melanoma cell lines.
Training Data¶
DeepMEL was trained on accessible genomic intervals derived from melanoma single-cell ATAC-seq experiments and modeled as 24 chromatin topics (including the 4-MEL melanocytic-like and 7-MES mesenchymal-like programs). Each training example is a 500 bp genomic interval labelled with a binary vector indicating which topics are active. Chromosome 2 was held out for validation and testing.
Training Procedure¶
Pre-training¶
The model was trained to minimize a multi-label binary cross-entropy loss between the branch-averaged sigmoid probabilities and the observed topic-activity labels.
- Optimizer: Adam
- Loss: Multi-label binary cross-entropy
- Regularization: Dropout (
0.2after pooling,0.1LSTM input and recurrent dropout,0.2after the BiLSTM,0.4before the prediction head)
Citation¶
Note
The artifacts distributed in this repository are part of the MultiMolecule project. If you use MultiMolecule in your research, you must cite the MultiMolecule project as follows:
| BibTeX | |
|---|---|
Contact¶
Please use GitHub issues of MultiMolecule for any questions or comments on the model card.
Please contact the authors of the DeepMEL paper for questions or comments on the paper/model.
License¶
This model implementation is licensed under the GNU Affero General Public License.
For additional terms and clarifications, please refer to our License FAQ.
| Text Only | |
|---|---|
multimolecule.models.deepmel
¶
DnaTokenizer
¶
Bases: Tokenizer
Tokenizer for DNA sequences.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Alphabet | str | List[str] | None
|
alphabet to use for tokenization.
|
None
|
|
int
|
Size of kmer to tokenize. |
1
|
|
bool
|
Whether to tokenize into codons. |
False
|
|
bool
|
Whether to replace U with T. |
True
|
|
bool
|
Whether to convert input to uppercase. |
True
|
Examples:
Source code in multimolecule/tokenisers/dna/tokenization_dna.py
DeepMelConfig
¶
Bases: PreTrainedConfig
This is the configuration class to store the configuration of a
DeepMelModel. It is used to instantiate a DeepMEL model according to the
specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a
similar configuration to that of the DeepMEL aertslab/DeepMEL architecture.
Configuration objects inherit from PreTrainedConfig and can be used to
control the model outputs. Read the documentation from PreTrainedConfig
for more information.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
int
|
Vocabulary size of the DeepMEL model. Defines the number of feature channels in the one-hot encoded input fed to the first convolution. Defaults to 5. |
5
|
|
int
|
The fixed length (in base pairs) of the input DNA sequence. Defaults to 500. |
500
|
|
int
|
Number of output channels (filters) of the first convolution. Defaults to 128. |
128
|
|
int
|
Convolution kernel size. Defaults to 20. |
20
|
|
int
|
Max-pool window applied after the convolution. The convolution stride is 1 and the pool stride matches the
pool size, so the effective downsampling factor equals |
10
|
|
int
|
Hidden size of the time-distributed dense layer applied after pooling. Defaults to 128. |
128
|
|
int
|
Hidden size of each direction of the bidirectional LSTM. Defaults to 128. |
128
|
|
int
|
Hidden size of the fully-connected layer between the recurrent stack and the prediction head. Defaults to 256. |
256
|
|
str
|
The non-linear activation function (function or string) in the encoder. If string, |
'relu'
|
|
float
|
The dropout probability after the convolutional max-pool block. |
0.2
|
|
float
|
The dropout probability after the bidirectional LSTM. |
0.2
|
|
float
|
The dropout probability after the fully-connected layer. |
0.4
|
|
float
|
The dropout probability applied to the LSTM input weights during training. |
0.1
|
|
float
|
The dropout probability applied to the LSTM recurrent weights during training. |
0.1
|
|
int
|
Number of multi-label binary topics. DeepMEL predicts 24 melanoma topics (4 MEL + 7 MES + others). Defaults to 24. |
24
|
|
HeadConfig | None
|
The configuration of the prediction head. Defaults to a multi-label binary classification head
( |
None
|
Examples:
Source code in multimolecule/models/deepmel/configuration_deepmel.py
| Python | |
|---|---|
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 | |
DeepMelForSequencePrediction
¶
Bases: DeepMelPreTrainedModel
Examples:
Source code in multimolecule/models/deepmel/modeling_deepmel.py
DeepMelModel
¶
Bases: DeepMelPreTrainedModel
Examples:
Source code in multimolecule/models/deepmel/modeling_deepmel.py
DeepMelModelOutput
dataclass
¶
Bases: ModelOutput
Base class for outputs of the DeepMEL model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
`torch.FloatTensor` of shape `(batch_size, fc_dim)`
|
Per-branch fully-connected representation of the forward DNA strand (i.e. before averaging with the reverse-complement branch). Useful for strand-specific interpretation. |
None
|
|
`torch.FloatTensor` of shape `(batch_size, fc_dim)`
|
Branch-averaged sequence-level representation for backbone use cases. The topic head consumes the forward and reverse-complement branch representations directly. |
None
|
|
`torch.FloatTensor` of shape `(batch_size, fc_dim)`
|
Fully-connected representation of the forward DNA strand, before branch averaging. |
None
|
|
`torch.FloatTensor` of shape `(batch_size, fc_dim)`
|
Fully-connected representation of the reverse-complement DNA strand, before branch averaging. |
None
|
|
`tuple(torch.FloatTensor)`, *optional*
|
Always |
None
|
|
`tuple(torch.FloatTensor)`, *optional*
|
Always |
None
|
Source code in multimolecule/models/deepmel/modeling_deepmel.py
DeepMelPreTrainedModel
¶
Bases: PreTrainedModel
An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.