Skip to content

configuration_utils

multimolecule.models.configuration_utils

HeadConfig dataclass

Bases: BaseHeadConfig

Configuration class for a prediction head.

Parameters:

Name Type Description Default

num_labels

int

Number of labels to use in the last layer added to the model, typically for a classification task.

Head should look for Config.num_labels if is None.

None

problem_type

str

Problem type for XxxForYyyPrediction models. Can be one of "regression", "single_label_classification" or "multi_label_classification".

Head should look for Config.problem_type if is None.

None

hidden_size

int | None

Dimensionality of the encoder layers and the pooler layer.

Head should look for Config.hidden_size if is None.

None

dropout

float

The dropout ratio for the hidden states.

0.0

transform

str | None

The transform operation applied to hidden states.

None

transform_act

str | None

The activation function of transform applied to hidden states.

'gelu'

bias

bool

Whether to apply bias to the final prediction layer.

True

act

str | None

The activation function of the final prediction output.

None

layer_norm_eps

float

The epsilon used by the layer normalization layers.

1e-12

output_name

`str`, *optional*

The name of the tensor required in model outputs.

If is None, will use the default output name of the corresponding head.

None
Source code in multimolecule/module/heads/config.py
Python
@dataclass
class HeadConfig(BaseHeadConfig):
    r"""
    Configuration class for a prediction head.

    Args:
        num_labels:
            Number of labels to use in the last layer added to the model, typically for a classification task.

            Head should look for [`Config.num_labels`][multimolecule.PreTrainedConfig] if is `None`.
        problem_type:
            Problem type for `XxxForYyyPrediction` models. Can be one of `"regression"`,
            `"single_label_classification"` or `"multi_label_classification"`.

            Head should look for [`Config.problem_type`][multimolecule.PreTrainedConfig] if is `None`.
        hidden_size:
            Dimensionality of the encoder layers and the pooler layer.

            Head should look for [`Config.hidden_size`][multimolecule.PreTrainedConfig] if is `None`.
        dropout:
            The dropout ratio for the hidden states.
        transform:
            The transform operation applied to hidden states.
        transform_act:
            The activation function of transform applied to hidden states.
        bias:
            Whether to apply bias to the final prediction layer.
        act:
            The activation function of the final prediction output.
        layer_norm_eps:
            The epsilon used by the layer normalization layers.
        output_name (`str`, *optional*):
            The name of the tensor required in model outputs.

            If is `None`, will use the default output name of the corresponding head.
    """

    num_labels: int = None  # type: ignore[assignment]
    problem_type: str = None  # type: ignore[assignment]
    hidden_size: int | None = None
    dropout: float = 0.0
    transform: str | None = None
    transform_act: str | None = "gelu"
    bias: bool = True
    act: str | None = None
    layer_norm_eps: float = 1e-12
    output_name: str | None = None

MaskedLMHeadConfig dataclass

Bases: BaseHeadConfig

Configuration class for a Masked Language Modeling head.

Parameters:

Name Type Description Default

hidden_size

int | None

Dimensionality of the encoder layers and the pooler layer.

Head should look for Config.hidden_size if is None.

None

dropout

float

The dropout ratio for the hidden states.

0.0

transform

str | None

The transform operation applied to hidden states.

'nonlinear'

transform_act

str | None

The activation function of transform applied to hidden states.

'gelu'

bias

bool

Whether to apply bias to the final prediction layer.

True

act

str | None

The activation function of the final prediction output.

None

layer_norm_eps

float

The epsilon used by the layer normalization layers.

1e-12

output_name

`str`, *optional*

The name of the tensor required in model outputs.

If is None, will use the default output name of the corresponding head.

None
Source code in multimolecule/module/heads/config.py
Python
@dataclass
class MaskedLMHeadConfig(BaseHeadConfig):
    r"""
    Configuration class for a Masked Language Modeling head.

    Args:
        hidden_size:
            Dimensionality of the encoder layers and the pooler layer.

            Head should look for [`Config.hidden_size`][multimolecule.PreTrainedConfig] if is `None`.
        dropout:
            The dropout ratio for the hidden states.
        transform:
            The transform operation applied to hidden states.
        transform_act:
            The activation function of transform applied to hidden states.
        bias:
            Whether to apply bias to the final prediction layer.
        act:
            The activation function of the final prediction output.
        layer_norm_eps:
            The epsilon used by the layer normalization layers.
        output_name (`str`, *optional*):
            The name of the tensor required in model outputs.

            If is `None`, will use the default output name of the corresponding head.
    """

    hidden_size: int | None = None
    dropout: float = 0.0
    transform: str | None = "nonlinear"
    transform_act: str | None = "gelu"
    bias: bool = True
    act: str | None = None
    layer_norm_eps: float = 1e-12
    output_name: str | None = None

PreTrainedConfig

Bases: PretrainedConfig

Base class for all model configuration classes.

Source code in multimolecule/models/configuration_utils.py
Python
class PreTrainedConfig(PretrainedConfig):
    r"""
    Base class for all model configuration classes.
    """

    head: HeadConfig

    hidden_size: int

    pad_token_id: int = 0
    bos_token_id: int = 1
    eos_token_id: int = 2
    unk_token_id: int = 3
    mask_token_id: int = 4
    null_token_id: int = 5

    def __init__(
        self, pad_token_id=0, bos_token_id=1, eos_token_id=2, unk_token_id=3, mask_token_id=4, null_token_id=5, **kwargs
    ):
        super().__init__(
            pad_token_id=pad_token_id,
            bos_token_id=bos_token_id,
            eos_token_id=eos_token_id,
            unk_token_id=unk_token_id,
            mask_token_id=mask_token_id,
            null_token_id=null_token_id,
            **kwargs,
        )

    def to_dict(self):
        output = super().to_dict()
        for k, v in output.items():
            if hasattr(v, "to_dict"):
                output[k] = v.to_dict()
            if is_dataclass(v):
                output[k] = asdict(v)
        return output