bpRNA-1m¶
bpRNA-new is a database of single molecule secondary structures annotated using bpRNA.
bpRNA-new is a dataset of RNA families from Rfam 14.2, designed for cross-family validation to assess generalization capability. It focuses on families distinct from those in bpRNA-1m, providing a robust benchmark for evaluating model performance on unseen RNA families.
Disclaimer¶
This is an UNOFFICIAL release of the bpRNA-new by Kengo Sato, et al.
The team releasing bpRNA-new did not write this dataset card for this dataset so this dataset card has been written by the MultiMolecule team.
Dataset Description¶
- Homepage: https://multimolecule.danling.org/datasets/bprna-new
- datasets: https://huggingface.co/datasets/multimolecule/bprna-new
- Point of Contact: Kengo Sato
Related Datasets¶
- bpRNA-1m: A database of single molecule secondary structures annotated using bpRNA.
- bpRNA-spot: A subset of bpRNA-1m that applies CD-HIT (CD-HIT-EST) to remove sequences with more than 80% sequence similarity from bpRNA-1m.
- ArchiveII: A database of RNA secondary with the same families as RNAStrAlign, usually used for testing.
License¶
This dataset is licensed under the AGPL-3.0 License.
Text Only | |
---|---|