🧠
Model

Molcrawl Compounds Bert Small

by Kojima Lab hf-model--kojima-lab--molcrawl-compounds-bert-small
Free2AITools Nexus Index
39.0 Top 100%
S: Semantic 50
A: Authority 0
P: Popularity 5
R: Recency 99
Q: Quality 65
Tech Context
0.09B Params
512 Ctx
Vital Performance
56 DL / 30D
0.0%
Audited 39 FNI Score
Tiny 0.09B Params
1k Context
56 Downloads
8G GPU ~2GB Est. VRAM
Dense BERTFORMASKEDLM Architecture
Commercial APACHE License
Model Information Summary
Entity Passport
Registry ID hf-model--kojima-lab--molcrawl-compounds-bert-small
License Apache-2.0
Provider huggingface
💾

Compute Threshold

~1.4GB VRAM

Interactive
Analyze Hardware
â–ŧ

* Static estimation for 4-Bit Quantization.

📜

Cite this model

Academic & Research Attribution

BibTeX
@misc{hf_model__kojima_lab__molcrawl_compounds_bert_small,
  author = {Kojima Lab},
  title = {Molcrawl Compounds Bert Small Model},
  year = {2026},
  howpublished = {\url{https://huggingface.co/kojima-lab/molcrawl-compounds-bert-small}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
Kojima Lab. (2026). Molcrawl Compounds Bert Small [Model]. Free2AITools. https://huggingface.co/kojima-lab/molcrawl-compounds-bert-small

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

Quick Commands

đŸĻ™ Ollama Run
ollama run molcrawl-compounds-bert-small
🤗 HF Download
huggingface-cli download kojima-lab/molcrawl-compounds-bert-small

âš–ī¸ Free2AITools Nexus Index V2.0

Semantic (S) 50
Authority (A) 0
Popularity (P) 5
Recency (R) 99
Quality (Q) 65

đŸ’Ŧ Index Insight

FNI V2.0 for Molcrawl Compounds Bert Small: Semantic (S:50), Authority (A:0), Popularity (P:5), Recency (R:99), Quality (Q:65).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
---

🚀 What's Next?

Technical Deep Dive

molcrawl-compounds-bert-small

Model Description

GPT-2 small (124M parameters) foundation model pre-trained on compound SMILES strings from the MolCrawl dataset.

The tokenizer is a character-level BPE tokenizer (vocab_size=612) that encodes each SMILES character as a separate token. Input SMILES strings should be passed without spaces (e.g. CC(=O)O). The [SEP] token (id=13) is used as the end-of-sequence marker.

Datasets

Usage

python
from transformers import AutoModelForMaskedLM, AutoTokenizer
import torch

model = AutoModelForMaskedLM.from_pretrained("kojima-lab/molcrawl-compounds-bert-small")
tokenizer = AutoTokenizer.from_pretrained("kojima-lab/molcrawl-compounds-bert-small")

# Predict masked SMILES token
prompt = "CC(=O)[MASK]"
inputs = tokenizer(prompt, return_tensors="pt")
mask_token_id = tokenizer.mask_token_id
mask_index = (inputs["input_ids"] == mask_token_id).nonzero(as_tuple=True)[1]

with torch.no_grad():
    outputs = model(**inputs)
logits = outputs.logits

predicted_token_id = logits[0, mask_index].argmax(dim=-1)
predicted_token = tokenizer.decode(predicted_token_id)
result = prompt.replace("[MASK]", predicted_token)
print(f"Predicted: {result}")

Training

This model was trained with the RIKEN Foundation Model pipeline. For more details, please refer to the training configuration files included in this repository.

License

This model is released under the APACHE-2.0 license.

Citation

If you use this model, please cite:

bibtex
@misc{molcrawl_compounds_bert_small,
  title={molcrawl-compounds-bert-small},
  author={{RIKEN}},
  year={2026},
  publisher={{Hugging Face}},
  url={{https://huggingface.co/kojima-lab/molcrawl-compounds-bert-small}}
}

âš ī¸ Incomplete Data

Some information about this model is not available. Use with Caution - Verify details from the original source before relying on this data.

View Original Source →

📝 Limitations & Considerations

  • â€ĸ Benchmark scores may vary based on evaluation methodology and hardware configuration.
  • â€ĸ VRAM requirements are estimates; actual usage depends on quantization and batch size.
  • â€ĸ FNI scores are relative rankings and may change as new models are added.
  • ⚠ License Unknown: Verify licensing terms before commercial use.

Social Proof

HuggingFace Hub
56Downloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Model Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id
hf-model--kojima-lab--molcrawl-compounds-bert-small
slug
kojima-lab--molcrawl-compounds-bert-small
source
huggingface
author
Kojima Lab
license
Apache-2.0
tags
safetensors, bert, pytorch, molecule-compound, fill-mask, license:apache-2.0, region:us

âš™ī¸ Technical Specs

architecture
BertForMaskedLM
params billions
0.09
context length
512
pipeline tag
fill-mask
vram gb
1.4
vram is estimated
true
vram formula
VRAM ≈ (params * 0.75) + 0.8GB (KV) + 0.5GB (OS)

📊 Engagement & Metrics

downloads
56
stars
0
forks
0

Data indexed from public sources. Updated daily.