🧠
Model

Molcrawl Molecule Nat Lang Gpt2 Small

by Kojima Lab hf-model--kojima-lab--molcrawl-molecule-nat-lang-gpt2-small
Nexus Index
37.7 Top 100%
S: Semantic 50
A: Authority 0
P: Popularity 3
R: Recency 97
Q: Quality 65
Tech Context
Vital Performance
24 DL / 30D
0.0%
Audited 37.7 FNI Score
Tiny - Params
- Context
24 Downloads
Commercial APACHE License
Model Information Summary
Entity Passport
Registry ID hf-model--kojima-lab--molcrawl-molecule-nat-lang-gpt2-small
License Apache-2.0
Provider huggingface
📜

Cite this model

Academic & Research Attribution

BibTeX
@misc{hf_model__kojima_lab__molcrawl_molecule_nat_lang_gpt2_small,
  author = {Kojima Lab},
  title = {Molcrawl Molecule Nat Lang Gpt2 Small Model},
  year = {2026},
  howpublished = {\url{https://huggingface.co/kojima-lab/molcrawl-molecule-nat-lang-gpt2-small}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
Kojima Lab. (2026). Molcrawl Molecule Nat Lang Gpt2 Small [Model]. Free2AITools. https://huggingface.co/kojima-lab/molcrawl-molecule-nat-lang-gpt2-small

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

Quick Commands

🤗 HF Download
huggingface-cli download kojima-lab/molcrawl-molecule-nat-lang-gpt2-small

âš–ī¸ Nexus Index V2.0

37.7
TOP 100% SYSTEM IMPACT
Semantic (S) 50
Authority (A) 0
Popularity (P) 3
Recency (R) 97
Quality (Q) 65

đŸ’Ŧ Index Insight

FNI V2.0 for Molcrawl Molecule Nat Lang Gpt2 Small: Semantic (S:50), Authority (A:0), Popularity (P:3), Recency (R:97), Quality (Q:65).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
---

🚀 What's Next?

Technical Deep Dive

molcrawl-molecule-nat-lang-gpt2-small

Model Description

GPT-2 small (124M parameters) foundation model pre-trained on molecule-related natural language text using a standard GPT-2 BPE tokenizer (vocab_size=50257).

Datasets

Usage

python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained("kojima-lab/molcrawl-molecule-nat-lang-gpt2-small")
tokenizer = AutoTokenizer.from_pretrained("kojima-lab/molcrawl-molecule-nat-lang-gpt2-small")

# Generate molecule-related text
prompt = "The compound with SMILES CC(=O)Oc1ccccc1C(=O)O represents aspirin, which"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    output_ids = model.generate(
        **inputs,
        max_new_tokens=100,
        do_sample=True,
        temperature=0.8,
        eos_token_id=None,  # HF config.json has legacy eos_token_id=0; disable early stop
        pad_token_id=0,
    )
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))

Training

This model was trained with the RIKEN Foundation Model pipeline. For more details, please refer to the training configuration files included in this repository.

License

This model is released under the APACHE-2.0 license.

Citation

If you use this model, please cite:

bibtex
@misc{molcrawl_molecule_nat_lang_gpt2_small,
  title={molcrawl-molecule-nat-lang-gpt2-small},
  author={{RIKEN}},
  year={2026},
  publisher={{Hugging Face}},
  url={{https://huggingface.co/kojima-lab/molcrawl-molecule-nat-lang-gpt2-small}}
}

âš ī¸ Incomplete Data

Some information about this model is not available. Use with Caution - Verify details from the original source before relying on this data.

View Original Source →

📝 Limitations & Considerations

  • â€ĸ Benchmark scores may vary based on evaluation methodology and hardware configuration.
  • â€ĸ VRAM requirements are estimates; actual usage depends on quantization and batch size.
  • â€ĸ FNI scores are relative rankings and may change as new models are added.
  • ⚠ License Unknown: Verify licensing terms before commercial use.

Social Proof

HuggingFace Hub
24Downloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Model Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id
hf-model--kojima-lab--molcrawl-molecule-nat-lang-gpt2-small
slug
kojima-lab--molcrawl-molecule-nat-lang-gpt2-small
source
huggingface
author
Kojima Lab
license
Apache-2.0
tags
pytorch, safetensors, gpt2, molecule-nl, text-generation, license:apache-2.0, region:us

âš™ī¸ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag
text-generation

📊 Engagement & Metrics

downloads
24
stars
0
forks
0

Data indexed from public sources. Updated daily.