📊
Dataset

Sky T1 32b Preview

by Novasky Ai hf-dataset--huggingface--novasky-ai--sky-t1-32b-preview
Nexus Index
38.0 Top 100%
S: Semantic 50
A: Authority 0
P: Popularity 55
R: Recency 100
Q: Quality 50
Tech Context
Vital Performance
0 DL / 30D
0.0%
Data Integrity 38 FNI Score
- Size
- Rows
Parquet Format
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hf-dataset--huggingface--novasky-ai--sky-t1-32b-preview
License Apache-2.0
Provider huggingface
📜

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset__huggingface__novasky_ai__sky_t1_32b_preview,
  author = {Novasky Ai},
  title = {Sky T1 32b Preview Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/huggingface/novasky-ai/sky-t1-32b-preview}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
Novasky Ai. (2026). Sky T1 32b Preview [Dataset]. Free2AITools. https://huggingface.co/datasets/huggingface/novasky-ai/sky-t1-32b-preview

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

âš–ī¸ Nexus Index V2.0

38.0
TOP 100% SYSTEM IMPACT
Semantic (S) 50
Authority (A) 0
Popularity (P) 55
Recency (R) 100
Quality (Q) 50

đŸ’Ŧ Index Insight

FNI V2.0 for Sky T1 32b Preview: Semantic (S:50), Authority (A:0), Popularity (P:55), Recency (R:100), Quality (Q:50).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
âŦ‡ī¸
Downloads
377

đŸ‘ī¸ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

đŸ§Ŧ Field Logic

đŸ§Ŧ

Schema not yet indexed for this dataset.

Dataset Specification

Model Details

Model Description

This is a 32B reasoning model trained from Qwen2.5-32B-Instruct with 17K data. The performance is on par with o1-preview model on both math and coding. Please see our blog post for more details.

  • Developed by: NovaSky Team from Sky Computing Lab at UC Berkeley.

Training Details

Training Data

17K verified correct responses from Qwen/QwQ-32B-Preview on coding, math. In addition, we add the science portion from the Still-2 paper.

Training Procedure

We perform supervised fine tuning on the data, with a batch size of 96.

Speeds

We use Llama-Factory for training. On 8 H100, the training takes 19 hours with DeepSpeed Zero-3 Offload.

Evaluation

Sky-T1-32B-Preview Qwen-2.5-32B-Instruct QwQ o1-preview
Math500 82.4 76.2 85.4 81.4
AIME2024 43.3 16.7 50.0 40.0
LiveCodeBench-Easy 86.3 84.6 90.7 92.9
LiveCodeBench-Medium 56.8 40.8 56.3 54.9
LiveCodeBench-Hard 17.9 9.8 17.1 16.3
GPQA-Diamond 56.8 45.5 52.5 75.2

Acknowledgement

We would like to thanks the compute resources from Lambda Lab and AnyScale. We would like to thanks the academic feedback and support from the Still-2 Team, and Junyang Lin from the Qwen Team.

Citation

Please considering citing our blog post if you found it useful for your research. Thank you!

bibtex
@misc{sky_t1_2025,
  author       = {NovaSky Team},
  title        = {Sky-T1: Fully open-source reasoning model with o1-preview performance in $450 budget},
  howpublished = {https://novasky-ai.github.io/posts/sky-t1},
  note         = {Accessed: 2025-01-09},
  year         = {2025}
}

Social Proof

HuggingFace Hub
377Downloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id
hf-dataset--huggingface--novasky-ai--sky-t1-32b-preview
slug
huggingface--novasky-ai--sky-t1-32b-preview
source
huggingface
author
Novasky Ai
license
Apache-2.0
tags
transformers, safetensors, qwen2, text-generation, conversational, en, dataset:codeparrot/apps, dataset:baai/taco, dataset:ai-mo/numinamath-cot, arxiv:2412.09413, base_model:qwen/qwen2.5-32b-instruct, base_model:finetune:qwen/qwen2.5-32b-instruct, license:apache-2.0, text-generation-inference, endpoints_compatible, deploy:azure, region:us

âš™ī¸ Technical Specs

architecture
null
params billions
32
context length
4,096
pipeline tag
text-generation

📊 Engagement & Metrics

downloads
377
stars
550
forks
0

Data indexed from public sources. Updated daily.