🧠

deepseek-v3.2-exp

Name: deepseek-v3.2-exp
Author: deepseek-ai

by deepseek-ai Model ID: hf-model--deepseek-ai--deepseek-v3.2-exp

FNI 8.1

Top 67%

🔗 View Source

Audited 8.1 FNI Score

Massive 685.4B Params

4k Context

Hot 75.4K Downloads

H100+ ~517GB Est. VRAM

⚡ Quick Commands

🤗 HF Download

huggingface-cli download deepseek-ai/deepseek-v3.2-exp

📦 Install Lib

pip install -U transformers

📊

Engineering Specs

V16.2 Platform Optimized

⚡ Hardware

Parameters

685.4B

Architecture

DeepseekV32ForCausalLM

Context Length

Model Size

642.1GB

🧠 Lifecycle

Library

Precision

float16

Tokenizer

🌐 Identity

Source

HuggingFace

License

Open Access

💾

Est. VRAM Benchmark

~516.5GB

Analyze Hardware

Test Hardware Compatibility

* Technical estimation for FP16/Q4 weights. Does not include OS overhead or long-context batching. For Technical Reference Only.

⚡

🔗 Core Ecosystem

🧠BASED_ON

DEEPSEEK V3.2 EXP BASEmodel

deepseek ai

→

📈 Interest Trend

* Real-time activity index across HuggingFace, GitHub and Research citations.

🔍 Semantic Keywords

🏷️ transformers 🏷️ safetensors 🏷️ deepseek_v32 🏷️ text-generation 🏷️ conversational 🏷️ base_model:deepseek-ai/deepseek-v3.2-exp-base 🏷️ license:mit 🏷️ endpoints_compatible 🏷️ fp8 🏷️ region:us

No similar models found.

Social Proof

FNI RankTop 67%

HuggingFace Hub

909Likes

75.4KDownloads

Hub Discussions

⚙️ Technical Specifications

4 specs

🧠

Parameters

685.4B

📏

Context

🏗️

Architecture

DeepseekV32ForCausalLM

📚

Library

transformers

🚀 Deployment Info

Difficulty

💎Expert

Recommended Hardware

☁️ Multi-GPU or cloud A100/H100

Quick Info

Library: transformers
Size: 689.5 GB

Model Information Summary
Identity	deepseek-v3.2-exp
Author	deepseek-ai
Primary Category	Standard
Downloads	75,365
Likes	909
Source	Unknown
🧬 Parent Model	deepseek-v3.2-exp-baseAncestry
Technical Specifications
Architecture	DeepseekV32ForCausalLM

🔬Technical Deep Dive

Full Specifications [+]

---

🚀 What's Next?

📊

Find Training Datasets

Discover datasets compatible with this model

📈

Compare Benchmarks

See how this model ranks on standard tests

⚡

⚡ Quick Commands

🤗 HF Download

huggingface-cli download deepseek-ai/deepseek-v3.2-exp

📦 Install Lib

pip install -U transformers

🖥️

Hardware Compatibility

Multi-Tier Validation Matrix

Live Sync

🎮 Compatible

RTX 3060 / 4060 Ti

Entry 8GB VRAM

🎮 Compatible

RTX 4070 Super

Mid 12GB VRAM

💻 Compatible

RTX 4080 / Mac M3

High 16GB VRAM

🚀 Compatible

RTX 3090 / 4090

Pro 24GB VRAM

🏗️ Compatible

RTX 6000 Ada

Workstation 48GB VRAM

🏭 Compatible

A100 / H100

Datacenter 80GB VRAM

ℹ️

Pro Tip: Compatibility is estimated for 4-bit quantization (Q4). High-precision (FP16) or ultra-long context windows will significantly increase VRAM requirements.

README

DeepSeek-V3.2-Exp

Introduction

We are excited to announce the official release of DeepSeek-V3.2-Exp, an experimental version of our model. As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.

This experimental release represents our ongoing research into more efficient transformer architectures, particularly focusing on improving computational efficiency when processing extended text sequences.

DeepSeek Sparse Attention (DSA) achieves fine-grained sparse attention for the first time, delivering substantial improvements in long-context training and inference efficiency while maintaining virtually identical model output quality.
To rigorously evaluate the impact of introducing sparse attention, we deliberately aligned the training configurations of DeepSeek-V3.2-Exp with V3.1-Terminus. Across public benchmarks in various domains, DeepSeek-V3.2-Exp demonstrates performance on par with V3.1-Terminus.

Benchmark	DeepSeek-V3.1-Terminus	DeepSeek-V3.2-Exp
Reasoning Mode w/o Tool Use
MMLU-Pro	85.0	85.0
GPQA-Diamond	80.7	79.9
Humanity's Last Exam	21.7	19.8
LiveCodeBench	74.9	74.1
AIME 2025	88.4	89.3
HMMT 2025	86.1	83.6
Codeforces	2046	2121
Aider-Polyglot	76.1	74.5
Agentic Tool Use
BrowseComp	38.5	40.1
BrowseComp-zh	45.0	47.9
SimpleQA	96.8	97.1
SWE Verified	68.4	67.8
SWE-bench Multilingual	57.8	57.9
Terminal-bench	36.7	37.7

Update

2025.11.17: We have identified that previous versions of the inference demo code contained an implementation discrepancy in Rotary Position Embedding (RoPE) within the indexer module, potentially leading to degraded model performance. Specifically, the input tensor to RoPE in the indexer module requires a non-interleaved layout, whereas RoPE in the MLA module expects an interleaved layout. This issue has now been resolved. Please refer to the updated version of the inference demo code and take note of this implementation detail.

How to Run Locally

HuggingFace

We provide an updated inference demo code in the inference folder to help the community quickly get started with our model and understand its architectural details.

First convert huggingface model weights to the the format required by our inference demo. Set MP to match your available GPU count:

cd inference
export EXPERTS=256
python convert.py --hf-ckpt-path ${HF_CKPT_PATH} --save-path ${SAVE_PATH} --n-experts ${EXPERTS} --model-parallel ${MP}

Launch the interactive chat interface and start exploring DeepSeek's capabilities:

export CONFIG=config_671B_v3.2.json
torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --interactive

SGLang

Installation with Docker

# H200
docker pull lmsysorg/sglang:dsv32

# MI350
docker pull lmsysorg/sglang:dsv32-rocm

# NPUs
docker pull lmsysorg/sglang:dsv32-a2
docker pull lmsysorg/sglang:dsv32-a3

Launch Command

python -m sglang.launch_server --model deepseek-ai/DeepSeek-V3.2-Exp --tp 8 --dp 8 --enable-dp-attention

vLLM

vLLM provides day-0 support of DeepSeek-V3.2-Exp. See the recipes for up-to-date details.

Open-Source Kernels

For TileLang kernels with better readability and research-purpose design, please refer to TileLang.

For high-performance CUDA kernels, indexer logit kernels (including paged versions) are available in DeepGEMM. Sparse attention kernels are released in FlashMLA.

License

This repository and the model weights are licensed under the MIT License.

Citation

@misc{deepseekai2024deepseekv32,
      title={DeepSeek-V3.2-Exp: Boosting Long-Context Efficiency with DeepSeek Sparse Attention}, 
      author={DeepSeek-AI},
      year={2025},
}

Contact

If you have any questions, please raise an issue or contact us at [email protected].

6,902 chars • Full Disclosure Protocol Active

ZEN MODE • README

DeepSeek-V3.2-Exp

Introduction

DeepSeek Sparse Attention (DSA) achieves fine-grained sparse attention for the first time, delivering substantial improvements in long-context training and inference efficiency while maintaining virtually identical model output quality.
To rigorously evaluate the impact of introducing sparse attention, we deliberately aligned the training configurations of DeepSeek-V3.2-Exp with V3.1-Terminus. Across public benchmarks in various domains, DeepSeek-V3.2-Exp demonstrates performance on par with V3.1-Terminus.

Benchmark	DeepSeek-V3.1-Terminus	DeepSeek-V3.2-Exp
Reasoning Mode w/o Tool Use
MMLU-Pro	85.0	85.0
GPQA-Diamond	80.7	79.9
Humanity's Last Exam	21.7	19.8
LiveCodeBench	74.9	74.1
AIME 2025	88.4	89.3
HMMT 2025	86.1	83.6
Codeforces	2046	2121
Aider-Polyglot	76.1	74.5
Agentic Tool Use
BrowseComp	38.5	40.1
BrowseComp-zh	45.0	47.9
SimpleQA	96.8	97.1
SWE Verified	68.4	67.8
SWE-bench Multilingual	57.8	57.9
Terminal-bench	36.7	37.7

Update

2025.11.17: We have identified that previous versions of the inference demo code contained an implementation discrepancy in Rotary Position Embedding (RoPE) within the indexer module, potentially leading to degraded model performance. Specifically, the input tensor to RoPE in the indexer module requires a non-interleaved layout, whereas RoPE in the MLA module expects an interleaved layout. This issue has now been resolved. Please refer to the updated version of the inference demo code and take note of this implementation detail.

How to Run Locally

HuggingFace

We provide an updated inference demo code in the inference folder to help the community quickly get started with our model and understand its architectural details.

First convert huggingface model weights to the the format required by our inference demo. Set MP to match your available GPU count:

cd inference
export EXPERTS=256
python convert.py --hf-ckpt-path ${HF_CKPT_PATH} --save-path ${SAVE_PATH} --n-experts ${EXPERTS} --model-parallel ${MP}

Launch the interactive chat interface and start exploring DeepSeek's capabilities:

export CONFIG=config_671B_v3.2.json
torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --interactive

SGLang

Installation with Docker

# H200
docker pull lmsysorg/sglang:dsv32

# MI350
docker pull lmsysorg/sglang:dsv32-rocm

# NPUs
docker pull lmsysorg/sglang:dsv32-a2
docker pull lmsysorg/sglang:dsv32-a3

Launch Command

python -m sglang.launch_server --model deepseek-ai/DeepSeek-V3.2-Exp --tp 8 --dp 8 --enable-dp-attention

vLLM

vLLM provides day-0 support of DeepSeek-V3.2-Exp. See the recipes for up-to-date details.

Open-Source Kernels

For TileLang kernels with better readability and research-purpose design, please refer to TileLang.

For high-performance CUDA kernels, indexer logit kernels (including paged versions) are available in DeepGEMM. Sparse attention kernels are released in FlashMLA.

License

This repository and the model weights are licensed under the MIT License.

Citation

@misc{deepseekai2024deepseekv32,
      title={DeepSeek-V3.2-Exp: Boosting Long-Context Efficiency with DeepSeek Sparse Attention}, 
      author={DeepSeek-AI},
      year={2025},
}

Contact

If you have any questions, please raise an issue or contact us at [email protected].

📝 Limitations & Considerations

• Benchmark scores may vary based on evaluation methodology and hardware configuration.
• VRAM requirements are estimates; actual usage depends on quantization and batch size.
• FNI scores are relative rankings and may change as new models are added.
⚠ License Unknown: Verify licensing terms before commercial use.
• Source: Unknown

📜

Cite this model

Academic & Research Attribution

BibTeX

@misc{hf_model__deepseek_ai__deepseek_v3.2_exp,
  author = {deepseek-ai},
  title = {undefined Model},
  year = {2026},
  howpublished = {\url{https://huggingface.co/deepseek-ai/deepseek-v3.2-exp}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}

APA Style

deepseek-ai. (2026). undefined [Model]. Free2AITools. https://huggingface.co/deepseek-ai/deepseek-v3.2-exp

🤗 Data Source: Hugging Face ↗

🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseℹ️ Verify with original source

🛡️ Model Transparency Report

Verified data manifest for traceability and transparency.

100% Data Disclosure Active

🆔 Identity & Source

id: hf-model--deepseek-ai--deepseek-v3.2-exp
author: deepseek-ai
tags: transformerssafetensorsdeepseek_v32text-generationconversationalbase_model:deepseek-ai/deepseek-v3.2-exp-baselicense:mitendpoints_compatiblefp8region:us

⚙️ Technical Specs

architecture: DeepseekV32ForCausalLM
params billions: 685.4
context length: 4,096
vram gb: 516.5
vram is estimated: true
vram formula: VRAM ≈ (params * 0.75) + 2GB (KV) + 0.5GB (OS)

📊 Engagement & Metrics

likes: 909
downloads: 75,365

Free2AITools Constitutional Data Pipeline: Curated disclosure mode active. (V15.x Standard)

Welcome to Free2AI Tools!

Smart Search

FNI Score

You're All Set!

⚡ Quick Commands

Engineering Specs

⚡ Hardware

🧠 Lifecycle

🌐 Identity

🕸️ Neural Mesh Hub

🔗 Core Ecosystem

📈 Interest Trend

🔍 Semantic Keywords

Social Proof

🔬Technical Deep Dive

🚀 What's Next?

Find Training Datasets

Compare Benchmarks

Deployment Guide

⚡ Quick Commands

Hardware Compatibility

RTX 3060 / 4060 Ti

RTX 4070 Super

RTX 4080 / Mac M3

RTX 3090 / 4090

RTX 6000 Ada

A100 / H100

README

DeepSeek-V3.2-Exp

Introduction

Update

How to Run Locally

HuggingFace

SGLang

Installation with Docker

Launch Command

vLLM

Open-Source Kernels

License

Citation

Contact

DeepSeek-V3.2-Exp

Introduction

Update

How to Run Locally

HuggingFace

SGLang

Installation with Docker

Launch Command

vLLM

Open-Source Kernels

License

Citation

Contact

📝 Limitations & Considerations

Cite this model

🛡️ Model Transparency Report

🆔 Identity & Source

⚙️ Technical Specs

📊 Engagement & Metrics