🧠

Model

Qwen3 Coder 30b A3b Instruct

Name: Qwen3 Coder 30b A3b Instruct
Author: unsloth

by unsloth hf-model--unsloth--qwen3-coder-30b-a3b-instruct

Free2AITools Nexus Index

42.1 Top 100%

S: Semantic 50

A: Authority 0

P: Popularity 46

R: Recency 84

Q: Quality 50

Tech Context

30 Params

32.768K Ctx

Vital Performance

13.6K DL / 30D

0.0%

Source →

Audited 42.1 FNI Score

30B Params

32k Context

13.6K Downloads

H100+ ~25GB Est. VRAM

Dense QWEN3MOEFORCAUSALLM Architecture

Commercial APACHE License

Model Information Summary
Entity Passport
Registry ID	hf-model--unsloth--qwen3-coder-30b-a3b-instruct
License	Apache-2.0
Provider	huggingface

💾

Compute Threshold

~25GB VRAM

Interactive

Analyze Hardware

Hardware Compatibility Test

▼

* Static estimation for 4-Bit Quantization.

📜

Cite this model

Academic & Research Attribution

BibTeX

@misc{hf_model__unsloth__qwen3_coder_30b_a3b_instruct,
  author = {unsloth},
  title = {Qwen3 Coder 30b A3b Instruct Model},
  year = {2026},
  howpublished = {\url{https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}

APA Style

unsloth. (2026). Qwen3 Coder 30b A3b Instruct [Model]. Free2AITools. https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct

🔬Technical Deep Dive

Full Specifications [+]

Quick Commands

🦙 Ollama Run

ollama run qwen3-coder-30b-a3b-instruct

🤗 HF Download

huggingface-cli download unsloth/qwen3-coder-30b-a3b-instruct

📦 Install Lib

pip install -U transformers

⚖️ Free2AITools Nexus Index V2.0

Methodology Index Protocol

Semantic (S) 50

Authority (A) 0

Popularity (P) 46

Recency (R) 84

Quality (Q) 50

💬 Index Insight

FNI V2.0 for Qwen3 Coder 30b A3b Instruct: Semantic (S:50), Authority (A:0), Popularity (P:46), Recency (R:84), Quality (Q:50).

Free2AITools Nexus Index

Verification Authority

HuggingFace API GitHub Metadata Arxiv Citation DB System Audit

Unbiased Data Node Refresh: VFS Live

---

🚀 What's Next?

📊

Find Training Datasets

Discover datasets compatible with this model

📈

Compare Benchmarks

See how this model ranks on standard tests

⚡

Technical Deep Dive

[!NOTE] Includes Unsloth chat template fixes!
For llama.cpp, use --jinja

Unsloth Dynamic 2.0 achieves superior accuracy & outperforms other leading quants.

Qwen3-Coder-3B-A3B-Instruct

Highlights

Qwen3-Coder is available in multiple sizes. Today, we're excited to introduce Qwen3-Coder-30B-A3B-Instruct. This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements:

Significant Performance among open models on Agentic Coding, Agentic Browser-Use, and other foundational coding tasks.
Long-context Capabilities with native support for 256K tokens, extendable up to 1M tokens using Yarn, optimized for repository-scale understanding.
Agentic Coding supporting for most platform such as Qwen Code, CLINE, featuring a specially designed function call format.

![image/jpeg](placeholder of Qwen3-Coder-30B-A3B-Instruct performance image )

Model Overview

Qwen3-Coder-30B-A3B-Instruct has the following features:

Type: Causal Language Models
Training Stage: Pretraining & Post-training
Number of Parameters: 30.5B in total and 3.3B activated
Number of Layers: 48
Number of Attention Heads (GQA): 32 for Q and 4 for KV
Number of Experts: 128
Number of Activated Experts: 8
Context Length: 262,144 natively.

NOTE: This model supports only non-thinking mode and does not generate <think></think> blocks in its output. Meanwhile, specifying enable_thinking=False is no longer required.

For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation.

Quickstart

We advise you to use the latest version of transformers.

With transformers<4.51.0, you will encounter the following error:

text

KeyError: 'qwen3_moe'

The following contains a code snippet illustrating how to use the model generate content based on given inputs.

python

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Qwen/Qwen3-Coder-30B-A3B-Instruct"

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# prepare the model input
prompt = "Write a quick sort algorithm."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=65536
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

content = tokenizer.decode(output_ids, skip_special_tokens=True)

print("content:", content)

Note: If you encounter out-of-memory (OOM) issues, consider reducing the context length to a shorter value, such as 32,768.

For local use, applications such as Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers have also supported Qwen3.

Agentic Coding

Qwen3-Coder excels in tool calling capabilities.

You can simply define or use any tools as following example.

python

# Your tool implementation
def square_the_number(num: float) -> dict:
    return num ** 2

# Define Tools
tools=[
    {
        "type":"function",
        "function":{
            "name": "square_the_number",
            "description": "output the square of the number.",
            "parameters": {
                "type": "object",
                "required": ["input_num"],
                "properties": {
                    'input_num': {
                        'type': 'number', 
                        'description': 'input_num is a number that will be squared'
                        }
                },
            }
        }
    }
]

import OpenAI
# Define LLM
client = OpenAI(
    # Use a custom endpoint compatible with OpenAI API
    base_url='http://localhost:8000/v1',  # api_base
    api_key="EMPTY"
)
 
messages = [{'role': 'user', 'content': 'square the number 1024'}]

completion = client.chat.completions.create(
    messages=messages,
    model="Qwen3-Coder-30B-A3B-Instruct",
    max_tokens=65536,
    tools=tools,
)

print(completion.choice[0])

Best Practices

To achieve optimal performance, we recommend the following settings:

Sampling Parameters:
- We suggest using temperature=0.7, top_p=0.8, top_k=20, repetition_penalty=1.05.
Adequate Output Length: We recommend using an output length of 65,536 tokens for most queries, which is adequate for instruct models.

Citation

If you find our work helpful, feel free to give us a cite.

text

@misc{qwen3technicalreport,
      title={Qwen3 Technical Report}, 
      author={Qwen Team},
      year={2025},
      eprint={2505.09388},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.09388}, 
}

⚠️ Incomplete Data

Some information about this model is not available. Use with Caution - Verify details from the original source before relying on this data.

View Original Source →

📝 Limitations & Considerations

• Benchmark scores may vary based on evaluation methodology and hardware configuration.
• VRAM requirements are estimates; actual usage depends on quantization and batch size.
• FNI scores are relative rankings and may change as new models are added.
⚠ License Unknown: Verify licensing terms before commercial use.

Social Proof

HuggingFace Hub

13.6KDownloads

Hub Discussions

🤗 Data Source: Hugging Face ↗

🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseℹ️ Verify with original source

🛡️ Model Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id: hf-model--unsloth--qwen3-coder-30b-a3b-instruct
slug: unsloth--qwen3-coder-30b-a3b-instruct
source: huggingface
author: unsloth
license: Apache-2.0
tags: transformers, safetensors, qwen3_moe, text-generation, unsloth, conversational, arxiv:2505.09388, base_model:qwen/qwen3-coder-30b-a3b-instruct, license:apache-2.0, endpoints_compatible, region:us

⚙️ Technical Specs

architecture: Qwen3MoeForCausalLM
params billions: 30
context length: 32,768
pipeline tag: text-generation
vram gb: 25
vram is estimated: true
vram formula: VRAM ≈ (params * 0.75) + 2GB (KV) + 0.5GB (OS)

📊 Engagement & Metrics

downloads: 13,574
stars: 0
forks: 0

Data indexed from public sources. Updated daily.