🧠
Model

Qwen3.5 397b A17b Mlx 3.5bit

by spicyneuron hf-model--spicyneuron--qwen3.5-397b-a17b-mlx-3.5bit
Free2AITools Nexus Index
39.8 Top 100%
S: Semantic 50
A: Authority 0
P: Popularity 22
R: Recency 97
Q: Quality 65
Tech Context
396.35 Params
32.768K Ctx
Vital Performance
614 DL / 30D
0.0%
Audited 39.8 FNI Score
Massive 396.35B Params
32k Context
614 Downloads
H100+ ~303GB Est. VRAM
Dense QWEN3_5MOEFORCONDITIONALGENERATION Architecture
Commercial APACHE License
Model Information Summary
Entity Passport
Registry ID hf-model--spicyneuron--qwen3.5-397b-a17b-mlx-3.5bit
License Apache-2.0
Provider huggingface
💾

Compute Threshold

~302.8GB VRAM

Interactive
Analyze Hardware
â–ŧ

* Static estimation for 4-Bit Quantization. [Multi-GPU / Unified Memory Required]

📜

Cite this model

Academic & Research Attribution

BibTeX
@misc{hf_model__spicyneuron__qwen3.5_397b_a17b_mlx_3.5bit,
  author = {spicyneuron},
  title = {Qwen3.5 397b A17b Mlx 3.5bit Model},
  year = {2026},
  howpublished = {\url{https://huggingface.co/spicyneuron/Qwen3.5-397B-A17B-MLX-3.5bit}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
spicyneuron. (2026). Qwen3.5 397b A17b Mlx 3.5bit [Model]. Free2AITools. https://huggingface.co/spicyneuron/Qwen3.5-397B-A17B-MLX-3.5bit

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

Quick Commands

🤗 HF Download
huggingface-cli download spicyneuron/qwen3.5-397b-a17b-mlx-3.5bit

âš–ī¸ Free2AITools Nexus Index V2.0

Semantic (S) 50
Authority (A) 0
Popularity (P) 22
Recency (R) 97
Quality (Q) 65

đŸ’Ŧ Index Insight

FNI V2.0 for Qwen3.5 397b A17b Mlx 3.5bit: Semantic (S:50), Authority (A:0), Popularity (P:22), Recency (R:97), Quality (Q:65).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
---

🚀 What's Next?

Technical Deep Dive

Qwen3.5-397B-A17B optimized for MLX!

  • Mixed-precision quantization balances throughput, accuracy, and memory.
  • Better quality than a 4-bit baseline but requires 20% less memory.
  • Fixed chat template allows more reliable prompt caching.
  • This version does NOT support vision (image input).

Also available as a smaller 129GB version: https://huggingface.co/spicyneuron/Qwen3.5-397B-A17B-MLX-2.6bit

Usage

sh
# Start server at http://localhost:8080/v1/chat/completions
uvx --from mlx-lm mlx_lm.server \
  --host 127.0.0.1 \
  --port 8080 \
  --model spicyneuron/Qwen3.5-397B-A17B-MLX-3.5bit

Methodology

Quantized with a mlx-lm fork, drawing inspiration from Unsloth/AesSedai/ubergarm style mixed-precision GGUFs. MLX quantization options differ than llama.cpp, but the principles are the same:

  • Sensitive layers like MoE routing, attention, and output embeddings get higher precision.
  • More tolerant layers like MoE experts get lower precision.

Benchmarks

metric lmstudio-community 4 bit 2.6 bit 3.5 bit
perplexity 3.919 Âą 0.019 3.852 Âą 0.018 3.919 Âą 0.019
hellaswag 0.594 Âą 0.022 0.598 Âą 0.022 0.622 Âą 0.022
piqa 0.798 Âą 0.018 0.802 Âą 0.018 0.804 Âą 0.018
winogrande 0.744 Âą 0.02 0.718 Âą 0.02 0.746 Âą 0.019
p1024/g512 prompt 490.702 489.545 479.453
p1024/g512 gen 39.192 38.398 35.547
p1024/g512 mem 225.095 131.523 179.842

âš ī¸ Incomplete Data

Some information about this model is not available. Use with Caution - Verify details from the original source before relying on this data.

View Original Source →

📝 Limitations & Considerations

  • â€ĸ Benchmark scores may vary based on evaluation methodology and hardware configuration.
  • â€ĸ VRAM requirements are estimates; actual usage depends on quantization and batch size.
  • â€ĸ FNI scores are relative rankings and may change as new models are added.
  • ⚠ License Unknown: Verify licensing terms before commercial use.

Social Proof

HuggingFace Hub
614Downloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Model Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id
hf-model--spicyneuron--qwen3.5-397b-a17b-mlx-3.5bit
slug
spicyneuron--qwen3.5-397b-a17b-mlx-3.5bit
source
huggingface
author
spicyneuron
license
Apache-2.0
tags
mlx, safetensors, qwen3_5_moe, text-generation, conversational, base_model:qwen/qwen3.5-397b-a17b, base_model:quantized:qwen/qwen3.5-397b-a17b, license:apache-2.0, 3-bit, region:us

âš™ī¸ Technical Specs

architecture
Qwen3_5MoeForConditionalGeneration
params billions
396.35
context length
32,768
pipeline tag
text-generation
vram gb
302.8
vram is estimated
true
vram formula
VRAM ≈ (params * 0.75) + 5GB (KV) + 0.5GB (OS)

📊 Engagement & Metrics

downloads
614
stars
0
forks
0

Data indexed from public sources. Updated daily.