🧠

gpt4-x-alpaca-13b-native-4bit-128g

by anon8231489123 Model ID: hf-model--anon8231489123--gpt4-x-alpaca-13b-native-4bit-128g
FNI 2.5
Top 59%

"Update (4/1): Added ggml for Cuda model Dataset is here (instruct): https://github.com/teknium1/GPTeacher Okay... Two different models now. One generated in the Triton branch, one generated in Cuda. Use the Cuda one for now unless the Triton branch becomes widely used. Cuda info (use this one): Comm..."

🔗 View Source
Audited 2.5 FNI Score
13B Params
4k Context
1.2K Downloads
24G GPU ~12GB Est. VRAM

⚡ Quick Commands

đŸĻ™ Ollama Run
ollama run gpt4-x-alpaca-13b-native-4bit-128g
🤗 HF Download
huggingface-cli download anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g
đŸ“Ļ Install Lib
pip install -U transformers
📊

Engineering Specs

⚡ Hardware

Parameters
13B
Architecture
LLaMAForCausalLM
Context Length
4K
Model Size
24.4GB

🧠 Lifecycle

Library
-
Precision
float16
Tokenizer
-

🌐 Identity

Source
HuggingFace
License
Open Access
💾

Est. VRAM Benchmark

~11.1GB

Analyze Hardware

* Technical estimation for FP16/Q4 weights. Does not include OS overhead or long-context batching. For Technical Reference Only.

📈 Interest Trend

--

* Real-time activity index across HuggingFace, GitHub and Research citations.

No similar models found.

đŸ”ŦTechnical Deep Dive

Full Specifications [+]
---

🚀 What's Next?

⚡ Quick Commands

đŸĻ™ Ollama Run
ollama run gpt4-x-alpaca-13b-native-4bit-128g
🤗 HF Download
huggingface-cli download anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g
đŸ“Ļ Install Lib
pip install -U transformers
đŸ–Ĩī¸

Hardware Compatibility

Multi-Tier Validation Matrix

Live Sync
🎮 Compatible

RTX 3060 / 4060 Ti

Entry 8GB VRAM
🎮 Compatible

RTX 4070 Super

Mid 12GB VRAM
đŸ’ģ Compatible

RTX 4080 / Mac M3

High 16GB VRAM
🚀 Compatible

RTX 3090 / 4090

Pro 24GB VRAM
đŸ—ī¸ Compatible

RTX 6000 Ada

Workstation 48GB VRAM
🏭 Compatible

A100 / H100

Datacenter 80GB VRAM
â„šī¸

Pro Tip: Compatibility is estimated for 4-bit quantization (Q4). High-precision (FP16) or ultra-long context windows will significantly increase VRAM requirements.

README

Update (4/1): Added ggml for Cuda model

Dataset is here (instruct): https://github.com/teknium1/GPTeacher

Okay... Two different models now. One generated in the Triton branch, one generated in Cuda. Use the Cuda one for now unless the Triton branch becomes widely used.

Cuda info (use this one): Command:

CUDA_VISIBLE_DEVICES=0 python llama.py ./models/chavinlo-gpt4-x-alpaca --wbits 4 --true-sequential --groupsize 128 --save gpt-x-alpaca-13b-native-4bit-128g-cuda.pt

Prev. info

Quantized on GPTQ-for-LLaMa commit 5955e9c67d9bfe8a8144ffbe853c2769f1e87cdd

GPTQ 4bit quantization of: https://huggingface.co/chavinlo/gpt4-x-alpaca

Note: This was quantized with this branch of GPTQ-for-LLaMA: https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/triton

Because of this, it appears to be incompatible with Oobabooga at the moment. Stay tuned?

Command:

CUDA_VISIBLE_DEVICES=0 python llama.py ./models/chavinlo-gpt4-x-alpaca --wbits 4 --true-sequential --act-order --groupsize 128 --save gpt-x-alpaca-13b-native-4bit-128g.pt

ZEN MODE â€ĸ README

Update (4/1): Added ggml for Cuda model

Dataset is here (instruct): https://github.com/teknium1/GPTeacher

Okay... Two different models now. One generated in the Triton branch, one generated in Cuda. Use the Cuda one for now unless the Triton branch becomes widely used.

Cuda info (use this one): Command:

CUDA_VISIBLE_DEVICES=0 python llama.py ./models/chavinlo-gpt4-x-alpaca --wbits 4 --true-sequential --groupsize 128 --save gpt-x-alpaca-13b-native-4bit-128g-cuda.pt

Prev. info

Quantized on GPTQ-for-LLaMa commit 5955e9c67d9bfe8a8144ffbe853c2769f1e87cdd

GPTQ 4bit quantization of: https://huggingface.co/chavinlo/gpt4-x-alpaca

Note: This was quantized with this branch of GPTQ-for-LLaMA: https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/triton

Because of this, it appears to be incompatible with Oobabooga at the moment. Stay tuned?

Command:

CUDA_VISIBLE_DEVICES=0 python llama.py ./models/chavinlo-gpt4-x-alpaca --wbits 4 --true-sequential --act-order --groupsize 128 --save gpt-x-alpaca-13b-native-4bit-128g.pt

📝 Limitations & Considerations

  • â€ĸ Benchmark scores may vary based on evaluation methodology and hardware configuration.
  • â€ĸ VRAM requirements are estimates; actual usage depends on quantization and batch size.
  • â€ĸ FNI scores are relative rankings and may change as new models are added.
  • ⚠ License Unknown: Verify licensing terms before commercial use.
  • â€ĸ Source: Unknown
📜

Cite this model

Academic & Research Attribution

BibTeX
@misc{hf_model__anon8231489123__gpt4_x_alpaca_13b_native_4bit_128g,
  author = {anon8231489123},
  title = {undefined Model},
  year = {2026},
  howpublished = {\url{https://huggingface.co/anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
anon8231489123. (2026). undefined [Model]. Free2AITools. https://huggingface.co/anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Model Transparency Report

Verified data manifest for traceability and transparency.

100% Data Disclosure Active

🆔 Identity & Source

id
hf-model--anon8231489123--gpt4-x-alpaca-13b-native-4bit-128g
author
anon8231489123
tags
transformerspytorchllamatext-generationtext-generation-inferenceendpoints_compatibleregion:us

âš™ī¸ Technical Specs

architecture
LLaMAForCausalLM
params billions
13
context length
4,096
vram gb
11.1
vram is estimated
true
vram formula
VRAM ≈ (params * 0.75) + 0.8GB (KV) + 0.5GB (OS)

📊 Engagement & Metrics

likes
733
downloads
1,188

Free2AITools Constitutional Data Pipeline: Curated disclosure mode active. (V15.x Standard)