Muq Mulan Large
| Entity Passport | |
| Registry ID | hf-model--papamoth--muq-mulan-large |
| License | CC-BY-NC-4.0 |
| Provider | huggingface |
Cite this model
Academic & Research Attribution
@misc{hf_model__papamoth__muq_mulan_large,
author = {PapaMoth},
title = {Muq Mulan Large Model},
year = {2026},
howpublished = {\url{https://huggingface.co/papamoth/muq-mulan-large}},
note = {Accessed via Free2AITools Knowledge Fortress}
} π¬Technical Deep Dive
Full Specifications [+]βΎ
Quick Commands
huggingface-cli download papamoth/muq-mulan-large βοΈ Nexus Index V2.0
π¬ Index Insight
FNI V2.0 for Muq Mulan Large: Semantic (S:50), Authority (A:0), Popularity (P:2), Recency (R:97), Quality (Q:50).
Verification Authority
π What's Next?
Technical Deep Dive
MuQ & MuQ-MuLan
This is the official repository for the paper *"MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization"*. For more detailed information, we strongly recommend referring to https://github.com/tencent-ailab/MuQ and the paper.
In this repo, the following models are released:
- MuQ(see this link): A large music foundation model pre-trained via Self-Supervised Learning (SSL), achieving SOTA in various MIR tasks.
- MuQ-MuLan(see this link): A music-text joint embedding model trained via contrastive learning, supporting both English and Chinese texts.
Usage
To begin with, please use pip to install the official muq lib, and ensure that your python>=3.8:
pip3 install muq
Using MuQ-MuLan to extract the music and text embeddings and calculate the similarity:
import torch, librosa
from muq import MuQMuLan
# This will automatically fetch checkpoints from huggingface
device = 'cuda'
mulan = MuQMuLan.from_pretrained("OpenMuQ/MuQ-MuLan-large")
mulan = mulan.to(device).eval()
# Extract music embeddings
wav, sr = librosa.load("path/to/music_audio.wav", sr = 24000)
wavs = torch.tensor(wav).unsqueeze(0).to(device)
with torch.no_grad():
audio_embeds = mulan(wavs = wavs)
# Extract text embeddings (texts can be in English or Chinese)
texts = ["classical genres, hopeful mood, piano.", "δΈι¦ιεζ΅·θΎΉι£ζ―ηε°ζη΄ζ²οΌθε₯ζ¬’εΏ«"]
with torch.no_grad():
text_embeds = mulan(texts = texts)
# Calculate dot product similarity
sim = mulan.calc_similarity(audio_embeds, text_embeds)
print(sim)
To extract music audio features using MuQ:
import torch, librosa
from muq import MuQ
device = 'cuda'
wav, sr = librosa.load("path/to/music_audio.wav", sr = 24000)
wavs = torch.tensor(wav).unsqueeze(0).to(device)
# This will automatically fetch the checkpoint from huggingface
muq = MuQ.from_pretrained("OpenMuQ/MuQ-large-msd-iter")
muq = muq.to(device).eval()
with torch.no_grad():
output = muq(wavs, output_hidden_states=True)
print('Total number of layers: ', len(output.hidden_states))
print('Feature shape: ', output.last_hidden_state.shape)
Model Checkpoints
| Model Name | Parameters | Data | HuggingFaceπ€ |
|---|---|---|---|
| MuQ | ~300M | MSD dataset | OpenMuQ/MuQ-large-msd-iter |
| MuQ-MuLan | ~700M | music-text pairs | OpenMuQ/MuQ-MuLan-large |
Note: Please note that the open-sourced MuQ was trained on the Million Song Dataset. Due to differences in dataset size, the open-sourced model may not achieve the same level of performance as reported in the paper.
License
The code is released under the MIT license.
The model weights (MuQ-large-msd-iter, MuQ-MuLan-large) are released under the CC-BY-NC 4.0 license.
Citation
@article{zhu2025muq,
title={MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization},
author={Haina Zhu and Yizhi Zhou and Hangting Chen and Jianwei Yu and Ziyang Ma and Rongzhi Gu and Yi Luo and Wei Tan and Xie Chen},
journal={arXiv preprint arXiv:2501.01108},
year={2025}
}
β οΈ Incomplete Data
Some information about this model is not available. Use with Caution - Verify details from the original source before relying on this data.
View Original Source βπ Limitations & Considerations
- β’ Benchmark scores may vary based on evaluation methodology and hardware configuration.
- β’ VRAM requirements are estimates; actual usage depends on quantization and batch size.
- β’ FNI scores are relative rankings and may change as new models are added.
- β License Unknown: Verify licensing terms before commercial use.
Social Proof
AI Summary: Based on Hugging Face metadata. Not a recommendation.
π‘οΈ Model Transparency Report
Technical metadata sourced from upstream repositories.
π Identity & Source
- id
- hf-model--papamoth--muq-mulan-large
- slug
- papamoth--muq-mulan-large
- source
- huggingface
- author
- PapaMoth
- license
- CC-BY-NC-4.0
- tags
- pytorch, music, audio-classification, en, zh, arxiv:2501.01108, license:cc-by-nc-4.0, region:us
βοΈ Technical Specs
- architecture
- null
- params billions
- null
- context length
- null
- pipeline tag
- audio-classification
π Engagement & Metrics
- downloads
- 14
- stars
- 0
- forks
- 0
Data indexed from public sources. Updated daily.