πŸ“„
Paper

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

by Dustin Podell, Zion English, Kyle Lacey, A. Blattmann, Tim Dockhorn, Jonas Muller, Joe Penna, Robin Rombach 2307.01952
Free2AITools Nexus Index
64.3
S: Semantic 50

Query-time baseline · scored live at search

A: Authority 95
P: Popularity 77
R: Recency 100
Q: Quality 65
Tech Context
Vital Performance

We present SDXL, a latent diffusion model for text-to-image synthesis. Compared to previous versions of Stable Diffusion, SDXL leverages a three times larger UNet backbone: The increase of model parameters is mainly due to more attention blocks and a larger cross-attention context as SDXL uses a second text encoder. We design multiple novel conditioning schemes and train SDXL on multiple aspect ratios. We also introduce a refinement model which is used to improve the visual fidelity of sample...

Semantic Scholar 4.0K Citations
Paper Information Summary
Entity Passport
Registry ID 2307.01952
License ArXiv
Provider semantic_scholar
πŸ“œ

Cite this paper

Academic & Research Attribution

BibTeX
@misc{arxiv_2307_01952,
  author = {Dustin Podell, Zion English, Kyle Lacey, A. Blattmann, Tim Dockhorn, Jonas Muller, Joe Penna, Robin Rombach},
  title = {SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper},
  year = {2026},
  howpublished = {\url{https://arxiv.org/abs/2307.01952}},
  note = {Accessed via Free2AITools.}
}
APA Style
Dustin Podell, Zion English, Kyle Lacey, A. Blattmann, Tim Dockhorn, Jonas Muller, Joe Penna, Robin Rombach. (2026). SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis [Paper]. Free2AITools. https://arxiv.org/abs/2307.01952

πŸ”¬Technical Deep Dive

Full Specifications [+]

βš–οΈ Free2AITools Nexus Index V2.0

Semantic (S) 50

Query-time baseline · scored live at search

Authority (A) 95
Popularity (P) 77
Recency (R) 100
Quality (Q) 65

πŸ’¬ Index Insight

FNI V2.0 for SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis: Authority (A:95), Popularity (P:77), Recency (R:100), Quality (Q:65). Semantic (S) is a query-time baseline scored live at search.

Free2AITools Nexus Index

Data Sources / Provenance

Open data Updated: Live data

πŸ“ Executive Summary

"We present SDXL, a latent diffusion model for text-to-image synthesis. Compared to previous versions of Stable Diffusion, SDXL leverages a three times larger UNet backbone: The increase of model parameters is mainly due to more attention blocks and a larger cross-attention context as SDXL uses a second text encoder. We design multiple novel conditioning schemes and train SDXL on multiple aspect ratios. We also introduce a refinement model which is used to improve the visual fidelity of sample..."

❝ Cite Node

@article{Podell2026SDXL:,
  title={SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis},
  author={Dustin Podell and Zion English and Kyle Lacey and A. Blattmann and Tim Dockhorn and Jonas Muller and Joe Penna and Robin Rombach},
  journal={arXiv preprint arXiv:2307.01952},
  year={2026}
}

πŸ‘₯ Collaborating Minds

Dustin Podell Zion English Kyle Lacey A. Blattmann Tim Dockhorn Jonas Muller Joe Penna Robin Rombach

πŸ”— Full Paper

Free2AITools indexes the abstract and factual metadata for this paper. Read the complete, authoritative paper on the official source.

Read the full paper on arXiv

πŸ“Š Research Signals

πŸ“ˆ3,958CitationsSemantic Scholar
πŸ›οΈ95AuthorityFNI pillar
⏱️100RecencyFNI pillar
βœ…65QualityFNI pillar
πŸ—‚οΈvision multimediaField

🏷️ Research Topics

image generationrag retrievalattention mechanism
πŸ“¦Data Source: semantic_scholar
πŸ”„ Updated daily

Source summary: Based on semantic_scholar metadata. Not a recommendation.

πŸ“Š FNI Methodology πŸ“š Knowledge Baseℹ️ Verify with original source

πŸ›‘οΈ Paper Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

πŸ†” Identity & Source

id
2307.01952
slug
2307.01952
source
semantic_scholar
author
Dustin Podell, Zion English, Kyle Lacey, A. Blattmann, Tim Dockhorn, Jonas Muller, Joe Penna, Robin Rombach
license
ArXiv
tags
paper, research, academic

βš™οΈ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag

πŸ“Š Engagement & Metrics

downloads
0
stars
0
forks
0
citations
3,958

Data indexed from public sources. Updated daily.