A Comprehensive Evaluation of 26 State-of-the-Art Text-to-Image Models

We evaluate 26 recent text-to-image models, encompassing various types (e.g., diffusion, autoregressive, GAN), sizes (ranging from 0.4B to 13B parameters), organizations, and accessibility (open or closed). Table 4 presents an overview of the models and their corresponding properties. In our evaluation, we employ the default inference configurations provided in the respective model’s API, GitHub, or Hugging Face repositories.

This paper is available on arxiv under CC BY 4.0 DEED license.

← Previous

Photorealism, Bias, and Beyond: Results from Evaluating 26 Text-to-Image Models

Up Next →

Paving the Way for Better AI Models: Insights from HEIM’s 12-Aspect Benchmark