By Insights Team in AI — 01 Jul 2025

UofT, UBC, MIT, and Fudan Collaborate to Publish Comprehensive Review on Diffusion Model-Driven Anomaly Detection and Generation}

A joint publication by UofT, UBC, MIT, and Fudan offers a systematic review of diffusion models in anomaly detection and generation, highlighting recent advances and future directions.

Diffusion Models (DMs) have shown tremendous potential in recent years, making significant progress in computer vision and natural language processing. As a key research area, anomaly detection (AD) plays a crucial role in industrial manufacturing, financial risk control, medical diagnosis, and other practical scenarios. Recently, researchers from the University of Toronto, UBC, MIT, Fudan University, Cardiff University, and others collaborated on a comprehensive survey titled "Anomaly Detection and Generation with Diffusion Models: A Survey." This work systematically reviews the latest developments in image, video, time series, tabular, and multimodal anomaly detection, providing a complete classification framework from the perspective of diffusion models. It also explores future trends and opportunities, aiming to guide researchers and practitioners in the field.

Paper Title: Anomaly Detection and Generation with Diffusion Models: A Survey
Link to Paper: https://arxiv.org/pdf/2506.09368
Project Homepage: https://github.com/fudanyliu/ADGDM

^{Figure 2: Analysis of research hotspots in anomaly detection, generation, and diffusion models}

II. Diffusion Models and Anomaly Detection

Diffusion models utilize forward diffusion and reverse denoising Markov processes to model data distributions. The forward process follows stochastic differential equations, gradually transforming data into Gaussian noise; the reverse process learns denoising mappings via neural networks, progressively restoring original data. Their generative mechanism excels at capturing subtle differences in complex data distributions. Compared to traditional unsupervised AD methods like GANs, VAEs, and Transformers, DMs demonstrate superior sample quality and diversity, showing great potential in anomaly detection.

^{Figure 3: Anomaly scoring mechanism based on diffusion models}

Diffusion-based anomaly detection models define anomalies as significant deviations from normal data patterns by modeling the intrinsic data distribution. Different scoring paradigms include:

Reconstruction-based scoring: Uses the reverse denoising process to reconstruct input samples; small errors indicate normality, large errors suggest anomalies. Common in industrial quality inspection with pixel-level reconstruction errors.
Density-based scoring: Estimates data probability density; negative log-likelihood serves as anomaly score, with low likelihood indicating anomalies.
Score-based scoring: Uses the gradient of data distribution (score function) to quantify deviation from the data manifold; higher gradient norms imply anomalies.

These methods capture anomalies from different perspectives: reconstruction error in sample space, likelihood in probability space, and geometric gradients on the data manifold. Their applications vary from local image anomalies to global sequence anomalies, with each suited to specific data types.

III. Diffusion-Driven Anomaly Detection and Generation

3.1 Image Anomaly Detection

In image anomaly detection (IAD), DMs face challenges like the “identity shortcut” problem—where models tend to copy anomalies directly—and high computational costs due to iterative sampling. To address these, recent approaches include mask reconstruction, latent space editing, and adversarial training to prevent trivial copying. For efficiency, techniques like model distillation, efficient ODE solvers, latent diffusion models (LDMs), and sparsification reduce sampling steps and resource consumption, enabling deployment in industrial quality control and medical imaging.

^{Figure 4: Illustration of image anomaly detection methods. (a) shows basic reconstruction-based methods; (b) shows variants designed to address the identity shortcut problem, aiming to improve anomaly sensitivity.}

3.2 Video Anomaly Detection

Video anomaly detection (VAD) involves modeling spatiotemporal dependencies to identify abnormal motion patterns. Advanced diffusion models incorporate optical flow, motion vectors, or spatiotemporal transformers to learn normal event evolution, enabling detection of speed, direction, or acceleration anomalies. By conditioning on past frames or motion features, models predict future normal frames and compare with actual observations, significantly improving detection accuracy and robustness in surveillance and autonomous driving scenarios.

^{Figure 5: Spatiotemporal anomaly detection framework integrating motion features via diffusion models}

3.3 Time Series Anomaly Detection

Time series anomaly detection (TSAD) faces challenges from intrinsic temporal dependencies, irregular sampling, and long-term correlations. Diffusion models in TSAD mainly follow two paradigms: reconstruction-based, which reconstructs the series and flags large errors as anomalies; and imputation-based, which fills missing data points, with poor imputation quality indicating anomalies. RNNs or attention mechanisms are often integrated to capture long-term dependencies, excelling in fraud detection and fault prediction.

^{Figure 6: TSAD framework with diffusion models, showing reconstruction and imputation paths}

3.4 Tabular Data Anomaly Detection

Tabular data, with mixed types and missing values, presents unique challenges for anomaly detection. Diffusion models adapted for tabular data typically involve embedding heterogeneous features into continuous representations, then learning the joint distribution of normal data. During inference, anomalies are identified via reconstruction loss or likelihood scores. Masking strategies during training improve robustness against missing data, making these methods suitable for finance and healthcare fraud detection.

^{Figure 7: Tabular anomaly detection framework, involving embedding, diffusion, and scoring}

3.5 Multimodal Anomaly Detection

Multimodal anomaly detection (MAD) combines information from different sources like images, text, and sensors to improve detection accuracy. Strategies include early fusion (feature-level), late fusion (decision-level), and dynamic fusion (context-aware). Collaborative diffusion frameworks build shared embeddings and adaptive modules, effectively aligning heterogeneous data, with promising applications in industrial inspection and security monitoring.

^{Figure 8: Conceptual diagram of multimodal anomaly detection with fusion strategies}

3.6 Anomaly Generation

Anomaly generation (AG) addresses the scarcity of real anomalous samples by creating realistic synthetic anomalies using diffusion models. By conditioning on text, masks, or latent space manipulations, these models generate diverse anomalies for data augmentation, robustness testing, and self-supervised learning, significantly enhancing model generalization and resilience.

^{Figure 9: Illustration of anomaly generation process using diffusion models for data augmentation}

IV. Challenges and Opportunities

Despite progress, diffusion models face challenges like high computational costs, slow inference, and difficulty handling complex, noisy, multi-modal data in real-world scenarios. Future research should focus on optimizing architectures, developing lightweight models, and improving robustness to complex environments. Combining DMs with foundation models and reinforcement learning may unlock new capabilities for anomaly detection and generation in practical applications.

V. Conclusion

This survey systematically reviews the application of diffusion models in anomaly detection and generation, covering theoretical foundations, methods, and practical scenarios. It highlights current bottlenecks, such as multi-step inference costs and limited generalization, and points to future trends like context-aware detection, rapid domain adaptation, and efficient architectures. For further reading, see the paper:

@misc{liu2025anomaly, title={Anomaly Detection and Generation with Diffusion Models: A Survey}, author={Liu, Yang and Liu, Jing and Li, Chengfang and Xi, Rui and Li, Wenchao and Cao, Liang and Wang, Jin and Yang, Laurence T. and Yuan, Junsong and Zhou, Wei}, year={2025}, primaryclass={cs.LG}, eprint={2506.09638}, doi={10.48550/arXiv.2506.09638}, url={https://arxiv.org/abs/2506.09638} }

Subscribe to QQ Insights