By Insights Team in AI — 16 Jul 2025

Enhanced Performance by 87%, 70% Less Data: Tianjin University’s Chemical Toxicity Prediction Model Published in Nature Sub-Journal}

Tianjin University and collaborators developed a novel chemical toxicity prediction model, achieving 87% performance improvement and reducing data requirements by 70%, published in Nature Communications.

In the chemical industry, multi-species acute toxicity assessment is fundamental for classification, labeling, and risk management. Traditional machine learning models often struggle due to scarce human toxicity data, with some endpoints having only 140 data points.

To address these limitations, Tianjin University and a joint team developed the latest Conditional Acute Toxicity (ToxACoL) framework, which models endpoint relationships through endpoint association graphs and bidirectional learning to evaluate toxicity more effectively.

This research, titled ToxACoL: an endpoint-aware and task-focused compound representation learning paradigm for acute toxicity assessment, was published on July 1, 2025, in Nature Communications.

Paper link: https://www.nature.com/articles/s41467-025-60989-7

Toxicity Assessment Method

Annually, over 100,000 new chemicals are introduced globally, but toxicity evaluation faces challenges like data imbalance and cross-species experimental biases.

To overcome the scarcity of data for many compounds and endpoints, researchers proposed a machine learning paradigm called Adjoint Correlation Learning (ToxACoL), which models multi-species acute toxicity using graph topology and introduces a correlation mechanism for parallel information processing.

By learning relationships among endpoints, ToxACoL significantly improves prediction accuracy for scarce endpoints, boosting performance by 56%, 87%, and 43% for human, female, and male oral LD_Lo respectively, while reducing training data needs by about 70-80%.

Figure 1: High-level overview of ToxACoL.

Results Summary

By integrating the correlation mechanism, ToxACoL enables parallel learning of multi-condition labels and sample information, achieving excellent performance in multi-condition toxicity assessment.

Using Pearson correlation coefficient (PCC), the team inferred relationships among endpoints based on the number of shared compounds and their high correlation in toxicity measurements, constructing an endpoint dependency graph where nodes are endpoints and edges represent dependencies.

Graph convolutional networks (GCN) propagate endpoint correlation information through multiple layers. In 5-fold cross-validation, ToxACoL achieved an average R² of 0.5843 and RMSE of 0.6396, outperforming previous best algorithms like DLCA.

Figure 2: Performance comparison of multi-condition acute toxicity estimation on 59 endpoints.

To handle real-world chemical toxicity evaluation, especially for data-scarce endpoints (notably human-related), ToxACoL demonstrates high efficiency and performance.

In the previously mentioned three types of LD_Lo endpoints, ToxACoL achieved R² scores of 0.50, 0.43, and 0.40. In small-scale training with only 20-30% of the data used by other methods, it still reached state-of-the-art performance.

More details on ToxACoL’s performance and visualization of molecular structures are available but not elaborated here.

Effective Evaluation Method

To facilitate broader use, the team integrated ToxACoL into an online platform that also provides chemical GHS classification predictions. The platform aims to offer new validation pathways and serve as a useful resource for regulatory applications.

Platform link: https://toxacol.bioinforai.tech/

The key innovation of ToxACoL is its ability to learn bidirectionally from compound data and toxicity endpoints, developing a reverse correlation mechanism that embeds both compounds and endpoints.

Its robustness has been validated across multiple experimental scenarios, including multi-endpoint performance, rare species endpoints, and cross-species extrapolation, demonstrating strong resilience in unbalanced multi-task datasets.

Future work will extend ToxACoL to broader acute toxicity tasks and other chemical-related assessments.

Toxicity Assessment Method

Effective Evaluation Method

Subscribe to QQ Insights