AI Drug Discovery Evolution: Stanford and Molecule Heart Develop SurfFlow to Overcome Peptide Surface Complementarity Challenges}

Stanford and Molecule Heart introduce SurfFlow, a novel peptide design system that addresses surface complementarity issues, advancing AI-driven therapeutic peptide development.

AI Drug Discovery Evolution: Stanford and Molecule Heart Develop SurfFlow to Overcome Peptide Surface Complementarity Challenges}

图片

Editor | Radish Skin

Recent advances in deep generative models have enabled scientists to design therapeutic peptides targeting difficult drug sites with greater precision. However, they underestimated the critical impact of molecular surface interactions in protein-protein interactions (PPI)—like finding the lock but ignoring the correct angle to open the door, which greatly hampers peptide design and discovery.

To bridge this gap, the team from Molecule Heart, led by Xu Jinbo, collaborated with Stanford University to propose a comprehensive peptide generation paradigm called SurfFlow. This innovative surface-based generation algorithm enables joint design of peptide sequences, structures, and surfaces.

SurfFlow employs a multimodal Conditional Flow Matching (CFM) architecture to learn the distribution of surface geometries and biochemical properties, improving binding accuracy.

In extensive PepMerge benchmarks, SurfFlow outperformed all-atom baselines across all metrics. These results highlight the advantages of considering molecular surfaces in de novo peptide discovery and demonstrate the potential of integrating multiple protein modalities for more effective therapeutic peptide discovery.

This research paper has been accepted at the prestigious ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2025, the leading international conference in data mining.

图片

Research Background

Peptides, composed of about 2 to 50 amino acids, play key roles in biological processes such as cell signaling, enzyme catalysis, and immune responses. They are essential in pharmacology due to their high affinity and specificity for cell surface receptors, with low toxicity, low immunogenicity, and easy delivery.

Traditional peptide discovery relies on frequent calculations of physical energy functions, but due to the vast design space, this approach is inefficient, prompting rapid development of computational methods.

Recently, surface interactions in protein-protein interactions (PPI) have gained attention, as PPI strength and specificity largely depend on surface complementarity, electrostatics, hydrophobicity, and geometric features like protrusions, grooves, and fissures, which are crucial for lock-and-key or induced fit mechanisms.

图片

These surfaces act as the fundamental interfaces for protein recognition and binding. Therefore, considering all molecular modes—sequence, structure, and surface—simultaneously during peptide generation is vital to enhance the overall design consistency.

SurfFlow

To achieve this, Stanford and Molecule Heart team introduced a new all-design peptide generation algorithm called SurfFlow.

It applies multimodal Flow Matching (FM) to internal structures and molecular surfaces, represented by surface point positions and unit normal vectors, as rigid frameworks in SE(3).

Since surface geometry alone cannot guarantee successful binding—precise interface compatibility, charge, polarity, and hydrophobicity placement are also necessary—SurfFlow incorporates these biochemical constraints.

图片

Illustration: SurfFlow workflow for comprehensive peptide design, considering sequence, structure, and surface modality consistency during generation. (Source: Paper)

Specifically, it uses Discrete Flow Matching (DFM) for classifying surface features and Continuous-Time Markov Chains (CTMC). To capture irregular surface geometries, multi-scale features, and protein interactions, researchers proposed an Equivariant Surface Geometry Network (ESGN) that dynamically models heterogeneous surface graphs, integrating intra- and inter-surface interactions.

Given key peptide properties like cyclicity and disulfide bonds influence stability and affinity, these are included as additional conditions to enhance SurfFlow’s capacity and generalization.

Performance Evaluation

The team comprehensively assessed SurfFlow on co-design tasks involving sequences, structures, and side-chain packing, using datasets from PepBDB and Q-BioLip. Results show SurfFlow consistently outperforms all-atom baselines across all metrics.

Table: Comparison of different methods in sequence-structure co-design tasks, with ablation studies on key SurfFlow components. Best results are highlighted in bold and underlined. (Source: Paper)

图片
图片

Design and Binding Energy Distribution

The distribution of binding energies for designed and natural peptides shows lower values are better, indicating stronger binding affinity. (Source: Paper)

图片

Despite improvements, SurfFlow still has room for further exploration. For example, incorporating receptor surface information into joint distribution models could yield further optimization. The success of RFDiffusion suggests pretraining on PDB for regular proteins is beneficial.

Nevertheless, SurfFlow is a novel model capable of generating all protein modalities—sequence, structure, and surface—simultaneously. Researchers applied SurfFlow to solve a specific peptide design challenge, integrating key features like cyclicity and disulfide bonds into the generation process.

It is expected that SurfFlow will soon be available for broader use, and interested researchers can look forward to its upcoming release.

Subscribe to QQ Insights

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe