top of page

Machine Learning-Driven Prediction and Design of Plasmid Copy Number from Replication Origin Sequences (The Royal Society Publishing, July 2025)

  • 2 days ago
  • 1 min read

Precise control over Plasmid Copy Number (PCN) is a fundamental challenge in synthetic biology, directly governing the balance between recombinant protein yield and host metabolic burden. Despite its importance, computationally predicting PCN based solely on the sequence of the origin of replication (ORI) has remained an elusive goal, until now.

We present a novel, interpretable machine learning framework designed to redefine plasmid engineering. By extracting over 6,000 biochemical and structural features, including RNA folding dynamics, promoter strength, and sigma factor recognition motifs, our models achieve high-accuracy predictions (r = 0.959 for RNAp) for ColE1-like plasmids.


Key Highlights:

  • Paradigm Shift in Design: Transition from trial-and-error library screening to precise, sequence-based computational design.

  • Biophysical Insights: Discovery of a thermodynamic sweet spot for ORI stability that optimizes replication efficiency.

  • Validated Methodology: Our model’s predictive power was confirmed through a robust experimental pipeline involving ORI replacement and colony qPCR validation.

 

This work provides a critical bridge between raw DNA synthesis and successful heterologous expression, offering a scalable solution for fine-tuning genetic circuits in both academic and industrial biomanufacturing.


Read More:


 

 
 
 

Comments


bottom of page