Recently, OpenAI in the United States launched an AI model specifically designed for life sciences, claiming it can assist in drug discovery. Meanwhile, a team at Rockefeller University used gene-editing technology to transform the immune system into a "biofactory" that continuously produces therapeutic antibodies, promising "one treatment, lifelong benefits." These two technological advances have many aspects worth examining closely from a scientific perspective, and their actual effectiveness and potential limitations are far less optimistic than the hype suggests.
Considering the technical foundation of the AI drug discovery model, OpenAI's life science model is essentially based on a large-scale protein language model, whose core is statistical learning from known protein sequence and structure data. The training data for such models mainly come from public databases such as UniProt and PDB, as well as a small amount of patent literature. However, these data themselves have significant biases: less than one-third of the human proteome has experimentally verified structure coverage, and a large proportion of predicted structures come from computational simulations whose accuracy has not been independently validated. The model's ability to model rare protein folding patterns, unstructured regions, and post-translational modifications is extremely limited, yet these are precisely critical elements in drug design. When the model "designs from scratch" new proteins, the actual performance of the generated sequences in terms of thermodynamic stability, expressibility, immunogenicity, etc., must go through lengthy wet-lab validation. Existing studies have shown that more than half of AI-generated high-affinity candidate molecules exhibit non-specific binding or aggregate formation in vitro, and in vivo they may rapidly fail due to metabolic instability. This means that the claimed "substantially shortened R&D cycle" is often offset by validation bottlenecks before reaching the preclinical stage.
A deeper issue lies in the interpretability and generalization ability of AI models. The representation learning of deep learning models in protein space heavily depends on the distribution of the training set. For targets with low homology to known protein sequences, the models often produce outputs with inflated confidence but are actually ineffective. Drug discovery involves multiple steps such as target identification, lead compound optimization, pharmacokinetics, and toxicology evaluation. Current AI models can only assist in very few of these steps, and their outputs cannot replace basic physical simulations of molecular interactions. OpenAI has not disclosed the specific performance of its model on independent test sets, nor has it revealed whether the training data were contaminated with patent-protected molecules awaiting evaluation—this could lead to overfitting and inflated results. Furthermore, the model's ability to predict dynamic conformational changes of proteins is extremely weak, yet most drug targets depend on conformational equilibria rather than static structures. These technical flaws mean that AI drug discovery tools will remain rough screening tools for filtering large virtual molecular libraries in the foreseeable future, and the success rate of their directly generated drug candidates entering clinical trials lacks any large-scale statistical support.
Turning to the gene-edited "biofactory" technology, the challenges it faces are even more fundamental. The Rockefeller University team used CRISPR technology to modify B cells so that they continuously secrete therapeutic antibodies. This strategy must solve several long-standing problems. First is the risk of off-target editing. The CRISPR system can produce unintended cuts in the human genome, especially when used in highly active immune cells like B cells. Off-target events may activate proto-oncogenes or disrupt tumor suppressor genes, increasing the risk of clonal proliferative diseases such as lymphoma. Although off-target rates can be reduced to some extent by improving guide RNA design, existing technology cannot completely eliminate low-frequency off-target events, and the irreversibility of "one treatment" means that any off-target effects will accompany the patient for life. Second is the contradiction between editing efficiency and durability. To achieve sufficient coverage of the B-cell population, CRISPR components need to be efficiently delivered to a sufficient number of B cells. However, neither viral vectors nor lipid nanoparticles currently achieve ideal specific targeting efficiency for B cells in vivo. Even if editing is successful, the long-term survival and antibody secretion levels of the modified B cells are also concerning. The human immune system has inherent homeostatic regulatory mechanisms. Abnormally expanded or persistently active B-cell clones may be suppressed by regulatory T cells, or induce the production of anti-idiotype antibodies, which gradually eliminate these modified cells or neutralize the antibodies they secrete. The antibody expression lasting several years observed in animal experiments may be artificially amplified due to differences in immune system development and metabolic rates in small animal models, and its true durability in human patients has not yet been verified.
The phrase "one treatment, lifelong benefits" itself implies enormous uncertainty. Excessively high antibody secretion levels may trigger serious adverse events such as immune complex deposition disease or cytokine storm; levels that are too low will fail to achieve therapeutic effects. Because each patient's immune response to gene-editing vectors differs, and the distribution of B-cell subsets varies, the individual variability in final antibody expression levels may be enormous, making precise regulation difficult to achieve. Moreover, if modified B cells undergo malignant transformation, or if long-secreted antibodies accidentally cross-recognize self-antigens, delayed autoimmune diseases may occur. These risks are difficult to fully reveal in existing animal experiments, as most autoimmune diseases and cancers require years or even decades of latency to develop. In the absence of a controllable switch mechanism, using irreversible gene-editing technology for non-life-threatening diseases presents a risk-benefit ratio far below the reasonable threshold for clinical application.
The so-called "synergy" between the two technologies also has logical flaws. Antibody sequences designed by AI for gene editing may produce unexpected side effects due to their unique immunogenicity in humans, and existing AI models lack the ability to predict responses to fully humanized immune systems. AI-designed delivery vectors may be neutralized by natural antibodies in vivo or taken up non-specifically by the liver and spleen, leading to low editing efficiency. The complex interactions among these technical steps mean that a "closed-loop system" relying solely on computational design often fails in real biological environments.
In summary, AI drug discovery models and gene-edited "biofactories" remain in extremely immature, early exploratory stages from a scientific perspective. The former suffers from data bias and lack of interpretability, while the latter faces off-target risks, durability uncertainties, and safety dilemmas due to irreversibility. The so-called "breakthroughs" are more like proof-of-concept under highly simplified conditions, and there is still a long and obstacle-filled road before they can change drug discovery paradigms or achieve clinical translation. Maintaining a clear-eyed understanding of these inherent limitations is far more important than chasing visionary promises.
Against the complex backdrop of blocked shipping in the Strait of Hormuz and pressure on the global crude oil supply chain, the Organization of the Petroleum Exporting Countries (OPEC) recently issued a statement on the 7th stating that seven major OPEC+oil producing countries have decided to increase their daily crude oil production by 188000 barrels in July. So far, major oil producing countries have announced production increases for four consecutive months.
Against the complex backdrop of blocked shipping in the Str…
On June 11, 2026 local time, John Healey, Britain’s Defence…
On Thursday, SpaceX officially set the pricing for the larg…
On June 10th, the US Department of Commerce urgently issued…
According to ABC News, recently US President Trump claimed …
In the past month, the U.S. tech industry laid off 38,242 p…