RE: LeoThread 2025-01-08 11:41

You are viewing a single comment's thread:

AI Reveals Gene Activity in Human Cells - Neuroscience News
by Neuroscience News

Artificial Intelligence / 2025-01-08 18:37



0
0
0.000
63 comments
avatar

Summary: Researchers have developed an AI model that accurately predicts gene activity in any human cell, providing insights into cellular functions and disease mechanisms.

0
0
0.000
avatar

Trained on data from over 1.3 million cells, the model can predict gene expression in unseen cell types with high accuracy.

0
0
0.000
avatar

It has already uncovered mechanisms driving a pediatric leukemia and may help explore the genome’s “dark matter,” where most cancer mutations occur.

0
0
0.000
avatar

Key Facts

AI and Gene Activity: The AI model predicts gene expression in unseen cell types using genomic and expression data, enabling insights into cellular functions.

0
0
0.000
avatar

Pediatric Cancer Discovery: The system identified how specific mutations disrupt transcription factors in inherited pediatric leukemia, confirmed by lab experiments.

0
0
0.000
avatar

Exploring Genome “Dark Matter”: The model offers tools to study non-coding genome regions, illuminating the role of unexplored mutations in cancer and disease.

0
0
0.000
avatar

Using a new artificial intelligence method, researchers at Columbia University Vagelos College of Physicians and Surgeons can accurately predict the activity of genes within any human cell, essentially revealing

0
0
0.000
avatar

The system, described in the current issue of Nature, could transform the way scientists work to understand everything from cancer to genetic diseases.

0
0
0.000
avatar

“Predictive generalizable computational models allow to uncover biological processes in a fast and accurate way.

0
0
0.000
avatar

These methods can effectively conduct large-scale computational experiments, boosting and guiding traditional experimental approaches,” says Raul Rabadan, professor of systems biology and senior author of the new paper.

0
0
0.000
avatar

Traditional research methods in biology are good at revealing how cells perform their jobs or react to disturbances. But they cannot make predictions about how cells work or how cells will react to change, like a cancer-causing mutation.

0
0
0.000
avatar

“Having the ability to accurately predict a cell’s activities would transform our understanding of fundamental biological processes,” Rabadan says.

0
0
0.000
avatar

“It would turn biology from a science that describes seemingly random processes into one that can predict the underlying systems that govern cell behavior.”

0
0
0.000
avatar

In recent years, the accumulation of massive amounts of data from cells and more powerful AI models are starting to transform biology into a more predictive science.

0
0
0.000
avatar

The 2024 Nobel Prize in Chemistry was awarded to researchers for their groundbreaking work in using AI to predict protein structures.

0
0
0.000
avatar

But the use of AI methods to predict the activities of genes and proteins inside cells has proven more difficult.

0
0
0.000
avatar

New AI method predicts gene expression in any cell

0
0
0.000
avatar

In the new study, Rabadan and his colleagues tried to use AI to predict which genes are active within specific cells

0
0
0.000
avatar

Such information about gene expression can tell researchers the identity of the cell and how the cell performs its functions.

0
0
0.000
avatar

“Previous models have been trained on data in particular cell types, usually cancer cell lines or something else that has little resemblance to normal cells,” Rabadan says.

0
0
0.000
avatar

Xi Fu, a graduate student in Rabadan’s lab, decided to take a different approach, training a machine learning model on gene expression data from millions of cells obtained from normal human tissues.

0
0
0.000
avatar

The inputs consisted of genome sequences and data showing which parts of the genome are accessible and expressed.

0
0
0.000
avatar

The overall approach resembles the way ChatGPT and other popular “foundation” models work.

0
0
0.000
avatar

These systems use a set of training data to identify underlying rules, the grammar of language, and then apply those inferred rules to new situations.

0
0
0.000
avatar

“Here it’s exactly the same thing: we learn the grammar in many different cellular states, and then we go into a particular condition—

0
0
0.000
avatar

it can be a diseased or it can be a normal cell type—and we can try to see how well we predict patterns from this information,” says Rabadan.

0
0
0.000
avatar

Fu and Rabadan soon enlisted a team of collaborators, including co-first authors Alejandro Buendia, now a Stanford PhD student formerly in the Rabadan lab, and Shentong Mo of Carnegie Mellon, to train and test the new model.

0
0
0.000
avatar

After training on data from more than 1.3 million human cells, the system became accurate enough to predict gene expression in cell types it had never seen, yielding results that agreed closely with experimental data.

0
0
0.000
avatar

New AI methods reveal drivers of a pediatric cancer

0
0
0.000
avatar

Next, the investigators showed the power of their AI system when they asked it to uncover still hidden biology of diseased cells, in this case, an inherited form of pediatric leukemia.

0
0
0.000
avatar

“These kids inherit a gene that is mutated, and it was unclear exactly what it is these mutations are doing,” says Rabadan, who also co-directs the

0
0
0.000
avatar

cancer genomics and epigenomics research program at Columbia’s Herbert Irving Comprehensive Cancer Center.

0
0
0.000
avatar

With AI, the researchers predicted that the mutations disrupt the interaction between two different transcription factors that determine the fate of leukemic cells.

0
0
0.000
avatar

Laboratory experiments confirmed AI’s prediction. Understanding the effect of these mutations uncovers specific mechanisms that drive this disease.

0
0
0.000
avatar

AI could reveal “dark matter” in genome

0
0
0.000
avatar

The new computational methods should also allow researchers to start exploring the role of genome’s “dark matter”—

0
0
0.000
avatar

a term borrowed from cosmology that refers to the vast majority of the genome, which does not encode known genes—in cancer and other diseases.

0
0
0.000
avatar

“The vast majority of mutations found in cancer patients are in so-called dark regions of the genome. These mutations do not affect the function of a protein and have remained mostly unexplored. says Rabadan.

0
0
0.000
avatar

“The idea is that using these models, we can look at mutations and illuminate that part of the genome.”

0
0
0.000
avatar

Already, Rabadan is working with researchers at Columbia and other universities, exploring different cancers, from brain to blood cancers, learning the grammar of regulation in normal cells,

0
0
0.000
avatar

and how cells change in the process of cancer development.

0
0
0.000
avatar

The work also opens new avenues for understanding many diseases beyond cancer and potentially identifying targets for new treatments

0
0
0.000
avatar

By presenting novel mutations to the computer model, researchers can now gain deep insights and predictions about exactly how those mutations affect a cell.

0
0
0.000
avatar

Coming on the heels of other recent advances in artificial intelligence for biology, Rabadan sees the work as part of a major trend:

0
0
0.000
avatar

It’s really a new era in biology that is extremely exciting; transforming biology into a predictive science.”

0
0
0.000
avatar

The paper, titled “A foundational model of transcription across human cell types,” was published Jan. 8 in Nature.

0
0
0.000
avatar

Authors (all from Columbia except where noted): Xi Fu, Shentong Mo (Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi,

0
0
0.000
avatar

UAE, and Carnegie Mellon University, Pittsburgh, PA), Alejandro Buendia, Anouchka P. Laurent, Anqi Shao, Maria del Mar Alvarez-Torres, Tianji Yu, Jimin Tan (New York University Grossman School of Medicine,

0
0
0.000
avatar

New York, NY), Jiayu Su, Romella Sagatelian, Adolfo A. Ferrando (Columbia and Regeneron, Tarrytown, NY), Alberto Ciccia, Yanyan Lan (Tsinghua University, Beijing, China),

0
0
0.000
avatar

David M. Owens Teresa Palomero, Eric P. Xing (Mohamed bin Zayed University of Artificial Intelligence and Carnegie Mellon University), and Raul Rabadan.

0
0
0.000
avatar

Transcriptional regulation, which involves a complex interplay between regulatory sequences and proteins, directs all biological processes.

0
0
0.000
avatar

Computational models of transcription lack generalizability to accurately extrapolate to unseen cell types and conditions.

0
0
0.000
avatar

Here we introduce GET (general expression transformer), an interpretable foundation model designed to uncover regulatory grammars across 213 human fetal and adult cell types.

0
0
0.000
avatar

Relying exclusively on chromatin accessibility data and sequence information, GET achieves experimental-level accuracy in predicting gene expression even in previously unseen cell types.

0
0
0.000
avatar

GET also shows remarkable adaptability across new sequencing platforms and assays, enabling regulatory inference across a broad range of cell types and conditions,

0
0
0.000
avatar

and uncovers universal and cell-type-specific transcription factor interaction networks.

0
0
0.000
avatar

We evaluated its performance in prediction of regulatory activity, inference of regulatory elements and regulators, and identification of physical interactions between transcription factors

0
0
0.000
avatar

and found that it outperforms current models in predicting lentivirus-based massively parallel reporter assay readout.

0
0
0.000
avatar

In fetal erythroblasts, we identified distal (greater than 1 Mbp) regulatory regions that were missed by previous models, and, in B cells, we identified a lymphocyte-specific

0
0
0.000
avatar

transcription factor–transcription factor interaction that explains the functional significance of a leukaemia risk predisposing germline mutation.

0
0
0.000
avatar

In sum, we provide a generalizable and accurate model for transcription together with catalogues of gene regulation and transcription factor interactions, all with cell type specificity.

0
0
0.000