Want a glimpse of a key problem for biology researchers in the 21st century? Imagine the United Nations with no earphones and no translators. How would diplomats communicate?
Now, imagine that it is not just speeches being conveyed, but data compiled in biology experiments by researchers working in dozens of languages. To make the situation more complex, imagine those experiments are done on computers through simulations, rather than in a lab using organisms or cells. And toss in a final complication: all those researchers compile their results in different computer formats.
Biologists have a term for it – the babelization of data. The word comes from the biblical Tower of Babel, where God sowed confusion among the inhabitants by having them speak in so many different languages that nothing could get done.
Kam Dahlquist, associate professor and the William F. McLaughlin Chair of Biology at Loyola Marymount University, has been part of worldwide collaboration for the past eight years to solve this perplexing problem.
The answer they came up with is Biological Pathway Exchange – BioPAX – a standard computational language used by scientists to represent what is taking place at the molecular and cellular level, and to facilitate the exchange of data about these biological events or so-called pathways.
This analysis is part of the discipline called computational biology, which applies the techniques of computer science, applied mathematics and statistics to address biological problems. Computational biology uses computer analysis to describe theoretically what takes place at the cellular and molecular level.
This fall, the work was published in the journal Nature Biotechnology. Dahlquist is a co-author of a paper with more than 70 elite scientists, titled “The BioPAX community standard for pathway data sharing.”
“It is the culmination of work started in 2002,” Dahlquist said in a recent interview in her small Seaver Hall office. “The real impact of this paper is that it is a community effort that involves scientists from around the world.”
The work evolved via e-mails, conference phone calls and in person as small groups of these scholars got together while attending larger conferences. Dahlquist’s first contributions were made while a post-doctoral fellow at UC San Francisco, then when she was on the faculty at Vassar College.
“It is not something that you get a lot of glory for,” she said. “Usually, the image we have is the lone scientist who comes up with the exciting discovery that wins the Nobel prize. This type of collaborative work can actually be harder. Over 70 authors came together, organized and completed something that is a significant milestone.”
In biology, there are three ways to do research – with whole, living animals and organisms, or in vivo; with parts of organisms in the lab and in test tubes, or in vitro; and most recently through computer simulations, or in silico.
In silico is the fastest growing field of biology and is becoming the standard. The field focuses on developing computational and statistical analysis methods, and also in developing mathematical modeling and computational simulation techniques. Computer simulations address scientific research topics and theoretical questions, but outside of a traditional laboratory.
The BioPAX standard helps deal with the exponential increase in molecular and genomics data collected in databases worldwide. Before the BioPAX standard, it was becoming increasingly challenging to collect, index, interpret and share data. With BioPax, the information can be organized into pathways in a computable form that supports visualization, analysis and biological discovery.
“It will make the entire scientific community’s life easier,” said Dahlquist. “And in a few years, most scientists may not realize how easy things have become for them because it happened in the background. BioPAX’s impact is worldwide.”