Biology’s transformation: from observation through experiment to computation

Abstract Summary We explore the nuanced temporal and epistemological distinctions among natural sciences, particularly the contrasting treatment of time and the interplay between theory and experimentation. Physics, an exemplar of mature science, relies on theoretical models for predictability and simulations. In contrast, biology, traditionally experimental, is witnessing a computational surge, with data analytics and simulations reshaping its research paradigms. Despite these strides, a unified theoretical framework in biology remains elusive. We propose that contemporary global challenges might usher in a renewed emphasis, presenting an opportunity for the establishment of a novel theoretical underpinning for the life sciences. Availability and implementation https://github.com/ouzounis/CLS-emerges Data in Json format, Images in PNG format.

The natural sciences exhibit notable distinctions, particularly regarding the role of time in a historical context.Mathematics, physics, and chemistry possess the ability to "reverse" time, as much of their subject matter is theoretically, or even practically, reversible.On the other hand, sciences like biology, geology, and astronomy inherently incorporate historicity, where past events significantly influence our present observations and time equations (Mont� evil 2022).Another crucial epistemological distinction shaping scientific research is whether a discipline leans toward experimentation or theory.The latter finds its prime examples in fundamental mathematics and physics.Experimental sciences encompass a broad range from chemistry and biology to geology and astronomy, with the latter two sometimes categorized as observational sciences due to the constraints on conducting certain experiments at scale (Daston 2008).The progression of a scientific discipline is often delineated by its level of "maturity" (Bunge 1968), starting with observation, advancing to experimentation, and ultimately culminating in computation, with theory and applications.Therefore, physics has traditionally been seen as the pinnacle of the natural sciences, and perhaps rightly so.Meanwhile, biology has primarily been classified as an experimental science with elements of historicity, thus leaning toward observation or experiment and not being perceived as a more mature, theoretical science.Recent developments in the last few decades have been causing a shift in this perspective.
A "mature" science, like physics, is defined by the development of theoretical frameworks that enable a high degree of predictability within the systems being studied.These theories encompass everything from simple "laws," such as those governing gravitational acceleration, to intricate "simulations" exploring emergent properties (Varley and Hoel 2022).The primary function of a theory is to establish potential space-time trajectories, reducing or eliminating altogether the necessity of conducting certain experiments.Should an experimenter choose to carry out an experiment adhering to the framework's strict reproducibility criteria, the theory should accurately foresee the experimental outcome.If these predictions do not align with the experimental findings, scientists will search for modifications to the existing framework or even propose an entirely new theory (Kuhn 1962).In today's complex scientific landscape, computational methods often take on a key role either in putting theory into practice or serving as substitutes for theoretical approaches.These methods resemble virtual simulations of experimental scenarios, supported by a mix of equations, statistics, and empirical data or observations (Frick� e 2015).The ability to make generalized predictions is crucial, as the maturity of a theory is evident in its capacity to predict the behavior of intricate phenomena, systems, or processes.This predictive nature serves as a testament to the effectiveness of the more mature, theorybased sciences.
Biology, much like physics in its earlier stages, originated from observations and experiments.While there have been philosophical debates regarding whether biology strictly adheres to the laws of physics or further possesses its own, unique laws (Rosen 1987, Dhar andGiuliani 2010), the foundation of modern biology rests on two principal pillars: the study of structure as genetics (life's mathematics) and the investigation of function as biochemistry (life's chemistry).A pivotal achievement in early 20th century biology was the development of the Modern Synthesis, which integrated the theory of evolution by natural selection with the theory of inheritance, as epitomized by the iconic works of Darwin and Mendel, respectively (Mayr 1993).A generation or two ago, the manipulation of biological systems via genetic engineering marked the rise of molecular biology (Stent 1968).Subsequently, biological science underwent a shift with more systematic studies and advancements in technology that ushered in the Genome Era (Botstein 2011).Now, the question arises: What is the present-day epistemological status of biology?Will it predominantly remain an experimental science, or is it rapidly moving toward "maturity," supported by computation?
Over the course of four decades, a central question has emerged concerning the transformation of biology into a science more akin to physics than to natural history, and the role that bioinformatics has played in this process (Ouzounis 2012).With strong affirmation, which is widely held by many, biology has unquestionably attained a notable level of sophistication, where extensive data analytics enable us to make solid predictions for various outcomes without solely relying on experiments (Markowetz 2017).Research in the life sciences now includes more theoretical work than outsiders might recognize, involving substantial computational demands, algorithmic depth, massive volumes of data, and the integration of robotic automation into large-scale experiments.This evolving landscape suggests that biology is entering a digital phase where experiments are increasingly guided by computation and modeling.Thanks to the growing predictive power of simulations, they encompass a multitude of potential experiments that need not be physically conducted, with the incorporation of machine learning.
Thus, contemporary life science research increasingly relies on automated procedures and high data volumes, driven by the rapid development of sequencing technologies, and slowly moving away from traditional natural history-based observations or hypothesis-based, classical experiments.Relevant observations within the life sciences span diverse scales, ranging from the molecular level with precision down to individual atoms, through microscales utilizing advanced imaging technologies, to macroscales facilitated by planetarylevel sensors, as well as collaborative efforts like observatories, scientific networks, and crowdsourcing initiatives (Nathan et al. 2022).These efforts are often supported or inspired by other disciplines, such as the earth sciences.
While biology is not exclusively a computational science, it is undoubtedly a broad discipline that heavily relies on computation by any standard (Fig. 1).Computational methods are extensively employed to control experiments across various levels, and the intricacies of life continuously stimulate innovative, "bio-inspired" design approaches for both hardware and software engineering.However, what remains absent is a comprehensive "grand" theory that can encompass the recurrent observations and predictable behaviors observed in biological systems (Krakauer et al. 2011).Such a theory would typically rely on CPU-and data-intensive systems in the short term, as well as innovative conceptual frameworks in the long term.
Furthermore, there is an opportunity to cultivate a similar social and political momentum as when physics dominated in the past.Recent global trends such as pandemics, conflicts, the climate crisis, widespread environmental destruction, and health challenges posed by an ageing population or wealth imbalance could offer the socio-political context needed for a new era of increased funding, akin to what physics has received historically.This renewed drive could contribute to the development of a comprehensive framework for the biological sciences while simultaneously aiding in the  Ouzounis management and restoration of natural ecosystems, with a wide array of applications aimed at promoting human and ecosystem well-being in the 21st century.

Figure 1 .
Figure1.Evolution of terms in the literature associated with the term "computational" and over 3-year periods from 1988 to 2022 (last period lasts two years, 2023 is excluded).Color code: pink-"biocomputing", bluegray-toolkits to robots, seafoam-neurocomputing, olive-medical devices, purplegenome informatics, seagreen-pharmacogenomics, skyblue-proteomics, orange-immunogenomics (see GitHub README for details).Generated with VOSviewer (Van Eck and Waltman 2010).Data type: map on text data, data source: Europe PMC, search query: title þ abstract, counting: full, minimum number of occurrences: 10, relevance score percentage: 60.See also Supplementary FiguresS1 and S2.Results are reproducible and not affected significantly with change of parameters. 2