Synthetic Biology: fostering the cyber-biological revolution

Abstract Since the description, in 2000, of two artificial gene networks, synthetic biology has emerged as a new engineering discipline that catalyzes a change of culture in the life sciences. Recombinant DNA can now be fabricated rather than cloned. Instead of focusing on the development of ad-hoc assembly strategies, molecular biologists can outsource the fabrication of synthetic DNA molecules to a network of DNA foundries. Model-driven product development cycles that clearly identify design, build, and test phases are becoming as common in the life sciences as they have been in other engineering fields. A movement of citizen scientists with roots in community labs throughout the world is trying to democratize genetic engineering. It challenges the life science establishment just like visionaries in the 70s advocated that computing should be personal at a time when access to computers was mostly the privilege of government scientists. Synthetic biology is a cultural revolution that will have far reaching implications for the biotechnology industry. The work of synthetic biologists today prefigures a new generation of cyber-biological systems that may to lead to the 5th industrial revolution. By catering to the scientific publishing needs of all members of a diverse community, Synthetic Biology hopes to do its part to support the development of this new engineering discipline, catalyze the culture changes it calls for, and foster the development of a new industry far into the twenty first century.

On January 20, 2000, Nature published two articles reporting the design, fabrication, and characterization of two artificial gene networks. Timothy Gardner, Jim Collins, and Charles Cantor described a genetic toggle switch that could be flipped between an ON and OFF states using transient environmental signals [1]. Michael Elowitz and Stanislas Leibler described the Repressilator, a genetic circuit that exhibited oscillations of the expression of a reporter gene [2].
On the face of it, these two articles looked like biology papers. They included the description of new plasmids and reported data collected with instruments commonly used by biologists. And there was nothing particularly new in these experiments. Many molecular biologists had the skills necessary to assemble and characterize these plasmids but none of them thought of designing them. It took the minds of a mechanical engineer (T. Gardner) and a physicist (M. Elowitz) to imagine these circuits. The novelty of these articles was not so much in their biological aspect as it was in the applications of engineering principles to the design of circuits encoded in DNA molecules. These two articles have been a source of inspiration for many of us. They have catalyzed the emergence of a movement of dreamers aspiring to engineer DNA like their parents engineered silicon. This movement eventually led to the emergence of synthetic biology as a new field of engineering [3][4][5].
Fifteen years later, we have come to appreciate the culture change that synthetic biology calls for. We see many indications that this specialty has the potential to support an industrial revolution fueled by the emergence of cyber-biological systems across many segments of the economy. The dynamics between scientific breakthroughs and innovative industrial applications is well illustrated by the career paths of the discipline pioneers. Gardner left academia for industry 10 years ago to join one of the first synthetic biology startups while Elowitz stayed in academia where his work continues to deeply renew our understanding of biological processes.

DNA is the new silicon
In the early days of genetic engineering, techniques used to assemble recombinant DNA molecules were extremely limited.
They extensively relied on the presence of motifs that could be cut by restriction enzymes. The resulting restriction fragments could be separated by physical purification processes like electrophoresis and stitched together using DNA ligases. In the 80s, the availability of instruments automating the synthesis of small single-stranded DNA molecules called oligonucleotides along with the invention of the Polymerase Chain Reaction (PCR) [6,7] led to the development of site-directed mutagenesis strategies. It became possible to locally edit natural DNA sequences by introducing new restriction sites, eliminating others, and altering the biological functions of specific DNA sequences. However, this process was expensive, time-consuming, and very constrained by the specific characteristics of individual DNA sequences. As a result, biologists have been concerned by the challenges of assembling new plasmids from existing genetic material since the early days of genetic engineering [8]. A number of new techniques have made it easier to recombine DNA fragments but little has changed in the way most biologists approach the development of new DNA molecules to express their genes of interest. Their attention is disproportionately focused on the assembly of expression vectors at the expense of other aspects of their research project [9]. Many research and development projects in academia and industry are constrained by the perceived cost and limitations of producing new recombinant DNA molecules.
One of the most transformative ideas introduced by synthetic biology pioneers like Drew Endy and Tom Knight [3] is that DNA should be "fabricated" instead of being handcrafted. Fabrication and manufacturing are words that imply that DNA molecules should be the output of industrial processes instead of the artisanal production of skilled craftsmen. The vision of an industrial production of new DNA molecules calls for generic assembly processes that can be applied to any DNA sequence instead of the ad-hoc cloning processes commonly used by molecular biologists [10]. It also anticipates the emergence of high-throughput assembly lines depending on automated instruments and factory workers, who may not need advanced degrees to perform the tasks that cannot be automated [11]. This evolution is reminiscent of the evolution of oligonucleotide synthesis, which is now mostly outsourced to a few companies like Integrated DNA Technologies.
Early on, synthetic biologists have anticipated the emergence of a DNA synthesis industry providing custom fabrication services to the biotechnology industry comparable to the foundries serving the semi-conductor industry. DNA synthesis, also known as "gene synthesis", is the de novo synthesis of DNA molecules entirely derived from oligonucleotides produced using a chemical process [12]. DNA synthesis is not new. It is as old as molecular biology itself as it was instrumental in the elucidation of the genetic code in the 60s [13][14][15]. However, it is only in the 90s that oligonucleotides became cheap enough to make it affordable to order the large numbers needed for DNA synthesis projects. At the same time, the rapid development of the PCR provided enzymes and protocols that could be used to assemble many oligonucleotides in a single reaction with the fidelity needed to meet the quality requirements of de novo DNA synthesis [16]. Around 2000, a number of startups like Blue Heron, GeneArt, and DNA2.0 launched gene synthesis services using techniques developed in the 90s. Many established companies serving the life science industry followed suit by offering DNA synthesis in addition to other services like oligonucleotide synthesis (Integrated DNA Technologies) or DNA sequencing (Genewiz). The cost of DNA synthesis services has steadily decreased but it is still regarded as prohibitively expensive for many projects. As a result, most projects still heavily rely on traditional cloning techniques and limit the use of DNA synthesis to the generation of specific sequences like codonoptimized open reading frames. This limited use of DNA synthesis has motivated the emergence of a second generation of DNA synthesis companies (Twist Biosciences, Gen9, SGI-DNA) hoping to disrupt the DNA synthesis market by developing new synthesis technologies that will reduce the cost, increase the throughput, and reduce times of DNA synthesis by orders of magnitude.

Model-driven development lifecycle
Another benefit of using the word "fabrication" in relation to DNA is that it implicitly refers to the life cycle of a product ( Figure 1). It places the assembly of synthetic DNA molecules in relation to other stages upstream and downstream of fabrication. When biologists are freed from the hassle of making DNA molecules, they can allocate more resources to imagining the DNA sequences that best serve their research objectives. Considering that fabrication is orthogonal to the design of DNA molecules can unleash the creativity of life scientists. Instead of being constrained by the limits of what DNA molecules they could write, they can now think of what DNA molecules they should write and let someone else figure out how to make them. While ideally fabrication should be independent of design, in practice it is not. Vendors and methods have restrictions around certain sequences (high GC content, sequence repeats, motifs). In practice this clean separation does not yet exist and genetic designers need to be mindful of the manufacturability of their designs.
The benefits of de novo gene synthesis cannot be understated. It has made it possible to "resurrect" extinct viral strains [17], rationally attenuate viral genomes [18,19], refactor the genomes of bacteria [20] and yeast [21], and extend the genetic code [22]. These moonshot projects drew a lot of attention but they pale in comparison of the upcoming transformations of the biotechnology supply chain resulting from cheap DNA synthesis. One can anticipate that in a nottoo-distant future it may become cheaper to resynthesize DNA molecules from scratch rather than storing and distributing existing plasmids. For the sake of argument, imagine that gene synthesis rates reach the symbolic threshold of 1 penny per base pair. A 5kb plasmid could be synthesized for $50, less than the price of ordering it from a not-for-profit organization like Addgene [23]. At that rate, the biological sample that contains the DNA molecules becomes much less valuable than the information about the DNA sequence itself. Beyond the cost factor, the time to access samples is very important. In the foreseeable future, retrieving an existing sample from a freezer will be much faster than synthesizing it and could justify the biobanking expenses.
Upstream of fabrication, the design of DNA molecules is now often model-driven. Optimization of coding sequences to maximize expression of heterologous proteins often referred to as "codon-optimization" is one of the most popular forms of computational design of synthetic DNA sequences [24][25][26][27]. The rational attenuation of viral sequences [18,19,28] and the computation of ribosome binding sites are other examples of model-driven design of DNA [29]. A number of other computational methods are being developed to streamline the design of longer and more complex DNA sequences [30][31][32][33] but the predictive power of mathematical models of the behaviors encoded in DNA sequences is still limited. The development of gene networks that implement user-defined specifications still requires a lot of empirical tuning to achieve the desired phenotypes [31].
This observation feeds a debate among synthetic biologists about the respective roles of rational design and evolutionary methods [34].
Downstream of fabrication, the phenotypes encoded in synthetic DNA molecules are now analyzed using quantitative models. When gene expression was measured by looking at bands on electrophoresis gels, data analysis was limited to qualitative (the protein is present or absent) or semi-quantitative (the protein is highly expressed) statements. The development of fluorescent proteins that could be used as reporter genes has opened numerous possibilities of more rigorous mathematical analysis [35][36][37][38][39]. Fluorescent proteins made it possible to measure gene expression in live cells instead of having to measure proteins extracted from cell cultures. Fluorescent proteins also made it possible to collect data with a single cell resolution using commonly available instruments like microscopes and flow-cytometers. Despite wellknown limitations, the use of fluorescent proteins as reporter genes provided the data needed to develop sophisticated mathematical models of gene expression [40]. More recently, the integration of imaging and microfluidics [41][42][43][44] has greatly improved the quality of data available to modelers.
"The Times They Are a-Changin" By using recombinant DNA technologies in the context of a broader model-driven product development workflow, synthetic biology has finally brought "engineering" to "genetic engineering". This represents a major culture change that has been triggered by a new interest of engineers and quantitative scientists for DNA. This demographic trend is just beginning and will take a few decades to complete. For the most part, the generation of scientists trained as biologists will not be able to embrace this change in their lifetime. It will take a generation of young biological engineers with solid quantitative and computational skills for the biotechnology industry to complete the transition. For more than 10 years, the competition iGEM has communicated to thousands of undergraduate students an inspiring vision of a world where DNA should be simple to engineer [45,46]. Over the course of a summer, students often come to appreciate the gap between this compelling vision and today's reality. Yet, their dream lives on and will motivate them to spearhead the culture changes that will transform the biotechnology industry. In the design phase, computer models are used to generate DNA sequences and predict their properties. In the build phase, these DNA molecules are produced by manufacturing processes that assemble large DNA molecules out of chemically synthesized building blocks. Finally, in the testing phase, DNA is introduced in living cells and gene expression is measured. Experimental data is finally compared to simulation results to improve the design in the next iteration of this cycle.
The emergence of the DIYBio and citizen scientist movements also participates in this culture change in the sense that DIYBiologists are taking some research projects away from the biological establishment [47][48][49][50]. DIYBiologists have the ambition to democratize biological research by bringing it to their kitchen and their garage just like the personal computing movement in the 70s challenged the dominance of government computing infrastructures. They are excited to challenge the incumbents who dominate the biotechnology industry. The Open Insulin Project is a good illustration of this trend. However, most DIYbiologists are hobbyists working on projects of very limited scope. Even initiatives with catchy names that manage to generate a short-lived hype are most likely to be soon forgotten. However, the development of Ginkgo Bioworks is evocative of Silicon Valley mythology. It didn't start in a garage but in a repurposed shipping container [50], which is close enough to fuel the imagination of aspiring entrepreneurs. Ten years, many federal grants and contracts, and several rounds of funding later, Ginkgo is sustaining its growth by attracting world class talent with its very distinctive corporate culture that comes with strong flavors of biohacking.

The cyber-biological industrial revolution
This culture change has the potential to enable an industrial revolution. Recently the world economic forum has recognized the emergence of cyber-physical systems as catalysts of the 4 th industrial revolution [51]. Cyber-physical systems are hybrid systems composed of a number of physical entities connected to software running control algorithms to direct individual devices in response to data received from various feeds. The navigation apps running on mobile devices (Google Maps, Waze) create a cyber-physical transportation system that many people use on a daily basis. Smart phones provide position and traffic information to a central server. The information provided by this network of devices is analyzed in real-time and along with other data sources to provide individualized directions to each user. Beyond transportation, the power grid, manufacturing, retail, health-care, and air-traffic control now include many cyber-physical systems.
With its strong emphasis on model-driven biology, synthetic biology also includes cyber-physical systems. For instance, the manufacturing of custom DNA molecules is a physical process driven by several layers of software. Virtual labs like Transcriptic or Emerald Cloud Labs are also examples of cyber-physical systems in biotechnology. However, synthetic biology goes beyond this by encoding control algorithms within DNA molecules, engineering organisms that can reproduce, communicate with each other, or leverage complex webs of interactions between hosts and pathogens, preys and predators, etc. There is an unprecedented level of complexity in these engineered biological systems that makes them different from cyber-physical systems. They may be best described as "cyber-biological" (Figure 2).
Synthetic biology is certainly not as mature as the technologies that catalyzed the emergence of cyber-physical systems. It is still mostly very artisanal but there are early indications that cyber-biological systems have the potential to catalyze the fifth industrial revolution in the second half of the twenty-first century. It is important to remember that the Internet and the Global Positioning Systems, two key technologies that enabled the development of today's cyber-physical systems, were developed by the US Department of Defense more than forty years ago [52]. This historical perspective helps one to appreciate the significance of the investments that the Defense Advanced Research Project Agency (DARPA) has been making in synthetic biology over the last few years. Its "Living Foundries" [53] program articulated a vision of a new industry relying on cyberbiological systems. This frontier is so important to DARPA that they recently created a new office of Biological Technologies [54].
There is also evidence that the center of gravity of the synthetic biology community has been progressively shifting toward industry. Companies like Amyris, Synthetic Genomics, Gingko Bioworks, Intrexon, or Twist Biosciences have raised resources that allow them to develop industrial-scale research infrastructures beyond the reach of academic research groups. SynBERC's very successful industry program has inspired some mature companies to develop synthetic biology initiatives in house or through collaborations with synthetic biology startups.

Editorial Policies
Synthetic biology articles are still mostly published in a broad range of interdisciplinary and specialized journals [55]. This situation can be problematic as it makes it difficult for readers to identify relevant papers and for authors to get published. In this environment Synthetic Biology aims to be a common forum in which researchers can share research and ideas. The number of synthetic biology articles has been growing at an annual rate of 6% over the last 10 years and this rate is expected to be sustained over the next ten years as governments across the globe have been investing in synthetic biology programs ( Figure 3A). The citation behavior of the field is very strong, with only 13% of 2012 papers receiving 0 citations and 12% receiving 16þ citations. From 2005, each year over 40% of papers were cited more than 5 times in their first two years. This indicates a history of strong citation performance in the field ( Figure 3B).
It is expected that a journal dedicated to the field will increase the visibility of synthetic biology papers, which will translate in improved citation statistics. Yet, Synthetic Biology refrains from making editorial decisions based on the editors' assessment of the anticipated impact of the work presented in the submissions it receives. The journal publishes all articles that are scientifically sound as evaluated by a rigorous peer-review process.
In order to minimize reproducibility issues, authors are requested to provide comprehensive sets of supporting data in a computer readable format. Synthetic Biology encourages authors to use data repositories like Figshare or Dryad to deposit their data prior to submitting their manuscript in a format allowing reviewers and readers to reuse the data. DNA sequences are of particular importance to most synthetic biology articles and authors will be requested to provide the complete sequences of the plasmids and other genetic material described in the manuscript [56]. Authors are also encouraged to release both raw data generated by instruments and the reduced data sets presented in the articles figures. For example, papers using time lapse microscopy to measure the dynamics of gene expression should release the images produced by the microscope (raw data), the scripts used to extract gene expression data from series of images, and the gene expression data (reduced data) used to generate figures [57][58][59]. The use of commonly accepted file formats and community standards like SBML [60] and SBOL [61,62] is recommended. For example, DNA sequences can be deposited as fasta, genbank, or SBOL files but PDF files are not suitable to communicate DNA sequences.
Reviewers are attentive to the proper use of physical units and calibration methods to ensure the reproducibility of results reported in the journal. It is common for fluorescence data to be reported as relative units making it impossible to compare datasets. Calibration of the instrument can lead to higher quality data [63].
Synthetic Biology recognizes the important contributions of industry to the development of the field. Manuscripts based on commercially available resources such as strains, reagents, or software are welcome as long as authors fully disclose their conflict of interest. Authors reporting results produced with their company products should keep in mind that their submissions need to meet the journal rigorous scientific standards and successfully go through peer-review. Submissions that read like promotional material are rejected without review.
The journal encourages the release of computational resources using one of the Open Source licenses recognized by the Open Source Initiative. However, we also recognize that in some instances, open source release may jeopardize the long term sustainability of important computing resources. Therefore, open source release is not a requirement to the publication of articles describing new software, databases, or web sites.
Synthetic Biology acknowledges that engineers and life scientists have different publishing usages. Engineers and physicists commonly rely on preprint servers and conferences to disseminate scientific results. Synthetic Biology supports these traditions that provide quick access to new scientific results while providing a new avenue to publish these results in a more polished format that readers will identify and cite more easily.
Finally, Synthetic Biology is interested in receiving submissions from students and teachers reporting educational projects. Submissions from citizen scientists discussing topics of interest to the DIYBio community are welcome. Authors of educational and DYIBio papers should send a pre-submission enquiry to the editorial office (synbio.editorialoffice@oup.com) in order to ensure that their ideas are a good fit for the journal.
By catering to the scientific publishing needs of all members of a diverse community, Synthetic Biology hopes to do its part to support the maturation of this new engineering discipline, catalyze the culture changes it calls for, and prepare for the emergence of a new industry far into the twenty first century. is expected that this trend will continue as the vast majority of these articles now cite one or more grants or contracts supporting the work. (B) The field of synthetic biology shows a strong citation patterns as 40% of papers receive more than 5 citations in the first two years after publication.