Automated design of hammerhead ribozymes and validation by targeting the PABPN1 gene transcript

We present a new publicly accessible web-service, RiboSoft, which implements a comprehensive hammerhead ribozyme design procedure. It accepts as input a target sequence (and some design parameters) then generates a set of ranked hammerhead ribozymes, which target the input sequence. This paper describes the implemented procedure, which takes into consideration multiple objectives leading to a multi-objective ranking of the computer-generated ribozymes. Many ribozymes were assayed and validated, including four ribozymes targeting the transcript of a disease-causing gene (a mutant version of PABPN1). These four ribozymes were successfully tested in vitro and in vivo, for their ability to cleave the targeted transcript. The wet-lab positive results of the test are presented here demonstrating the real-world potential of both hammerhead ribozymes and RiboSoft. RiboSoft is freely available at the website http://ribosoft.fungalgenomics.ca/ribosoft/.


Multi-objective optimization
The schematic below illustrates the principles of multi-objective optimization with "pareto-optimal" fronts. • The set of those solutions that are non-dominated by any solution make-up the' paretooptimal' front, and are all assigned the same rank: 1 • If the rank 1 solutions are removed, the next front is comprised of solutions of rank 2 • This layered removal of solution 'fronts' is continued until either we don't need any more solutions or there are simply no solutions left • The same principles apply independently of the number of objectives (dimensions) assessed

RiboSoft pseudo-code
The following pseudo-code provides brief explanations of the algorithms that evaluate each parameter of the ribozyme candidates. A candidateStructure is a possible secondary structure of an hhRz.

Ribozyme
FindContinousBasePairs will find all double-stranded sequences in the structure, such that there is at most one mismatch in each sequence.
This list will be stored in ContinousPairs. For each of these double stranded segments (or ContinuousPair), MeltingTemperature will be computed and added to sum. This sum represents the melting temperature of the whole candidateStructure. Specificity is initialized to zero. For each cut-site, results will hold information obtained from querying BLAST on that cut-site. A cutsite is defined here as the region of the target that is complementary to the arms of the hhRz. Each BLAST result will be dealt with individually, even if two or more results are attributed to a single transcript. Weight will weigh each result based on the perfection of its match to the hhRz arms. These weighted results will be summed into

Target Accessibility
Specificity. An XM or XR Blast hit will not be used in the specificity calculation.

Fitness Evaluation
ParetoRank ( In both cases, the RNA was prepared as described for hammerhead ribozymes, except that it was purified on 6% PAGE.

Additional Ribozyme Transcripts
The hhRz designed against RFP and the 16S fragment were all prepared in the same way as the ribozymes against PABPN1 with the following oligonucleotides:

Kinetic measurements of ribozyme cleavage for RFP and 16S
To monitor the ribozyme cleavage kinetics, 5nM of each ribozyme was added to the cleavage buffer (as described the main text) in the presence of labelled RFP or 16S and in absence of MgCl 2 . The first step was the incubation at 85°C for 1 minute to allow RNA folding, followed by the immediate transfer on ice for a few minutes. In the second step, 1µl of MgCl 2 was added to start the cleavage reaction and this moment was considered as time 0, after which aliquots of 2 µl each were taken at different times. All the rest is the same as described in the main text. Ribozyme(rank) : the numbers like "2(4)" refer to: The cut-site position (which is GUC by default), so "2" means the third GUC that exists in the target sequence because the counting begins from 0 (the first GUC is n° 0). The 4 which is in parenthesis is the rank of the ribozyme as determined by RiboSoft (1 being the best).
Tm: melting temperature for one ribozyme arm with the target. Because ribozymes require complementarity to form two stems with the target, both the left and the right Tm are shown.
Accessibility: to be able to hybridize with the template, the ribozyme should find an accessible sequence so the value of the accessibility that goes from 0 to 1 depends on the position of the GUC and the sequence on both sides of it that could be in a stem (so the value is closer to 0) or in a bulge (so the value is closer to 1) so the closer the value is to 1 the better the cleavage is supposed to be.
Shape quality : Each RNA folds in a specific secondary structure, but the hammerhead ribozyme has a consensus structure so the values of the shape quality are the prediction by RiboSoft of the ability for each putative ribozyme sequence to fold in the active ribozyme structure.
Cleavage: those values presented as -and + are the results of testing the ability of the ribozymes on cutting their template. These values reflect both maximum cleavage and speed of cleavage (from time course experiments, data not shown).
Note: the "A" written in front of some ribozyme names means "Anterior" and refers to a first set of ribozymes that were tested on RFP. Figure S1. Some of the ribozymes designed by RiboSoft against RFP RNA were chosen to evaluate the effect of different parameters on the ribozyme activity. Thus, for many sites two ribozymes of different ranks were tested. All ribozymes were designed by RiboSoft (designated by their cut-site number with their ranking in parenthesis) except Rz 5m, which was designed manually (a ribozyme with a catalytic core identical to that used by RiboSoft, but with binding arms chosen manually to include mismatches with the target). Different combinations of 2 or 3 ribozymes were made and the cleavage efficiency was calculated for each reaction.
The 6% polyacrylamide gel shows labelled RFP RNA in the last lane, the higher band corresponds to the full length RFP (726* nt) with no ribozymes. In the following ten lanes, from 2(4) to 5(m), RFP RNA was incubated in presence of one ribozyme in every reaction where, for the ribozymes 2, 4, 7 and 9, RFP was cleaved and we can observe different bands corresponding to the ribozyme cut-sites respectively: 153* nt (site2), 229* nt (site4), 393* nt (site7) and 558* nt (site 9). In the 8 th lane, it is not clear whether the ribozyme is active or not because the corresponding cut-site 11 results in an RFP fragment of 702* nucleotides which we cannot differentiate from the full length RFP (although the band at the top does appear to be slightly lower in this lane). The absence of cleavage for the cut-site 5 was expected as the accessibility 2 of the 5(3) is 0 so the ribozyme is not able to hybridize with the target.
The other lanes are combinations of 2 or 3 ribozymes where the full length RFP is not detectable in most cases, with the corresponding bands for cleaved fragments.
Band intensity was measured by using ImageQuant to calculate the cleavage percentage in each case (every ribozyme by itself and the different combinations).

Sequences
The target sequences used in this study, as well as the corresponding ribozymes, can be found below. All the potential "GUC" target sites have been annotated in red. Similarly, the different regions of the ribozymes (and corresponding regions on the target RNAs) have been highlighted in different colors to help visualize the ribozymes and cleavage sites: gray, stem 1; green, stem 3; turquoise, stem 2; yellow, conserved regions of CUGANGA and GAAA; finally, pink, nucleotides important for the tertiary interaction between stem I and stem II. For RFP, in cases where regions of complementarity with the ribozymes can vary in length depending on the ribozyme used for targeting, different shadings of the same colors are used.

PABPN1
Transcription product of the new target = 938 nt