BetaCavityWeb: a webserver for molecular voids and channels

Molecular cavities, which include voids and channels, are critical for molecular function. We present a webserver, BetaCavityWeb, which computes these cavities for a given molecular structure and a given spherical probe, and reports their geometrical properties: volume, boundary area, buried area, etc. The server's algorithms are based on the Voronoi diagram of atoms and its derivative construct: the beta-complex. The correctness of the computed result and computational efficiency are both mathematically guaranteed. BetaCavityWeb is freely accessible at the Voronoi Diagram Research Center (VDRC) (http://voronoi.hanyang.ac.kr/betacavityweb).


INTRODUCTION
Molecular structure determines molecular function. Of particular importance are the 'gaps' within a structure--internal voids, channels running through it, pockets in the surface--because biological molecules perform their tasks via interactions with other molecules in their environment and these interactions often take place in these spaces. This paper reports a webserver, BetaCavi-tyWeb, which recognizes voids and channels in molecular structures and measures their geometrical properties such as volume, area of cavity boundary and atoms contributing to geometric features.
Here we define a void as a cavity in a molecular interior that is not accessible to bulk solvent around the molecule and can be either hydrated or free from any solvent molecule. A channel is defined as a hole penetrating a molecular structure with two or more openings toward the exterior space through which a solvent or ligand molecule can freely pass (1,2).
The first work on the properties of voids in molecular structures was Connolly's MS program in the early 80s (3,4).
MS used a rolling ball on the surface of a molecule to define a shell corresponding to the external molecular boundary and an internal void boundary. Many methods have since been developed for identifying the empty regions, specifically for locating potential binding sites. In 1994, Kleywegt and Jones developed VOIDOO which recognized voids by embedding the molecule in a grid and checking which grid points were not within any of the atoms (5). Another method using a grid was AVP (6). The SURFNET program fitted spheres into the spaces between pairs of atoms and then identified the voids as clusters of these spheres (7). Sheffler and Baker developed RosettaHoles to recognize voids by first generating a set of void-filling balls that cover the interstitial space in a molecule and then applying the statistical learning technique of support vector machines (8). Liang et al. developed the VOLBL program using the alpha-shape which was based on the power diagram (9) and further improved into CASTp (10). Most recently the BetaVoid program was developed by Kim et al. and used the beta-complex derived from the Voronoi diagram of spherical atoms and its derivative constructs. They showed that it outperforms existing methods (11) including CASTp (10).
The earliest efforts on the recognition of molecular channels came during the mid 90s for computing the pore dimensions of ion channels. The HOLE program (12) required the user to specify an initial location within a channel and its direction. It then used Monte Carlo simulation to recognize the channel. Subsequent studies on channel recognition have been reported relatively recently. In 2006, Voss et al. showed that channel geometry is important for ribosomal polypeptides in determining biomolecular function (13). Channel geometry is also important for transmembrane proteins in selecting transport of ions, small molecules and even large molecules (14)(15)(16)(17)(18). In 2006, Damborský and colleagues reported a grid-based method for extracting cavities by tracing routes from buried active sites to the external solvent and developed CAVER (19) as a plug-in to PyMOL (20). Otyepka and colleagues developed W414 Nucleic Acids Research, 2015, Vol. 43, Web Server issue MOLE using the ordinary Voronoi diagram of atom center points (21). The MolAxis program used the alpha-shape of the molecule, where atoms were approximated by a number of identically sized balls (22). In 2009, Pellegrini-Calace et al. reported the PoreWalker server which computed channels in transmembrane proteins. The method took advantage of the knowledge that secondary structures are usually aligned with the medial axis of the channel, and additionally used to geometric information (23). An important step forward was made by Hege and colleagues in 2011 (24) who adapted an algorithm which traced the edges in the Voronoi diagram of atoms (25). In 2012, Otyepka's group reported the MOLEonline 2.0 server, which was an improved version of the original MOLE program mentioned above (26), and Damborský's group produced a new version of CAVER, as CAVER 3.0, which could analyze both static structures and molecular dynamic trajectories, now using the ordinary Voronoi diagram of points by approximating atoms with several spherical balls with identical radii (27).
There are two critical issues in cavity recognition. First, an accurate, efficient and convenient approach is necessary for recognizing and measuring cavities. Vlassi et al. have pointed out the difficulty of comparing void sizes produced from different programs due to the different computational methods and parameters used (28). They concluded that cavity volume was a poor parameter for analyzing structural responses to mutation because of the unreliable and somewhat contradictory computational results. This suggests that a more accurate and reliable mathematical method and its implementation is necessary. Second, there are various types of information to be analyzed and reported regarding cavities. The volume and area are only two of these. For example, Hubbard et al. pointed out that there were numerous types of analysis possible regarding molecular voids (29): position, size and shape, number of voids, polarity, packing of solvent in cavities, distribution of cavity volumes for solvated and empty cavities, amino acid preferences for accessible and buried protein surfaces, hydrogen bonding of polar atoms and solvent within voids, mobility of cavity atoms, etc. For this purpose, it is necessary to have a formal representation of molecular structure. These two issues are the motivation for BetaCavityWeb. Voids and channels are indeed important quality measures of computational protein design (30). BetaCavityWeb is accurate, efficient and convenient and is freely available at the VDRC (http://voronoi.hanyang.ac.kr/betacavityweb).

Voronoi diagrams, quasi-triangulations and beta-complexes
Let A = {a 1 , a 2 , . . . , a n } be a set of three-dimensional spherical atoms where a i = (c i , r i ) is an atom with center c i and vdW-radius r i . Consider A a molecule. Let VC(a i ) be the Voronoi cell for a i defined as Then, the Voronoi diagram VD for the atom set A is defined as VD = {VC(a 1 ), VC(a 2 ), · · · , VC(a n )} where the connectivity among the topological entities are appropriately represented. In the three-dimensional space, VD can be represented as the set of Voronoi faces and C V is the set of Voronoi cells. The topology among vertices, edges, faces and cells in VD are properly maintained in the radial-edge data structure (31). Suppose that A O is an offset model where a O i ∈ A O is an offset atom where its radius is increased by a constant amount δ from the atom a i ∈ A. Then, it is known that the topology of the Voronoi diagram VD O for A O is identical to that of VD. Thus, we call the Voronoi diagram of atoms VD offset-invariant. Formally, in the computational geometry community, VD is called the additively weighted Voronoi diagram. Note that VD in Figure 1(c) is different from both the ordinary Voronoi diagram of points where the points correspond to atom centers in Figure 1(a) and the power diagram in Figure 1(b). For the details of VD and its algorithm, refer to (25,32) and for the Voronoi diagram in general, refer to (33). Figure 1(c) shows the VD of a two-dimensional atom set A which consists of six circular disks. Figure 1(d) shows the two offset curves for different offset amounts where the intersections between offset circles are always placed on the Voronoi edges.
The quasi-triangulation QT is the dual structure of VD and is represented as gulation which is the dual structure of the ordinary Voronoi diagram. VD is offset-invariant and so is QT . For the details of QT , see (34)(35)(36). Figure 1(e) shows QT of A.
The beta-complex BC is defined from QT when a spherical probe with radius ␤ is given. Consider an edge e in QT and the two atoms corresponding to the vertices of e. If the probe can pass between the two atoms without any intersection, we remove e from QT . Similarly, if the probe can pass among the three atoms corresponding to the vertices of a triangular face f in QT , we remove f from QT . If we apply this operation for all the simplexes in QT , we eventually have the beta-complex BC which is therefore a subset of QT . The region of Euclidean space bounded by BC is called the beta-shape BS. Hence, each simplex on the boundary of BS determines the proximity among the atoms on the boundary of the molecule where the boundary is defined by the probe. Each simplex in BC determines the proximity among all atoms on and within the boundary of the molecular structure. We emphasize here that the beta-complex can be computed very efficiently from the quasi-triangulation. For details, see (37). Figure 1(f) shows an example of BC for β-value. Note that Figure 1 was drawn by BetaConcept (38).

The BetaCavity algorithm
Consider a vdW-molecule A and its offset model A O , offset by an amount δ ≥ 0. The boundary ∂A O of the offset model is equivalent to the Lee-Richards (solvent accessible) surface for a solvent molecule probe with the radius δ.  (39) from which we quote the following theorem fundamental for the BetaCavityWeb server. Note that the BetaVoid program is the implementation for voids (11). THEOREM 1. The Voronoi complement Vor C and the space external to an offset model are homotopy equivalent.
Therefore, a Voronoi complement and the exterior of an offset model have an identical topological property and its structure can be used to correctly recognize the voids and channels of the offset model. A Voronoi complement consists of one or more components. If there is only one component, it corresponds to the external space possibly with channels without any void. If there are two or more components, the one connected to infinity corresponds to the external space and each of the others corresponds to a void. Given Vor C = {ξ 1 , ξ 2 , . . . } where ξ i is a component, we classify its components corresponding to the unbounded external region and voids.
There are two types of Voronoi edges in the component ξ of Vor C for the unbounded region: those intersecting the boundary of an offset model and those non-intersecting. Similarly, there are two types of Voronoi faces: those intersecting and those non-intersecting Voronoi faces. Given ξ , we remove the intersecting edges and faces. Then, the contraction of each Voronoi face to one of its bounding Vornoi W416 Nucleic Acids Research, 2015, Vol. 43, Web Server issue edges reduces ξ to a graph called the Voronoi graph which consists of vertices and edges. Then, the Euler-Poincaré formula of the Voronoi graph of ξ gives the information about the topology of the channel (i.e. the existence of handles). In the case of channels, the Voronoi graph is called its spine and reveals the topology of the channel. For details about the Euler-Poincaré formula, see (31). The volume and area of each cavity can be correctly and efficiently computed by applying the beta-decomposition algorithm (40). The Supplementary material illustrates the recognition of channels and voids using two example molecules.

The functions of the BetaCavityWeb server
BetaCavityWeb has two ways of taking an input molecular structure. As the server mirrors the molecular structures in the PDB (41), users can simply enter a PDB accession code. Alternatively, users may upload their own molecular structures stored in PDB format. Users can indicate whether they wish to compute voids and/or channels with respect to a solvent probe with a specified radius, ␤.
BetaCavityWeb produces both text and graphics output as shown in Figure 2 (For 1jd0, see (42); For the role of the channel in carbonic anhydrase, see (43)). The textual output consists of three sections: the header, containing overall information about the session, and two other sections containing information on the voids and channels in the structure. Statistics such as the van der Waals volume and area and the computation time are also reported. Each of the void and channel sections contain information such as the number of cavities, their geometrical properties, a list of atoms defining their boundary, etc.
A probe radius of zero corresponds to the van der Waals surface and thus the recognized cavities are the van der Waals cavities. If the probe radius is nonzero, the recognized cavities are the Lee-Richards (solvent accessible) cavities. In the case of channels, BetaCavityWeb further reports the topological properties of each one: the number of openings (i.e. entrances and exits), the number of topological handles within the channel, the number of atoms contributing to the channel boundary, the atom triplets defining an area on the channel boundary, etc. BetaCavityWeb also reports channel spines and the bottleneck of each channel (i.e. its narrowest point). We emphasize that solution correctness and computational efficiency are mathematically guaranteed. Beta-CavityWeb allows a user to choose individual cavities for further analysis. The textual output can be customized by the user before downloading.
The graphical output, which employs JSmol (44), is shown in Figure 2. The computed voids and channels can be displayed together with a molecular structure represented by a space-filling, ball-and-stick, stick, or line model. Figure  3(a) and (b) show the computed voids in the Lee-Richards solvent accessible surface and the atoms contributing to the boundary of the largest void, respectively. Channels can be visualized in three different ways: (i) a spine, (ii) a radiusvarying ball sweeping through a spine where the radius is determined by the perpendicular distance from its center to the boundary of nearby atoms and (iii) the atoms contributing to channel boundary. Figure 4 shows channels: (a)

The components of BetaCavityWeb
BetaCavityWeb is composed of four components: (i) a geometric kernel, (ii) a trimmer of the Voronoi diagram, (iii) a classifier of the Voronoi graph and (iv) an evaluator of geometric properties. The geometric kernel computes the Voronoi diagram, transforms to a quasi-triangulation and extracts the beta-complex. The trimmer computes the Voronoi complement by trimming the Voronoi structure intersecting a van der Waals molecule or a Lee-Richards solvent accessible surface model (i.e. the offset model) and converts the Voronoi complement into the Voronoi graph. The   classifier parses the Voronoi graph to recognize voids and channels. The evaluator computes geometrical properties such as volume, boundary area, etc. of the recognized voids and channels. Figure 5 shows how these components are related: arrows denote the computational logic and data flow.

QTDB: computational acceleration
The timing for the computation of the quasi-triangulation for molecular structures in the PDB is important. In most cases, a molecule of interest might have several analyses performed. In such cases the quasi-triangulation only needs to be computed once and can be reused for subsequent analyses. Hence, it is convenient to compute the quasitriangulation in a preprocessing stage and store it in a database so it can be recalled when required. This approach is possible because the Voronoi diagram of atoms and the quasi-triangulation are offset invariant. For this purpose, we have defined a quasi-triangulation file format (QTF) (45) to store the data in a quasi-triangulation database (QTDB), available from the VDRC. Users can simply download the QTF file corresponding to the PDB file of interest and convert it to a Voronoi diagram. The QTF file takes O(m) memory for a quasi-triangulation with m simplexes and the conversion from a quasi-triangulation to the Voronoi diagram or vice versa takes O(m) time in the worst case. If it is necessary or desirable, users can build their own QTDB by running the QTFier program available from the VDRC. Figure  6 shows this approach.

CONCLUSIONS
We report the BetaCavityWeb server which recognizes molecular voids and channels and computes their geometric properties. BetaCavityWeb is based on the Voronoi diagram of atoms, its quasi-triangulation and the betacomplex whose properties are all mathematically proven. The algorithms used in the computation are correct and efficient with mathematical guarantee. With the BetaCav-ityWeb server, researchers can easily and freely access the powerful capabilities of the Voronoi diagram of atoms to analyze molecular voids and channels.