## Abstract

Primary visual cortex contains multiple maps of features of the visual scene, including visual field position, orientation, direction, ocular dominance and spatial frequency. The complex relationships between these maps provide clues to the strategies the cortex uses for representing and processing information. Here we simulate the combined development of all these map systems using a computational model, the elastic net. We show that this model robustly produces combined maps of these four variables that bear a close resemblance to experimental maps. In addition we show that the experimentally observed effects of monocular deprivation and single-orientation rearing can be reproduced in this model, and we make some testable predictions. These results provide strong support for the hypothesis that cortical representations attempt to optimize a trade-off between coverage and continuity.

## Introduction

The receptive fields of neurons in the primary visual cortex of cats, monkeys and ferrets are selective for several different features of the visual input. These include visual field position (VF), ocular dominance (OD), orientation (OR), direction of motion (DR) and spatial frequency (SF) (Hubel and Wiesel, 1977; Bonhoeffer and Grinvald, 1991; Shmuel and Grinvald, 1996; Weliky *et al.*, 1996; Shoham *et al.*, 1997; Hübener *et al.*, 1997; Kim *et al.*, 1999). The preferred stimulus for each variable tends to change smoothly as one moves across the surface of the cortex, with occasional discontinuities. This can be thought of as a ‘dimension-reducing mapping’, whereby a space of features containing multiple dimensions (VF, OD, OR, DR, SF, etc.) is compressed onto an essentially two-dimensional sheet of neurons. The result is a set of coexisting maps of visual features with a rich set of geometrical relationships. For instance, OR pinwheels tend to lie in the center of OD columns, and maps tend to intersect at right angles (Bartfeld and Grinvald, 1992; Obermayer and Blasdel, 1993; Hübener *et al.*, 1997; Kim *et al.*, 1999). It is generally believed that the source of these relationships is the mechanisms by which visual cortical maps are generated during development (e.g. Wolf and Geisel, 1998). Thus, the structure of these maps provides an important constraint on theories for how maps form.

Several computational models have made important contributions with regard to understanding the structure of OD and OR maps (reviewed in Erwin *et al.*, 1995; Swindale, 1996; Carreira-Perpiñán and Goodhill, 2002). In particular, ‘low dimensional’ or ‘feature space’ models (Durbin and Mitchison, 1990; Goodhill and Willshaw, 1990; Obermayer *et al.*, 1992; Goodhill *et al.*, 1997; Swindale and Bauer, 1998; Goodhill and Cimponeriu, 2000; Carreira-Perpiñán and Goodhill, 2004), generate maps that closely reproduce the structure of OR and OD maps (Erwin *et al.*, 1995; Swindale, 1996). However, the application of these models has been limited up to now to only one or two features (OD and/or OR) in addition to VF. This naturally raises three questions: (i) can such models still successfully reproduce map structure when multiple features are considered; (ii) what quantitative predictions do these models make in this case; and (iii) what light does the behavior of models in this case shed on the biological mechanisms underlying map formation. In this paper we address these questions by simulating the combined development of VF, OD, OR, DR and SF using the elastic net model (Durbin and Willshaw, 1987). This is a feature space model which works by minimizing an objective function that explicitly trades off coverage versus continuity constraints (Swindale, 1996). In addition, we show that in this multiple map case the elastic net can also reproduce recent experimental results regarding monocular deprivation (Crair *et al.*, 1997) and single orientation rearing (Sengpiel *et al.*, 1999). Taken together, our results extend the range of phenomena to which computational modeling of map formation in visual cortex has been applied. Most importantly, they provide strong additional support for the hypothesis that the computational principles underlying algorithms such as the elastic net capture the essential biological principles underlying map formation in the real cortex.

## Materials and Methods

### The Elastic Net Algorithm

The elastic net (Durbin and Willshaw, 1987; Durbin and Mitchison, 1990; Carreira-Perpiñán and Goodhill, 2002) produces maps that minimize a tradeoff *E* = α*C* + (β/2)*R* between coverage *C* of the stimulus space and continuity *R* of the cortical representation. The *coverage* term is defined as follows:

**x**

_{n}represents a stimulus space vector,

**y**

_{m}an elastic net centroid (stimulus preference of neuron

*m*), and

*K*a receptive field size (‖·‖ represents the Euclidean norm, or length, of a vector). Note we use the term ‘receptive field’ to refer to all features jointly (VF, OD, OR, DR, SF), and not only to the visual field. The continuity term was originally defined as

*M*centroids, representing a square array of cortical neurons (the depth dimension of the cortex is not modeled).

### The Stimulus Space

The stimulus space consists of feature points evenly spaced along orthogonal feature dimensions. For visual field (VFx, VFy) the stimulus values (training set) are a grid of *N*_{x} × *N*_{y} points in the rectangle [0, 1] × [0, 1]. Orientation preference and selectivity (OR, ORr) are represented by adding dimensions of NOR × 1 points in [(−π/2), (π/2)] × {*r*_{OR}}, where OR is conventionally represented by two variables in polar coordinates, with *N*_{OR} values uniformly arranged on a ring of radius *r*_{OR}. Direction preference and selectivity (DR, DRr) are represented by adding a dimension of points {OR − (π/2),OR + (π/2)} on a ring of radius *r*_{DR}, as in Swindale and Bauer (1998). The ocular dominance (OD) dimension has *N*_{OD} values in [−*l*_{OD}, *l*_{OD}]. Spatial frequency (SF) is represented by *N*_{SF} values in [−*l*_{SF}, *l*_{SF}]. Consistent with several experimental reports we take *N*_{SF} = 2, representing ‘high’ and ‘low’ frequencies (Hübener *et al.*, 1997; Shoham *et al.*, 1997; Kim *et al.*, 1999). [It should be noted that others have suggested the representation of SF may be more continuous (Silverman *et al.*, 1989; Everson *et al.*, 1998; Issa *et al.*, 2000)].

### Training Regime

We trained nets with the following configuration. Training set: *N*_{x} = *N*_{y} = 20, *N*_{OR} = 6, *N*_{DR} = 2, *N*_{OD} = 2, *N*_{SF} = 2 (a total of 19 200 training points); *l*_{OD} = 0.06 (*l*_{OD} = 0.08 for strabismus), *l*_{SF} = 0.06, *r*_{OR} = 0.08, and *r*_{DR} = 0.08. Elastic net: *M* = 128 × 128 = 16 384 centroids; α = 1, β = 10; nonperiodic boundary conditions. The net was started from a random initial starting configuration with some global topographic bias; where results have been averaged over multiple simulations this corresponds to taking different random initial starting configurations for the net. As is common in the elastic net, the minimization of the energy is interleaved with decreasing (annealing) *K*. We annealed *K* from a starting point of 0.2 with a rate of 0.9925 to the point at which the maps have just arisen (*K* ≈ 0.03). An efficient minimization method based on Cholesky factorization was used (Carreira-Perpiñán and Goodhill, 2004). The simulations were performed using custom software written in Matlab.

### Monocular Deprivation and Single–orientation Rearing

For the default case (no deprivation) the coverage term α was 1 for all stimulus points **x**_{n}. However, in order to implement deprivation of some stimulus (e.g. OD), α was redefined as a vector with *N* components, where the value of the component α_{n} represents the relative strength with which the stimulus point **x**_{n} is represented in the input. Thus, for monocular deprivation we took α_{n} = dep_{OD} ∈ (0, 1) for each **x**_{n} matching the deprived eye. In order to implement enhancement of OR for single orientation rearing, we took α_{n} = dep_{OR} > 1 for each **x**_{n} matching the enhanced angle of OR. In order to capture the experimental observation that SF maps consist of patches of low frequency in a broader expanse of high frequency (Shoham *et al.*, 1997), one value of SF was ‘deprived’ with dep_{SF} = 0.5 for all simulations.

### Column Width

The power of the discrete Fourier transform of the angle maps was roughly isotropic and concentrated around a ring. We summarized it by the mean wavelength (mean column width), averaged over all directions.

### Crossing Angles

We disregarded the points lying 5 pixels or less from the boundaries of the net (a thin inner stripe framing the maps in e.g. Fig. 1) to eliminate border effects. At each pixel, we computed the angle between the gradient vectors of pairs of dimensions (see below) and mapped it to [0°, 90°], thus obtaining the angle between contours for the dimensions. Each such angle counted with a weight proportional to the product of the gradient moduli of the dimensions, e.g. ‖∇OR‖ × ‖∇OD‖ for OR and OD, so that pixels lying in areas of either constant OR or constant OD (i.e. away from borders of OR or contours of OD) were effectively removed from the histogram. This is because, in an area where either map is nearly constant, the gradient vector is negligible in modulus and its direction basically arbitrary; unlike along borders, where the gradient vector is large and its direction well defined. We refer to the graphs we plot as histograms, even though in a strict sense they are not.

**Figure 1.**

**Figure 1.**

### Pinwheel Distributions

An OR singularity, or pinwheel, is defined positive if the orientation angle increases in a clockwise direction around the pinwheel and negative if anticlockwise. Pinwheels were automatically located in the OR maps as follows. First, the winding number of each pixel in the OR map was estimated by summing the increments of OR angle (in [(−π/2), (π/2)]) along a closed path (a square of radius 1 pixel centered in the pixel considered) in a clockwise direction and dividing the result by 2π; this results in 0 for non-pinwheel points and +1/2 (−1/2) for pixels at or adjacent to a positive (negative) pinwheel, respectively. The exact pinwheel location was obtained by grouping clusters of nonzero winding number and computing their centers of mass. To quantify the layout of pinwheels with respect to OD or SF columns, we computed, for every pinwheel, the distribution of distances of a given pinwheel to its closest OD or SF border. We prefer this to calculating distances to OD centers since the ‘center of an OD/SF column’ is generally not well defined, except for columns which are translations of each other, and becomes difficult to use with columns of changing width, forks, islands or other complex shapes. In contrast, the OD/SF borders are well defined in all these cases. To compute the distances between a pinwheel and its closest OD or SF border, we represented the OD/SF borders by a finely spaced collection of points (the contours generated by Matlab for the OD/SF map); the said distance was then given by the point in this collection closest to the given pinwheel. We considered that a pinwheel lay on an OD/SF border if the distance between the two was 1 pixel or less.

## Results

### Multiple Maps

In a feature space model such as the elastic net, feature dimensions are represented as spatial dimensions in a Euclidean space. Thus there are, for instance, two dimensions for VF, and one for OD. Since OR is periodic it is usually represented by a ring in two dimensions, making five dimensions. To this we added two dimensions representing DR (also periodic), and one representing SF, making eight dimensions in total. The precise distribution of feature points in each of these dimensions is described in Materials and Methods.

Figure 1 shows an initial simulation of VF and OR, and the effect of adding dimensions representing DR, OD and SF respectively in subsequent simulations. The top row corresponds to OR simulated just with VF, with subsequent rows showing the addition of DR, OD and SF respectively. It is apparent that adding dimensions does not change the qualitative character of maps, but does tend to reduce their wavelengths in the model (quantified in Fig. 2). Looking at the last row of Figure 1, it is apparent that the general structure of all four of the OR, DR, OD and SF maps have a good correspondence, at least qualitatively, with experimental data (Shmuel and Grinvald, 1996; Weliky *et al.*, 1996; Shoham *et al.*, 1997; Hübener *et al.*, 1997; Kim *et al.*, 1999). (Simulations started with different random seeds gave maps with similar global characteristics.) Also in Figure 1 is inset a joint contour map for all four columnar systems, to show the extent to which they intersect at right angles, and the relation of OR pinwheels to OD and SF columns (these properties are quantitatively analyzed in Fig. 3). OR, OD and SF all show strong orthogonality to each other, as observed experimentally (Hübener *et al.*, 1997).

**Figure 2.**

**Figure 2.**

**Figure 3.**

**Figure 3.**

Figure 2*A*–*C* looks at some specific relationships between the DR and OR maps, and between OR and SF, when all maps are simulated together. It can be seen that the DR map has fractures (lines of low DR modulus, which correlate with where DR angle reverses direction), and that these fractures connect OR pinwheels (Shmuel and Grinvald, 1996; Swindale and Bauer, 1998). Away from DR fractures, contours of OR and DR run parallel. DR fractures, on the other hand, run orthogonal to contours of OR, since they connect pinwheels. The OR and SF maps tend to intersect orthogonally (Hübener *et al.*, 1997). Figure 2*D* shows the mean wavelength for each dimension as the number of dimensions is increased. It is apparent that adding dimensions reduces the mean wavelength for all previous maps. This effect is less pronounced if the map added has strong ‘deprivation’, as seen with decreased gradient for addition of SF (dep_{SF} = 0.5 in this case; with dep_{SF} = 0.3 there is no wavelength reduction).

Figure 3 shows a quantitative analysis of the effect of adding feature dimensions on the distribution of interpinwheel distances, distances of pinwheels to OD and SF borders, and intersection angles between columns. As in Figure 1 successive rows show the cases of OR simulated alone (i.e. with VF but without DR, OD and SF); OR and DR together; OR, DR and OD; and finally all four feature dimensions (OR, DR, OD, SF). The first column shows histograms of distance between pairs of pinwheels that are nearest neighbors. As in the experimental data (Müller *et al.*, 2000), singularities repel each other compared to a random distribution, though the repulsion weakens as more maps are added. For OR/VF simulated alone, the percentage of nearest-neighbor pinwheels that are of the same sign is 18.7 ± 1.3%, which matches well with experimentally determined values of 19.7 ± 1.5% (ferret) and 21.4 ± 2.0% (cat) (Müller *et al.*, 2000). However, in the simulations this rises to 30.9 ± 1.1% when all features are simulated.

The second column of Figure 3 shows histograms of OR pinwheel distance to OD border and SF border (the OD case is shown for the simulations with and without SF). Distances are normalized by mean wavelength of the OD/SF map respectively, so that a normalized distance of 0.25 roughly corresponds to the ‘center’ of the OD/SF columns (we prefer to use distance of pinwheels to column borders rather than column ‘centers’ since borders have a less ambiguous definition than ‘centers’; see Materials and Methods). Pinwheels tend to lie away from the borders of OD and SF, and the histogram of pinwheel to border distance reliably has a strong peak at slightly less than 0.2 of mean wavelength. Adding SF reduces this distance slightly for OD, i.e. moves OR pinwheels slightly closer to OD borders. It is hard to make a precise quantitative comparison with experimental data due to ambiguities in the definition of ‘distance to column center’, but at least qualitatively there is a good match (Bartfeld and Grinvald, 1992; Obermayer and Blasdel, 1993; Hübener *et al.*, 1997).

The third column of Figure 3 shows intersection angles for map pairs. This shows the tendency to orthogonality for all combinations except OR and DR, where the intersection angles are low. It is clear that the shape of the intersection angles histogram for each pair of maps is very robust to the presence of additional feature dimensions. In order to make a direct comparison with numerical values quoted experimentally, Table 1 shows mean intersection angles (scalar average) for the simulations. It can be seen that adding features causes maps to intersect at very slightly shallower mean angles. The mean angle for OD/OR reported by Hübener *et al*. (1997) was 51.7 ± 0.8°, which appears statistically indistinguishable from our value of 52.5 ± 0.2° when all maps are simulated together. Our value for SF/OR is 52.0 ± 0.2°; that found by Hübener *et al*. (1997) was 49.7 ± 0.8°, which they did not report as significantly different from their OD/OR value. It should be noted though that such means are only a very crude way of characterizing these angle distributions, and that 90° is the modal angle of intersection.

**Table 1**

Model of OR + DR | Model of OD + DR + OD | Model of OR + DR + OD + SF | Experimental results of Hübener et al. (1997) | |
---|---|---|---|---|

OR/DR | 19.7 ± 0.2 | 16.9 ± 0.2 | 15.6 ± 0.1 | |

OD/OR | 54.6 ± 0.2 | 52.5 ± 0.2 | 51.7 ± 0.8 | |

OD/DR | 56.7 ± 0.2 | 54.1 ± 0.2 | ||

OD/SF | 47.2 ± 0.3 | |||

SF/OR | 52.0 ± 0.2 | 49.7 ± 0.8 | ||

SF/DR | 52.6 ± 0.1 |

Model of OR + DR | Model of OD + DR + OD | Model of OR + DR + OD + SF | Experimental results of Hübener et al. (1997) | |
---|---|---|---|---|

OR/DR | 19.7 ± 0.2 | 16.9 ± 0.2 | 15.6 ± 0.1 | |

OD/OR | 54.6 ± 0.2 | 52.5 ± 0.2 | 51.7 ± 0.8 | |

OD/DR | 56.7 ± 0.2 | 54.1 ± 0.2 | ||

OD/SF | 47.2 ± 0.3 | |||

SF/OR | 52.0 ± 0.2 | 49.7 ± 0.8 | ||

SF/DR | 52.6 ± 0.1 |

Another characteristic of visual cortical maps is the rate of change of features with respect to other features, particularly the relation between VF and OR. That is, what is the change in the VF preference versus OR preference between neighboring cortical locations? Figure 4 shows this relationship, and those for VF/OD and OR/OD, as maps are added. In general few consistent relationships are apparent. A specific change in OR can be associated with any change in VF. Large changes in VF tend to be associated with small changes in OR when OR and VF are simulated alone, but this trend weakens considerably when DR and SF are added. These results fit better with recent reports of a lack of consistent relationship between VF and OR gradients (Hetherington and Swindale, 1999; White *et al.*, 2001; Bosking *et al.*, 2002; Buzás *et al.*, 2003) than with the finding of Das and Gilbert (1997) of a strong positive correlation. Although early results with the elastic net (Durbin and Mitchison, 1990) were interpreted as implying a strong negative correlation (Das and Gilbert, 1997) this had not actually been quantified until now, and our results show a rather more subtle picture.

**Figure 4.**

**Figure 4.**

### Strabismus

Strabismus has previously been modeled in the elastic net by increasing the spacing of points along the OD dimension (Goodhill and Löwel, 1995). However, the effect on OR, DR ad SF maps of strabismus in the model was not investigated. We therefore repeated the simulations described above for the strabismic case (see Materials and Methods). Although as previously reported, OD columns become wider (Goodhill and Löwel, 1995), no significant differences were found in the interrelationships between maps reported above for the non-strabismic case (data not shown). This is consistent with the experimental finding that strabismus does not greatly alter the relationships between OR and OD maps (Löwel *et al.*, 1998).

### Monocular Deprivation and Single-orientation Rearing

In addition to the normal development simulated above, we were also interested to examine abnormal development in response to visual deprivation. In particular we focused on the experimentally well-characterized phenomenon of monocular deprivation (MD) (reviewed in Hubel and Wiesel, 1977; Katz and Shatz, 1996), and the more recently discovered phenomenon that single-orientation rearing (SOR) leads to an over-representation of that orientation in the OR map (Sengpiel *et al.*, 1999). In each case, deprivation was modeled by changing the strength of the coverage term in the elastic net energy function relative to the continuity term (see Materials and Methods). In effect, we decreased the influence of feature points representing inputs in the deprived eye for the cortex relative to the open eye for MD, and increased the influence of feature points representing the ‘single’ orientation for the cortex relative to other orientations for SOR.

Goodhill and Willshaw (1994) modeled MD in the elastic net using the same method of reduced influence for a feature space consisting only of VF and OD. However, they examined only constant MD existing throughout development. Figure 5 shows a more complete analysis of the effect of MD with varying-length time windows and start times for the deprivation (here OR was simulated in addition to OD — for analysis of the relationship between OR and OD see Fig. 6). It can be seen that, analogously to experimental data (reviewed in Hubel and Wiesel, 1977; Katz and Shatz, 1996), OD deprivation has significant effects only in a ‘critical period’ located shortly after the moment the OD map starts to arise but before it is fully developed, with OD deprivation either before or after having little effect.

**Figure 5.**

**Figure 5.**

**Figure 6.**

**Figure 6.**

Figures 6 and 7 examine the effect of MD on the structure of OR, DR and SF maps. Although the immediate qualitative impression is that MD has little effect on the structure of other maps, there are subtle changes. Perhaps the most noticeable effect is seen in the map of OR + OD contours: the islands of cortex still dominated by the deprived eye tend to colocalize with pinwheels (Fig. 7*B*), as seen experimentally (Crair *et al.*, 1997). Besides, column widths for the OR, DR and SF maps increase slightly as the deprivation ratio increases (Fig. 7*C*). This can be explained as follows in the coverage-continuity framework (see Materials and Methods). We know that increasing the value of the ratio β/α (within a range for which striped maps arise) increases the width of the columns for all maps (Carreira-Perpiñán and Goodhill, 2004). In MD, we decrease the value of α (more precisely, the value of α_{n} for the deprived eye stimuli) in equation (1), which has the effect of increasing β/α and so increasing the width of the columns for all maps. We also found a broadening of the histogram of distances from pinwheels to OD and SF borders for the largest deprivation ratio, as shown in Figure 7*E*,*F*. For increasing deprivation, pinwheels move even closer to the center of ocular dominance columns, and the peak becomes wider, indicating a larger dispersion. This is because when the deprivation is high, the deprived-eye OD domains shrink and the distance to pinwheels inside them decreases; but the normal-eye domains expand and the distance to pinwheels inside them increases.

**Figure 7.**

**Figure 7.**

Figure 8 shows a related influence for the structure of visual input on OR map structure via single orientation rearing (Sengpiel *et al.*, 1999). As observed biologically, the greater the ‘over-representation’ of one orientation in the input, the greater the area of cortex that is dominated by that orientation in the final map. The OD, DR and SF maps are not shown: as for MD, they were largely unaffected.

**Figure 8.**

**Figure 8.**

## Discussion

Computational theories of visual map development have been proposed at several levels of abstraction. While those towards the more realistic end of the spectrum, such as ‘high-dimensional’ models (e.g. Miller *et al.*, 1989; Goodhill, 1993), are easier to interpret biologically, they have not been very successful at reproducing the geometrical relationships described above. More successful in this regard have been self-organizing ‘feature space’ algorithms such as the elastic net and Kohonen algorithm, which were first applied to the development of visual cortical maps ∼15 years ago (Durbin and Mitchison, 1990; Goodhill and Willshaw, 1990; Obermayer *et al.*, 1990, 1992). These early applications considered only the development of one columnar system, either OR or OD, in addition to VF. Subsequently the joint development of two columnar systems in addition to VF was considered (OR and OD: Erwin *et al.*, 1995; Wolf and Geisel, 1998; Goodhill and Cimponeriu, 2000; OR and DR: Swindale and Bauer, 1998), and several detailed properties of biological maps were reproduced in these studies. Here we have presented the first application of such an algorithm to the joint development of four columnar systems in addition to VF. It was not obvious *a priori* that such algorithms would still produce biologically relevant results in such a relatively high-dimensional space. However, we have shown that the elastic net can do a remarkably good job of reproducing a wide range of experimentally observed phenomena in this case. We have also shown that it can reproduce recently observed phenomena concerning MD and SOR. The main experimental facts the model reproduces even when multiple maps are simulated together are now summarized.While it was already known that some of these map properties could be produced by feature-space models (Erwin *et al.*, 1995; Swindale, 1996; Swindale and Bauer, 1998), we have shown that these results are robust to the addition of extra feature dimensions. Further, we have shown that the relationships between maps produced by the model are generally robust to perturbations in input activity during development including strabismus, MD and SOR. In particular, the model predicts that these perturbations have little effect on the structure of SF and DR maps. This could relatively straightforwardly be tested by imaging all four of the maps OR, DR, OD and SF after each of these perturbations. The model also predicts a slight increase in the wavelengths of the maps of OR, DR and SF after MD. From the coverage-continuity point of view, this results from the fact that the monocular deprivation effectively increases the influence of the continuity term.

*Orientation and ocular dominance:*The model reproduces the tendency of OR and OD columns to intersect at steep angles (Obermayer and Blasdel, 1993), and for pinwheels to lie far from OD borders (Bartfeld and Grinvald, 1992; Obermayer and Blasdel, 1993; Hübener*et al.*, 1997).*Direction:*In the model the DR map has fractures rather than pinwheels and pinwheels tend to be connected by fractures (Shmuel and Grinvald, 1996; Weliky*et al.*, 1996).*Spatial frequency:*The model reproduces the tendency of OR and OD columns to intersect SF columns at steep angles, and of OR pinwheels to lie far from SF borders (Hübener*et al.*, 1997).*Monocular deprivation:*MD during a critical period of development produces a shrinkage of OD domains from the deprived eye, with a magnitude proportional to the strength and duration of the deprivation. Pinwheels tend to colocalize with deprived eye patches (Crair*et al.*, 1997).*Single orientation rearing:*SOR produces an expansion of OR domains for the overrepresented orientation (Sengpiel*et al.*, 1999).

The adult arrangement of maps in visual cortex has also been analyzed from a wiring optimization perspective (Koulakov and Chklovskii, 2001, 2002; Chklovskii and Koulakov, 2004). Although this approach does not address the development of maps, and also has not so far included topography as a variable to be mapped, it can successfully reproduce some of the experimental findings we have addressed here with the elastic net. In fact recent work has shown both a formal link between wiring optimization, Mexican-hat lateral interactions and the elastic net continuity term (Carreira-Perpiñán and Goodhill, 2004), and that subtle differences in patterns of intracortical connectivity can have a strong influence on the relationships between individual maps.

The only other study to have examined the development of multiple columnar systems in visual cortex using a feature space model is that of Swindale (2000). Here the development of up to nine feature dimensions in addition to VF was simulated using Kohonen's algorithm. However, the approach was more abstract than ours in that all feature dimensions were binary, and were not interpreted in terms of specific visual features. Swindale found that the structure of each map individually, including its wavelength, was fairly robust to additional maps, but that the relationships between maps (specifically intersection angles) changed in a gradual way as more maps were added. These results differ somewhat from our own results: we found wavelengths to decrease as maps are added, yet more robustness of intersection angles as maps are added. We did, however, find a shift in the location of pinwheels relative to OD columns with the addition of the SF dimension [as no periodic dimensions were simulated in Swindale (2000) the question of the location of pinwheels relative to other maps did not arise in that case]. The reason for the difference in results between these two studies is unclear, but it could be a manifestation of the fact that, although the elastic net and Kohonen algorithms are similar in flavor, they can behave differently in fine details.

What light do our results shed on the biological mechanisms underlying map formation? The elastic net represents a particular mathematical instantiation of the hypothesis that visual cortical maps are the result of an optimization process. In particular, the elastic net attempts to jointly optimize both coverage, the degree to which all input features are uniformly represented, and continuity, the degree to which the spatial representation of features is ‘smooth’ in some sense (see Materials and Methods and Carreira-Perpiñán and Goodhill, 2002). It was already known that such optimization hypotheses are sufficient to generate the detailed structure of OR and OD maps under normal conditions (Erwin *et al.*, 1995; Swindale, 1996). Our results now show that optimization principles based on the coverage–continuity trade-off are also sufficient to reproduce a large array of additional data, both when additional maps are considered, and when particular types of visual deprivation are introduced.

How might the visual cortex actually implement such optimization principles? One way the elastic net equations can be interpreted biologically is in terms of a Hebbian process: the simple gradient descent learning rule for optimizing the elastic net objective function is equivalent to an activity-dependent strengthening of connections between the input feature that is presented at each instant and the neurons that respond most strongly to that feature. In contrast, evidence is now mounting that at least some properties of visual map formation are activity-independent (e.g. Katz and Crowley, 2002). However, the elastic net equations can be interpreted in other ways, and in fact the algorithm originally grew out of an activity-independent model that relied on the matching of molecular cues (Willshaw and von der Malsburg, 1979). The fact that such optimization models work so well, and are now the only developmental models available for reproducing the wide array of map properties addressed in this paper, suggest that the optimization hypothesis is key to understanding visual cortex, independent of details of how precisely it might be implemented.

We thank David Swartz for help with some of the simulations. Supported by grants NIH EY12544 and NSF BITS0130822.