Abstract submitted to the 7th Annual Computational Neuroscience Meeting (CNS*1998)

Presented at CNS*1998
7th Annual Computational Neuroscience Meeting
Santa Barbara (CA), July 26th-30th

The poster version is here!!!

The demonstration showing a real-time simulation of the model was running on a Powerbook under Unix (Tenon's MachTen).

Figure-ground segregation of coherent motion in V1:

A model based on the role of intra-cortical and extra-cortical feedbacks

W.H.A. Beaudot

E-mail:

McGill Vision Research, Department of Ophthalmology, McGill University, Montreal, Canada.

Supported by KyberVision Consulting, R&D.

Present Affiliation: KyberVision Consulting, R&D, Montreal, Canada.

SHORT ABSTRACT

The recent computational theories have expressed the perception of coherent visual motion as an optimization process constrained by the conflicting demands of integration and segmentation of local motion. I propose here a cortical implementation of these computational principles which accounts for the figure-ground segregation of coherent motion as soon as the primary visual cortical area (V1) through the modulation of its intra-cortical connections by the feedbacks from the middle temporal visual area (MT). This model supports the recent experimental evidence that V1 processing is under the influence of contextual modulations through intra- and extra-cortical connections, and that its perceptive correlate can be expressed in terms of figure-ground segregation.

LONG ABSTRACT

Purpose. This work investigates the function of the interactions between visual areas V1 and MT in the figure-ground segregation of coherent motion, with the aim of revealing the nature of the integrative processes performed by area V1 in visual segmentation.

Methods. A neuromorphic model of the retino-cortical motion processing stream is proposed which incorporates both feedforward and feedback mechanisms. The feedforward stream follows a gradient scheme for local motion estimation and its integration for global motion estimation, while the feedback stream applies local constraints of both smoothness and discontinuity onto the local motion stage. We propose that this model correlates with the following neural organization of the visual pathway:

Retinal spatiotemporal processing provides parvo- and magno-cellular inputs with different band-pass characteristics to area V1. Oriented band-pass filtering of the luminance parvo-cellular signal provides V1 orientation-selective simple cells. These cells interact nonlinearly with the magno-cellular inputs to produce V1 direction-selective simple cells (SDS) responses. These V1 SDS cells drive the V1 direction-selective complex cells (CDS) which are laterally and excitatory connected. The spatial convergence and temporal integration of V1 CDS signals provide the responses of MT direction-selective cells with large receptive fields. The reciprocal retinotopic connections from MT modulate the gain of the feedback loop formed by the excitatory intra-cortical connectivity between V1 CDS cells.

Each neural layer of the model (5 in retina, 3 x n in V1, n in MT where n is the number of considered orientations) is modeled by a resistive network with different spatial characteristics reflecting the coupling among neurons of the same layer, and membrane properties of each neuron by a leaky integrator. Synaptic interactions are considered as linear for the excitatory or inhibitory ones and as multiplicative for the modulatory ones. The functional properties of this model were derived analytically, and illustrated by real-time simulations on image sequences representing real scenes and psychophysical stimuli.

Results.

The feedforward stream alone provides a spatially "fine" representation of local motion in area V1 and a spatially "coarse" representation of global motion in area MT. It is however unable to deal with the aperture problem, and consequently cannot support any spatial segmentation based on coherent motion. Reciprocally the feedback from MT onto V1 induces a dynamic competition between the local and global representations of motion information, which promotes the emergence of an intermediate representation of the visual flow compatible with both the local and global representations. On convergence of this dynamics, responses of V1 CDS cells provide a fine representation of "true" motion, solving simultaneously the aperture problem and the figure-ground segregation of coherent motion.

The explanation of this result rests on the specific nature of the feedback made by area MT onto area V1 in the proposed model. Indeed, MT feedback onto V1 does not just transmit a re-entrant signal but rather an active signal which modulates the V1 processing according to a confidence measurement of motion coherence. Responses of MT cells increase in presence of coherent motion, and then favour the spatial and temporal integration of local motion through the V1 CDS intra-cortical network. Conversely in absence of coherent motion the weaker responses of MT cells limit such integration to the direct input from V1 SDS cells which carry the local estimation. The enhancement or depression of V1 CDS responses according to the strength of MT responses reflects these changes in the spatiotemporal properties of V1 CDS receptive fields (gain, space and time constants). This particular scheme has three important consequences on the representation of the visual motion:

1) Area MT imposes a local smoothness constraint onto the representation of the optical flow by V1 CDS cells, and the aperture problem is solved through the selection of the optimal receptive field which integrates the local motion in a way consistent with the global estimation.

2) The overall effect of this space-variant alteration of the receptive fields is an effective enhancement of V1 CDS responses at locations where the local motion is compatible with the global motion, and a dramatic decrease where it is not. This spatially nonlinear mechanism then preserves the discontinuities present across space in the "true" motion, and thus promotes a shape-from-motion segmentation as early as area V1.

3) This figure-ground segregation of coherent motion is the result of an optimization process which converges in presence of stationary motion, and which is supported by the dynamics induced through the reciprocal connections between areas V1 and MT.

These striking capabilities of the proposed model are illustrated on the processing of kinetic contours defined by coherent random-dot motion: a spatial shape composed of coherent moving random-dots embedded in a background composed of non coherent moving random-dots is successfully segmented by the V1 CDS cells in presence of the MT feedback, while the deactivation of this feedback leads irremediably to the disappearance of this shape. The feedback onto V1 also increases significantly the signal-to-noise ratio at the level of area MT, and speeds up its activation in presence of coherent motion.

Conclusions. The perception of coherent visual motion can be expressed as an optimization problem constrained by the conflicting and simultaneous demands of integration and segmentation of local motion. The proposed model is compatible with this computational scheme and with the recent anatomical and physiological evidence of contextual modulation inside area V1. Firstly, it demonstrates that the modulation of V1 processing by the MT feedback is able to subtend the figure-ground segregation of coherent motion as soon as area V1. And secondly, it suggests, by the simplicity of its neural structure, that the excitatory intra-cortical connectivity and its modulation by a higher cortical signal should be a critical feature of the cortical circuitries involved in the segmentation of the visual information. Although initially based on a theoretical background and developed in the context of neuromorphic engineering, this model also provides some experimentally testable hypothesis about the nature of the integrative processes present in area V1.