Journal
of the Optical Society of America, A 12:1208-1224, 1995.
Characterization
of Spatial Frequency Cues in the Perception of Shape-from-Texture
Ko
Sakai and Leif H. Finkel
Department of Bioengineering and
Institute of Neurological Sciences
University of Pennsylvania
Philadelphia, PA 19104, U. S. A.
ko@neuroengineering.upenn.edu
leif@neuroengineering.upenn.edu
Abstract
The major
cue to shape-from-texture is the compression of texture as a function of surface
curvature. A number of computational models have been proposed in which compression
is measured by detecting changes in the spatial frequency spectrum. We propose
that the visual system uses a strategy of characterizing the frequency spectrum
by a simple set of measures, and tracking the changes in this characterization,
rather than determining changes in the shape of the actual spectra. Our evidence
is based on a number of psychophysical demonstrations using stimuli with specifically-tailored
frequency spectra, constructed from white noise filtered in the frequency domain.
Our evidence suggests that the visual system determines the average peak frequency
of the spectrum, and uses this measure as its characterization. Changes in are
strongly correlated with the degree of surface curvature, and over a range of
stimuli, takes account of the variance in local estimates of the frequency spectrum.
is computed by determining the peak frequency at each spatial location, and
then averaging these frequency values over a local spatial region. We show that
is related to the second-order moment but is more biologically plausible and
shows superior ability to function in the presence of noise.
As a test of this model, we have constructed a neural network architecture for
computing shape-from-texture. Our model is limited to orthographically-projected,
homogeneous textures without in-surface rotation. The early stages of the model
consist of multiple simple-cell units tuned to different orientations and spatial
frequencies. We show that these simple cells are inadequate to determine compression,
but that the outputs of complex-cell like units, after normalization, generate
estimates of surface slant and tilt. The network shows qualitative agreement
with human perception of shape-from-texture over a wide range of real and artificial
stimuli.
Full Text:
PostScript
Acrobat