The acoustic environment poses at least two important challenges. First, animals must localise sound sources using a variety of binaural and monaural cues; and second they must separate sources into distinct auditory streams (the "cocktail party problem"). Binaural cues include intra-aural intensity and phase disparity. The primary monaural cue is the spectral filtering introduced by the head and pinnae via the head-related transfer function (HRTF), which imposes different linear filters upon sources arising at different spatial locations.
Here we address the second challenge, source separation. We propose an algorithm for exploiting the monaural HRTF to separate spatially localised acoustic sources in a noisy environment. We assume that each source has a unique position in space, and is therefore subject to preprocessing by a different linear filter. We also assume prior knowledge of weak statistical regularities present in the sources. This framework can incorporate various aspects of acoustic transfer functions (echos, delays, multiple sensors, frequency-dependent attenuation) in a uniform fashion, treating them as cues for, rather than obstacles to, separation. To accomplish this, sources are represented sparsely in an overcomplete basis. This framework can be extended to make predictions about the neural representations required to separate acoustic sources.