Return to Answer

added 2465 characters in body

edited Sep 4, 2012 at 20:36

5.3k
9
45
76

Edit 1

This edit is to answer the questions raised in the comments.

The basic idea of delay and sum beamforming is to apply delays to different acquisition channels such that the sounds the originate from one point in space align and "amplify" when signal from the different channels are added. Sounds that orignate from other regions of space do not align and therefore are not "amplified".

The point in space for which the sounds align using a certain set of delays is called the focus of the microphone array (or focal spot). In reality however, the focus is not an ideal point but rather a small(ish) (depending on the array) region of space for which the sounds align well. The size of this region is called the size of the focal spot.

The geometry (size, shape, etc) depend on the exact details of the array: number of microphones, microphone spacing, frequency content of the signals of interest. See e.g. this article.

For more information look for texts on focusing "phased arrays" or "linear arrays" in ultrasonics. Beamforming can be used on reception (to amplify signals from a certain point in space) or on emission (to create a "loud" spot in a room). The principles are identical: just replace "microphone" by "loudspeaker" in your thinking.

Regarding the calibration procedure: you are correct. The procedure I outlined is too simplistic. It only works well if you can create the calibration clap from a much longer distance than the region of space you are interested in. (I.e. to ensure a plain wave.)

If this is not possible, you have to take the position of the clap into account. In this case, the simplest procedure is to correct the delays by cross-correlation as described but then add the curvature of the wavefront back onto the the signal by applying an "inverse beamforming" set of delays calculated with the position of the origin of the clap. (I.e. if you use a depth variable +t0 (or +z0) in your "normal" beamforming algorithm, you need to use -t0 (or -z0) for the inverse beamforming algorithm.)

What is the point of this calibration: it eliminates any errors due to the different sound cards starting their recording at slightly different times. This would normally prevent signals aligning properly even with correct delays and thus prevent the amplification effect you are looking for.

Edit 1

This edit is to answer the questions raised in the comments.

The geometry (size, shape, etc) depend on the exact details of the array: number of microphones, microphone spacing, frequency content of the signals of interest. See e.g. this article.

Source Link

answered Aug 25, 2012 at 19:55

ARF

5.3k
9
45
76

Having worked extensively in adaptive beamforming, I would really shy away from hacking something myself for this until I had some experience. (Note: Professional solutions with about 60 channels cost about 100k€. With many channels your spatial resolution becomes much better, but you only get a limited amount of information through a USB port...)

For reliable beamforming it is essential that all microphones use the same time base. The easiest way to achieve this is with an external USB soundcard with multiple input channels. Those are not really cheap though. Have you had a look at what can be found on ebay?

An alternative is to sacrifice the common time base by using a number of USB soundcards with e.g. two channels each. You will however need to calibrate your acquisition system. This is really not as difficult as it sounds:

To calibrate, you set up your array and produce a short sound (e.g. a crack/clap/etc.) at a distance from your array that is of the order of the extent of your array. You then record this sound and use Matlab or similar to calculate the cross-correlation between the clap/crack/etc. on the different channels. This will give you a list of time offsets you need to apply to your channels to align them before feeding the data to your beamforming algorithm.

To explore adaptive beamforming, this is probably the way to go unless you can make a bargain on a multi-channel soundcard.