Reverb Hardware
Reverberators are a bit different than most DSP applications. Primarily, they consist of lots of multitap delay lines. The 'modulo' addressing mode that some DSP's have is quite a useful code-writing tool for this. But it's not the most efficient way hardware-wise. Having the entire memory appear as a shift register (ie using a round-robin counter and adding an offset) is more efficient as a separate hardware address generator and modulo logic is not really required. So a dedicated hardware reverb chip (AL3201, Lexichip) typically implements things this way. A generic DSP only sometimes will be called on doing reverb. The TI TMS320VC33 and Freescale DSP56366 offer powerful modulus address arithmetic which are useful for general FIR filtering, which is also useful for reverb processing. However, the multitap output arrangements are somewhat less efficient on a DSP56366 because you have to move address registers around to accomplish this - the maximum modulus value is 32,767 which means that you need to keep several sets of independent memory areas. The TI part allows large modulus values, and if the entire digital audio RAM is considered one huge array, this becomes trivial. The Freescale part is nicer in the peripheral mix, however. The host port is a very useful feature that the TI part lacks. If the TI part is used with port I/O registers or a dual-port RAM to facilitate interface to a host processor, this becomes more attractive. DP RAM makes more sense with a host, whereas adding I/O makes it possible to run the UI as a background process and the reverb as an interrupt. There are nicer TI parts, but the development tools are a bit expensive. Freescale and the TMS320C3x both have freeware assemblers for them which makes them very attractive for DIY projects. I immediately ruled out Analog Devices' DSP's because the cost of the development environment prevents its use.
Newer TI parts (the TMS320C6xxx) are scary. The development environment is a bit crazy, both in cost and complexity. The VLIW DSP is neat from a performance point of view, but according to one of my knowledgeable friends, is somewhat difficult to work with and debug. This is not what I want in a hobby project.
Atmel offers the ATSAM3703 which might be useful for reverbs. I have not investigated this part yet. But I should, it looks promising.
The ARU-based Lexicon reverbs and AL3201 both offer similar performance, the ARU-based verbs allow 64k of RAM rather than 32k. Tthe Lexichip allows a large amount of DRAM whereas the AL3201 has a limited amount (32k words) of RAM and it's not expandable. Double the RAM (or more) would be very nice. At 48k, the AL3201 offers the same amount of audio storage as a PCM-60 - note that the PCM-60 sampled at 24kHz, where most CODECS and the AL3201 will be run at 48 kHz. The PCM-91 offers 256k words per reverb chip which helps it achieve the relatively impressive sounds that it does. Some of the higher-end Lexicons have even more processing units in parallel. Note that the Lexichip-3 actually executes 256 program steps per loop, where the AL3201, Lexichip 1 and 2, and the ARU-based Lexicons use 128 steps per program loop.
For a long time, the Lexicon approach (a dedicated reverb engine) was superior to a general-purpose DSP. In terms of silicon area, clock speed, and reverb efficiency, that is still the case. But advances in DSP technology make a generic DSP processor more cost-effective for lower-volume manufacturers - at high volumes, a custom processor may make sense. FPGA's offer another solution, however, the cost - performance for FPGA's are usually biased towards speed at the expense of cost. The multiplier precision required for reverberators is not very high - even three bits plus sign are sufficient for most tasks, and a seven bit plus sign multiplier is adequate. Most presynthesized multipliers are symmetrical, so a 24-bit processor has a 24-by-24 multiplier followed by a 48 bit (or more) accumulator. A reverberator can easily deal with a 24 by 7 bit fractional multiplier with a 24 bit result - six bits are dropped and the seventh can be used for rounding if desired. Interestingly enough, my day job (engine management systems) frequently uses 16 bit by 8 bit fractional multiplies with a 16 bit result.
Reverb memory.... DRAM is cheap. Unfortunately, the widespread acceptance of SDRAM has made the easier-to-interface DRAM obsolete, or 'not recommended for new design'. That's not nice. Also, SDRAM works well in burst mode, where you are reading many sequential words at once. That may be the case with an FIR filter or a convolution algorithm. It is not the case with a multitap delay structure, where the address transfer needs to be done on every cycle. For best performance (but highest cost), static SRAM is much nicer to work with. A typical reverb (non-convolution) requires not that much RAM - maybe 256 to 1024 k-words is sufficient for the most awesome reverb tricks. Convincing reverb can be done with 32k... at least the PCM-70 and the 224XL did it this way. The SRAM cost of this is not that bad for a one-off or low-volume unit - probably around $50 for a 1024 k-word SRAM. Interfacing with the DSP is a snap. But it's not as cheap as SDRAM. I have been wondering if the timing for an SDRAM can be obtained with a normal DRAM controller and an ultra-fast GAL, however, I am not 100% sure on that. The read-write performance of an SDRAM in a random read/write cycle mode is in the order of 50 to 80 ns, which seriously slows down processing. A lot of people cannot believe that these ultra-fast DDR3 SDRAM's really are as slow as RAM chips from two decades ago, but they are! You need to look at all of the timings given in the data sheets. Now, for burst access, they are much faster, but to do a random read-modify-write cycle without the help of caching.... There's also the argument that a cache could hold the whole delay line. Well, why bother with the SDRAM at all?
I just got a new effects box to actually work. It uses a Freescale MC9S08AW60 for the host processor, a Newmarket VFD (20x2), a Grayhill rotary encoder, and two of the AL3201 single-chip digital reverb engines in series. So far I have a multitap delay line written up on it, several reverb algorithms, and a few presets set up for it. I probably will finish writing up the DSP and host code for each of the other programs, then start working on the editor and MIDI interface. I haven't built up the front keypad yet. It interfaces via I2C to the AW60, and contains both the headroom indicator (bargraph) and a 14-switch keypad. It's in a 4" deep, 1U rack enclosure. I am planning on building several of these, although I would like to redesign the front panel so I can use the same front panel for this box as the next one. That may mean a few more keys. The only issue I have with this front panel is the Newmarket VFD is slightly oversized for a 1U box.
And I just got the second effects box to work. Here it is... DSP56366, MCF51AC256, 2x40 LCD display, two encoders, eight pushbutton switches, stereo in and out. So far, I don't have any parameter editing yet, but all but two of the reverb building blocks work. Multi-tap delays, all-pass filters, and a two-input summing block all work. I need to get the stereo multi-tap delay working, and the chorus/interpolator. I only want to do the stereo multi-tap because of DSP pipeline stalls - if I do two in parallel, I can use a two-cycle pipeline stall to do useful work.