I'm working on a project where I need to control a large number of shift register outputs updated at 44.1kHz audio rate. I built a board in the past that worked well for controlling 64 outputs (8 shift registers), I'm now working to expand it to more outputs for a different application.
To control more outputs I'm designing a daisy chainable PCB . The master PCB will have a Teensy 4.1 microcontroller populated sending SPI data to the shift registers. This PCB will send output to a series of daisy chain PCBs without the 4.1 populated using differential encoders between each PCB.
The intention for this project is to control a very large number of shift register outputs (somewhere in low thousands). To this end I'm planning to run SPI at around 50-100 MHz clock speed (100MHz / 44.1kHz = ~2267).
For sending SPI between the PCBs I'm using SN65LVDS32DR and SN65LVDS31DR differential transmitter + receivers with shielded Cat6 cable. For the shift registers I'm using 74LVC595 high speed shift registers. I'm also using SN74LVC2G17 schmitt trigger buffers for signal cleanup on the input and output of the PCB. I expect the CAT6 cables to be most < 12" always < 36".
Please find my schematic attached. Does this seem like a reasonable approach to the stated application? I'm trying to make my solution as modular as possible for different numbers of outputs while using a single PCB design.
Signal flow is as follows:
[Differential Receiver / Teensy] -> [Schmitt Buffer] -> [Shift Registers] -> [Schmitt Buffer] -> [Differential Transmitter] -> [Shielded CAT6 to Next PCB]
In particular there are two questions I'd like answered in regards to this design:
Is the combination of SN65LVDS32DR and SN65LVDS31DR differential transmitter + receivers with shielded CAT6 cable and the SN74LVC2G17 schmitt trigger buffers the best approach for sending high frequency SPI signal between daisy chained PCBs and maintaining clean signal on the boards themselves? Is there a different approach to buffering and board to board connection that would work better for this application?
If there is a way in which this approach fails, what is the likely point of breakdown? Is there a certain SPI clock rate at which this becomes unfeasible, a certain number of daisy chained boards, or a certain cable length at which this will cease to work? What is the mechanism causing this failure? In particular I'm worried about phase misalignment between clock and data due to rise time and propagation delays, but these signals are experiencing the same effects at each buffer point in the signal chain, which theoretically would keep them consistent.
