A robust and parameterizable Asynchronous FIFO implementation in SystemVerilog for safe clock domain crossing. This design uses Gray code pointer synchronization to prevent metastability issues when transferring data between different clock domains.
- Features
- Architecture
- Module Hierarchy
- Parameters
- Interface Signals
- Usage Example
- Simulation
- Verification
- Design Highlights
- File Structure
- Requirements
- Getting Started
- Contributing
- License
- ✅ Parameterizable design: Configurable data width, FIFO depth, and synchronizer stages
- ✅ Gray code synchronization: Eliminates multi-bit synchronization issues
- ✅ Metastability protection: Multi-stage synchronizers for clock domain crossing
- ✅ Full/Empty flag generation: Reliable status flags in respective clock domains
- ✅ Independent clock domains: Supports different frequencies for read and write clocks
- ✅ Zero latency: Data available on next read clock after write
- ✅ Synthesizable: FPGA and ASIC ready
- ✅ Comprehensive testbench: 12 test cases with 100% pass rate
- ✅ Phase-complete verification: Tested with coprime clocks (10:17 and 17:10) for complete phase coverage
- ✅ Bidirectional testing: Verified for both fast-to-slow and slow-to-fast clock domain transfers
The Asynchronous FIFO consists of the following key components:
┌─────────────────────────────────────────────────────────────────┐ │ Async FIFO Top │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ │ Gray Code Pointers │ │ │ │ │ Write PTR │─────────────────────────────▶│ Sync R2W │ │ │ │ & Full │ │ │ │ │ │ │ └──────────────┘ │ │ └──────────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────┐ │ │ │ │ │ │ │ Dual-Port │ │ │ │ RAM │ │ │ │ (FIFOMEM) │ │ │ │ │ │ │ └──────────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ │ Gray Code Pointers │ │ │ │ │ Read PTR │◀─────────────────────────────│ Sync W2R │ │ │ │ & Empty │ │ │ │ │ │ │ └──────────────┘ │ │ └──────────────┘ │ └─────────────────────────────────────────────────────────────────┘ Write Clock Domain Read Clock Domain - Gray Code Encoding: Pointers are converted to Gray code before crossing clock domains, ensuring only one bit changes at a time
- Multi-Stage Synchronizers: 2-stage (default) flip-flop synchronizers mitigate metastability
- Separate Full/Empty Logic: Status flags are generated independently in their respective clock domains
- Dual-Port RAM: True dual-port memory allows simultaneous read and write operations
async_fifo (Top Module) ├── wptr_full - Write pointer and full flag generation ├── rptr_empty - Read pointer and empty flag generation ├── sync_r2w - Synchronize read pointer to write domain ├── sync_w2r - Synchronize write pointer to read domain └── fifomem - Dual-port RAM storage | Parameter | Default | Description |
|---|---|---|
DATA_WIDTH | 8 | Width of data bus (bits) |
ADDR_WIDTH | 4 | Address width (FIFO depth = 2^ADDR_WIDTH) |
SYNC_STAGES | 2 | Number of synchronizer flip-flop stages |
Example Configurations:
- 8-bit data, 16-deep FIFO:
DATA_WIDTH=8,ADDR_WIDTH=4 - 32-bit data, 256-deep FIFO:
DATA_WIDTH=32,ADDR_WIDTH=8
| Signal | Direction | Width | Description |
|---|---|---|---|
wclk | Input | 1 | Write clock |
wrst_n | Input | 1 | Write domain reset (active low) |
winc | Input | 1 | Write increment enable |
wdata | Input | DATA_WIDTH | Write data |
wfull | Output | 1 | FIFO full flag |
waddr | Output | ADDR_WIDTH+1 | Write pointer (debug) |
| Signal | Direction | Width | Description |
|---|---|---|---|
rclk | Input | 1 | Read clock |
rrst_n | Input | 1 | Read domain reset (active low) |
rinc | Input | 1 | Read increment enable |
rdata | Output | DATA_WIDTH | Read data |
rempty | Output | 1 | FIFO empty flag |
raddr | Output | ADDR_WIDTH+1 | Read pointer (debug) |
async_fifo #( .DATA_WIDTH (8), .ADDR_WIDTH (4), .SYNC_STAGES(2) ) u_async_fifo ( // Write side .wclk (write_clk), .wrst_n (write_rst_n), .winc (write_enable), .wdata (write_data), .wfull (fifo_full), .waddr (write_addr), // Read side .rclk (read_clk), .rrst_n (read_rst_n), .rinc (read_enable), .rdata (read_data), .rempty (fifo_empty), .raddr (read_addr) );always_ff @(posedge wclk or negedge wrst_n) begin if (!wrst_n) begin winc <= 1'b0; end else begin if (!wfull && write_request) begin winc <= 1'b1; wdata <= data_to_write; end else begin winc <= 1'b0; end end endalways_ff @(posedge rclk or negedge rrst_n) begin if (!rrst_n) begin rinc <= 1'b0; end else begin if (!rempty && read_request) begin rinc <= 1'b1; // rdata is valid on next clock cycle end else begin rinc <= 1'b0; end end end- ModelSim/QuestaSim: Mentor Graphics simulator
- Vivado Simulator: Xilinx XSim
- VCS: Synopsys simulator
- Any SystemVerilog compatible simulator
# Navigate to sim directory cd sim # Compile source files vlog -work work ../rtl/*.sv vlog -work work async_fifo_tb.sv # Run simulation vsim -c work.async_fifo_tb -do "run -all" # Or with GUI and waveforms vsim work.async_fifo_tb add wave -r /* run -all# Create project and add files vivado -mode batch -source compile.tcl # Or use Vivado GUI xvlog --sv ../rtl/*.sv sim/async_fifo_tb.sv xelab async_fifo_tb -debug typical xsim work.async_fifo_tb -guiSave this as run_sim.do in the sim/ directory:
# Clean up if {[file exists work]} { vdel -all } # Create work library vlib work # Compile RTL files vlog -work work -sv ../rtl/fifomem.sv vlog -work work -sv ../rtl/sync_r2w.sv vlog -work work -sv ../rtl/sync_w2r.sv vlog -work work -sv ../rtl/wptr_full.sv vlog -work work -sv ../rtl/rptr_empty.sv vlog -work work -sv ../rtl/async_fifo.sv # Compile testbench vlog -work work -sv async_fifo_tb.sv # Run simulation vsim -voptargs=+acc work.async_fifo_tb # Add waves add wave -r /* # Run run -allThen execute:
vsim -do run_sim.doThe testbench includes 12 comprehensive test cases:
- Basic Write and Read - Simple data transfer verification
- Fill and Empty FIFO - Full capacity testing
- Full Flag Test - Verify full flag assertion and write blocking
- Empty Flag Test - Verify empty flag assertion and read blocking
- Wrap Around Test - Multiple fill/empty cycles
- Random Write and Read - Randomized concurrent operations
- Back-to-Back Operations - Continuous read/write stress
- Fast Write, Slow Read - Clock rate difference testing
- Slow Write, Fast Read - Reverse clock rate testing
- Burst Operations - Multiple burst transfers
- Corner Cases - Special data patterns (0x00, 0xFF, 0xAA, 0x55, etc.)
- Stress Test - Heavy randomized concurrent traffic
The testbench employs coprime clock periods to ensure comprehensive phase relationship coverage:
Test Configuration 1: Fast-to-Slow Transfer
WCLK_PERIOD = 10ns // 100 MHz (Write Clock) RCLK_PERIOD = 17ns // 58.8 MHz (Read Clock)Test Configuration 2: Slow-to-Fast Transfer
WCLK_PERIOD = 17ns // 58.8 MHz (Write Clock) RCLK_PERIOD = 10ns // 100 MHz (Read Clock)Why Coprime Clock Periods?
- GCD(10, 17) = 1: Ensures clock periods are coprime (no common divisor)
- Phase Coverage: All possible phase alignments between clocks are tested
- LCM(10, 17) = 170ns: Phase relationship repeats every 170ns, guaranteeing complete phase traversal
- Realistic Testing: Avoids the pitfall of integer-ratio clocks where certain phase combinations never occur
Both configurations have been verified with 100% test pass rate, confirming robust operation in:
- ✅ Fast-to-slow clock domain crossing
- ✅ Slow-to-fast clock domain crossing
- ✅ All phase relationships between asynchronous clocks
- ✅ Different clock frequency ratios
Configuration 1 (Fast Write, Slow Read - 10:17):
================================================================================ FINAL TEST REPORT ================================================================================ Total Tests: 12 Tests Passed: 12 Tests Failed: 0 ================================================================================ *** ALL TESTS PASSED *** ================================================================================ Configuration 2 (Slow Write, Fast Read - 17:10):
================================================================================ FINAL TEST REPORT ================================================================================ Total Tests: 12 Tests Passed: 12 Tests Failed: 0 ================================================================================ *** ALL TESTS PASSED *** ================================================================================ - ✅ Functional coverage: 100%
- ✅ Full and Empty conditions: Verified
- ✅ Clock domain crossing scenarios: Both fast-to-slow and slow-to-fast
- ✅ Phase relationship coverage: Complete phase traversal via coprime clock periods
- ✅ Different clock frequency ratios: 10:17 and 17:10 (1.7:1 bidirectional)
- ✅ Data integrity verification: All data correctly transferred across clock domains
- ✅ Pointer wrap-around: Multiple cycles tested
- ✅ Concurrent read/write operations: Random and stress testing
- ✅ Metastability protection: Multi-stage synchronizers verified under all phase conditions
The design implements industry-standard CDC techniques with comprehensive verification:
// Gray code conversion (wptr_full.sv) assign wptr_gray = (wptr >> 1) ^ wptr; // Multi-stage synchronizer (sync_r2w.sv) always_ff @(posedge wclk or negedge wrst_n) begin if (!wrst_n) begin sync_reg <= '0; end else begin sync_reg[0] <= rptr_gray; for (int i = 1; i < SYNC_STAGES; i++) begin sync_reg[i] <= sync_reg[i-1]; end end endVerification Strategy:
- Coprime Clock Periods: Using GCD(10, 17) = 1 ensures all phase relationships are tested
- Bidirectional Testing: Both fast-to-slow and slow-to-fast transfers verified
- Phase Traversal: Complete phase coverage achieved through non-integer clock ratios
- Real-world Scenarios: Avoids artificial synchronization that occurs with integer-ratio clocks
// Full condition: write pointer catches up to read pointer assign wfull = (wptr_gray == {~rptr_gray_sync[ADDR_WIDTH:ADDR_WIDTH-1], rptr_gray_sync[ADDR_WIDTH-2:0]}); // Empty condition: read pointer equals write pointer assign rempty = (rptr_gray == wptr_gray_sync);- Uses true dual-port RAM for simultaneous access
- No additional buffering required
- Optimal area utilization
async-fifo-design/ │ ├── README.md # This file ├── LICENSE # MIT License ├── .gitignore # Git ignore rules │ ├── docs/ │ └── architecture.md # Detailed design documentation │ ├── rtl/ # RTL source files │ ├── async_fifo.sv # Top-level module │ ├── fifomem.sv # Dual-port RAM │ ├── rptr_empty.sv # Read pointer and empty logic │ ├── sync_r2w.sv # Read-to-write synchronizer │ ├── sync_w2r.sv # Write-to-read synchronizer │ └── wptr_full.sv # Write pointer and full logic │ └── sim/ # Simulation files ├── async_fifo_tb.sv # SystemVerilog testbench └── run_sim.do # ModelSim simulation script - Language: SystemVerilog (IEEE 1800-2017)
- Tools: Any synthesis tool supporting SystemVerilog
- Xilinx Vivado
- Intel Quartus Prime
- Synopsys Design Compiler
- Cadence Genus
- ModelSim/QuestaSim (Recommended)
- Xilinx Vivado Simulator
- Synopsys VCS
- Cadence Xcelium
git clone https://github.com/dianluniuniu/async-fifo.git cd async-fifocd sim # Using ModelSim vsim -do run_sim.do # Or compile and run manually vlog -work work ../rtl/*.sv async_fifo_tb.sv vsim work.async_fifo_tb -do "run -all"# Add RTL files to your synthesis tool # Apply appropriate timing constraints # Synthesize and analyze timing// Instantiate in your top-level module async_fifo #( .DATA_WIDTH (YOUR_DATA_WIDTH), .ADDR_WIDTH (YOUR_ADDR_WIDTH) ) u_cdc_fifo ( .wclk (your_wclk), .wrst_n (your_wrst_n), // ... connect other signals );-
Cliff Cummings Papers:
- "Simulation and Synthesis Techniques for Asynchronous FIFO Design"
- "Clock Domain Crossing (CDC) Design & Verification Techniques"
-
Books:
- "Digital Design and Computer Architecture" - Harris & Harris
- "Advanced FPGA Design" - Steve Kilts
-
Online Resources:
Contributions are welcome! Please feel free to submit a Pull Request. For major changes:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Follow SystemVerilog coding standards
- Add comprehensive comments
- Include testbench for new features
- Update documentation
This project is licensed under the MIT License - see the LICENSE file for details.
Project Maintainer: CHIUKUEI HUANG
Project Link: https://github.com/dianluniuniu/async-fifo
- Based on Cliff Cummings' asynchronous FIFO design methodology
- Inspired by industry-standard CDC practices
- Thanks to the open-source hardware community
- RTL Design Complete
- Testbench Complete
- Functional Verification Complete
- Formal Verification
- Silicon Proven
- FPGA Deployment Examples
If you find this project helpful, please consider giving it a star! ⭐
Made with ❤️ for the Digital Design Community