Skip to content

Parameterizable Asynchronous FIFO with Gray Code Synchronization - A robust clock domain crossing solution in SystemVerilog

License

Notifications You must be signed in to change notification settings

dianluniuniu/async-fifo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Asynchronous FIFO Design

License SystemVerilog Status

A robust and parameterizable Asynchronous FIFO implementation in SystemVerilog for safe clock domain crossing. This design uses Gray code pointer synchronization to prevent metastability issues when transferring data between different clock domains.

📋 Table of Contents

✨ Features

  • Parameterizable design: Configurable data width, FIFO depth, and synchronizer stages
  • Gray code synchronization: Eliminates multi-bit synchronization issues
  • Metastability protection: Multi-stage synchronizers for clock domain crossing
  • Full/Empty flag generation: Reliable status flags in respective clock domains
  • Independent clock domains: Supports different frequencies for read and write clocks
  • Zero latency: Data available on next read clock after write
  • Synthesizable: FPGA and ASIC ready
  • Comprehensive testbench: 12 test cases with 100% pass rate
  • Phase-complete verification: Tested with coprime clocks (10:17 and 17:10) for complete phase coverage
  • Bidirectional testing: Verified for both fast-to-slow and slow-to-fast clock domain transfers

🏗️ Architecture

The Asynchronous FIFO consists of the following key components:

┌─────────────────────────────────────────────────────────────────┐ │ Async FIFO Top │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ │ Gray Code Pointers │ │ │ │ │ Write PTR │─────────────────────────────▶│ Sync R2W │ │ │ │ & Full │ │ │ │ │ │ │ └──────────────┘ │ │ └──────────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────┐ │ │ │ │ │ │ │ Dual-Port │ │ │ │ RAM │ │ │ │ (FIFOMEM) │ │ │ │ │ │ │ └──────────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ │ Gray Code Pointers │ │ │ │ │ Read PTR │◀─────────────────────────────│ Sync W2R │ │ │ │ & Empty │ │ │ │ │ │ │ └──────────────┘ │ │ └──────────────┘ │ └─────────────────────────────────────────────────────────────────┘ Write Clock Domain Read Clock Domain 

Key Design Concepts

  1. Gray Code Encoding: Pointers are converted to Gray code before crossing clock domains, ensuring only one bit changes at a time
  2. Multi-Stage Synchronizers: 2-stage (default) flip-flop synchronizers mitigate metastability
  3. Separate Full/Empty Logic: Status flags are generated independently in their respective clock domains
  4. Dual-Port RAM: True dual-port memory allows simultaneous read and write operations

📦 Module Hierarchy

async_fifo (Top Module) ├── wptr_full - Write pointer and full flag generation ├── rptr_empty - Read pointer and empty flag generation ├── sync_r2w - Synchronize read pointer to write domain ├── sync_w2r - Synchronize write pointer to read domain └── fifomem - Dual-port RAM storage 

⚙️ Parameters

Parameter Default Description
DATA_WIDTH 8 Width of data bus (bits)
ADDR_WIDTH 4 Address width (FIFO depth = 2^ADDR_WIDTH)
SYNC_STAGES 2 Number of synchronizer flip-flop stages

Example Configurations:

  • 8-bit data, 16-deep FIFO: DATA_WIDTH=8, ADDR_WIDTH=4
  • 32-bit data, 256-deep FIFO: DATA_WIDTH=32, ADDR_WIDTH=8

🔌 Interface Signals

Write Clock Domain

Signal Direction Width Description
wclk Input 1 Write clock
wrst_n Input 1 Write domain reset (active low)
winc Input 1 Write increment enable
wdata Input DATA_WIDTH Write data
wfull Output 1 FIFO full flag
waddr Output ADDR_WIDTH+1 Write pointer (debug)

Read Clock Domain

Signal Direction Width Description
rclk Input 1 Read clock
rrst_n Input 1 Read domain reset (active low)
rinc Input 1 Read increment enable
rdata Output DATA_WIDTH Read data
rempty Output 1 FIFO empty flag
raddr Output ADDR_WIDTH+1 Read pointer (debug)

💡 Usage Example

Basic Instantiation

async_fifo #( .DATA_WIDTH (8), .ADDR_WIDTH (4), .SYNC_STAGES(2) ) u_async_fifo ( // Write side .wclk (write_clk), .wrst_n (write_rst_n), .winc (write_enable), .wdata (write_data), .wfull (fifo_full), .waddr (write_addr), // Read side .rclk (read_clk), .rrst_n (read_rst_n), .rinc (read_enable), .rdata (read_data), .rempty (fifo_empty), .raddr (read_addr) );

Write Operation

always_ff @(posedge wclk or negedge wrst_n) begin if (!wrst_n) begin winc <= 1'b0; end else begin if (!wfull && write_request) begin winc <= 1'b1; wdata <= data_to_write; end else begin winc <= 1'b0; end end end

Read Operation

always_ff @(posedge rclk or negedge rrst_n) begin if (!rrst_n) begin rinc <= 1'b0; end else begin if (!rempty && read_request) begin rinc <= 1'b1; // rdata is valid on next clock cycle end else begin rinc <= 1'b0; end end end

🧪 Simulation

Requirements

  • ModelSim/QuestaSim: Mentor Graphics simulator
  • Vivado Simulator: Xilinx XSim
  • VCS: Synopsys simulator
  • Any SystemVerilog compatible simulator

Running the Testbench

Using ModelSim

# Navigate to sim directory cd sim # Compile source files vlog -work work ../rtl/*.sv vlog -work work async_fifo_tb.sv # Run simulation vsim -c work.async_fifo_tb -do "run -all" # Or with GUI and waveforms vsim work.async_fifo_tb add wave -r /* run -all

Using Vivado

# Create project and add files vivado -mode batch -source compile.tcl # Or use Vivado GUI xvlog --sv ../rtl/*.sv sim/async_fifo_tb.sv xelab async_fifo_tb -debug typical xsim work.async_fifo_tb -gui

ModelSim DO Script

Save this as run_sim.do in the sim/ directory:

# Clean up if {[file exists work]} { vdel -all } # Create work library vlib work # Compile RTL files vlog -work work -sv ../rtl/fifomem.sv vlog -work work -sv ../rtl/sync_r2w.sv vlog -work work -sv ../rtl/sync_w2r.sv vlog -work work -sv ../rtl/wptr_full.sv vlog -work work -sv ../rtl/rptr_empty.sv vlog -work work -sv ../rtl/async_fifo.sv # Compile testbench vlog -work work -sv async_fifo_tb.sv # Run simulation vsim -voptargs=+acc work.async_fifo_tb # Add waves add wave -r /* # Run run -all

Then execute:

vsim -do run_sim.do

✅ Verification

The testbench includes 12 comprehensive test cases:

  1. Basic Write and Read - Simple data transfer verification
  2. Fill and Empty FIFO - Full capacity testing
  3. Full Flag Test - Verify full flag assertion and write blocking
  4. Empty Flag Test - Verify empty flag assertion and read blocking
  5. Wrap Around Test - Multiple fill/empty cycles
  6. Random Write and Read - Randomized concurrent operations
  7. Back-to-Back Operations - Continuous read/write stress
  8. Fast Write, Slow Read - Clock rate difference testing
  9. Slow Write, Fast Read - Reverse clock rate testing
  10. Burst Operations - Multiple burst transfers
  11. Corner Cases - Special data patterns (0x00, 0xFF, 0xAA, 0x55, etc.)
  12. Stress Test - Heavy randomized concurrent traffic

Clock Domain Crossing Verification

The testbench employs coprime clock periods to ensure comprehensive phase relationship coverage:

Test Configuration 1: Fast-to-Slow Transfer

WCLK_PERIOD = 10ns // 100 MHz (Write Clock) RCLK_PERIOD = 17ns // 58.8 MHz (Read Clock)

Test Configuration 2: Slow-to-Fast Transfer

WCLK_PERIOD = 17ns // 58.8 MHz (Write Clock)  RCLK_PERIOD = 10ns // 100 MHz (Read Clock)

Why Coprime Clock Periods?

  • GCD(10, 17) = 1: Ensures clock periods are coprime (no common divisor)
  • Phase Coverage: All possible phase alignments between clocks are tested
  • LCM(10, 17) = 170ns: Phase relationship repeats every 170ns, guaranteeing complete phase traversal
  • Realistic Testing: Avoids the pitfall of integer-ratio clocks where certain phase combinations never occur

Both configurations have been verified with 100% test pass rate, confirming robust operation in:

  • ✅ Fast-to-slow clock domain crossing
  • ✅ Slow-to-fast clock domain crossing
  • ✅ All phase relationships between asynchronous clocks
  • ✅ Different clock frequency ratios

Test Results

Configuration 1 (Fast Write, Slow Read - 10:17):

================================================================================ FINAL TEST REPORT ================================================================================ Total Tests: 12 Tests Passed: 12 Tests Failed: 0 ================================================================================ *** ALL TESTS PASSED *** ================================================================================ 

Configuration 2 (Slow Write, Fast Read - 17:10):

================================================================================ FINAL TEST REPORT ================================================================================ Total Tests: 12 Tests Passed: 12 Tests Failed: 0 ================================================================================ *** ALL TESTS PASSED *** ================================================================================ 

Coverage

  • Functional coverage: 100%
  • Full and Empty conditions: Verified
  • Clock domain crossing scenarios: Both fast-to-slow and slow-to-fast
  • Phase relationship coverage: Complete phase traversal via coprime clock periods
  • Different clock frequency ratios: 10:17 and 17:10 (1.7:1 bidirectional)
  • Data integrity verification: All data correctly transferred across clock domains
  • Pointer wrap-around: Multiple cycles tested
  • Concurrent read/write operations: Random and stress testing
  • Metastability protection: Multi-stage synchronizers verified under all phase conditions

🎯 Design Highlights

Clock Domain Crossing Safety

The design implements industry-standard CDC techniques with comprehensive verification:

// Gray code conversion (wptr_full.sv) assign wptr_gray = (wptr >> 1) ^ wptr; // Multi-stage synchronizer (sync_r2w.sv) always_ff @(posedge wclk or negedge wrst_n) begin if (!wrst_n) begin sync_reg <= '0; end else begin sync_reg[0] <= rptr_gray; for (int i = 1; i < SYNC_STAGES; i++) begin sync_reg[i] <= sync_reg[i-1]; end end end

Verification Strategy:

  • Coprime Clock Periods: Using GCD(10, 17) = 1 ensures all phase relationships are tested
  • Bidirectional Testing: Both fast-to-slow and slow-to-fast transfers verified
  • Phase Traversal: Complete phase coverage achieved through non-integer clock ratios
  • Real-world Scenarios: Avoids artificial synchronization that occurs with integer-ratio clocks

Full/Empty Generation

// Full condition: write pointer catches up to read pointer assign wfull = (wptr_gray == {~rptr_gray_sync[ADDR_WIDTH:ADDR_WIDTH-1], rptr_gray_sync[ADDR_WIDTH-2:0]}); // Empty condition: read pointer equals write pointer assign rempty = (rptr_gray == wptr_gray_sync);

Memory Efficiency

  • Uses true dual-port RAM for simultaneous access
  • No additional buffering required
  • Optimal area utilization

📁 File Structure

async-fifo-design/ │ ├── README.md # This file ├── LICENSE # MIT License ├── .gitignore # Git ignore rules │ ├── docs/ │ └── architecture.md # Detailed design documentation │ ├── rtl/ # RTL source files │ ├── async_fifo.sv # Top-level module │ ├── fifomem.sv # Dual-port RAM │ ├── rptr_empty.sv # Read pointer and empty logic │ ├── sync_r2w.sv # Read-to-write synchronizer │ ├── sync_w2r.sv # Write-to-read synchronizer │ └── wptr_full.sv # Write pointer and full logic │ └── sim/ # Simulation files ├── async_fifo_tb.sv # SystemVerilog testbench └── run_sim.do # ModelSim simulation script 

🔧 Requirements

RTL Synthesis

  • Language: SystemVerilog (IEEE 1800-2017)
  • Tools: Any synthesis tool supporting SystemVerilog
    • Xilinx Vivado
    • Intel Quartus Prime
    • Synopsys Design Compiler
    • Cadence Genus

Simulation

  • ModelSim/QuestaSim (Recommended)
  • Xilinx Vivado Simulator
  • Synopsys VCS
  • Cadence Xcelium

🚀 Getting Started

1. Clone the Repository

git clone https://github.com/dianluniuniu/async-fifo.git cd async-fifo

2. Run Simulation

cd sim # Using ModelSim vsim -do run_sim.do # Or compile and run manually vlog -work work ../rtl/*.sv async_fifo_tb.sv vsim work.async_fifo_tb -do "run -all"

3. Synthesize

# Add RTL files to your synthesis tool # Apply appropriate timing constraints # Synthesize and analyze timing

4. Integrate into Your Design

// Instantiate in your top-level module async_fifo #( .DATA_WIDTH (YOUR_DATA_WIDTH), .ADDR_WIDTH (YOUR_ADDR_WIDTH) ) u_cdc_fifo ( .wclk (your_wclk), .wrst_n (your_wrst_n), // ... connect other signals );

🎓 Learning Resources

Recommended Reading

  1. Cliff Cummings Papers:

    • "Simulation and Synthesis Techniques for Asynchronous FIFO Design"
    • "Clock Domain Crossing (CDC) Design & Verification Techniques"
  2. Books:

    • "Digital Design and Computer Architecture" - Harris & Harris
    • "Advanced FPGA Design" - Steve Kilts
  3. Online Resources:

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes:

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Development Guidelines

  • Follow SystemVerilog coding standards
  • Add comprehensive comments
  • Include testbench for new features
  • Update documentation

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

📧 Contact

Project Maintainer: CHIUKUEI HUANG

Project Link: https://github.com/dianluniuniu/async-fifo

🙏 Acknowledgments

  • Based on Cliff Cummings' asynchronous FIFO design methodology
  • Inspired by industry-standard CDC practices
  • Thanks to the open-source hardware community

📊 Project Status

  • RTL Design Complete
  • Testbench Complete
  • Functional Verification Complete
  • Formal Verification
  • Silicon Proven
  • FPGA Deployment Examples

⭐ Star History

If you find this project helpful, please consider giving it a star! ⭐


Made with ❤️ for the Digital Design Community

About

Parameterizable Asynchronous FIFO with Gray Code Synchronization - A robust clock domain crossing solution in SystemVerilog

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published