I've been looking at "single-cycle" processors such as PicoRV32, and I've noticed that barring textbook examples of single cycle processors with magical fully combinational and separate instruction and data memories such as "Digital Design and Computer Architecture" by Harris & Harris, it doesn't seem possible to actually create a true single cycle processor.
Looking at the state machines of these processors, such as the PicoRV32 processors, I've been trying to wrapping my head behind how a more realistic processor would work.
My current understanding is assuming that all memory returns in one cycle, the absolute fastest a RISC CPU with a Von Neumann style memory (shared data and memory) would two cycles.
- Cycle 1: Fetch the current instruction from memory. Since it takes one cycle to get the instruction from memory, we can only wait during this cycle.
- Cycle 2: Decode, execute, and writeback.
Some instruction, such as loads would require three cycles.
- Cycle 1: Fetch the current instruction from memory.
- Cycle 2: Decode, and request data from memory
- Cycle 3: Execute/writeback to register file
Stores would also have to be three cycles.
Some instructions, such as loads would require three cycles.
- Cycle 1: Fetch the current instruction from memory.
- Cycle 2: Decode/execute and pull the data from the register file
- Cycle 3: Write data to the memory
I'm not entirely sure whether my understanding is correct or not. The state machine for the PicoRV32 processor seems to require four cycles minimum (fetch-decode-read from register file-execute/store/load/shift) and probably a more straightforwards state machine but I was wondering if my methodology is possible.

