Debug system compliant to the RISC-V External Debug Support spec version 0.13.2. Written in Verilog
The system conforms to the spec supporting all the required features and support to some optional feature like the program buffer. This repo contains the code for the Debug Transport Module or DTM (rtl/dtm_*.v) and the Debug Module or DM (rtl/dm_*.v). Debug Module Interface (DMI) is implicit between those modules.
The Debug Module implement 7 registers from the spec:
- 0x04: data0
- 0x10: dmcontrol
- 0x11: dmstatus
- 0x16: abstractcs
- 0x17: command
- 0x20: progbuf0
- 0x30: authdata
Those 7 registers are internally addressed with a 3-bit address space: data0 is 0b000 up to authdata 0b110 and the 0b111 address is left for an hardwired zero. When debugger request any address in the 7-bit space that does not correspond to one of those 7 supported registers, the DTM convert it to 0b111. This is useful since when the debugger want to know if a register is supported or not it tries to write ones to it and reading back to check those ones.
You can see the state machine flow in the image below, however you should ignore all quick access related states since it's not implemented here.
From the Running state (green circle) when the debugger write 1 to haltreq, the module move to the Halting state. Here it send an halt request to the hart (more specifically, to the IF stage) asserting the appropriate signals. IF then halt and assert its halted signal and the signal pass to the next stages one by one. When all the pipeline stages are halted (i.e. the and of their halted signals is 1), the DM can move to the Halted: Waiting state. From Halted: Waiting the DM can either go to Resuming, if the debugger write 1 to resumereq, or to Command: Start if the command register contains an AccessReg command type. The Resuming state is the exact opposite of Halting: the debugger deassert hart halting signals and wait until IF negates its halted signal (i.e. the and of all the pipeline stages' halted singals is 0) to move to Running. From Command: Start the DM can move either to Command: Transfer, Command: ProgBuf or Command: Done as shown in the image above. How it moves between those stages and what it does should be clear looking at the image.
When the hart is in debug mode, IF stage automatically select the program buffer input as output instruction instead of the instruction memory. To execute the instruction while in debug mode, the DM negates the halt signal for one cycle (that's what it does in Command: ProgBuf).
The DM fsm implementation may add some states that are not present in the above image. Specifically this implementation add one state before Command: ProgBuf. That state is reached from Command: Start or Command: Transfer when postexec is high and wait for hrt_halted_i to transitioning to Command: ProgBuf. This is done since the Command ProgBuf state checks hart halted signal to move to Command: Done, but also set the required signal for hart to resume and execute the program buffer, thus incurring in fsm going to Command: Done before hart being really done with it.
The spec states that if the DM support only one program buffer register (i.e. only one instruction at a time) it should automatically append ebreak instructions to that. This DM does not write ebreak instructions to IF stage or in any internal register, but it simply assert the halt signal again one cycle after negating it. Since the halt signal flows down the pipeline stage by stage, the hart would be halted again the cycle after the instruction completes.
This DM only support the Access Register abstract command. It takes place during the Command: Transfer state and we can reach this state only if the transfer field in the command register is set during the Command: Start state. The command register also has an internal regno field (first 16 bits) that is used to determine the hart register to select, across all integer, floating point and CS registers. How the conversion from the 16-bit regno field to the target register is done is shown in the image below along with the command register fields. N.B. Actually this DM only support integer registers (CSRs and floating point registers are read via program buffer) as the least required by the spec. However it do it ignoring the last 11 bits of the regno field, while it should check that the 12th bit (and that bit only) is set. This will be addressed soon.
The DTM use JTAG as transport, implenting section 6 of the spec. That's nothing special about it, it simply conforms to the specification.
While designing this module, I first thought about how to solve clock domain issues. Since the DM is driven by the system clock, while DTM is driven by the JTAG tck test clock we are supposed to handle it. The DTM issues requests to the DM through a custom interface (DMI), so all the signals of this interface must be synchronized. To talk about CDC we first need to know the DMI protocol.
The interface protocol is as simple as follows:
-
When the DTM wants to start a request, it asserts
req_oandcyc. As per the spec, it can do so only during Update-DR and if the previous operation result isn't sticky, so we check(update & ~sticky). We also check^opsince all valid operations are0b01and0b10. Herestickyindicates if the previous operations result is sticky (0b10or0b11) anddmiresetis low, thus allowing the user to retry any request writing one todmireset(more on this at point 4). -
When
req_ois high, DTM wait for DM to assert one ofack_iorerr_i(only one of the two can be high at any moment). When DM acknowledge the request (asserting eitherack_iorerr_i), DTM negatesreq_o. Negatingreq_oserves as an acknowledge of the DM acknowledge, so the latter can lower its signal. If the request has been ack'd withack_i,dm_err(indicating the last operation result) would be0b00, while it'd be2'b10if DM ack'd us witherr_i. -
However we can't move back to idle just after
req_ois negated, that's the whole purpose ofcyc. We want to support any kind of relationships between the DM clock and the DTM one (JTAG), even if it's highly likely that the DM one would be much faster. Imagine the following:- DTM assert
req_oduring Update-DR to start a new request. - During the next Capture-DR DM hasn't replied yet (
~ack_i & ~err_i) because its clock is much slower than DTM one. We set busy and go on. - Then any time before the next Capture-DR DM reply with
ack_iand we immediately lowerreq_o. - We then go to Update-DR once more and the DM still asserts
ack_i, meaning that it hasn't seen thatreq_ohas been negated yet. However, since we don't check forack_ianderr_iwe start the new request. - Next clock edge DTM will see
ack_ihigh, thus terminating the request without really having performed that. - That's a big problem!
To solve it, when
req_ois low andcycis high we enter a state where we're waiting for DM to negatesack_i(orerr_i), so we'll negatecyconly when~ack_i & ~err_i. After that we're going back to idle and we'll be able to start new requests in Update-DR. - DTM assert
-
The result of an operation is written in the DTM data register
dmiduring Capture-DR, thus if the operation started in Update-DR hasn't been acknowledged when the fsm goes to Capture-DRdm_errwould be0b11and the data register is not written. This also happen if the request is acknowledged during Capture-DR. This module prioritize correctness over performance, so we don't mind to waste a TAP cycle (fsm cycle, not clock cycle) in such cases. Now think aboutsticky: we said that it's 1 if the result of the last operation (dm_err) is sticky (0b10or0b11) anddmiresetis low. What would happen if DM acknowledge our request between Capture-DR and the following Update-DR? The debugger will lose that response. We write to data register only during Capture-DR so if our request isn't ack'd there, we'll write busy to the data register. Then, DM acks it before Update-DR and we setdm_errto0b00(no error) and in Update-DR we start that operation again, without debugger being able to read the previous response. The right behaviour is that we start a new request in Update-DR, not if the actual value ofdm_errisn't sticky, but if its value in Capture-DR wasn't! So we evaluate sticky only during Capture-DR, reset or whendmiresetis high. Also,dm_erris evaluated based oncycinstead ofreq_o. This means that even if the request is ack'd before Capture-DR, we'll consider the status not busy only ifcycis low during Capture-DR. That's necessary since if we clearstickyin Capture-DR based onreq_o, the debugger will issue a new request that we'll perform during Update-DR. However nothing guarantees that DM will seereq_onegated before that (and that's why we introducedcyc, as stated in point 3). This will cost peformance, but we value correctness over it. Also thedtmcs[14:12]is an indicator for the debugger of how many cycles it should spend inRun Test IdleJTAG state to minimize the probability to have a sticky status. We set that to its max value0b111to overcome our issues.
Now that we clarified the protocol, we can talk about CDC. To avoid metastability problems we use two 2FF synchronizers for all DMI signals crossing from one clock domain to another. One of the two synchronizers is clocked by the DM clock, while the other is dirven by the DTM JTAG clock (tck_i or tck). Signals going from DTM to DM will pass through the first synchronizer (clock with DM clock) and ones going from DM to DTM through the second. Signals described in the previous section (req_o, ack_i, etc) will go through the synchronizers, as well as others (like we_o which indicates if the issued operation is a read or a write).
Synchronizers by themselves are not enough to avoid metastability related issues, only combined with the handshaking protocol above we can be sure.
dmireset in dtmcs is defined as follow, by the RISC-V External Debug Support spec version 0.13.2.
Writing 1 to this bit clears the sticky error state and allows the DTM to retry or complete the previous transaction.
As already stated in the DMI protocol, we clear sticky whenever dmireset is set, allowing the debugger to retry the last operation or issue a new one. However, look at the following scenario:
- Debugger issued an operation that starts during Update-DR.
- The operation is ack'd with
err_iimmediately after Update-DR, so we setdm_errto0b10. - During Capture-DR we assert
sticky. - During Shift-DR debugger set
dmiresetand we negatesticky. - During Update-DR we retry the operation, that's immediately ack'd with
err_ias before and we setdm_err. - During Capture-DR we should assert
sticky, butdmiresetis still set so we don't and debugger will see the request as succeeded.
That's bad. Also, during Shift-DR, dmireset may be set temporarily while not intended by the debugger. What we want to do is to limit dmireset. First we want to ignore it during shift. Then, the purpose of dmireset is to clear sticky, which is set in Capture-DR, before Update-DR to allow a new request to start. So we just have to make sure that we clear sticky before Update-DR. To do that, we add a new register r_dmireset that's set to dtmcs[16] during Shift-DR, while it's always negated otherwise. Than we set dmireset = r_dmireset & ~shfit, so dmireset may be high only for one cycle out of Shift-DR. We solved our problem, even if that may not be the best solution.
Now we should consider dm_err == 0b11, which indicates a busy condition. If we're busy, it means that cyc is high. So if the debugger set dmireset we still negates sticky, but that does not allow for a new request to start in Update-DR, since out fsm is locked in this state:
end else if (cyc) begin if (~ack_i & ~err_i) cyc <= 1'b0; end else beginSo even if sticky has been negate, the DTM won't start a new request. To do that, dmihardreset should be used instead:
Writing 1 to this bit does a hard reset of the DTM, causing the DTM to forget about any outstanding DMI transactions. In general this should only be used when the Debugger has reason to expect that the outstanding DMI transaction will never complete (e.g. a reset condition caused an inflight DMI transaction to be cancelled).
The debugger should set dmihardreset to succesfully start a new request during the next Update-DR if dm_err currently report busy. N.B. Actually this DTM doesn't support dmihardreset but we're going to fix that soon.


