Is there a more efficient/better way to prepare the History-State using poly(n)-depth circuit?

Question

The paper mentions that the following state can be prepared efficiently (in terms of parameter $L$). (Page 7 of this paper)

$$|{\eta}\rangle = \frac{1}{\sqrt{L+1}}\sum_{j=0}^{L} U_j\cdots U_1\Big(|{\psi} \rangle \otimes |{0\cdots 0} \rangle \Big)\otimes |{2^j -1} \rangle $$

Here, each $U_j$ is either a one-qubit or a two-qubit gate. Assume the state $|{\psi} \rangle $ is known to be efficiently preparable. [Remark: It is called history-state, but this detail is not relevant for this post.]

The paper mentions the following steps to implement it with a $\mathsf{poly(L)}$ sized quantum circuit.

Step I: Implement the following state using $\mathsf{poly(L)}$ circuit. $$|{\eta} \rangle = \frac{1}{\sqrt{L+1}}\sum_{j=0}^{L} |{2^j -1} \rangle$$ Although the details of preparation are not provided, I am convinced that the following paper by Grover and Rudolph prescribes a method to implement it. (Link: Grover-Rudolph state preparation algorithm. )

Step II: Apply $L+1$-fold controlled $U_j$-gates to get the desired state. (This requires implementing multi-control multi-target controlled gate operations. The target gate is promised to be at most a two-qubit gate.)

My query: I am looking for an efficient implementation of step II.

My attempt: I am thinking of composing an efficient implementation of-

(i) $L+1$-fold controlled unitaries when the target unitary is a single-qubit gate (Given in fig 4.10 of Nielsen and Chueng textbook) and

(ii) the decomposition of multi-qubit (target) unitaries into a bunch of single-qubit (target) unitaries.

Is a more compact implementation possible?

Remark: One can deploy a gates from any fixed universal gate set of choice.

Thanks for your time and attention!

Using a unary clock encoding would be more efficient. Or is |eta> just an awkward notation for a unary clock? In that case, there is no multi-control. — Norbert Schuch
– Norbert Schuch, Commented Sep 4 at 13:46
Dr. @NorbertSchuch, thanks for the comment. The encoding scheme for the clock register is not entirely clear to me. They have reserved $L$ qubits for computation that occurs over $T=L$ steps. The pattern that I could see is as follows. For j=0, 1, 2, 3, 4, ... the state evaluates to $|0\rangle$, $|1\rangle$, $|3\rangle$, $|7\rangle$, ..., $|15\rangle$. In binary, this decimal number corresponds to $|0\rangle$, $|1\rangle$, $|11\rangle$, $|111\rangle$, $|1111\rangle$, .... It seems like unary clock encoding. — Manish Kumar
– Manish Kumar, Commented Sep 5 at 8:06

Bram · Accepted Answer · 2025-09-05 14:17:22Z

History states are standard in the Feynman–Kitaev construction: you have a work register (with $|\psi\rangle$ and any ancillas) and a clock register encoding the progress $j=0,\dots,L$. How to prepare the history state:

Step I — prepare $\frac{1}{\sqrt{L+1}}\sum_{j=0}^L |2^j-1\rangle$ with $O(L)$ gates

You can efficiently build this with a cascade of $R_y$ rotations, where each rotation is applied only if the previous clock qubit is 1. This implements a “go on or stop” process that yields a uniform distribution over $j\in\{0,\dots,L\}$.

Take clock qubits $c_0,\dots,c_{L-1}$ with $c_0$ the LSB. For step $k=1,\dots,L$ define the “continue-probability”

$$ c_k = \frac{L-k+1}{L-k+2}. $$

Then $\theta_k=\arcsin\sqrt{c_k}$, and apply $R_y(2\theta_k)$.

On $c_0$: $R_y(2\theta_1)$.
For $m=1,\dots,L-1$: on $c_m$ do controlled-$R_y(2\theta_{m+1})$, controlled by $c_{m-1}$.

This ensures that the amplitude weight of each string $|2^j-1\rangle = |\,1^j0^{L-j}\rangle$ is exactly $1/\sqrt{L+1}$.

Step II — attach the history component $U_j\cdots U_1|\psi\rangle$

Let $U_1,\dots,U_L$ be 1- or 2-qubit gates on the work register. With the unary clock (domain-wall encoding), you simply:

Apply $U_k$ controlled on clock qubit $c_{k-1}$.

In branch $j$, that qubit is 1 iff $j\ge k$. So the work register evolves as $U_j\cdots U_1$. No multi-controls needed — just one control per gate.

Step I: $O(L)$ ladder of $R_y$/CRY gives the uniform unary clock.
Step II: add single-controlled $U_k$’s — no multi-controls.
Combined: poly($L$) circuit for $|\eta\rangle$.

Qiskit example Helper functions

from math import asin, sqrt from qiskit import QuantumCircuit, QuantumRegister from qiskit.circuit.library import RYGate def prepare_uniform_prefix_clock(clock: QuantumRegister) -> QuantumCircuit: """Maak (1/sqrt(L+1)) sum_{j=0}^L |2^j - 1> op klokqubits (c_0 is LSB).""" L = len(clock) qc = QuantumCircuit(clock, name="clock_uniform_prefix") # k = 1..L; m = k-1 is het index in 0..L-1, controle altijd op de 'vorige' qubit for m in range(L): # c_{k} = (L-k+1)/(L-k+2) met k=m+1 p_continue = (L - m) / (L - m + 1) # = c_{m+1} theta = 2 * asin(sqrt(p_continue)) ry = RYGate(theta) if m == 0: qc.append(ry, [clock[m]]) else: cry = ry.control(1) qc.append(cry, [clock[m-1], clock[m]]) return qc def add_history_propagation(qc: QuantumCircuit, work: QuantumRegister, clock: QuantumRegister, U_spec: list): """ Voeg gecontroleerde U_k toe. U_spec is een lijst van tuples (gate_or_circ, targets), waarbij targets indices zijn in `work`. - Als gate_or_circ een QuantumCircuit is, wordt `.to_gate()` gebruikt. - targets heeft lengte 1 of 2 (1- of 2-qubit gate). """ assert len(U_spec) == len(clock), "U_spec moet L items hebben (één per tijdstap)." for k, (op, targets) in enumerate(U_spec, start=1): # normaliseer naar Gate if hasattr(op, "to_gate"): gate = op.to_gate(label=f"U{k}") else: gate = op # al een Gate cgate = gate.control(1) # controle op clock qubit c_{k-1} ctrl = clock[k-1] targs = [work[t] for t in targets] qc.append(cgate, [ctrl] + targs) def build_history_state(psi_prep: QuantumCircuit, U_spec: list, L: int) -> QuantumCircuit: """ Bouw de volledige |η>: (1/sqrt(L+1)) sum_j (U_j...U_1 |psi,0...0>) ⊗ |2^j -1>. - psi_prep: circuit dat |psi>⊗|0...0> op 'work' voorbereidt. - U_spec: lijst (gate_or_circ, targets) met lengte L (targets indexeren in 'work'). - L: aantal klokqubits (en #tijdstappen). """ work = QuantumRegister(psi_prep.num_qubits, "work") clock = QuantumRegister(L, "clock") qc = QuantumCircuit(work, clock, name="history_state") # |psi> op work qc.compose(psi_prep, qubits=work, inplace=True) # Stap I: klok-superpositie clk_circ = prepare_uniform_prefix_clock(clock) qc.compose(clk_circ, qubits=clock, inplace=True) # Stap II: gecontroleerde voortgang U_k add_history_propagation(qc, work, clock, U_spec) return qc

How to use

from qiskit.circuit.library import HGate, CXGate # Voorbeeld: |psi> op 2 werkqubits (bv. |+0>) psi = QuantumCircuit(2) psi.h(0) L = 3 # drie tijdstappen # Definieer U1, U2, U3 (elk 1- of 2-qubit) en op welke werkqubits ze inwerken U_spec = [ (HGate(), [1]), # U1: H op work[1] (CXGate(), [0, 1]), # U2: CX work[0]->work[1] (HGate(), [0]), # U3: H op work[0] ] hist = build_history_state(psi, U_spec, L) print(hist)

Output

 Clock probabilities: {np.str_('000'): np.float64(0.2500000000000001), np.str_('001'): np.float64(0.2499999999999999), np.str_('011'): np.float64(0.24999999999999994), np.str_('111'): np.float64(0.24999999999999983)}

thanks! I am going through the details. I will let you know if I have any doubts. — Manish Kumar
– Manish Kumar, Commented Sep 6 at 8:19

Stack Exchange Network

Is there a more efficient/better way to prepare the History-State using poly(n)-depth circuit?

1 Answer 1

Step I — prepare $\frac{1}{\sqrt{L+1}}\sum_{j=0}^L |2^j-1\rangle$ with $O(L)$ gates

Step II — attach the history component $U_j\cdots U_1|\psi\rangle$

Hot Network Questions

Is there a more efficient/better way to prepare the History-State using poly(n)-depth circuit?

1 Answer 1

Step I — prepare $\frac{1}{\sqrt{L+1}}\sum_{j=0}^L |2^j-1\rangle$ with $O(L)$ gates

Step II — attach the history component $U_j\cdots U_1|\psi\rangle$

Related

Hot Network Questions