quantum compiler | NotionNext BLOG

酉矩阵分解 Quantum circuit synthesis Enabling Full-Stack Quantum Computing with Changeable Error-Corrected Qubits Towards Optimal Topology Aware Quantum Circuit Synthesis Tackling the Qubit Mapping Problem with Permutation-Aware Synthesis(PAS)Tackling the Qubit Mapping Problem for NISQ-Era Quantum Devices （SABRE）Quantum Transpilation

In order to evaluate a decomposition of a quantum circuit, we consider quantum circuit costs. The space cost of a circuit, i.e., the number of qubits (or qutrits), is referred to as circuit width. Requiring ancilla increases the circuit width and, therefore, the space cost of a circuit. The time cost for a circuit is the depth of a circuit. The depth is given as the length of the critical path from input to output.

酉矩阵分解

For 2x2 unitary, it is just a U3-gate.

For 4x4 unitary, TwoQubitBasisDecomposer is used. TwoQubitBasisDecomposer implements KAK decomposition method described in arXiv:0806.4015 by Drury and Love. This method uses optimal number of CNOT gates.

For larger unitaries, Isometry class is used. This class implements the method introduced by Iten et al. in arXiv:1501.06911. This method achieves the theoretical lower bound on the number of used CNOT gates.

Update: In Qiskit version 0.37 a new module qiskit.quantum_info.synthesis.qsd is added to apply Quantum Shannon Decomposition of arbitrary unitaries. This functionality replaces the previous isometry-based approach.

The Quantum Shannon Decomposition (arXiv:quant-ph/0406176) uses about half the CNOT gates as the isometry implementation when decomposing unitary matrices of greater than two qubits.

qiskit：

The method used depends on the size of the gate. For single qubit gates the transpiler uses Euler decomposition. For two-qubit gates see appendix B of this paper: https://arxiv.org/pdf/1811.12926.pdf, and for more qubits the transpiler uses column-by-column decomposition (see: https://arxiv.org/pdf/1501.06911.pdf).

Quantum circuit synthesis

Quantum circuit synthesis is the process of taking a description of a desired unitary matrix and decomposing it into a sequence of smaller unitaries, representing the gates in a circuit that implements the target unitary.

大矩阵一系列小矩阵

量子电路综合是将所需的幺正矩阵描述分解成一系列较小的幺正矩阵的过程，这些较小的矩阵表示电路中实现目标幺正变换的门。该过程对于设计和优化特定任务的量子电路以及在物理量子计算机上实现量子算法非常重要。

量子电路综合涉及查找实现所需幺正变换的有效门操作序列。这可能是一个具有挑战性的问题，尤其是对于大型和复杂的幺正矩阵，因此已经开发了各种算法来解决这个问题。一种常见的方法是使用Solovay-Kitaev算法，该算法涉及使用固定门集合中的门序列来逼近目标幺正矩阵。其他方法包括使用进化算法、优化技术和启发式算法。

Solovay-Kitaev 算法的问题：

This method served as a proof-of-concept for the field of synthesis, but the circuits it produces are very long.

综合的主要目标之一：

to produce “short” circuits, minimizing metrics such as CNOT count, total gate count, and critical path length.

CNOT count is of particular importance on NISQ devices, since they contribute significantly more to the overall noise and runtime of circuits than single-qubit gates.

为什么要先于Quantum circuit synthesis 进行qubit mapping?

Qubit mapping会引入swap gates，而swap gates会导致cnot gates增多，从而降低保真度。

如果在进行Quantum circuit synthesis之前进行qubit mapping，可以在原始电路中添加swap gates，然后在之后的Quantum circuit synthesis过程中尽可能地减少cnot gates的数量。

此外，直接对原始电路进行qubit mapping也可以减少swap gates的数量，因为cnot的模式是有规律的。如果先进行Quantum circuit synthesis，会打乱量子应用中cnot的规律，使得qubit mapping算法无法减少swap gates的数量。

Enabling Full-Stack Quantum Computing with Changeable Error-Corrected Qubits

Eastin-Knill theorem

The Eastin-Knill theorem is a fundamental result in quantum error correction and fault-tolerant quantum computing.

Quantum error correction is necessary for practical quantum computing due to the inherent susceptibility of quantum systems to noise and errors. Error correction codes allow us to correct for such errors and protect the integrity of quantum information.

One of the desirable properties in quantum error correction is "transversality". A quantum operation (or gate) is said to be transversal if it acts independently on each qubit of the encoded state. Transversal operations are inherently fault-tolerant because an error on a single physical qubit can only affect one qubit in the logical (encoded) state.

However, the Eastin-Knill theorem (published by Bryan Eastin and Emanuel Knill in 2009) shows that there is a fundamental tradeoff between error correction and computational power. Specifically, the theorem states that no quantum error-correcting code can have a transversal implementation of a complete set of quantum gates. In simpler terms, you cannot have a full set of quantum gates that all operate transversally within an error correcting code. This is a significant restriction because it implies that we cannot perform universal quantum computation solely using transversal gates, which are highly desirable for their error-resilience.

The result doesn't mean fault-tolerant quantum computing is impossible. It simply means we have to be more clever about how we implement our gate sets in a fault-tolerant way. Many current schemes, such as the surface code, get around the Eastin-Knill theorem by using a small set of transversal gates combined with other techniques like magic state distillation to achieve universality.

The Eastin–Knill theorem does not prohibit protocols that provide fault tolerant quantum computation. Some methods of getting around Eastin–Knill are:

Code switching

Code drift

Using continuous variables

Shor's fault tolerant toffoli gate

Teleportation of gates

Magic state distillation

Multiple partitions

Pieceable fault tolerance

Universal braiding

Code Switching

For constructing transversal gates in a universal set, we can change a QEC code A into another code B , we execute the transversal gate which are not transversal in code A, then we

change the code B back into code A.

flag-bridge circuit

attempt to resolve the scenario in which the parity qubit is not directly connected to the data qubits.

L.Lao and C.G.Almudéver.Fault-tolerant quantum error correction on near-term quantum processors using flag and bridge qubits. Physical Review A, 101:032333, 2020.

Remote CX gate by a GHZ-based approach

attempt to resolve the scenario in which direct physical CNOT gates are performed between data qubits that are not physically coupled.

we prepare a GHZ state using the qubits rather than SWAP for low cost.

Observation 1: a good data qubit layout for a QEC-CS logical qubit should first ensure the reliability of error detection, then improve the logical CX, and finally optimize overhead of code switching by fine-tuning the data qubit layout.

The placement of logical qubits affects the performance of logical CX gates and the ‘logical’ connectivity. By co-designing the logical qubit layout design and the compiler design, we may further improve the fidelity or reduce the latency of specific programs.

Observation 3: Most logical gates should be executed in the Steane mode of QEC-CS logical qubits while the code switching should be applied in a context-aware way to promote the computational benefit of the QEC-CS architecture.

Towards Optimal Topology Aware Quantum Circuit Synthesis

本文介绍了一种算法，用于将任意的酉操作编译成一个原生于量子处理器的门序列。

该算法的应用可以优化在嘈杂中间规模量子设备上的量子程序的编译，最小化使用容易出错的CNOT门的数量，同时考虑到量子处理器的连通性。该算法的优秀性能可以帮助量子处理器设计师发现新的门集并优化量子处理器的性能。此外，该算法还可以在量子编译工具链中发挥作用，优化量子程序的编译过程。

该算法为量子计算领域提供了一种优化量子程序编译的有效方法。

Quantum circuit synthesis is an approach to derive a circuit that implements a given unitary and can thus facilitate advances in all these directions: hardware, software and algorithmic exploration.

近似酉矩阵、态（列向量形式）的区分、酉矩阵（方块矩阵形式）的区分

Thus, it is often the case where we want to perform a particular quantum operation A and because of external constraints we end up performing an approximation B, where B ≠ A. Deciding which algorithm has executed is often referred to as distinguishability and several metrics with op- erational motivation have been proposed.

Trace distance and fidelity [29]–[31] have been proposed for distinguishing states.

Metrics such as the diamond norm [32] have been introduced to distinguish processes (algorithms).

minimize . choose an error threshold , use it for convergence, .

Early synthesis algorithms use the diamond norm, while more recent efforts [14], [33] use the Hilbert-Schmidt inner product of and .

While topology is important for superconducting qubits, implementations using trapped ion [35] qubits provide all-to- all connectivity through Mølmer-Sørensen [36] gates.

Tackling the Qubit Mapping Problem with Permutation-Aware Synthesis(PAS)

Qubit mapping and routing

mapping: finding the initial logical-to-physical qubit mapping

映射：找到初始的逻辑qubit到物理qubit映射

routing: applying SWAP gates to move the qubits to physically connected qubits

路由：应用SWAP门将量子比特移动到物理上连接的量子比特

Both are NP-hard problems

There are two types of qubit mapping algorithms: heuristic and optimal ones.

已知的：一类是基于启发式的算法，另一类则是寻求最优映射的算法。

1）启发式算法：SABRE[20]是一个被广泛使用的启发式算法，已被Qiskit编译器[7]以及许多其他的路由算法[22]，[28]所采纳。在SABRE中，电路被划分为多个层次，算法在最前端的层次进行门的路由，并通过一种启发式代价函数选择路径，这种函数是基于映射到物理量子位之间的距离来计算的。该启发式代价函数具有前瞻性地对最前端的层次进行路由，使得在最前端的层次和扩展层次（也就是未来将被路由的门所在的层次）的门的路由代价之间达到平衡。初始的映射是根据电路的反向遍历进行更新的。受到SABRE的启发，研究者们提出了许多不同的启发式算法。比如，Niu等人基于校准数据，提出了一种分层的硬件感知启发式算法[28]。Liu等人[22]则提出了一种优化感知的启发式算法，旨在在电路优化后最小化两量子位门的数量。此外，还有许多其他的启发式算法，包括TKET[35]，基于换位的路由[17]，基于模拟退火的路由[48]，动态前瞻[49]，以及时间最优映射[47]等。

2）最优算法：另一类路由算法是最优量子位映射器。最优映射器通过将映射问题转换为一系列的约束条件，并通过最优求解器找到具有最优的SWAP门数量或最优的深度的电路，从而解决映射问题。例如，OLSQ[37]方法将此问题构造为一个满足性模理论（SMT）优化问题，然后使用Z3 SMT求解器[26]来找到最优的电路。在Qiskit中的BIP映射器[27]通过解决一个二进制整数规划（BIP）问题来找到最优的映射和路由。然而，由于搜索空间的指数增长，这些基于约束的求解器都面临着显著的可扩展性问题。除了可扩展性问题之外，综合算法可能生成比最优映射器生成的电路更小的电路。

In contrast to this work, QGo is an optimization framework and is applied after qubit mapping and routing. It relies on the qubit mapping algorithm to find the logical-to-physical qubit mapping and cannot change the placement of the blocks.

与此工作相反，QGo是一个优化框架，应该在量子比特映射和路由之后应用。它依赖于量子比特映射算法来找到逻辑到物理比特的映射，无法改变块的放置。

Synthesis algorithms can map small circuits to physical devices with fewer gates compared with routing algorithms.

综合算法可以将小电路映射到物理设备上，相比路由算法，使用的门数量更少。

3 insights:

Insight 1: Creating and preserving complex operations (blocks) may be beneficial for qubit mapping and routing.

Insight 2: Introducing input and output permutation for the circuit block may reduce the circuit size.

Insight 3: Permutation-aware synthesis can directly map a circuit to a physical device with higher quality than routing algorithms can achieve.

The quantum compiler can benefit from preserving complex operations, input/output permutation, and permutation-aware synthesis for circuit mapping and routing to reduce circuit size and improve quality on physical devices.

A permutation-aware mapping (PAM) framework based on the aforementioned insights is proposed.

PAM partitions a circuit into blocks

It performs permutation-aware synthesis for each block

It runs a block-based mapping for the blocks.

基于上述洞见，提出了一个置换感知映射（PAM）框架。

PAM将电路划分为块，

对每个块进行置换感知综合，

并针对块，运行基于块的映射。

Tackling the Qubit Mapping Problem for NISQ-Era Quantum Devices （SABRE）

Abstract: Due to little consideration in the hardware constraints, e.g., limited connections between physical qubits to enable twoqubit gates, most quantum algorithms cannot be directly executed on the Noisy Intermediate-Scale Quantum (NISQ) devices. Dynamically remapping logical qubits to physical qubits in the compiler is needed to enable the two-qubit gates in the algorithm, which introduces additional operations and inevitably reduces the fidelity of the algorithm. Previous solutions in finding such remapping suffer from high complexity, poor initial mapping quality, and limited flexibility and controllability. To address these drawbacks mentioned above, this paper proposes a SWAP-based BidiREctional heuristic search algorithm (SABRE), which is applicable to NISQ devices with arbitrary connections between qubits. By optimizing every search attempt, globally optimizing the initial mapping using a novel reverse traversal technique, introducing the decay effect to enable the trade-off between the depth and the number of gates of the entire algorithm, SABRE outperforms the best known algorithm with exponential speedup and comparable or better results on various benchmarks.

摘要：由于在硬件约束方面考虑不足，例如，物理量子比特之间的连接有限以使得两量子比特门难以实现，大多数量子算法不能在噪声中间规模量子（NISQ）设备上直接执行。编译器中需要动态重新映射逻辑量子比特到物理量子比特以使算法中的两量子比特门生效，这引入了额外的操作并不可避免地降低了算法的保真度。以往的解决方案在找到这种重新映射方面存在高复杂性、初始映射质量差和灵活性和可控性受限等问题。为了解决上述缺点，本文提出了一种基于SWAP的双向启发式搜索算法（SABRE），适用于具有任意量子比特之间连接的NISQ设备。通过优化每次搜索尝试，使用一种新颖的反向遍历技术全局优化初始映射，引入衰减效应以实现整个算法深度和门数之间的权衡，SABRE在各种基准测试中超越了已知最佳算法，具有指数加速和可比较或更好的结果。

Based on a given quantum circuit and the coupling information of the device, we need 1) an initial logical-to-physical qubit mapping and 2) the intermediate mapping transition which is able to remap the two logical qubits in a two-qubit gate to two coupled physical qubits.

根据给定的量子电路和设备的耦合信息，我们需要：

初始的逻辑到物理量子比特映射

中间映射转换，能够将双量子比特门中的两个逻辑量子比特，重新映射到两个耦合的物理量子比特中。

Previous qubit mapping solutions lacked the ability to control the quality of the generated circuit among multiple optimization objectives to adapt to NISQ devices with diverse characteristics.

之前的量子比特映射解决方案缺乏控制生成电路质量的能力，无法适应具有不同特征的NISQ设备的多个优化目标。

We present a novel reserve traversal search technique in SABRE to naturally generate a high-quality initial mapping through traversing a reverse circuit, in which more consideration is given to those gates at the beginning of the circuit without completely ignoring the rest of the circuit. Moreover, we introduce a decay effect, which will slightly increase our heuristic cost function values when evaluating overlapped SWAPs, to let SABRE tend to select non-overlapped SWAPs.

我们在SABRE中提出了一种新颖的保留遍历搜索技术，通过遍历反向电路自然地生成高质量的初始映射，在此过程中更多地考虑电路开头的门，而不完全忽略电路的其余部分。此外，我们引入了一个衰减效应，当评估重叠的SWAP时，略微增加启发式成本函数值，以便SABRE倾向于选择非重叠的SWAP。

qubit- mapping问题是NP-Complete 问题

NP-Complete 问题和 NP-Hard 问题都是计算机科学中的重要概念，但是它们之间有一些区别。

NP-Complete 问题是 NP（非确定性多项式时间）问题的一个特殊子集，通常指的是那些难以在多项式时间内求解的计算问题。NP-Complete 问题是 NP 问题中最困难的一类问题，因此所有的 NP 问题都可以被归约到 NP-Complete 问题。

NP-Hard 问题是指那些至少和 NP-Complete 问题一样难以求解的计算问题。与 NP-Complete 问题不同的是，NP-Hard 问题可以是任何计算问题，而不仅仅是 NP 问题，因此它们可能不具有非确定性多项式时间算法，也可能不是决定性问题。

因此，NP-Complete 问题和 NP-Hard 问题之间的区别在于，NP-Complete 问题是 NP 问题的一个子集，而 NP-Hard 问题则是一个更广泛的概念，包括了所有和 NP-Complete 问题同样难以求解的计算问题。

需要注意的是，在计算机科学中，NP-Complete 问题和 NP-Hard 问题通常被认为是等价的，因为它们都代表了那些难以在多项式时间内求解的计算问题。因此，在实际应用中，这两个概念通常可以互换使用。

总之，NP-Complete 问题和 NP-Hard 问题都是计算机科学中重要的概念，它们之间的区别在于 NP-Complete 问题是 NP 问题的一个子集，而 NP-Hard 问题则是一个更广泛的概念，包括了所有和 NP-Complete 问题同样难以求解的计算问题。

从直觉上说，P<=NP<=NP-Complete<=NP-Hard，问题的难度递增。但目前只能证明 P 属于 NP，究竟 P=NP 还是 P 真包含于 NP 还未知。

We perform a comprehensive analysis on the shortcomings of previous solutions, and then summarize the objectives and metrics that should be considered when designing a heuristic solution for the qubit mapping problem.

•We propose a SWAP-based search scheme which can produce comparable results with exponential speedup in the search complexity compared with previous exhaustive mappingsearch algorithms. This fast search scheme ensures the scalability of SABRE to accommodate larger-size quantum devices in the NISQ era.

We present a reverse traversal technique to enable global optimization in the initial mapping solution by leveraging the intrinsic reversibility in qubit mapping problem. Our high-quality initial mapping can significantly reduce the overhead in the generated circuit.

By introducing a decay effect in the heuristic cost function, we are able to generate different hardwarecompliant circuits by trading the number of gates in the circuit against the circuit depth. This makes SABRE applicable for NISQ devices with different characteristics and optimization objectives.

我们对先前解决方案的缺点进行全面分析，然后总结了在设计启发式解决方案时应考虑的目标和指标，以解决量子比特映射问题。

我们提出了一种基于SWAP的搜索方案，该方案在搜索复杂度方面具有指数加速，可以产生与先前的详尽映射搜索算法相当的结果。这种快速搜索方案确保了SABRE的可扩展性，以适应NISQ时代更大的量子设备。

我们提出了一种反向遍历技术，通过利用量子比特映射问题中的内在可逆性，实现初始映射解决方案的全局优化。我们的高质量初始映射可以显著减少生成电路中的开销。

通过在启发式成本函数中引入衰减效应，我们能够通过在电路深度和电路中门的数量之间进行权衡，生成不同的硬件兼容电路。这使得SABRE适用于具有不同特征和优化目标的NISQ设备。

Objectives and Metrics

Since qubit mapping problem is NP-Complete , it is hard to directly find the optimal solution. We will design a heuristic algorithm trying to find a solution to this problem with the following objectives:

1. Flexibility. NISQ devices may have an irregular coupling design which can evolve over time. Our algorithm should be able to deal with arbitrary symmetric coupling cases for various benchmarks. 2. Fidelity. This objective comes from the imperfect quantum operations. The error rate of a CNOT gate is high and one SWAP even requires 3 CNOT gates. We target to improve the overall fidelity by reducing the number of quantum gates, especially two-qubit gates, of the final hardware compliant circuit.

3. Parallelism. This objective comes from the limited qubit lifetime. Inserting SWAPs may increase the depth of the circuit. If our algorithm can insert SWAPs that can be executed in parallel and control the final circuit depth, a deeper circuit will be allowed to execute on hardware.

4. Scalability. Our algorithm targets to be scalable with an acceptable execution time for NISQ devices which contain dozens to hundreds of qubits. As the number of qubits continues to increase beyond the scope of NISQ in the future, QEC might be used, and the problem addressed in the paper turns into another one, as discussed in other papers [16, 20, 28, 35].

灵活性。NISQ设备可能具有不规则的耦合设计，可以随时间演化。我们的算法应该能够处理各种基准测试的任意对称耦合情况。

保真度。这个目标来自不完美的量子操作。CNOT门的误差率很高，一个SWAP甚至需要3个CNOT门。我们的目标是通过减少最终硬件兼容电路中的量子门数量，特别是双量子门，来提高整体保真度。

并行性。这个目标来自有限的量子比特寿命。插入SWAP可能会增加电路的深度。如果我们的算法可以插入可以并行执行并控制最终电路深度的SWAP，则可以在硬件上执行更深的电路。

可扩展性。我们的算法旨在针对包含数十到数百个量子比特的NISQ设备具有可接受的执行时间。随着量子比特数量在未来继续增加超出NISQ的范围，可能会使用QEC，并且本文讨论的问题会变成另一个问题，如其他论文所讨论的[16, 20, 28, 35]。

Metrics

Our algorithm is evaluated by a set of benchmarks of various sizes on IBM’s latest public superconducting chip model [18] to test the flexibility and scalability. The metrics are the total number of gates and the circuit depth in the generated hardware-compliant circuit.

我们的算法通过一组基准测试来评估其在IBM最新的公共超导芯片模型[18]上的灵活性和可扩展性。指标是生成的硬件符合电路中门的总数和电路深度。

注意逻辑qubit和物理qubit的区别

Distance matrix computing.

Given the coupling graph G(V , E) of a quantum device, we will first compute the AllPairs Shortest Path (APSP) by Floyd-Warshall algorithm [12] to obtain the distance matrix D[ ][ ]. Each edge in the coupling graph has distance of 1 because one SWAP is required to exchange the two qubits of an edge. So that D[i][j] represents the minimum number of SWAPs required to move a logical qubit from physical qubit Qi to Qj . The complexity of this step is O(), which is acceptable for NISQ devices with hundreds of qubits.

给定一个量子设备的耦合图G(V, E)，我们首先使用Floyd-Warshall算法[12]计算全点对最短路径（APSP）以获得距离矩阵D[][]。耦合图中的每条边的距离都是1，因为需要一个SWAP来交换边上的两个量子比特。因此，D[i][j]表示将逻辑量子比特从物理量子比特Qi移动到Qj所需的最小SWAP数。这一步的复杂度为O()，对于拥有数百个量子比特的NISQ设备是可接受的。

A front layer (denoted as F ) in this paper is defined as the set of all the two-qubit gates which have no unexecuted predecessors in the DAG.

本文中的前层（表示为 F ）被定义为在有向无环图中没有未执行的前驱的所有双量子比特门的集合。

：就是说F的所有前驱节点都被执行了。

The mapping has to encode the positions of the electrons and do so with a guarantee of anti-symmetry of the fermionic wave functions, to respect fermion algebra. There are thus two pieces of information: the electron occupancy on each site and the overall parity. The Jordan-Wigner mapping fully localizes the occupation number (each qubit represents the occupation on a site), which then causes the parity to be completely non-localized: whenever an electron "hops" sites, the parity update requires an update to all qubits, propagated through a chain of controlled gates. The Bravyi-Kitaev transform encodes both occupancy and parity information non-locally, leading to a more balanced result.

映射必须对电子的位置进行编码，并保证费米子波函数的反对称性，以遵守费米代数。因此有两个信息：每个站点上的电子占据和整体奇偶性。Jordan-Wigner映射完全定位了占据数（每个量子比特代表一个站点上的占据），这导致奇偶性完全无法定位：每当电子“跳跃”站点时，奇偶性更新需要更新所有量子比特，通过一系列控制门进行传播。Bravyi-Kitaev变换非局部地编码了占用和奇偶性信息，从而导致更平衡的结果。

dag可视化安装包

model指pass data, can be public for passes

没有用model的passes

只是读取model中的数据的passes

set.model←compiler.model

读取model，并在运行pass之后会改变model的passes

set.model←compiler.model
get.model → compiler.model

passes workflow: list[pass]

compiler = Compiler(workflow, initial_model)

compiled_circuit = compiler.compile(input_circuit)

dag.add_instruction_node_end

将dag.node一个个加到尾部。实现类似circuit一个个加gate的功能。

dag.nodes_labels_resorted

update nodes' labels. This is only for convenience and does not affect the DAGCircuit function.

pass workflow in current QASMTrans:

QIR - Azure Quantum | Microsoft Learn

Quantum Transpilation

Quantum transpiler plays a crucial role in quantum computing by translating high-level quantum algorithms into a series of low-level hardware-specific instructions that quantum hardware can execute. Qiskit is a widely used quantum software development package developed by IBM. The Qiskit transpiler provides a flexible and extensible framework, offering a wide array of compilation passes that can be combined in different ways to create customized and hardware-tailored transpilation pipelines.

In addition to Qiskit, there are various transpilers aiming at different purposes: 1) application-oriented transpilation: These transpilers focus on specific domain applications. For example, Paulihedral [25] focuses on VQE, Twoqan [19] concentrates on QAOA circuits. 2) hardware-oriented transpilation: These transpilers focus on supporting the new features of a particular quantum platform. For instance, CaQR emphasizes the support for dynamic circuit generation and the opportunities from qubit reset [15]. Pulse transpilers delve into the nuances of low-level pulse scheduling, optimizing quantum operations at the physical layer [7], [12], [40]. AutoComm [51] and QuComm [50] present transpiler optimization techniques for distributive quantum devices. 3) Optimization for mapping/routing: there are a bunch of works aiming at improving general transpilation performance, like Sabre [24] and Zulhner [53] attempt to minimize the number of additional gates in mapping/routing. TOQM [52] aims at shrinking the depth of the transpiled circuit. Shi et al. [40] presents the complete transpilation and optimization flow, including gate aggregation and cancellation. QASMTrans falls into the third category, aiming at improving the transpilation performance of large and deep QASM circuits.

量子转译器在量子计算中扮演着至关重要的角色，将高级量子算法转化为一系列低级别的硬件特定指令，以便于量子硬件执行。Qiskit 是由 IBM 开发的广泛使用的量子软件开发工具包。Qiskit 转译器提供了一个灵活且可扩展的框架，提供多种编译传递，可按不同方式组合，创建定制的硬件定制转译流水线。

除了 Qiskit 外，还有各种不同目的的转译器：1) 应用定向转译：这些转译器专注于特定领域的应用。例如，Paulihedral[25]专注于 VQE，Twoqan[19]专注于 QAOA 电路。2) 硬件定向转译：这些转译器专注于支持特定量子平台的新功能。例如，CaQR 强调支持动态电路生成和来自量子比特重置的机会[15]。Pulse 转译器深入研究低级脉冲调度的细微差别，优化物理层面的量子操作[7]，[12]，[40]。AutoComm[51]和 QuComm[50]为分布式量子设备提供转译器优化技术。3) 映射/路由优化：有一堆工作旨在提高一般转译性能，例如 Sabre[24]和 Zulhner[53]试图将映射/路由中的附加门数量最小化。TOQM[52]旨在缩小转译电路的深度。Shi 等人[40]提供了完整的转译和优化流程，包括门聚合和取消。QASMTrans 属于第三类，旨在改善大型和深度 QASM 电路的转译性能。

[25]:G. Li, A. Wu, Y. Shi, A. Javadi-Abhari, Y. Ding, and Y. Xie, “Paulihedral: a generalized block-wise compiler optimization framework for quantum simulation kernels,” in Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2022, pp. 554–569.

[19]: L. Lao and D. E. Browne, “2qan: A quantum compiler for 2-local qubit hamiltonian simulation algorithms,” 2021. [Online]. Available: https://arxiv.org/abs/2108.02099

[15]: F. Hua, Y. Jin, Y. Chen, S. Vittal, K. Krsulich, L. S. Bishop, J. Lapeyre, A. Javadi-Abhari, and E. Z. Zhang, “Caqr: A compiler-assisted approach for qubit reuse through dynamic circuit,” in Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3, 2023, pp. 59–71.

[7]: Y. Chen, Y. Jin, F. Hua, A. Hayes, A. Li, Y. Shi, and E. Z. Zhang, “A pulse generation framework with augmented program-aware basis gates and criticality analysis,” in 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2023, pp. 773–786.

[12]: P. Gokhale, A. Javadi-Abhari, N. Earnest, Y. Shi, and F. T. Chong, “Optimized quantum compilation for near-term algorithms with openpulse,” in 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 2020, pp. 186–200.

[40]:Y. Shi, N. Leung, P. Gokhale, Z. Rossi, D. I. Schuster, H. Hoffmann, and F. T. Chong, “Optimized compilation of aggregated instructions for realistic quantum computers,” in Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS ’19. New York, NY, USA: ACM, 2019, pp. 1031–1044. [Online]. Available: http: //doi.acm.org/10.1145/3297858.3304018

[51]: A. Wu, H. Zhang, G. Li, A. Shabani, Y. Xie, and Y. Ding, “Autocomm: A framework for enabling efficient communication in distributed quantum programs,” in 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 2022, pp. 1027–1041.

[50]: A. Wu, Y. Ding, and A. Li, “Collcomm: Enabling efficient collective quantum communication based on epr buffering,” arXiv preprint arXiv:2208.06724, 2022.

[24]: G. Li, Y. Ding, and Y. Xie, “Tackling the qubit mapping problem for nisq-era quantum devices,” in Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 2019, pp. 1001–1014.

[53]: A. Zulehner, A. Paler, and R. Wille, “Efficient mapping of quantum circuits to the ibm qx architectures,” in 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2018, pp. 1135–1138.

[52]: C. Zhang, A. Hayes, L. Qiu, Y. Jin, Y. Chen, and E. Z. Zhang, “Time-optimal qubit mapping,” in Proceedings of the Twenty-Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS ’21. Virtual: ACM, 2021.

QPU 分配

检测置换中的循环找到最少的swap数目