Lightning-AMDGPU Advances Quantum by AMD and PennyLane

Lightning-AMDGPU enables fast, scalable quantum simulation on AMD Instinct GPUs, bridging PennyLane quantum software with exascale HPC systems.

AMD hardware is a key component of High-Performance Computing (HPC), and PennyLane’s quantum software ecosystem is integrating intimately with this potent infrastructure. The environment for accelerated scientific discovery is evolving quickly as a result of AMD accelerators powering some of the most powerful computers in the world, including the Top500 leaders of today and the next exascale behemoths Frontier, El Capitan, Alice Recoque, Discovery, and Lux. PennyLane is dedicated to facilitating quantum research on any hardware, whether it is through huge simulations on flagship supercomputers, AMD Developer Cloud access, or personal workstation prototypes.

You can also read Canada Quantum: Build Quantum Computing Sovereignty

Seamless Access via Lightning-AMDGPU

PennyLane has developed Lightning-AMDGPU to simplify the quantum simulation experience on AMD GPUs. Precompiled binaries of the current Lightning-Kokkos simulator, tailored for AMD GPUs, make up this tool. With a straightforward installation command on a Developer Cloud instance, developers can utilize the power of AMD Instinct GPUs in the release, which is optimized for maximum usability and optimal performance.

The extremely portable Lightning-Kokkos simulator backend, which is driven by the Kokkos portability framework, is used by Lightning-AMDGPU behind the hood. C++ code can run smoothly on CPUs and GPUs in this framework. Importantly, this code is naturally lowered into AMD’s native programming model, HIP, on AMD processors, guaranteeing optimal performance. For GPUs in the ROCm 7.0 and MI300 series, precompiled wheels for Lightning-AMDGPU are currently available.

High-end AMD GPUs, such as the MI300X, are more easily accessible through the AMD Developer Cloud without requiring direct supercomputing resources. This resource works similarly to a standard cloud provider, and developers can use it by establishing a GPU Droplet and choosing the MI300X plan with the AMD ROCm 7.1 Software image already installed.

You can also read Xanadu Quantum Technologies Inc. get $23M fund From CQCP

Validated on Frontier: Massive Scalability with MPI

While accessibility is important, pushing the bounds of research requires scaling quantum workloads to the utmost. Through a partnership with Oak Ridge National Laboratory (ORNL), PennyLane has proven its scalability by successfully launching PennyLane and the Lightning simulator on Frontier, the first exascale supercomputer in history.

Exascale computing is sometimes misunderstood to need intricate, custom code. On the other hand, Lightning on Frontier was easy to install and use, and detailed instructions were made available to help customers make the most of the system’s high-bandwidth link.

PennyLane Lightning is essentially “HPC-friendly” and contains the foundation of this high-performance capability. Lightning-Kokkos has supported the Message Passing Interface (MPI) since version 0.42. By letting programmers divide the state vector of a single circuit among several nodes and GPUs, MPI provides the “secret sauce” that makes huge scaling possible. This feature enables researchers to simulate more qubits than might fit on a single GPU’s memory and to execute large-scale simulations more quickly.

For example, Lightning-Kokkos can readily mimic circuits on over 1000 AMD GPUs, where increasing hardware resources directly results in performance benefits, according to strong scaling graphs for running the Quantum Fourier Transform (QFT), a key subroutine in many quantum algorithms.

You can also read Cloudflare Inc News in Quantum Security Shape 2025 Internet

Catalyst: Supercharging Hybrid Compilation

Optimizing intricate hybrid quantum-classical operations is crucial, even beyond scaling the simulator. This need is met by PennyLane’s Quantum Just-In-Time (QJIT) compiler, Catalyst. By effectively assembling hybrid programs, Catalyst unlocks notable speedups and makes advanced features like AutoGraph and optimized dynamic quantum circuits possible.

The Lightning-AMDGPU (and Lightning-Kokkos) backend is intended to be smoothly integrated with Catalyst. Through this interaction, customers can leverage the raw throughput provided by Lightning operating on an AMD ROCm device, in conjunction with the compilation capabilities of Catalyst. With this combination, the entire workflow, not just the quantum circuit, can be just-in-time compiled for optimal performance and automatically offloaded to AMD GPUs.

Most importantly, Catalyst significantly enhances the optimization procedure. The time it takes PennyLane to perform passes like merging rotation gates and cancelling successive adjoint gates to optimize a sample circuit increases as the circuit gate depth increases. Catalyst, on the other hand, exhibits the strength of maintaining a control structure for quantum optimization since its compilation time is constant irrespective of the gate depth.

You can also read Global Quantum Intelligence Technology Predictions For 2026

Getting Started with PennyLane and AMD

There are various easy ways for developers to start utilizing PennyLane on AMD hardware right now:

Pip Install on AMD Developer Cloud: The easiest method to run on MI300X GPUs is using pip to install both PennyLane and pennylane-lightning-amdgpu.

Docker Images: Before executing the designated Docker command, users must follow AMD’s quick start instructions for their Container Toolkit. Pre-built images are provided.

Building from Source: For users seeking maximum performance, custom configurations (like those required on Frontier), or MPI scaling capabilities, building Lightning-AMDGPU or Lightning-Kokkos from source is recommended, using tools like CMake, Ninja, and hipcc.

These techniques enable users to run PennyLane programs like the Quantum Fourier Transform (QFT), the Bell State Circuit, and the Variational Quantum Circuit (VQC), which uses JAX for effective gradient computation. VQC and QFT benchmarks show notable speedups when executed on the GP (lightning.amdgpu) as opposed to the CPU (lightning.qubit).

You can also read CQCP invest CA$23M in Nord Quantiques Error-corrected Qubits