Quantum classifiers use enhanced feature spaces

TL;DR

Quantum Classifiers Proposed: Researchers introduce quantum variational classifier and quantum kernel estimator using quantum feature spaces for supervised classification.[[1]](https://www.nature.com/articles/s41586-019-0980-2)[[2]](https://arxiv.org/abs/1804.11326)
2-Qubit Experiments: Implemented on a superconducting processor with 2 qubits, achieving up to 100% accuracy on artificial 2D datasets.[[3]](https://arxiv.org/pdf/1804.11326)
NISQ Applications: Methods work on noisy hardware with error mitigation, enabling exploration of quantum machine learning tools.[[1]](https://www.nature.com/articles/s41586-019-0980-2)

The story at a glance

IBM researchers Vojtěch Havlíček and colleagues propose two quantum algorithms for supervised classification that map data into high-dimensional quantum feature spaces. They implement these—a variational quantum classifier and a quantum kernel estimator—on a 5-qubit superconducting transmon processor using 2 qubits. The work demonstrates feasibility on noisy intermediate-scale quantum (NISQ) devices amid rising interest in quantum machine learning. It builds on kernel methods like support vector machines (SVMs), which struggle with large feature spaces classically.

Key points

Proposes quantum variational classifier: data encoded via feature map circuit into quantum state, variational circuit optimizes separating hyperplane, trained with classical optimizer like SPSA.
Introduces quantum kernel estimator: quantum computer computes kernel matrix entries as state overlaps, classical SVM optimizes on that matrix.
Experiments use artificial 2D datasets (20 training points per class, domain (0, 2π]×(0, 2π]), perfectly separable in quantum feature space with gap Δ=0.3.
Feature map: U_Φ(x̃) with phases φ1=x1, φ2=π x1 - π x2, creating 16-dimensional effective space from 2 qubits.
Variational circuits up to depth l=4 (8 CNOTs), with error mitigation via zero-noise extrapolation on slowed gates.
Kernel estimation: 100% success on Sets I/II, 94.75% average on Set III over 10 test sets.
Processor details: qubits Q0/Q1, T1 ~46-55 μs, T2 ~16-43 μs, readout fidelity 95-96.56%, single-qubit gates ~83 ns.

Details and context

Classical kernel methods like SVMs map data to high-dimensional spaces but face limits when kernels are expensive to compute for large dimensions. Quantum methods exploit Hilbert space dimensionality (4^n for n qubits) via feature maps that may yield classically hard-to-estimate kernels, conjectured #P-hard for some circuits.

Both algorithms encode classical data x̃ into |Φ(x̃)⟩ using unitary U_Φ, leveraging entanglement for non-linear separability. Variational classifier minimizes empirical risk R_emp(θ) ≈ (1/2) erfc(∑ w_α Φ_α(x̃) / √2), analogous to SVM soft-margin. Kernel method estimates K(x̃_i, x̃_j) = |⟨Φ(x̃_i)|Φ(x̃_j)⟩|^2 from measurement probabilities.

Tests used synthetic data designed for quantum separability, with random unitaries V and parity checks. Deeper circuits improved performance but needed mitigation for noise; shallow ones converged faster. No real-world datasets tested, focusing on proof-of-principle for NISQ hardware.

Key quotes

"Both methods represent the feature space of a classification problem by a quantum state, taking advantage of the large dimensionality of quantum Hilbert space to obtain an enhanced solution."[[2]](https://arxiv.org/abs/1804.11326)

Why it matters

Quantum-enhanced kernels could handle feature spaces too vast for classical simulation, potentially accelerating classification in high-dimensional tasks like chemistry or finance. For researchers, it provides practical NISQ algorithms with demonstrated near-perfect accuracy on hardware despite noise. Watch for extensions to real datasets and proofs of quantum advantage over classical kernels.

What changed

No prior quantum classifiers tested on superconducting hardware; now variational and kernel methods achieve 94-100% accuracy on 2-qubit processor; published March 13, 2019.[[1]](https://www.nature.com/articles/s41586-019-0980-2)[[3]](https://arxiv.org/pdf/1804.11326)

FAQ

Q: What feature map do the algorithms use?

A: The feature map U_Φ(x̃) applies Hadamards, diagonal phase unitaries exp(i φ_k Z^⊗k) with φ1=x1, φ2=π x1 - π x2, and Hadamards again to encode 2D data into a 2-qubit state. This creates shifts and nonlinearities for separability. Experiments confirm it enables perfect classification for designed datasets.[[3]](https://arxiv.org/pdf/1804.11326)

Q: How is the variational classifier trained?

A: Parameters θ in the variational circuit W(θ) are optimized classically via SPSA to minimize empirical risk R_emp(θ), approximated using sigmoid on Z-basis measurements. Depth l up to 4 layers of rotations and CZ entanglers. Error mitigation via zero-noise extrapolation improves convergence.[[3]](https://arxiv.org/pdf/1804.11326)

Q: What accuracy did the kernel estimator achieve?

A: 100% success classifying test sets for data Sets I and II, averaging 94.75% for Set III across 10 runs. Kernels computed as |⟨Φ(x̃_i)|Φ(x̃_j)⟩|^2 from all-zero outcomes, fed to classical SVM solver.[[3]](https://arxiv.org/pdf/1804.11326)

Q: What hardware limitations were addressed?

A: Used 2 of 5 transmon qubits on superconducting chip with ~50 μs coherence, sub-μs gates; readout ~96% fidelity. Noise mitigated by slowing gates 1.5x for extrapolation. Deeper circuits (l=4, 8 CNOTs) still reached ~100% test accuracy.[[3]](https://arxiv.org/pdf/1804.11326)

[[1]](https://www.nature.com/articles/s41586-019-0980-2)[[2]](https://arxiv.org/abs/1804.11326)[[3]](https://arxiv.org/pdf/1804.11326)