“Talking quantum circuits”

Interpretable and scalable quantum natural language processing

September 18, 2024
The central question that pre-occupies our team has been:

“How can quantum structures and quantum computers contribute to the effectiveness of AI?”

In previous work we have made notable advances in answering this question, and this article is based on our most recent work in the new papers [arXiv:2406.17583, arXiv:2408.06061], and most notably the experiment in [arXiv:2409.08777].

This article is one of a series that we will be publishing alongside further advances – advances that are accelerated by access to the most powerful quantum computers available.

Large language Models (LLMs) such as ChatGPT are having an impact on society across many walks of life. However, as users have become more familiar with this new technology, they have also become increasingly aware of deep-seated and systemic problems that come with AI systems built around LLM’s.

The primary problem with LLMs is that nobody knows how they work - as inscrutable “black boxes” they aren’t “interpretable”, meaning we can’t reliably or efficiently control or predict their behavior. This is unacceptable in many situations. In addition, Modern LLMs are incredibly expensive to build and run, costing serious – and potentially unsustainable –amounts of power to train and use. This is why more and more organizations, governments, and regulators are insisting on solutions.  

But how can we find these solutions, when we don’t fully understand what we are dealing with now?1

At Quantinuum, we have been working on natural language processing (NLP) using quantum computers for some time now. We are excited to have recently carried out experiments [arXiv: 2409.08777] which demonstrate not only how it is possible to train a model for a quantum computer in a scalable manner, but also how to do this in a way that is interpretable for us. Moreover, we have promising theoretical indications of the usefulness of quantum computers for interpretable NLP [arXiv:2408.06061].

In order to better understand why this could be the case, one needs to understand the ways in which meanings compose together throughout a story or narrative. Our work towards capturing them in a new model of language, which we call DisCoCirc, is reported on extensively in this previous blog post from 2023.

In new work referred to in this article, we embrace “compositional interpretability” as proposed in [arXiv:2406.17583] as a solution to the problems that plague current AI. In brief, compositional interpretability boils down to being able to assign a human friendly meaning, such as natural language, to the components of a model, and then being able to understand how they fit together2.

A problem currently inherent to quantum machine learning is that of being able to train at scale. We avoid this by making use of “compositional generalization”. This means we train small, on classical computers, and then at test time evaluate much larger examples on a quantum computer. There now exist quantum computers which are impossible to simulate classically. To train models for such computers, it seems that compositional generalization currently provides the only credible path.

1. Text as circuits

DisCoCirc is a circuit-based model for natural language that turns arbitrary text into “text circuits” [arXiv:1904.03478, arXiv:2301.10595, arXiv:2311.17892]. When we say that arbitrary text becomes ‘text-circuits’ we are converting the lines of text, which live in one dimension, into text-circuits which live in two-dimensions. These dimensions are the entities of the text versus the events in time.

To see how that works, consider the following story. In the beginning there is Alex and Beau. Alex meets Beau. Later, Chris shows up, and Beau marries Chris. Alex then kicks Beau.

The content of this story can be represented as the following circuit:

Figure 1. A text circuit for a simple story, involving three actors Alex, Beau andChris, who have a number of interactions with one another, making up a story –the circuit is to be read from top to bottom.
2. From text circuits to quantum circuits

Such a text circuit represents how the ‘actors’ in it interact with each other, and how their states evolve by doing so. Initially, we know nothing about Alex and Beau. Once Alex meets Beau, we know something about Alex and Beau’s interaction, then Beau marries Chris, and then Alex kicks Beau, so we know quite a bit more about all three, and in particular, how they relate to each other.

Let’s now take those circuits to be quantum circuits.

In the last section we will elaborate more why this could be a very good choice. For now it’s ok to understand that we simply follow the current paradigm of using vectors for meanings, in exactly the same way that this works in LLMs. Moreover, if we then also want to faithfully represent the compositional structure in language3, we can rely on theorem 5.49 from our book Picturing Quantum Processes, which informally can be stated as follows:

If the manner in which meanings of words (represented by vectors) compose obeys linguistic structure, then those vectors compose in exactly the same way as quantum systems compose.4

In short, a quantum implementation enables us to embrace compositional interpretability, as defined in our recent paper [arXiv:2406.17583].

3. Text circuits on our quantum computer

So, what have we done? And what does it mean?

We implemented a “question-answering” experiment on our Quantinuum quantum computers, for text circuits as described above. We know from our new paper [arXiv:2408.06061] that this is very hard to do on a classical computer due to the fact that as the size of the texts get bigger they very quickly become unrealistic to even try to do this on a classical computer, however powerful it might be. This is worth emphasizing. The experiment we have completed would scale exponentially using classical computers – to the point where the approach becomes intractable.

The experiment consisted of teaching (or training) the quantum computer to answer a question about a story, where both the story and question are presented as text-circuits. To test our model, we created longer stories in the same style as those used in training and questioned these. In our experiment, our stories were about people moving around, and we questioned the quantum computer about who was moving in the same direction at the end of the stories. A harder alternative one could imagine, would be having a murder mystery story and then asking the computer who was the murderer.

And remember - the training in our experiment constitutes the assigning of quantum states and gates to words that occur in the text.

Figure 2. The question-answering task for the language of text circuits as implementable on a quantum computer from [arXiv: 2409.08777]. Above the dotted line is the text we consider. Below are upside-down text circuits which constitute the question we ask. The boxes with words are parameterized as quantum gates. The diagram on the left constitutes one possible answer to the question, and the one on the right the other. Can you figure out what the text is and what the questions are?
4. Compositional generalization

The major reason for our excitement is that the training of our circuits enjoys compositional generalization. That is, we can do the training on small-scale ordinary computers, and do the testing, or asking the important questions, on quantum computers that can operate in ways not possible classically. Figure 4 shows how, despite only being trained on stories with up to 8 actors, the test accuracy remains high, even for much longer stories involving up to 30 actors.

Training large circuits directly in quantum machine learning, leads to difficulties which in many cases undo the potential advantage. Critically - compositional generalization allows us to bypass these issues.

Figure 3. A simplified plot from [arXiv:2409.08777] showing that increasing the sizes of circuits when testing doesn’t affect the accuracy, after training small-scale on ordinary computers. The number of actors correlates with the text size. H1-1 is the name of the Quantinuum quantum computer that was used.
5. Real-world comparison: ChatGPT

We can compare the results of our experiment on a quantum computer, to the success of a classical LLM ChatGPT (GPT-4) when asked the same questions.

What we are considering here is a story about a collection of characters that walk in a number of different directions, and sometimes follow each other. These are just some initial test examples, but it does show that this kind of reasoning is not particularly easy for LLMs.

The input to ChatGPT was:

What we got from ChatGPT:

Can you see where ChatGPT went wrong?

ChatGPT’s score (in terms of accuracy) oscillated around 50% (equivalent to random guessing). Our text circuits consistently outperformed ChatGPT on these tasks. Future work in this area would involve looking at prompt engineering – for example how the phrasing of the instructions can affect the output, and therefore the overall score.

Of course, we note that ChatGPT and other LLM’s will issue new versions that may or may not be marginally better with ‘question-answering’ tasks, and we also note that our own work may become far more effective as quantum computers rapidly become more powerful.

6. What’s next?

We have now turned our attention to work that will show that using vectors to represent meaning and requiring compositional interpretability for natural language takes us mathematically natively into the quantum formalism. This does not mean that there doesn't exist an efficient classical method for solving specific tasks, and it may be hard to prove traditional hardness results whenever there is some machine learning involved. This could be something we might have to come to terms with, just as in classical machine learning.

At Quantinuum we possess the most powerful quantum computers currently available. Our recently published roadmap is going to deliver more computationally powerful quantum computers in the short and medium term, as we extend our lead and push towards universal, fault tolerant quantum computers by the end of the decade. We expect to show even better (and larger scale) results when implementing our work on those machines. In short, we foresee a period of rapid innovation as powerful quantum computers that cannot be classically simulated become more readily available. This will likely be disruptive, as more and more use cases, including ones that we might not be currently thinking about, come into play.

Interestingly and intriguingly, we are also pioneering the use of powerful quantum computers in a hybrid system that has been described as a ‘quantum supercomputer’ where quantum computers, HPC and AI work together in an integrated fashion and look forward to using these systems to advance our work in language processing that can help solve the problem with LLM’s that we highlighted at the start of this article. 

1 And where do we go next, when we don’t even understand what we are dealing with now? On previous occasions in the history of science and technology, when efficient models without a clear interpretation have been developed, such as the Babylonian lunar theory or Ptolemy’s model of epicycles, these initially highly successful technologies vanished, making way for something else.

2 Note that our conception of compositionality is more general than the usual one adopted in linguistics, which is due to Frege. A discussion can be found in [arXiv: 2110.05327].

3 For example, using pregroups here as linguistic structure, which are the cups and caps of PQP.

4 That is, using the tensor product of the corresponding vector spaces.

About Quantinuum

Quantinuum, the world’s largest integrated quantum company, pioneers powerful quantum computers and advanced software solutions. Quantinuum’s technology drives breakthroughs in materials discovery, cybersecurity, and next-gen quantum AI. With over 500 employees, including 370+ scientists and engineers, Quantinuum leads the quantum computing revolution across continents. 

Blog
December 9, 2024
Q2B 2024: The Roadmap to Quantum Value

At this year’s Q2B Silicon Valley conference from December 10th – 12th in Santa Clara, California, the Quantinuum team will be participating in plenary and case study sessions to showcase our quantum computing technologies. 

Schedule a meeting with us at Q2B

Meet our team at Booth #G9 to discover how Quantinuum is charting the path to universal, fully fault-tolerant quantum computing. 

Join our sessions: 

Tuesday, Dec 10, 10:00 - 10:20am PT

Plenary: Advancements in Fault-Tolerant Quantum Computation: Demonstrations and Results

There is industry-wide consensus on the need for fault-tolerant QPU’s, but demonstrations of these abilities are less common. In this talk, Dr. Hayes will review Quantinuum’s long list of meaningful demonstrations in fault-tolerance, including real-time error correction, a variety of codes from the surface code to exotic qLDPC codes, logical benchmarking, beyond break-even behavior on multiple codes and circuit families.

View the presentation

Wednesday, Dec 11, 4:30 – 4:50pm PT

Keynote: Quantum Tokens: Securing Digital Assets with Quantum Physics

Mitsui’s Deputy General Manager, Quantum Innovation Dept., Corporate Development Div., Koji Naniwada, and Quantinuum’s Head of Cybersecurity, Duncan Jones will deliver a keynote presentation on a case study for quantum in cybersecurity. Together, our organizations demonstrated the first implementation of quantum tokens over a commercial QKD network. Quantum tokens enable three previously incompatible properties: unforgeability guaranteed by physics, fast settlement without centralized validation, and user privacy until redemption. We present results from our successful Tokyo trial using NEC's QKD commercial hardware and discuss potential applications in financial services.

Details on the case study

Wednesday, Dec 11, 5:10 – 6:10pm PT

Quantinuum and Mitsui Sponsored Happy Hour

Join the Quantinuum and Mitsui teams in the expo hall for a networking happy hour. 

events
All
Blog
December 5, 2024
Quantum computing is accelerating

Particle accelerator projects like the Large Hadron Collider (LHC) don’t just smash particles - they also power the invention of some of the world’s most impactful technologies. A favorite example is the world wide web, which was developed for particle physics experiments at CERN.

Tech designed to unlock the mysteries of the universe has brutally exacting requirements – and it is this boundary pushing, plus billion-dollar budgets, that has led to so much innovation. 

For example, X-rays are used in accelerators to measure the chemical composition of the accelerator products and to monitor radiation. The understanding developed to create those technologies was then applied to help us build better CT scanners, reducing the x-ray dosage while improving the image quality. 

Stories like this are common in accelerator physics, or High Energy Physics (HEP). Scientists and engineers working in HEP have been early adopters and/or key drivers of innovations in advanced cancer treatments (using proton beams), machine learning techniques, robots, new materials, cryogenics, data handling and analysis, and more. 

A key strand of HEP research aims to make accelerators simpler and cheaper. A key piece of infrastructure that could be improved is their computing environments. 

CERN itself has said: “CERN is one of the most highly demanding computing environments in the research world... From software development, to data processing and storage, networks, support for the LHC and non-LHC experimental programme, automation and controls, as well as services for the accelerator complex and for the whole laboratory and its users, computing is at the heart of CERN’s infrastructure.” 

With annual data generated by accelerators in excess of exabytes (a billion gigabytes), tens of millions of lines of code written to support the experiments, and incredibly demanding hardware requirements, it’s no surprise that the HEP community is interested in quantum computing, which offers real solutions to some of their hardest problems. 

As the authors of this paper stated: “[Quantum Computing] encompasses several defining characteristics that are of particular interest to experimental HEP: the potential for quantum speed-up in processing time, sensitivity to sources of correlations in data, and increased expressivity of quantum systems... Experiments running on high-luminosity accelerators need faster algorithms; identification and reconstruction algorithms need to capture correlations in signals; simulation and inference tools need to express and calculate functions that are classically intractable.”

The HEP community’s interest in quantum computing is growing. In recent years, their scientists have been looking carefully at how quantum computing could help them, publishing a number of papers discussing the challenges and requirements for quantum technology to make a dent (here’s one example, and here’s the arXiv version). 

In the past few months, what was previously theoretical is becoming a reality. Several groups published results using quantum machines to tackle something called “Lattice Gauge Theory”, which is a type of math used to describe a broad range of phenomena in HEP (and beyond). Two papers came from academic groups using quantum simulators, one using trapped ions and one using neutral atoms. Another group, including scientists from Google, tackled Lattice Gauge Theory using a superconducting quantum computer. Taken together, these papers indicate a growing interest in using quantum computing for High Energy Physics, beyond simple one-dimensional systems which are more easily accessible with classical methods such as tensor networks.

We have been working with DESY, one of the world’s leading accelerator centers, to help make quantum computing useful for their work. DESY, short for Deutsches Elektronen-Synchrotron, is a national research center that operates, develops, and constructs particle accelerators, and is part of the worldwide computer network used to store and analyze the enormous flood of data that is produced by the LHC in Geneva.  

Our first publication from this partnership describes a quantum machine learning technique for untangling data from the LHC, finding that in some cases the quantum approach was indeed superior to the classical approach. More recently, we used Quantinuum System Model H1 to tackle Lattice Gauge Theory (LGT), as it’s a favorite contender for quantum advantage in HEP.

Lattice Gauge Theories are one approach to solving what are more broadly referred to as “quantum many-body problems”. Quantum many-body problems lie at the border of our knowledge in many different fields, such as the electronic structure problem which impacts chemistry and pharmaceuticals, or the quest for understanding and engineering new material properties such as light harvesting materials; to basic research such as high energy physics, which aims to understand the fundamental constituents of the universe,  or condensed matter physics where our understanding of things like high-temperature superconductivity is still incomplete.

The difficulty in solving problems like this – analytically or computationally – is that the problem complexity grows exponentially with the size of the system. For example, there are 36 possible configurations of two six-faced dice (1 and 1 or 1 and 2 or 1and 3... etc), while for ten dice there are more than sixty million configurations.

Quantum computing may be very well-suited to tackling problems like this, due to a quantum processor’s similar information density scaling – with the addition of a single qubit to a QPU, the information the system contains doubles. Our 56-qubit System Model H2, for example, can hold quantum states that require 128*(2^56) bits worth of information to describe (with double-precision numbers) on a classical supercomputer, which is more information than the biggest supercomputer in the world can hold in memory.

The joint team made significant progress in approaching the Lattice Gauge Theory corresponding to Quantum Electrodynamics, the theory of light and matter. For the first time, they were able study the full wavefunction of a two-dimensional confining system with gauge fields and dynamical matter fields on a quantum processor. They were also able to visualize the confining string and the string-breaking phenomenon at the level of the wavefunction, across a range of interaction strengths.

The team approached the problem starting with the definition of the Hamiltonian using the InQuanto software package, and utilized the reusable protocols of InQuanto to compute both projective measurements and expectation values. InQuanto allowed the easy integration of measurement reduction techniques and scalable error mitigation techniques. Moreover, the emulator and hardware experiments were orchestrated by the Nexus online platform.

In one section of the study, a circuit with 24 qubits and more than 250 two-qubit gates was reduced to a smaller width of 15 qubits thanks our unique qubit re-use and mid-circuit measurement automatic compilation implemented in TKET.

This work paves the way towards using quantum computers to study lattice gauge theories in higher dimensions, with the goal of one day simulating the full three-dimensional Quantum Chromodynamics theory underlying the nuclear sector of the Standard Model of particle physics. Being able to simulate full 3D quantum chromodynamics will undoubtedly unlock many of Nature’s mysteries, from the Big Bang to the interior of neutron stars, and is likely to lead to applications we haven’t yet dreamed of. 

technical
All