“Talking quantum circuits”

Interpretable and scalable quantum natural language processing

September 18, 2024
The central question that pre-occupies our team has been:

“How can quantum structures and quantum computers contribute to the effectiveness of AI?”

In previous work we have made notable advances in answering this question, and this article is based on our most recent work in the new papers [arXiv:2406.17583, arXiv:2408.06061], and most notably the experiment in [arXiv:2409.08777].

This article is one of a series that we will be publishing alongside further advances – advances that are accelerated by access to the most powerful quantum computers available.

Large language Models (LLMs) such as ChatGPT are having an impact on society across many walks of life. However, as users have become more familiar with this new technology, they have also become increasingly aware of deep-seated and systemic problems that come with AI systems built around LLM’s.

The primary problem with LLMs is that nobody knows how they work - as inscrutable “black boxes” they aren’t “interpretable”, meaning we can’t reliably or efficiently control or predict their behavior. This is unacceptable in many situations. In addition, Modern LLMs are incredibly expensive to build and run, costing serious – and potentially unsustainable –amounts of power to train and use. This is why more and more organizations, governments, and regulators are insisting on solutions.  

But how can we find these solutions, when we don’t fully understand what we are dealing with now?1

At Quantinuum, we have been working on natural language processing (NLP) using quantum computers for some time now. We are excited to have recently carried out experiments [arXiv: 2409.08777] which demonstrate not only how it is possible to train a model for a quantum computer in a scalable manner, but also how to do this in a way that is interpretable for us. Moreover, we have promising theoretical indications of the usefulness of quantum computers for interpretable NLP [arXiv:2408.06061].

In order to better understand why this could be the case, one needs to understand the ways in which meanings compose together throughout a story or narrative. Our work towards capturing them in a new model of language, which we call DisCoCirc, is reported on extensively in this previous blog post from 2023.

In new work referred to in this article, we embrace “compositional interpretability” as proposed in [arXiv:2406.17583] as a solution to the problems that plague current AI. In brief, compositional interpretability boils down to being able to assign a human friendly meaning, such as natural language, to the components of a model, and then being able to understand how they fit together2.

A problem currently inherent to quantum machine learning is that of being able to train at scale. We avoid this by making use of “compositional generalization”. This means we train small, on classical computers, and then at test time evaluate much larger examples on a quantum computer. There now exist quantum computers which are impossible to simulate classically. To train models for such computers, it seems that compositional generalization currently provides the only credible path.

1. Text as circuits

DisCoCirc is a circuit-based model for natural language that turns arbitrary text into “text circuits” [arXiv:1904.03478, arXiv:2301.10595, arXiv:2311.17892]. When we say that arbitrary text becomes ‘text-circuits’ we are converting the lines of text, which live in one dimension, into text-circuits which live in two-dimensions. These dimensions are the entities of the text versus the events in time.

To see how that works, consider the following story. In the beginning there is Alex and Beau. Alex meets Beau. Later, Chris shows up, and Beau marries Chris. Alex then kicks Beau.

The content of this story can be represented as the following circuit:

Figure 1. A text circuit for a simple story, involving three actors Alex, Beau andChris, who have a number of interactions with one another, making up a story –the circuit is to be read from top to bottom.
2. From text circuits to quantum circuits

Such a text circuit represents how the ‘actors’ in it interact with each other, and how their states evolve by doing so. Initially, we know nothing about Alex and Beau. Once Alex meets Beau, we know something about Alex and Beau’s interaction, then Beau marries Chris, and then Alex kicks Beau, so we know quite a bit more about all three, and in particular, how they relate to each other.

Let’s now take those circuits to be quantum circuits.

In the last section we will elaborate more why this could be a very good choice. For now it’s ok to understand that we simply follow the current paradigm of using vectors for meanings, in exactly the same way that this works in LLMs. Moreover, if we then also want to faithfully represent the compositional structure in language3, we can rely on theorem 5.49 from our book Picturing Quantum Processes, which informally can be stated as follows:

If the manner in which meanings of words (represented by vectors) compose obeys linguistic structure, then those vectors compose in exactly the same way as quantum systems compose.4

In short, a quantum implementation enables us to embrace compositional interpretability, as defined in our recent paper [arXiv:2406.17583].

3. Text circuits on our quantum computer

So, what have we done? And what does it mean?

We implemented a “question-answering” experiment on our Quantinuum quantum computers, for text circuits as described above. We know from our new paper [arXiv:2408.06061] that this is very hard to do on a classical computer due to the fact that as the size of the texts get bigger they very quickly become unrealistic to even try to do this on a classical computer, however powerful it might be. This is worth emphasizing. The experiment we have completed would scale exponentially using classical computers – to the point where the approach becomes intractable.

The experiment consisted of teaching (or training) the quantum computer to answer a question about a story, where both the story and question are presented as text-circuits. To test our model, we created longer stories in the same style as those used in training and questioned these. In our experiment, our stories were about people moving around, and we questioned the quantum computer about who was moving in the same direction at the end of the stories. A harder alternative one could imagine, would be having a murder mystery story and then asking the computer who was the murderer.

And remember - the training in our experiment constitutes the assigning of quantum states and gates to words that occur in the text.

Figure 2. The question-answering task for the language of text circuits as implementable on a quantum computer from [arXiv: 2409.08777]. Above the dotted line is the text we consider. Below are upside-down text circuits which constitute the question we ask. The boxes with words are parameterized as quantum gates. The diagram on the left constitutes one possible answer to the question, and the one on the right the other. Can you figure out what the text is and what the questions are?
4. Compositional generalization

The major reason for our excitement is that the training of our circuits enjoys compositional generalization. That is, we can do the training on small-scale ordinary computers, and do the testing, or asking the important questions, on quantum computers that can operate in ways not possible classically. Figure 4 shows how, despite only being trained on stories with up to 8 actors, the test accuracy remains high, even for much longer stories involving up to 30 actors.

Training large circuits directly in quantum machine learning, leads to difficulties which in many cases undo the potential advantage. Critically - compositional generalization allows us to bypass these issues.

Figure 3. A simplified plot from [arXiv:2409.08777] showing that increasing the sizes of circuits when testing doesn’t affect the accuracy, after training small-scale on ordinary computers. The number of actors correlates with the text size. H1-1 is the name of the Quantinuum quantum computer that was used.
5. Real-world comparison: ChatGPT

We can compare the results of our experiment on a quantum computer, to the success of a classical LLM ChatGPT (GPT-4) when asked the same questions.

What we are considering here is a story about a collection of characters that walk in a number of different directions, and sometimes follow each other. These are just some initial test examples, but it does show that this kind of reasoning is not particularly easy for LLMs.

The input to ChatGPT was:

What we got from ChatGPT:

Can you see where ChatGPT went wrong?

ChatGPT’s score (in terms of accuracy) oscillated around 50% (equivalent to random guessing). Our text circuits consistently outperformed ChatGPT on these tasks. Future work in this area would involve looking at prompt engineering – for example how the phrasing of the instructions can affect the output, and therefore the overall score.

Of course, we note that ChatGPT and other LLM’s will issue new versions that may or may not be marginally better with ‘question-answering’ tasks, and we also note that our own work may become far more effective as quantum computers rapidly become more powerful.

6. What’s next?

We have now turned our attention to work that will show that using vectors to represent meaning and requiring compositional interpretability for natural language takes us mathematically natively into the quantum formalism. This does not mean that there doesn't exist an efficient classical method for solving specific tasks, and it may be hard to prove traditional hardness results whenever there is some machine learning involved. This could be something we might have to come to terms with, just as in classical machine learning.

At Quantinuum we possess the most powerful quantum computers currently available. Our recently published roadmap is going to deliver more computationally powerful quantum computers in the short and medium term, as we extend our lead and push towards universal, fault tolerant quantum computers by the end of the decade. We expect to show even better (and larger scale) results when implementing our work on those machines. In short, we foresee a period of rapid innovation as powerful quantum computers that cannot be classically simulated become more readily available. This will likely be disruptive, as more and more use cases, including ones that we might not be currently thinking about, come into play.

Interestingly and intriguingly, we are also pioneering the use of powerful quantum computers in a hybrid system that has been described as a ‘quantum supercomputer’ where quantum computers, HPC and AI work together in an integrated fashion and look forward to using these systems to advance our work in language processing that can help solve the problem with LLM’s that we highlighted at the start of this article. 

1 And where do we go next, when we don’t even understand what we are dealing with now? On previous occasions in the history of science and technology, when efficient models without a clear interpretation have been developed, such as the Babylonian lunar theory or Ptolemy’s model of epicycles, these initially highly successful technologies vanished, making way for something else.

2 Note that our conception of compositionality is more general than the usual one adopted in linguistics, which is due to Frege. A discussion can be found in [arXiv: 2110.05327].

3 For example, using pregroups here as linguistic structure, which are the cups and caps of PQP.

4 That is, using the tensor product of the corresponding vector spaces.

About Quantinuum

Quantinuum, the world’s largest integrated quantum company, pioneers powerful quantum computers and advanced software solutions. Quantinuum’s technology drives breakthroughs in materials discovery, cybersecurity, and next-gen quantum AI. With over 500 employees, including 370+ scientists and engineers, Quantinuum leads the quantum computing revolution across continents. 

Blog
January 22, 2025
Quantum Computers Will Make AI Better
Today’s LLMs are often impressive by past standards – but they are far from perfect

Quietly, and determinedly since 2019, we’ve been working on Generative Quantum AI. Our early focus on building natively quantum systems for machine learning has benefitted from and been accelerated by access to the world’s most powerful quantum computers, and quantum computers that cannot be classically simulated.

Our work additionally benefits from being very close to our Helios generation quantum computer, built in Colorado, USA. Helios is 1 trillion times more powerful than our H2 System, which is already significantly more advanced than all other quantum computers available.

While tools like ChatGPT have already made a profound impact on society, a critical limitation to their broader industrial and enterprise use has become clear. Classical large language models (LLMs) are computational behemoths, prohibitively huge and expensive to train, and prone to errors that damage their credibility.

Training models like ChatGPT requires processing vast datasets with billions, even trillions, of parameters. This demands immense computational power, often spread across thousands of GPUs or specialized hardware accelerators. The environmental cost is staggering—simply training GPT-3, for instance, consumed nearly 1,300 megawatt-hours of electricity, equivalent to the annual energy use of 130 average U.S. homes.

This doesn’t account for the ongoing operational costs of running these models, which remain high with every query. 

Despite these challenges, the push to develop ever-larger models shows no signs of slowing down.

Enter quantum computing. Quantum technology offers a more sustainable, efficient, and high-performance solution—one that will fundamentally reshape AI, dramatically lowering costs and increasing scalability, while overcoming the limitations of today's classical systems. 

Quantum Natural Language Processing: A New Frontier

At Quantinuum we have been maniacally focused on “rebuilding” machine learning (ML) techniques for Natural Language Processing (NLP) using quantum computers. 

Our research team has worked on translating key innovations in natural language processing — such as word embeddings, recurrent neural networks, and transformers — into the quantum realm. The ultimate goal is not merely to port existing classical techniques onto quantum computers but to reimagine these methods in ways that take full advantage of the unique features of quantum computers.

We have a deep bench working on this. Our Head of AI, Dr. Steve Clark, previously spent 14 years as a faculty member at Oxford and Cambridge, and over 4 years as a Senior Staff Research Scientist at DeepMind in London. He works closely with Dr. Konstantinos Meichanetzidis, who is our Head of Scientific Product Development and who has been working for years at the intersection of quantum many-body physics, quantum computing, theoretical computer science, and artificial intelligence.

A critical element of the team’s approach to this project is avoiding the temptation to simply “copy-paste”, i.e. taking the math from a classical version and directly implementing that on a quantum computer. 

This is motivated by the fact that quantum systems are fundamentally different from classical systems: their ability to leverage quantum phenomena like entanglement and interference ultimately changes the rules of computation. By ensuring these new models are properly mapped onto the quantum architecture, we are best poised to benefit from quantum computing’s unique advantages. 

These advantages are not so far in the future as we once imagined – partially driven by our accelerating pace of development in hardware and quantum error correction.

Making computers “talk”- a short history

The ultimate problem of making a computer understand a human language isn’t unlike trying to learn a new language yourself – you must hear/read/speak lots of examples, memorize lots of rules and their exceptions, memorize words and their meanings, and so on. However, it’s more complicated than that when the “brain” is a computer. Computers naturally speak their native languages very well, where everything from machine code to Python has a meaningful structure and set of rules. 

In contrast, “natural” (human) language is very different from the strict compliance of computer languages: things like idioms confound any sense of structure, humor and poetry play with semantics in creative ways, and the language itself is always evolving. Still, people have been considering this problem since the 1950’s (Turing’s original “test” of intelligence involves the automated interpretation and generation of natural language).

Up until the 1980s, most natural language processing systems were based on complex sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in natural language processing with the introduction of machine learning algorithms for language processing. 

Initial ML approaches were largely “statistical”: by analyzing large amounts of text data, one can identify patterns and probabilities. There were notable successes in translation (like translating French into English), and the birth of the web led to more innovations in learning from and handling big data.

What many consider “modern” NLP was born in the late 2000’s, when expanded compute power and larger datasets enabled practical use of neural networks. Being mathematical models, neural networks are “built” out of the tools of mathematics; specifically linear algebra and calculus. 

Building a neural network, then, means finding ways to manipulate language using the tools of linear algebra and calculus. This means representing words and sentences as vectors and matrices, developing tools to manipulate them, and so on. This is precisely the path that researchers in classical NLP have been following for the past 15 years, and the path that our team is now speedrunning in the quantum case.

Quantum Word Embeddings: A Complex Twist

The first major breakthrough in neural NLP came roughly a decade ago, when vector representations of words were developed, using the frameworks known as Word2Vec and GloVe (Global Vectors for Word Representation). In a recent paper, our team, including Carys Harvey and Douglas Brown, demonstrated how to do this in quantum NLP models – with a crucial twist. Instead of embedding words as real-valued vectors (as in the classical case), the team built it to work with complex-valued vectors.

In quantum mechanics, the state of a physical system is represented by a vector residing in a complex vector space, called a Hilbert space. By embedding words as complex vectors, we are able to map language into parameterized quantum circuits, and ultimately the qubits in our processor. This is a major advance that was largely under appreciated by the AI community but which is now rapidly gaining interest.

Using complex-valued word embeddings for QNLP means that from the bottom-up we are working with something fundamentally different. This different “geometry” may provide advantage in any number of areas: natural language has a rich probabilistic and hierarchical structure that may very well benefit from the richer representation of complex numbers.

The Quantum Recurrent Neural Network (RNN)

Another breakthrough comes from the development of quantum recurrent neural networks (RNNs). RNNs are commonly used in classical NLP to handle tasks such as text classification and language modeling. 

Our team, including Dr. Wenduan Xu, Douglas Brown, and Dr. Gabriel Matos, implemented a quantum version of the RNN using parameterized quantum circuits (PQCs). PQCs allow for hybrid quantum-classical computation, where quantum circuits process information and classical computers optimize the parameters controlling the quantum system.

In a recent experiment, the team used their quantum RNN to perform a standard NLP task: classifying movie reviews from Rotten Tomatoes as positive or negative. Remarkably, the quantum RNN performed as well as classical RNNs, GRUs, and LSTMs, using only four qubits. This result is notable for two reasons: it shows that quantum models can achieve competitive performance using a much smaller vector space, and it demonstrates the potential for significant energy savings in the future of AI.

In a similar experiment, our team partnered with Amgen to use PQCs for peptide classification, which is a standard task in computational biology. Working on the Quantinuum System Model H1, the joint team performed sequence classification (used in the design of therapeutic proteins), and they found competitive performance with classical baselines of a similar scale. This work was our first proof-of-concept application of near-term quantum computing to a task critical to the design of therapeutic proteins, and helped us to elucidate the route toward larger-scale applications in this and related fields, in line with our hardware development roadmap.

Quantum Transformers - The Next Big Leap

Transformers, the architecture behind models like GPT-3, have revolutionized NLP by enabling massive parallelism and state-of-the-art performance in tasks such as language modeling and translation. However, transformers are designed to take advantage of the parallelism provided by GPUs, something quantum computers do not yet do in the same way.

In response, our team, including Nikhil Khatri and Dr. Gabriel Matos, introduced “Quixer”, a quantum transformer model tailored specifically for quantum architectures. 

By using quantum algorithmic primitives, Quixer is optimized for quantum hardware, making it highly qubit efficient. In a recent study, the team applied Quixer to a realistic language modeling task and achieved results competitive with classical transformer models trained on the same data. 

This is an incredible milestone achievement in and of itself. 

This paper also marks the first quantum machine learning model applied to language on a realistic rather than toy dataset. 

This is a truly exciting advance for anyone interested in the union of quantum computing and artificial intelligence, and is in danger of being lost in the increased ‘noise’ from the quantum computing sector where organizations who are trying to raise capital will try to highlight somewhat trivial advances that are often duplicative.

Quantum Tensor Networks. A Scalable Approach

Carys Harvey and Richie Yeung from Quantinuum in the UK worked with a broader team that explored the use of quantum tensor networks for NLP. Tensor networks are mathematical structures that efficiently represent high-dimensional data, and they have found applications in everything from quantum physics to image recognition. In the context of NLP, tensor networks can be used to perform tasks like sequence classification, where the goal is to classify sequences of words or symbols based on their meaning.

The team performed experiments on our System Model H1, finding comparable performance to classical baselines. This marked the first time a scalable NLP model was run on quantum hardware – a remarkable advance. 

The tree-like structure of quantum tensor models lends itself incredibly well to specific features inherent to our architecture such as mid-circuit measurement and qubit re-use, allowing us to squeeze big problems onto few qubits.

Since quantum theory is inherently described by tensor networks, this is another example of how fundamentally different quantum machine learning approaches can look – again, there is a sort of “intuitive” mapping of the tensor networks used to describe the NLP problem onto the tensor networks used to describe the operation of our quantum processors.

What we’ve learned so far

While it is still very early days, we have good indications that running AI on quantum hardware will be more energy efficient. 

We recently published a result in “random circuit sampling”, a task used to compare quantum to classical computers. We beat the classical supercomputer in time to solution as well as energy use – our quantum computer cost 30,000x less energy to complete the task than Frontier, the classical supercomputer we compared against. 

We may see, as our quantum AI models grow in power and size, that there is a similar scaling in energy use: it’s generally more efficient to use ~100 qubits than it is to use ~10^18 classical bits.

Another major insight so far is that quantum models tend to require significantly fewer parameters to train than their classical counterparts. In classical machine learning, particularly in large neural networks, the number of parameters can grow into the billions, leading to massive computational demands. 

Quantum models, by contrast, leverage the unique properties of quantum mechanics to achieve comparable performance with a much smaller number of parameters. This could drastically reduce the energy and computational resources required to run these models.

The Path Ahead

As quantum computing hardware continues to improve, quantum AI models may increasingly complement or even replace classical systems. By leveraging quantum superposition, entanglement, and interference, these models offer the potential for significant reductions in both computational cost and energy consumption. With fewer parameters required, quantum models could make AI more sustainable, tackling one of the biggest challenges facing the industry today.

The work being done by Quantinuum reflects the start of the next chapter in AI, and one that is transformative. As quantum computing matures, its integration with AI has the potential to unlock entirely new approaches that are not only more efficient and performant but can also handle the full complexities of natural language. The fact that Quantinuum’s quantum computers are the most advanced in the world, and cannot be simulated classically, gives us a unique glimpse into a future. 

The future of AI now looks very much to be quantum and Quantinuum’s Gen QAI system will usher in the era in which our work will have meaningful societal impact.

technical
All
Blog
December 9, 2024
Q2B 2024: The Roadmap to Quantum Value

At this year’s Q2B Silicon Valley conference from December 10th – 12th in Santa Clara, California, the Quantinuum team will be participating in plenary and case study sessions to showcase our quantum computing technologies. 

Schedule a meeting with us at Q2B

Meet our team at Booth #G9 to discover how Quantinuum is charting the path to universal, fully fault-tolerant quantum computing. 

Join our sessions: 

Tuesday, Dec 10, 10:00 - 10:20am PT

Plenary: Advancements in Fault-Tolerant Quantum Computation: Demonstrations and Results

There is industry-wide consensus on the need for fault-tolerant QPU’s, but demonstrations of these abilities are less common. In this talk, Dr. Hayes will review Quantinuum’s long list of meaningful demonstrations in fault-tolerance, including real-time error correction, a variety of codes from the surface code to exotic qLDPC codes, logical benchmarking, beyond break-even behavior on multiple codes and circuit families.

View the presentation

Wednesday, Dec 11, 4:30 – 4:50pm PT

Keynote: Quantum Tokens: Securing Digital Assets with Quantum Physics

Mitsui’s Deputy General Manager, Quantum Innovation Dept., Corporate Development Div., Koji Naniwada, and Quantinuum’s Head of Cybersecurity, Duncan Jones will deliver a keynote presentation on a case study for quantum in cybersecurity. Together, our organizations demonstrated the first implementation of quantum tokens over a commercial QKD network. Quantum tokens enable three previously incompatible properties: unforgeability guaranteed by physics, fast settlement without centralized validation, and user privacy until redemption. We present results from our successful Tokyo trial using NEC's QKD commercial hardware and discuss potential applications in financial services.

Details on the case study

Wednesday, Dec 11, 5:10 – 6:10pm PT

Quantinuum and Mitsui Sponsored Happy Hour

Join the Quantinuum and Mitsui teams in the expo hall for a networking happy hour. 

events
All