This is technically the first blog in my series: Towards Quantum Machine Learning. If you are looking to follow this series from the beginning, this is it… But there’s an intro blog, I posted roughly a week before, you can have a look at it here.
A couple of points to note: In this series, I’ll be assuming you have some basic knowledge in Machine Learning, and maybe a bit of an introductory knowledge in quantum computing… As mentioned in my previous blog, all of this discussion will have to do with CQ (Classical data generation, Quantum learning device) approach to QML.
Before getting into the topic, let’s have a look at the steps of a supervised quantum learning machine algorithm:
Mathematically, A (binary) quantum classifier consists of three well-defined functions:
(i) Encoding Function:
(ii) State Evolving Function:
One might wonder, like that guy who once started discussing about QML with me —
Well, we can’t Patrick… You can’t just take classical data, and feed it into a qModel somehow to solve your ML problem… Your world is different from the one your QML model runs in, you gotta do something at least at the meeting point… which well is really crucial for the QML model, and should always be looked up to.
In the first few episodes of the series, we’ll be discussing the various data encoding methods (commonly used ones) that have been proposed since the old times (bear with me Patrick, trust me it’s important). These mainly include:
The computational basis state of an n-qubit system is associated with a classical n-bit system. Each classical bit is literally replaced by a qubit, and the computation acts parallel on all bit sequences in superposition.
As it might be evident by the name, it associates the classical information in a real vector (which has to be normalized), with the quantum amplitudes. Essentially, this is what happens: (x is the normalized classical vector)
But it’s a really uncommon method in quantum computing nowadays, as there are much more workarounds in the literature doing the same thing.
Anyways, this encoding method comes with severe limitations:
- We can’t perform non-linear operations i.e., Non-unitary operations.
- This encoding essentially reduces a degree of freedom from the original data while representing as normalized amplitudes. (I’ll leave this to you to figure this out. Hint: Normalized data in 2-D has coordinates: (cosθ, sinθ))
Given a classical discrete probability distribution over 2^n binary strings, p, . . . . . . , p[2^n], the quantum state of an n-qubit quantum system can be given as:
This quantum state is sometimes referred to as a qsample. Probably more on this in the coming episodes…
As unitary operators restrict the class of matrices that they represent, it’s useful to associate a Hamiltonian H with a square matrix A. In case A is not Hermitian, one can instead use the trick of encoding
and only considering part of the output.
Basically, the classical vector is represented in an N-qubit quantum system, as the phase angles (one for each qubit) of the quantum state.
This might make more sense when written mathematically,…
Given a feature vector x = [x1, . . . . . , xN]T, the angle encoding mapping is given by:
This method has been adopted by a lot of recent papers and seems to have gotten quite a lot of attention.
Discussed above were some data encoding methods that have been proposed in the literature. Although some of them are quite favorites of the recent papers, the efficiency of a quantum learning model really depends a lot more on the learning algorithm used in the quantum learning model. Some data encoding methods work quite well with some learning algorithms and some don’t.
Here on, I’ll be discussing basis encoding. Data encoding is a crucial step for Quantum ML algorithms as the way classical information is represented in a quantum computer provides a context for designing quantum algorithms and possibilities for speed-ups, so we might as well get a better insight as to how we might actually go about doing it efficiently.
A strategy was introduced by Ventura, Martinez, and others: “Quantum associative memory” and “Probabilistic quantum memories”.
Okay, just to simplify things a bit (pun intended), I’ll be considering binary features, i.e., each bit represents one feature.
Right off the bat, we need a quantum system of the form:
I can explain: There are three registers — the first one is a loading register of N qubits, Next, the ancilla register (helping register), is basically a register whose values you consider as trash at the end of the day. Next on the list, is the N-qubit storage register. You’ll see why so in a minute…
Operation 1: Hadamard on 2nd qubit of ancilla register
After applying Hadamard gate on the ancilla register’s second qubit, the initial quantum state will get separated into two “branches” (superposition states):
Now, the algorithm will take place in the following way:
Iteratively the patterns are loaded into the loading register and the processing branch is split cleverly into two more branches to add the pattern to the memory branch. And thus, the superposition of the pattern is extended step by step.
To quickly explain what’s going on, I’ll be explaining (m+1)th iteration.
Assume, the first m vectors have already been encoded after 1, …, m iterations. The state will look like this:
All the m vectors have been stored in the storage register of the memory branch (each pattern in superposition).
Operation 2: Write (m+1)th pattern in loading register (both branches of course)
This can be done by applying an X gate to the qubits of the loading register where bits are nonzero in the input pattern.
Operation 3: Copy the pattern in storage register of processing branch using CNOT gates
The pattern gets copied into the processing branch’s storage register. To control, that only the processing branch gets affected, the CNOT gates are controlled with the second ancilla bit (a2). The state becomes:
Operation 4: Flip a1 to 1 of the processing branch
Do it using the CNOT gate controlled by a2
Operation 5: Apply a special unitary on a2 controlled by a1 (this essentially splits up the processing branch)
Apply the following unitary to a2, when a1=1, that is to the processing branch only:
where μ = M + 1 — (m + 1).
This operation splits the processing branch into two sub-branches. One of them that will be added to the memory branch and another will be left out to be the processing branch for the next iteration.
Operation 6: Add the required branch to the memory branch
Now, all we need to do is add the first sub-branch of the processing branch to the memory branch. This could be done, by just flipping the a1 register of the first sub-branch, using a (reverse) CNOT gate conditioned on a2=0 (i.e., turn a1 upside down if a2=0).
Now lastly, the storage register of the processing branch and the loading registers of memory and processing branches need to be made to the ground (zero) state before the next iteration. This could be done by just reversing the previous operations.
NOTE: This whole algorithm requires O(MN) steps to succeed with certainty.
There have been many interesting architectural proposals for quantum Random Access memory. One of those has been presented in the paper (2008) “Quantum random access memory”. But still, the hardware to implement this remains a problem for now,
Alright, that’s it for now… I will be back soon (hopefully soon alright) with another episode in the series, probably on another encoding method, or maybe something different this time.
I’ll just leave a little quote here from Richard Feynman
To every man is given the key to the gates of heaven. The same key opens the gates of hell. And so it is with science.