Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Micro-simulator in C #37

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from
Draft

Micro-simulator in C #37

wants to merge 13 commits into from

Conversation

stavros11
Copy link
Member

Adds a micro-state vector-simulator written in C based on the qibojit functions. It can be used to simulate only single-qubit gates controlled on arbitrary number of qubits. This is sufficient to execute the circuit in the example.

Most likely it is possible to improve some things, in particular how the elements (gate targets) vector is moved from Rust to C and how gates are mapped to matrices. Maybe also the compilation process. For now I compiled with:

cc -c microsimulator/microsim.c -o microsim.o -I./prefix/include/qibo_core_c -L./prefix/lib/x86_64-linux-gnu -lqibo_core_c
cc -c examples/circuit.c -o circuit.o -I./microsimulator -I./prefix/include/qibo_core_c -L./prefix/lib/x86_64-linux-gnu -lqibo_core_c
cc microsim.o circuit.o -o circuit -I./microsimulator -I./prefix/include/qibo_core_c -L./prefix/lib/x86_64-linux-gnu -lqibo_core_c

ran inside the crates/c directory.

The example circuit can be executed and the result agrees with qibo-numpy simulation.

Copy link
Member

@alecandido alecandido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, just for you to know.

However, I will fix everything by myself, no worries. I'd only need some input from you to understand how the simulator is working (mainly just the comment about the control index).

I will do that as well, but an update to the Makefile would be nice as well.

crates/c/microsimulator/microsim.c Outdated Show resolved Hide resolved
for (size_t j = 0; j < nqubits; j++) {
size_t const n = qubits[j];
size_t const k = 1 << n;
i = ((i >> n) << (n + 1)) + (i & (k - 1)) + k;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(i >> n) << n is essentially a floor(i / 2**n) * 2**n (as floats). I.e. you're wiping the rightmost n bits.
So (i >> n) << (n + 1) is a ceil(i / 2**n) * 2**n) (as floats).

+ (i & k-1) is adding back all the bits you removed before (since k - 1 sets to 1 all the n rightmost bits, and then it's used as a mask), and k is setting the n+1th bit in any case (0-based to 1-based conversion).

E.g. i = 1010_1101 and n = 5 results in:

  • (i >> n) << (n + 1) = 0100_0000
  • res + (i & k-1) = 0100_0000 + 0000_1101 = 0100_1101, since k = 0001_1111`
  • res + k = 0100_1101 + 0010_0000 = 0110_1101

The overall effect is that the rightmost n bits are left unchanged, the n+1th bit becomes the nth bit toggled, and the n+2th bit becomes the n+1th bit, and you're losing the last bit, i.e.:

new[:(n+1)] = old[:(n+1)]
new[n+1] = ~old[n]
new[(n+2):] = old[(n+1):-1]

(NumPy notation on an array of booleans).

If this is working, so much the better, but it should be documented. But I'm still puzzled about the meaning...

Copy link
Member Author

@stavros11 stavros11 Jun 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. i = 1010_1101 and n = 5 results in:

  • (i >> n) << (n + 1) = 0100_0000

I believe there is an error here, the correct result would be 10100_0000 (the first 1 is not removed). That would make the final result 10110_1101, which is the original i with a 1 added in its 8-5=3rd position.

An equivalent Python function that is more readable would be

def control_index_py(g, qubits, nqubits):
    n = nqubits - len(qubits)
    bits = [int(b) for b in format(g, f"0{n}b")]
    for q in sorted(nqubits - q - 1 for q in qubits):
        bits.insert(q, 1)
    # convert bits to integer
    return (2 ** np.arange(len(bits) - 1, -1, -1)).dot(bits)

In words, control_index accepts an integer g < 2 ** (nqubits - len(qubits)) and inserts 1s in its binary representation, in the places dictated by qubits. Thus the output is a number < 2 ** nqubits, or if you think in terms of arrays of booleans/bits, the output is len(qubits) longer than the input. Note that qubits here are all the elements involved in the gate (in our case that is arbitrarily many controls + one target).

I was also not sure, so I confirmed this by asserting the result of the two functions is the same for different (random) configurations of qubits.

Code for the check here
import numpy as np

def control_index(g: int, qubits: list[int]):
    i = g
    for n in sorted(qubits):
        k = 1 << n
        i = ((i >> n) << (n + 1)) + (i & (k - 1)) + k
    return i


def control_index_py(g, qubits, nqubits):
    n = nqubits - len(qubits)
    bits = [int(b) for b in format(g, f"0{n}b")]
    for q in sorted(nqubits - q - 1 for q in qubits):
        bits.insert(q, 1)
    # convert bits to integer
    return (2 ** np.arange(len(bits) - 1, -1, -1)).dot(bits)


nqubits = 12
a = np.arange(nqubits)

for ncontrols in range(nqubits - 1):    
    # for various random choices
    for _ in range(100):
        # grab `ncontrols` + 1 target qubits
        np.random.shuffle(a)
        qubits = a[:ncontrols + 1]
        # compare the two functions
        nopen = nqubits - len(qubits)
        for g in range(2 ** nopen):
            result1 = control_index(g, qubits)
            result2 = control_index_py(g, qubits, nqubits)
            assert result1 == result2

Regarding why this is needed: In order to apply a single-qubit (not controlled) gate with matrix G[i, j] to the state |b1 b2 ... bN> (where bi are bits) we need to do the following:

|b1 b2 ... 0 ... bN> --> G[0, 0] |b1 b2 ... 0 ... bN> + G[0, 1] |b1 b2 ... 1 ... bN>
|b1 b2 ... 1 ... bN> --> G[1, 0] |b1 b2 ... 0 ... bN> + G[1, 1] |b1 b2 ... 1 ... bN>

for all 2^(N-1) bitstrings (excluding the target). These are lines 67-68 in the code.

If the gate is controlled, the bitstrings for which ALL controls are 1 follow the above rule, while for all other bistrings we do nothing (since we update the state in place, otherwise we would have to copy). The bitstring for which all controls are 1 are 2^(N - Ncontrols - 1) and the way we generate them is to loop g in range(2 ** (nqubits - ncontrols - 1)) in

for (size_t g = 0; g < nstates; g++) {

and use control_index to insert 1s in all control (and the target) positions. The target is also flipped to 0 in

size_t const i1 = i2 - tk;

to apply the equation above.

If that explanation makes sense, it could be useful to introduce part of it as documentation in qibojit. I agree that this part is not well documented and even the variable names are not good. Historically, we first introduced this in C++ (and also CUDA) for qibotf, then moved to numba (Python) for qibojit. I also kind of invented it myself, without looking what other libraries are doing, so maybe is not even the best implementation, but I guess we kept it because it was working (at least noone complained so far) and performance is acceptable.

Comment on lines +60 to +70
#[no_mangle]
pub extern "C" fn qibo_core_circuit_elements(circuit: &Circuit, gid: usize, ptr: *mut *const usize, len: *mut usize) {
let elements = circuit.elements(gid);
// Complaints about what follows are to be directed to ChatGPT
let boxed_slice = elements.clone().into_boxed_slice();
unsafe {
*ptr = boxed_slice.as_ptr();
*len = elements.len();
}
std::mem::forget(boxed_slice);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, you're avoiding returning in this function, resorting to a more complex solution (with mutable inputs, used to effectively return).

I will fix it, since this should be a very simple function.

Copy link
Member Author

@stavros11 stavros11 Jun 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, did not want to avoid returning and I would prefer something simpler. I was just not sure how to return a Rust vector to C, so I used chatGPT (as written in the comment), probably not with the best prompt, to get something working. Same for your next comment (about _free_elements), it came together with this.

If you are planning to fix, great! Otherwise I can also have another look. Note, that C does not need to manipulate the vector of elements, just read the values. I am not sure if that simplifies it in any way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm planning to improve over this.

I'd like to return a structured object (i.e. essentially a smart pointer, like std::vector), possibly defined within the lib.rs itself.
But I'm not sure if I will be able to access its fields from C (I should try).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, and most likely, we should focus on the C++ API rather than the pure C, encapsulating objects in std::array, std::vector and smart pointers. I don't believe we will see interest from collaborators to have a pure C backend soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I perfectly agree.

That's why I was also interested in direct usage of free() (cf. #37 (comment)), to limit the overhead in the C API definition, and postpone the proper memory handling to the C++ library.

I will try to have another go at packaging it properly, with a more advanced build tool, since now we'll need the C++ applications (including examples) to depend on the C++ library, that in turn will depend on the C library (whose dependency on qibo-core main library is handled by Cargo).

crates/c/src/lib.rs Show resolved Hide resolved
#include "microsim.h"


void print_state(complex double *state, const size_t size) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be moved in the qibo_core library (even the main Rust one), start making a state object (that could simply wrap an array of double).

}

#[no_mangle]
pub extern "C" fn qibo_core_circuit_free_elements(ptr: *const usize, len: usize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a possible way for freeing the elements.

However, once you passed the pointer down to C, you should be able to call free(that_pointer).
Notice that the call does not contain the number of elements (the size). I don't know how the internals of the allocator, so I can just make assumptions about how it works, but I believe it keeps track of the whole block of memory requested just by the address it returned during allocation. I should check this later on.

However, the proposal would be to directly use free() in C, without reimplementing a function to free every pointer, if the returned object is just a dynamic array of non-referencing elements.
For those objects holding references, we should resort to the implementation of a dedicated destructor. This class will mainly consist of objects owning vectors or hash-maps as attributes.

Copy link
Member

@alecandido alecandido Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the time being, the proposal is to consciously leak memory, and fix this consistently at a later stage.

Cf. #38 #39

@alecandido
Copy link
Member

The last batch of comments is mainly for myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants