SynchronizationΒΆ

There is no guarantee in the completion of kernel calls. Hence we have to synchronize the device with the host.

device.finish();
occaDeviceFinish(device);
device.finish()
call occaDeviceFinish(device)

Alternatively, we can do non-asynchronous (blocking) copies to guarantee all of the device executions have finished.

occa::memcpy(b, o_b);
occaCopyMemToPtr(b, o_b, occaAllBytes, 0, occaDefault);
occa.memcpy(b, o_b)
call occaCopyMemToPtr(b, o_b, occaAllBytes, 0, occaDefault);

Now we can finally test our results by looking at our host data in ab.

for (int i = 0; i < entries; ++i) {
  if (b[i] != 1) {
    std::cerr("addVectors failed\n");
  }
}
for (int i = 0; i < entries; ++i) {
  if (b[i] != 1) {
    fprintf(stderr, "addVectors failed\n");
  }
}
if not np.array_equal1(b, [1] * entries):
  raise Exception('add_vectors failed')
!???