DeviceAtlas Device API and GPU/CPU interaction

# DeviceAtlas Device API and GPU/CPU interaction # The DeviceAtlas API can bridge between the CPU and GPU, allowing significantly more hardware resources to be brought to bear on a problem. ### Technologies supported ### #### OpenACC/OpenMP #### OpenACC jas the most support from the DeviceAtlas API's perspective, especially since it does not require an additional language but only heuristics. The GCC toolsuite has the most complete support, whether it's for NVIDIA and/or AMD. ##### Requirements ##### - A compiler which supports OpenACC (GCC 7 minimum, 9 at least recommended). - One of the offload components for the related GPU (these are not mutually exclusive since both an NVIDIA and AMD device can be present in the same hardware). ###### Ubuntu/Debian ###### ```shell % sudo apt-get install gcc gcc-offload-amdgcn and/or gcc-offload-nvptx ``` ##### Example ##### ```c #include <dac.h> int main(void) { da_status_t s = da_init(); ... #pragma acc data copyin(atlas, uid, ua) #pragma acc parallel loop for (size_t i = 0; i < nbuas; i ++) { da_device_t device; s = da_search(&atlas, &device, uid, ua, 0); ... da_close(&device); } da_fini(); ... return 0; } ``` ##### Limitations ##### - This feature is under development. - Does not allow yet to initialise a specific device, especially in the case when there is more than one GPU. #### Nvidia CUDA/AMD HIP #### NVIDIA CUDA and, in a lesser extent, AMD HIP requires the use of an additional language, though it is easier if there is some familiarity with the C++ programming language. In addition, an understanding of the GPU memory management, devices transmission bandwith, data throughput capacity ... of the actual hardware GPU is more important in this case. ##### Requirements ##### - a nvcc or clang compiler (at least 10.x version recommended) for CUDA. - a clang compiler for HIP (at least 10.x version recommended). ###### Ubuntu/Debian ###### ```shell % sudo apt-get install nvcc nvidia-driver-bin ``` ##### Example ##### ```cpp #include <dagpu.h> /// Note that HIP vocabulary has been translated to CUDA for easier porting. /// The HIP api, however, can still be used if preferred. main() -> int { da_status_t s = da_device_init(); ... size_t ndetect = 1024; da_evidencearr_t *ev = std::unique_ptr<da_evidencearr_t []>(new da_evidencearr_t[ndetect]); ... /// Optional to measure the time spent of the GPU unit. float diff; da_device_record_start(); ... // preparing the lookups data before propeling them ev[0].arr[0].key = uid; ev[0].arr[0].value = ua; ev[0].arr[1].key = da_atlas_clientprop_evidence_id(&atlas); ev[0].arr[1].value = cs; ... ev[1].arr[0].key = uid; ev[1].arr[0].value = ua; ev[1].arr[1].key = chid; ev[1].arr[1].value = ch; ev[1].arr[2].key = fromid; ev[1].arr[2].value = from; ... da_device_searchv(&atlas, &ev, ndetect); ... da_device_record_stop(&diff); ... da_device_close(&ev, ndetect); ... da_device_fini(); ... return 0; } ``` ##### Limitations ##### - This feature is under development. - Shared memory feature, albeit available, is not entirely used yet for storing long lifetime data. #### Apple Metal #### Using the LLVM-backed Apple Swift and Metal technologies allows interaction with both the CPU and GPU natively in a coherent and seemless manner. ##### Requirements ##### - Xcode and Swift installed (it generates a Xcode project to build from). ##### Example ##### ``` #include <dac.h> int main(void) { da_status_t s = da_device_init(); ... size_t ndetect = 1024; da_evidencearr_t *ev = calloc(ndetect, sizeof(da_evidencearr_t)); ... // preparing the lookups data before propeling them ev[0].arr[0].key = uid; ev[0].arr[0].value = ua; ev[0].arr[1].key = da_atlas_clientprop_evidence_id(&atlas); ev[0].arr[1].value = cs; ... ev[1].arr[0].key = uid; ev[1].arr[0].value = ua; ev[1].arr[1].key = chid; ev[1].arr[1].value = ch; ev[1].arr[2].key = fromid; ev[1].arr[2].value = from; ... da_device_searchv(&atlas, &ev, ndetect); ... da_device_close(&ev, ndetect); free(ev); ... da_device_fini(); ... return 0; } ``` ##### Limitations ##### - This feature is still under development. - Shared memory feature, though available, is not yet used. - Variadic calls are not supported yet. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - _ Copyright (c) DeviceAtlas Limited 2025. All Rights Reserved. _ _ https://deviceatlas.com _  <div class="btn-group" id="main-menu" style="float:right"><a class="btn dropdown-toggle" data-toggle="dropdown" href="#">Menu<span class="caret"></span></a><ul class="dropdown-menu"><li><a href="README.html">Main</a></li><li><a href="README.DeviceApi.html">Device C API</a></li><li><a href="README.ClientHints.html">Client Hints Support</a></li><li><a href="README.CarrierApi.html">Carrier Identification API</a></li><li><a href="README.Upgrade.html">Device API Upgrade</a></li><li><a href="README.Cpp.html">Device API C++</a></li><li><a href="README.Nginx.html">NGINX Module</a></li><li><a href="README.Apache2.html">Apache2 Module</a></li><li><a href="README.JsonConverter.html">Device Identification API C JSONConverter</a></li><li><a href="README.Scheduler.html">Device Identification API C file monitor and scheduler</a></li><li><a href="README.Go-DeviceApi.html">Device API Usage Go</a></li><li><a href="README.Go-Upgrade.html">Device API Upgrade Go</a></li><li><a href="README.Rust-DeviceApi.html">Device API Usage Rust</a></li><li class="disabled"><a href="README.Gpu.html">Device API and GPU/CPU interaction</a></li><li class="divider"></li><li><a href="./ApiDocs/index.html">C ApiDocs</a></li><li><a href="./ApiCppDocs/index.html">C ++ interface ApiDocs</a></li><li><a href="./Go-ApiDocs/carrier.html">Go ApiDocs Carrier</a></li><li><a href="./Go-ApiDocs/device.html">Go ApiDocs Device</a></li></ul></div>