Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

ModalAI Forum

  1. ModalAI Support Forum
  2. Ask your questions right here!
  3. Voxl2 Docker (Ubuntu 22) with OpenCL/Adreno

Voxl2 Docker (Ubuntu 22) with OpenCL/Adreno

Scheduled Pinned Locked Moved Ask your questions right here!
20 Posts 4 Posters 4.5k Views 3 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • E Offline
    E Offline
    eric
    wrote on last edited by
    #1

    Hello, we have a ROS2 humble autonomy stack that we we build in a CI pipeline and deploy to various platforms, including Voxl2.

    The pipeline selects an appropriate base image for respective platforms (including nvida jetson, voxl, etc), each of which is a custom docker image based on ubuntu:jammy with unique layers for platform dependencies.

    We would like to be able to utilize the Adreno GPU within our voxl2 target via OpenCL. I could use some assistance understanding how I might go about this for VOXL2. I am aware of this VOXL1 example: https://gitlab.com/voxl-public/voxl-docker-images/voxl-docker-opencl. Unfortunately, this example uses a prebuilt package for OpenCL to provide Adreno support.

    Is there any documentation that could be shared regarding how this Adreno OpenCL library was built so that we can build an equivalent library in Ubuntu 22 that supports Adreno/VOXL2?

    Any guidance would be greatly appreciated.

    Thank you,
    Eric

    1 Reply Last reply
    0
    • E Offline
      E Offline
      eric
      wrote on last edited by
      #2

      For reference, I've tried downloading the adreno-opencl-sdk-2.0 from qualcomm, loaded the libraries into the docker, and run privileged while also adding devices /dev/dri and /dev/kgsl-3d0. I also added a vendor including the path to qualcomm's libOpenCL library. When I query platforms using the C++ api, the script just hangs. If I don't set the vendor, it fails with a -1001 error.

      Inside the docker, I can build all the sdk examples just fine. I just can't access Adreno.

      Thanks again for any help.

      E 1 Reply Last reply
      0
      • E eric

        For reference, I've tried downloading the adreno-opencl-sdk-2.0 from qualcomm, loaded the libraries into the docker, and run privileged while also adding devices /dev/dri and /dev/kgsl-3d0. I also added a vendor including the path to qualcomm's libOpenCL library. When I query platforms using the C++ api, the script just hangs. If I don't set the vendor, it fails with a -1001 error.

        Inside the docker, I can build all the sdk examples just fine. I just can't access Adreno.

        Thanks again for any help.

        E Offline
        E Offline
        eric
        wrote on last edited by
        #3

        @Eric-Katzfey I know it's been a few years, but I see your name all over the voxl-docker-opencl commit history was wondering if you'd be able to share your thoughts regarding how I might approach this? Specifically, if there are any steps you'd recommend I take to get OpenCL integrated into docker for Voxl2.

        Thanks,
        Eric

        Eric KatzfeyE 1 Reply Last reply
        0
        • E eric

          @Eric-Katzfey I know it's been a few years, but I see your name all over the voxl-docker-opencl commit history was wondering if you'd be able to share your thoughts regarding how I might approach this? Specifically, if there are any steps you'd recommend I take to get OpenCL integrated into docker for Voxl2.

          Thanks,
          Eric

          Eric KatzfeyE Offline
          Eric KatzfeyE Offline
          Eric Katzfey
          ModalAI Team
          wrote on last edited by
          #4

          @eric Yes, I did put that together based on some Qualcomm example code for VOXL. Not really sure how to do something similar on VOXL 2 but I'll ask around the office to see if anyone has some ideas on how to get that going.

          E 1 Reply Last reply
          1
          • Eric KatzfeyE Eric Katzfey

            @eric Yes, I did put that together based on some Qualcomm example code for VOXL. Not really sure how to do something similar on VOXL 2 but I'll ask around the office to see if anyone has some ideas on how to get that going.

            E Offline
            E Offline
            eric
            wrote on last edited by
            #5

            @Eric-Katzfey Really appreciate it! Thanks so much

            Alex KushleyevA 1 Reply Last reply
            0
            • E eric

              @Eric-Katzfey Really appreciate it! Thanks so much

              Alex KushleyevA Offline
              Alex KushleyevA Offline
              Alex Kushleyev
              ModalAI Team
              wrote on last edited by
              #6

              @eric I will try it out, please give me a few days.

              Alex

              Alex KushleyevA 1 Reply Last reply
              1
              • Alex KushleyevA Alex Kushleyev

                @eric I will try it out, please give me a few days.

                Alex

                Alex KushleyevA Offline
                Alex KushleyevA Offline
                Alex Kushleyev
                ModalAI Team
                wrote on last edited by
                #7

                Quick update, I did some testing and tried searching documentation and could not get it to work. There are posts online asking Qualcomm whether this is possible, but there is no response there.

                I know this would be a useful feature and I will try again next week. I want to see how this worked on VOXL1 and perhaps I am missing something.

                In my test, I am also just doing a simple device query and it works on the host VOXL2 but not inside docker, tried mapping various .so libraries and devices to the docker container and no luck yet.

                Alex

                E 1 Reply Last reply
                0
                • Alex KushleyevA Alex Kushleyev

                  Quick update, I did some testing and tried searching documentation and could not get it to work. There are posts online asking Qualcomm whether this is possible, but there is no response there.

                  I know this would be a useful feature and I will try again next week. I want to see how this worked on VOXL1 and perhaps I am missing something.

                  In my test, I am also just doing a simple device query and it works on the host VOXL2 but not inside docker, tried mapping various .so libraries and devices to the docker container and no luck yet.

                  Alex

                  E Offline
                  E Offline
                  eric
                  wrote on last edited by
                  #8

                  @Alex-Kushleyev That aligns pretty well with my own experience so far. Thanks again for looking into this!

                  Alex KushleyevA 1 Reply Last reply
                  0
                  • E eric

                    @Alex-Kushleyev That aligns pretty well with my own experience so far. Thanks again for looking into this!

                    Alex KushleyevA Offline
                    Alex KushleyevA Offline
                    Alex Kushleyev
                    ModalAI Team
                    wrote on last edited by
                    #9

                    @eric I was able to get the GPU device query inside ubuntu 22.04 docker working using the following steps. It is possible that we can reduce the number of mapped devices and libraries to the docker container, but i am just going to give you this information right now so you can test. I will try to clean this up a bit later. I only tried the device query for now, but i figured i would let you know that there is progress..

                    #run docker
                    docker run -it --rm --privileged --device=/dev/kgsl-3d0 --device=/dev/ion -v /proc:/proc -v /firmware/image:/firmware/image -v /lib/firmware:/lib/firmware -v /sys/class:/sys/class -v /sys/bus:/sys/bus -v /sys/devices:/sys/devices -v /data:/data -v /usr/lib/liblog.so.0:/usr/lib/liblog.so.0 -v /usr/lib/libOpenCL.so:/usr/lib/libOpenCL.so -v /usr/lib/libcutils.so.0:/usr/lib/libcutils.so.0 -v /usr/lib/libllvm-qcom.so:/usr/lib/libllvm-qcom.so -v /usr/lib/libion.so.0.0.0:/usr/lib/libion.so.0.0.0 -v /usr/lib/libsync.so.0.0.0:/usr/lib/libsync.so.0.0.0 -v /usr/lib/libgsl.so:/usr/lib/libgsl.so -v /usr/lib/libCB.so:/usr/lib/libCB.so -v /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0.5600.4:/usr/lib/aarch64-linux-gnu/libglib-2.0.so.0.5600.4 -v `pwd`:/opt/code -w /opt/code arm64v8/ubuntu:22.04 bash
                    
                    apt-get update
                    apt install --no-install-recommends -y pocl-opencl-icd
                    

                    then run your test app to query the device..

                    E 1 Reply Last reply
                    1
                    • Alex KushleyevA Alex Kushleyev

                      @eric I was able to get the GPU device query inside ubuntu 22.04 docker working using the following steps. It is possible that we can reduce the number of mapped devices and libraries to the docker container, but i am just going to give you this information right now so you can test. I will try to clean this up a bit later. I only tried the device query for now, but i figured i would let you know that there is progress..

                      #run docker
                      docker run -it --rm --privileged --device=/dev/kgsl-3d0 --device=/dev/ion -v /proc:/proc -v /firmware/image:/firmware/image -v /lib/firmware:/lib/firmware -v /sys/class:/sys/class -v /sys/bus:/sys/bus -v /sys/devices:/sys/devices -v /data:/data -v /usr/lib/liblog.so.0:/usr/lib/liblog.so.0 -v /usr/lib/libOpenCL.so:/usr/lib/libOpenCL.so -v /usr/lib/libcutils.so.0:/usr/lib/libcutils.so.0 -v /usr/lib/libllvm-qcom.so:/usr/lib/libllvm-qcom.so -v /usr/lib/libion.so.0.0.0:/usr/lib/libion.so.0.0.0 -v /usr/lib/libsync.so.0.0.0:/usr/lib/libsync.so.0.0.0 -v /usr/lib/libgsl.so:/usr/lib/libgsl.so -v /usr/lib/libCB.so:/usr/lib/libCB.so -v /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0.5600.4:/usr/lib/aarch64-linux-gnu/libglib-2.0.so.0.5600.4 -v `pwd`:/opt/code -w /opt/code arm64v8/ubuntu:22.04 bash
                      
                      apt-get update
                      apt install --no-install-recommends -y pocl-opencl-icd
                      

                      then run your test app to query the device..

                      E Offline
                      E Offline
                      eric
                      wrote on last edited by
                      #10

                      @Alex-Kushleyev 🙇

                      Alex KushleyevA 1 Reply Last reply
                      0
                      • E eric

                        @Alex-Kushleyev 🙇

                        Alex KushleyevA Offline
                        Alex KushleyevA Offline
                        Alex Kushleyev
                        ModalAI Team
                        wrote on last edited by Alex Kushleyev
                        #11

                        OK, a little more clean-up, it seems this is the minimal set of libraries /devices needed:

                        docker run -it --rm --privileged \
                        	-v /usr/lib/libOpenCL.so:/usr/lib/libOpenCL.so \
                        	-v /usr/lib/libCB.so:/usr/lib/libCB.so \
                        	-v /usr/lib/libgsl.so:/usr/lib/libgsl.so \
                        	-v /usr/lib/liblog.so.0:/usr/lib/liblog.so.0 \
                        	-v /usr/lib/libcutils.so.0:/usr/lib/libcutils.so.0 \
                        	-v /usr/lib/libsync.so.0.0.0:/usr/lib/libsync.so.0.0.0 \
                        	-v /usr/lib/libion.so.0.0.0:/usr/lib/libion.so.0.0.0 \
                        	-v /usr/lib/libllvm-qcom.so:/usr/lib/libllvm-qcom.so \
                        	-v /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0.5600.4:/usr/lib/aarch64-linux-gnu/libglib-2.0.so.0.5600.4 \
                        	-v `pwd`:/opt/code -w /opt/code \
                        	arm64v8/ubuntu:22.04 bash
                        

                        (--privileged mode maps all the needed devices to the docker container)

                        Then install some more packages (not sure if this can be reduced, not clear exactly what is missing):

                        apt-get update
                        apt install --no-install-recommends -y pocl-opencl-icd
                        

                        Maybe we can figure out what lib is still missing so that pocl-opencl-icd does not have to be installed.. At least the issue was the a missing library, not a mapped device

                        For testing, I used a device query script from here:

                        root@733a6d4d5fdb:/opt/code# ./simple_query 
                        1. Device: QUALCOMM Adreno(TM)
                         1.1 Hardware version: OpenCL 2.0 Adreno(TM) 650
                         1.2 Software version: OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch:  Compiler E031.37.12.01
                         1.3 OpenCL C version: OpenCL C 2.0 Adreno(TM) 650
                         1.4 Parallel compute units: 3
                        

                        I also verified that a simple matrix multiplication app also worked (not provided here)

                        @eric , can you please let me know if this works for you?

                        Alex

                        E 1 Reply Last reply
                        0
                        • Alex KushleyevA Alex Kushleyev

                          OK, a little more clean-up, it seems this is the minimal set of libraries /devices needed:

                          docker run -it --rm --privileged \
                          	-v /usr/lib/libOpenCL.so:/usr/lib/libOpenCL.so \
                          	-v /usr/lib/libCB.so:/usr/lib/libCB.so \
                          	-v /usr/lib/libgsl.so:/usr/lib/libgsl.so \
                          	-v /usr/lib/liblog.so.0:/usr/lib/liblog.so.0 \
                          	-v /usr/lib/libcutils.so.0:/usr/lib/libcutils.so.0 \
                          	-v /usr/lib/libsync.so.0.0.0:/usr/lib/libsync.so.0.0.0 \
                          	-v /usr/lib/libion.so.0.0.0:/usr/lib/libion.so.0.0.0 \
                          	-v /usr/lib/libllvm-qcom.so:/usr/lib/libllvm-qcom.so \
                          	-v /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0.5600.4:/usr/lib/aarch64-linux-gnu/libglib-2.0.so.0.5600.4 \
                          	-v `pwd`:/opt/code -w /opt/code \
                          	arm64v8/ubuntu:22.04 bash
                          

                          (--privileged mode maps all the needed devices to the docker container)

                          Then install some more packages (not sure if this can be reduced, not clear exactly what is missing):

                          apt-get update
                          apt install --no-install-recommends -y pocl-opencl-icd
                          

                          Maybe we can figure out what lib is still missing so that pocl-opencl-icd does not have to be installed.. At least the issue was the a missing library, not a mapped device

                          For testing, I used a device query script from here:

                          root@733a6d4d5fdb:/opt/code# ./simple_query 
                          1. Device: QUALCOMM Adreno(TM)
                           1.1 Hardware version: OpenCL 2.0 Adreno(TM) 650
                           1.2 Software version: OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch:  Compiler E031.37.12.01
                           1.3 OpenCL C version: OpenCL C 2.0 Adreno(TM) 650
                           1.4 Parallel compute units: 3
                          

                          I also verified that a simple matrix multiplication app also worked (not provided here)

                          @eric , can you please let me know if this works for you?

                          Alex

                          E Offline
                          E Offline
                          eric
                          wrote on last edited by
                          #12

                          @Alex-Kushleyev

                          OMG IT WORKS!!

                          I was able to extract all these libraries from the host and directly install them inside the docker, and now the pcol-opencl-icd installation isn't needed.

                          This is really important for us, since it allows us to build external dependencies that rely on OpenCL in our pipeline directly without bind mounts (outside the host environment).

                          Really, really appreciate all your help! 

                          FROM arm64v8/ubuntu:22.04
                          
                          # Install necessary dependencies
                          RUN apt-get update && \
                              apt-get install -y \
                              cmake \
                              build-essential \
                              libglib2.0-0
                          
                          # Copy Adreno GPU dependencies
                          # - libcutils0_0-r1_arm64.deb
                          # - libsync_1.0-r1_arm64.deb
                          # - qti-libion_0-r1_arm64.deb
                          # - liblog0_1.0-r1_arm64.deb
                          # - qti-adreno_1.0-r0_arm64.deb
                          COPY dep /root/dep
                          
                          # Create required directory for qti-adreno install
                          RUN mkdir /usr/include/KHR && dpkg -i /root/dep/*.deb 
                          
                          # Copy and build test script
                          COPY ./hellocl /root/hellocl
                          RUN cd /root/hellocl && mkdir build && cd build && cmake .. && make
                          
                          CMD ["bash"]
                          
                          voxl2:~/opencl$ docker run -it --rm --privileged opencl:latest ./root/hellocl/build/hellocl
                          Platform Information:
                          Platform Name: QUALCOMM Snapdragon(TM)
                          Platform Vendor: QUALCOMM
                          Platform Version: OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch: 
                          Platform Profile: FULL_PROFILE
                          Platform Extensions:  
                          ------------------------------------
                          Device Information:
                          Device Name: QUALCOMM Adreno(TM)
                          Device Vendor: QUALCOMM
                          Driver Version: OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch:  Compiler E031.37.12.01
                          Device Version: OpenCL 2.0 Adreno(TM) 650
                          Device OpenCL C Version: OpenCL C 2.0 Adreno(TM) 650
                          Device Max Compute Units: 3
                          This should be three: 3
                          
                          Alex KushleyevA 1 Reply Last reply
                          0
                          • E eric

                            @Alex-Kushleyev

                            OMG IT WORKS!!

                            I was able to extract all these libraries from the host and directly install them inside the docker, and now the pcol-opencl-icd installation isn't needed.

                            This is really important for us, since it allows us to build external dependencies that rely on OpenCL in our pipeline directly without bind mounts (outside the host environment).

                            Really, really appreciate all your help! 

                            FROM arm64v8/ubuntu:22.04
                            
                            # Install necessary dependencies
                            RUN apt-get update && \
                                apt-get install -y \
                                cmake \
                                build-essential \
                                libglib2.0-0
                            
                            # Copy Adreno GPU dependencies
                            # - libcutils0_0-r1_arm64.deb
                            # - libsync_1.0-r1_arm64.deb
                            # - qti-libion_0-r1_arm64.deb
                            # - liblog0_1.0-r1_arm64.deb
                            # - qti-adreno_1.0-r0_arm64.deb
                            COPY dep /root/dep
                            
                            # Create required directory for qti-adreno install
                            RUN mkdir /usr/include/KHR && dpkg -i /root/dep/*.deb 
                            
                            # Copy and build test script
                            COPY ./hellocl /root/hellocl
                            RUN cd /root/hellocl && mkdir build && cd build && cmake .. && make
                            
                            CMD ["bash"]
                            
                            voxl2:~/opencl$ docker run -it --rm --privileged opencl:latest ./root/hellocl/build/hellocl
                            Platform Information:
                            Platform Name: QUALCOMM Snapdragon(TM)
                            Platform Vendor: QUALCOMM
                            Platform Version: OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch: 
                            Platform Profile: FULL_PROFILE
                            Platform Extensions:  
                            ------------------------------------
                            Device Information:
                            Device Name: QUALCOMM Adreno(TM)
                            Device Vendor: QUALCOMM
                            Driver Version: OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch:  Compiler E031.37.12.01
                            Device Version: OpenCL 2.0 Adreno(TM) 650
                            Device OpenCL C Version: OpenCL C 2.0 Adreno(TM) 650
                            Device Max Compute Units: 3
                            This should be three: 3
                            
                            Alex KushleyevA Offline
                            Alex KushleyevA Offline
                            Alex Kushleyev
                            ModalAI Team
                            wrote on last edited by
                            #13

                            hi @eric ,

                            Nice! very clean.

                            Did you use dpkg-repack to create debs of installed packages, such as:

                            apt-get install dpkg-repack
                            dpkg-repack qti-adreno
                            

                            Cool trick!

                            I will test this out and add to our docs.

                            Alex

                            E 1 Reply Last reply
                            0
                            • Alex KushleyevA Alex Kushleyev

                              hi @eric ,

                              Nice! very clean.

                              Did you use dpkg-repack to create debs of installed packages, such as:

                              apt-get install dpkg-repack
                              dpkg-repack qti-adreno
                              

                              Cool trick!

                              I will test this out and add to our docs.

                              Alex

                              E Offline
                              E Offline
                              eric
                              wrote on last edited by eric
                              #14

                              @Alex-Kushleyev Yes, dpkg -S <file path> to figure out which debs installed which libraries (ie, dpkg -S /usr/lib/libOpenCL.so), apt-cache show to see the source (ubuntu ppa vs modalai), then dpkg-repack to repack the modalai debs.

                              Thanks again!

                              Alex KushleyevA 1 Reply Last reply
                              0
                              • E eric

                                @Alex-Kushleyev Yes, dpkg -S <file path> to figure out which debs installed which libraries (ie, dpkg -S /usr/lib/libOpenCL.so), apt-cache show to see the source (ubuntu ppa vs modalai), then dpkg-repack to repack the modalai debs.

                                Thanks again!

                                Alex KushleyevA Offline
                                Alex KushleyevA Offline
                                Alex Kushleyev
                                ModalAI Team
                                wrote on last edited by
                                #15

                                @eric , thanks again for your input on this, i have posted a complete tutorial how to enable OpenCL in Docker on VOXL2 : https://docs.modalai.com/voxl-2-opencl-in-docker/

                                Alex

                                E 1 Reply Last reply
                                0
                                • Alex KushleyevA Alex Kushleyev

                                  @eric , thanks again for your input on this, i have posted a complete tutorial how to enable OpenCL in Docker on VOXL2 : https://docs.modalai.com/voxl-2-opencl-in-docker/

                                  Alex

                                  E Offline
                                  E Offline
                                  eric
                                  wrote on last edited by
                                  #16

                                  @Alex-Kushleyev Awesome! Thanks again for all your help with this!

                                  Peter MilaniP 1 Reply Last reply
                                  0
                                  • E eric

                                    @Alex-Kushleyev Awesome! Thanks again for all your help with this!

                                    Peter MilaniP Offline
                                    Peter MilaniP Offline
                                    Peter Milani
                                    wrote on last edited by
                                    #17

                                    @Alex-Kushleyev @eric I've implemented your solution and get the same result.

                                    I did get a bit confused as running clinfo only returned a single device of type CPU and without the name "Adreno".

                                    However I added to your test script a query on the device_type and it returned GPU so I guess its only finding the GPU. I would have expected it to return a few more devices as the [Qualcomm OpenCL guide] (https://docs.qualcomm.com/bundle/publicresource/80-NB295-11_REV_C_Qualcomm_Snapdragon_Mobile_Platform_Opencl_General_Programming_and_Optimization.pdf) suggests that the dsp and CPU could have been returned as well, so I'm not sure what is happening there. I didn't have to link devices only shared the volumes to the relevant libraries. I would have expected the CPU to be returned a a matter of course as that is what happens with the intel implementation.

                                    My additional lines to the script (given for info is):

                                      cl_device_type device_type;
                                      clGetDeviceInfo(devices[j], CL_DEVICE_TYPE, sizeof(cl_device_type), &device_type, NULL);
                                      printf("Device type: ");
                                      if (device_type & CL_DEVICE_TYPE_CPU)
                                          printf("CPU ");
                                      if (device_type & CL_DEVICE_TYPE_GPU)
                                          printf("GPU ");
                                      if (device_type & CL_DEVICE_TYPE_ACCELERATOR)
                                          printf("ACCELERATOR ");
                                      if (device_type & CL_DEVICE_TYPE_DEFAULT)
                                          printf("DEFAULT ");
                                      printf("\n");
                                    
                                    

                                    Which returns

                                    OpenCL platform count: 1
                                    OpenCL device count: 1
                                    1. Device: QUALCOMM Adreno(TM)
                                     1.1 Hardware version: OpenCL 2.0 Adreno(TM) 650
                                    Device type: GPU 
                                     1.2 Software version: OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch:  Compiler E031.37.12.01
                                     1.3 OpenCL C version: OpenCL C 2.0 Adreno(TM) 650
                                     1.4 Parallel compute units: 3
                                    
                                    
                                    Alex KushleyevA 1 Reply Last reply
                                    0
                                    • Peter MilaniP Peter Milani

                                      @Alex-Kushleyev @eric I've implemented your solution and get the same result.

                                      I did get a bit confused as running clinfo only returned a single device of type CPU and without the name "Adreno".

                                      However I added to your test script a query on the device_type and it returned GPU so I guess its only finding the GPU. I would have expected it to return a few more devices as the [Qualcomm OpenCL guide] (https://docs.qualcomm.com/bundle/publicresource/80-NB295-11_REV_C_Qualcomm_Snapdragon_Mobile_Platform_Opencl_General_Programming_and_Optimization.pdf) suggests that the dsp and CPU could have been returned as well, so I'm not sure what is happening there. I didn't have to link devices only shared the volumes to the relevant libraries. I would have expected the CPU to be returned a a matter of course as that is what happens with the intel implementation.

                                      My additional lines to the script (given for info is):

                                        cl_device_type device_type;
                                        clGetDeviceInfo(devices[j], CL_DEVICE_TYPE, sizeof(cl_device_type), &device_type, NULL);
                                        printf("Device type: ");
                                        if (device_type & CL_DEVICE_TYPE_CPU)
                                            printf("CPU ");
                                        if (device_type & CL_DEVICE_TYPE_GPU)
                                            printf("GPU ");
                                        if (device_type & CL_DEVICE_TYPE_ACCELERATOR)
                                            printf("ACCELERATOR ");
                                        if (device_type & CL_DEVICE_TYPE_DEFAULT)
                                            printf("DEFAULT ");
                                        printf("\n");
                                      
                                      

                                      Which returns

                                      OpenCL platform count: 1
                                      OpenCL device count: 1
                                      1. Device: QUALCOMM Adreno(TM)
                                       1.1 Hardware version: OpenCL 2.0 Adreno(TM) 650
                                      Device type: GPU 
                                       1.2 Software version: OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch:  Compiler E031.37.12.01
                                       1.3 OpenCL C version: OpenCL C 2.0 Adreno(TM) 650
                                       1.4 Parallel compute units: 3
                                      
                                      
                                      Alex KushleyevA Offline
                                      Alex KushleyevA Offline
                                      Alex Kushleyev
                                      ModalAI Team
                                      wrote on last edited by
                                      #18

                                      @Peter-Milani , it looks like Qualcomm CPU device is not supported by OpenCL library from Qualcomm.

                                      clinfo may be confused, but installing and running clinfo natively on voxl2 does not return any platforms - the opencl libraries that may get installed by apt are most likely not compatible with the VOXL2 GPU.

                                      Alex

                                      Peter MilaniP 1 Reply Last reply
                                      0
                                      • Alex KushleyevA Alex Kushleyev

                                        @Peter-Milani , it looks like Qualcomm CPU device is not supported by OpenCL library from Qualcomm.

                                        clinfo may be confused, but installing and running clinfo natively on voxl2 does not return any platforms - the opencl libraries that may get installed by apt are most likely not compatible with the VOXL2 GPU.

                                        Alex

                                        Peter MilaniP Offline
                                        Peter MilaniP Offline
                                        Peter Milani
                                        wrote on last edited by
                                        #19

                                        @Alex-Kushleyev I was able to get the following when running opencl within the docker instance:

                                         clinfo
                                        Number of platforms                               1
                                          Platform Name                                   Portable Computing Language
                                          Platform Vendor                                 The pocl project
                                          Platform Version                                OpenCL 1.2 pocl 1.4, None+Asserts, LLVM 9.0.1, RELOC, SLEEF, POCL_DEBUG
                                          Platform Profile                                FULL_PROFILE
                                          Platform Extensions                             cl_khr_icd
                                          Platform Extensions function suffix             POCL
                                        
                                          Platform Name                                   Portable Computing Language
                                        Number of devices                                 1
                                          Device Name                                     pthread-0x805
                                          Device Vendor                                   Qualcomm
                                          Device Vendor ID                                0x13b5
                                          Device Version                                  OpenCL 1.2 pocl HSTR: pthread-aarch64-unknown-linux-gnu-GENERIC
                                          Driver Version                                  1.4
                                          Device OpenCL C Version                         OpenCL C 1.2 pocl
                                          Device Type                                     CPU
                                          Device Profile                                  FULL_PROFILE
                                          Device Available                                Yes
                                          Compiler Available                              Yes
                                          Linker Available                                Yes
                                          Max compute units                               8
                                          Max clock frequency                             1804MHz
                                          Device Partition                                (core)
                                            Max number of sub-devices                     8
                                            Supported partition types                     equally, by counts
                                            Supported affinity domains                    (n/a)
                                          Max work item dimensions                        3
                                          Max work item sizes                             4096x4096x4096
                                          Max work group size                             4096
                                          Preferred work group size multiple              8
                                          Preferred / native vector sizes                 
                                            char                                                16 / 16      
                                            short                                                8 / 8       
                                            int                                                  4 / 4       
                                            long                                                 2 / 2       
                                            half                                                 0 / 0        (n/a)
                                            float                                                4 / 4       
                                            double                                               2 / 2        (cl_khr_fp64)
                                          Half-precision Floating-point support           (n/a)
                                          Single-precision Floating-point support         (core)
                                            Denormals                                     No
                                            Infinity and NANs                             Yes
                                            Round to nearest                              Yes
                                            Round to zero                                 No
                                            Round to infinity                             No
                                            IEEE754-2008 fused multiply-add               No
                                            Support is emulated in software               No
                                            Correctly-rounded divide and sqrt operations  No
                                          Double-precision Floating-point support         (cl_khr_fp64)
                                            Denormals                                     Yes
                                            Infinity and NANs                             Yes
                                            Round to nearest                              Yes
                                            Round to zero                                 Yes
                                            Round to infinity                             Yes
                                            IEEE754-2008 fused multiply-add               Yes
                                            Support is emulated in software               No
                                          Address bits                                    64, Little-Endian
                                          Global memory size                              5896568832 (5.492GiB)
                                          Error Correction support                        No
                                          Max memory allocation                           2147483648 (2GiB)
                                          Unified memory for Host and Device              Yes
                                          Minimum alignment for any data type             128 bytes
                                          Alignment of base address                       1024 bits (128 bytes)
                                          Global Memory cache type                        None
                                          Image support                                   Yes
                                            Max number of samplers per kernel             16
                                            Max size for 1D images from buffer            134217728 pixels
                                            Max 1D or 2D image array size                 2048 images
                                            Max 2D image size                             8192x8192 pixels
                                            Max 3D image size                             2048x2048x2048 pixels
                                            Max number of read image args                 128
                                            Max number of write image args                128
                                          Local memory type                               Global
                                          Local memory size                               33554432 (32MiB)
                                          Max number of constant args                     8
                                          Max constant buffer size                        33554432 (32MiB)
                                          Max size of kernel argument                     1024
                                          Queue properties                                
                                            Out-of-order execution                        Yes
                                            Profiling                                     Yes
                                          Prefer user sync for interop                    Yes
                                          Profiling timer resolution                      1ns
                                          Execution capabilities                          
                                            Run OpenCL kernels                            Yes
                                            Run native kernels                            Yes
                                          printf() buffer size                            16777216 (16MiB)
                                          Built-in kernels                                (n/a)
                                          Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_fp64
                                        
                                        NULL platform behavior
                                          clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Portable Computing Language
                                          clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [POCL]
                                          clCreateContext(NULL, ...) [default]            Success [POCL]
                                          clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
                                            Platform Name                                 Portable Computing Language
                                            Device Name                                   pthread-0x805
                                          clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  Success (1)
                                            Platform Name                                 Portable Computing Language
                                            Device Name                                   pthread-0x805
                                          clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform
                                          clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
                                          clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
                                          clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
                                            Platform Name                                 Portable Computing Language
                                            Device Name                                   pthread-0x805
                                        
                                        ICD loader properties
                                          ICD loader Name                                 OpenCL ICD Loader
                                          ICD loader Vendor                               OCL Icd free software
                                          ICD loader Version                              2.2.11
                                          ICD loader Profile                              OpenCL 2.1
                                        
                                        

                                        but after installing:

                                        apt install -y -qq pocl-opencl-icd;
                                        
                                        Alex KushleyevA 1 Reply Last reply
                                        0
                                        • Peter MilaniP Peter Milani

                                          @Alex-Kushleyev I was able to get the following when running opencl within the docker instance:

                                           clinfo
                                          Number of platforms                               1
                                            Platform Name                                   Portable Computing Language
                                            Platform Vendor                                 The pocl project
                                            Platform Version                                OpenCL 1.2 pocl 1.4, None+Asserts, LLVM 9.0.1, RELOC, SLEEF, POCL_DEBUG
                                            Platform Profile                                FULL_PROFILE
                                            Platform Extensions                             cl_khr_icd
                                            Platform Extensions function suffix             POCL
                                          
                                            Platform Name                                   Portable Computing Language
                                          Number of devices                                 1
                                            Device Name                                     pthread-0x805
                                            Device Vendor                                   Qualcomm
                                            Device Vendor ID                                0x13b5
                                            Device Version                                  OpenCL 1.2 pocl HSTR: pthread-aarch64-unknown-linux-gnu-GENERIC
                                            Driver Version                                  1.4
                                            Device OpenCL C Version                         OpenCL C 1.2 pocl
                                            Device Type                                     CPU
                                            Device Profile                                  FULL_PROFILE
                                            Device Available                                Yes
                                            Compiler Available                              Yes
                                            Linker Available                                Yes
                                            Max compute units                               8
                                            Max clock frequency                             1804MHz
                                            Device Partition                                (core)
                                              Max number of sub-devices                     8
                                              Supported partition types                     equally, by counts
                                              Supported affinity domains                    (n/a)
                                            Max work item dimensions                        3
                                            Max work item sizes                             4096x4096x4096
                                            Max work group size                             4096
                                            Preferred work group size multiple              8
                                            Preferred / native vector sizes                 
                                              char                                                16 / 16      
                                              short                                                8 / 8       
                                              int                                                  4 / 4       
                                              long                                                 2 / 2       
                                              half                                                 0 / 0        (n/a)
                                              float                                                4 / 4       
                                              double                                               2 / 2        (cl_khr_fp64)
                                            Half-precision Floating-point support           (n/a)
                                            Single-precision Floating-point support         (core)
                                              Denormals                                     No
                                              Infinity and NANs                             Yes
                                              Round to nearest                              Yes
                                              Round to zero                                 No
                                              Round to infinity                             No
                                              IEEE754-2008 fused multiply-add               No
                                              Support is emulated in software               No
                                              Correctly-rounded divide and sqrt operations  No
                                            Double-precision Floating-point support         (cl_khr_fp64)
                                              Denormals                                     Yes
                                              Infinity and NANs                             Yes
                                              Round to nearest                              Yes
                                              Round to zero                                 Yes
                                              Round to infinity                             Yes
                                              IEEE754-2008 fused multiply-add               Yes
                                              Support is emulated in software               No
                                            Address bits                                    64, Little-Endian
                                            Global memory size                              5896568832 (5.492GiB)
                                            Error Correction support                        No
                                            Max memory allocation                           2147483648 (2GiB)
                                            Unified memory for Host and Device              Yes
                                            Minimum alignment for any data type             128 bytes
                                            Alignment of base address                       1024 bits (128 bytes)
                                            Global Memory cache type                        None
                                            Image support                                   Yes
                                              Max number of samplers per kernel             16
                                              Max size for 1D images from buffer            134217728 pixels
                                              Max 1D or 2D image array size                 2048 images
                                              Max 2D image size                             8192x8192 pixels
                                              Max 3D image size                             2048x2048x2048 pixels
                                              Max number of read image args                 128
                                              Max number of write image args                128
                                            Local memory type                               Global
                                            Local memory size                               33554432 (32MiB)
                                            Max number of constant args                     8
                                            Max constant buffer size                        33554432 (32MiB)
                                            Max size of kernel argument                     1024
                                            Queue properties                                
                                              Out-of-order execution                        Yes
                                              Profiling                                     Yes
                                            Prefer user sync for interop                    Yes
                                            Profiling timer resolution                      1ns
                                            Execution capabilities                          
                                              Run OpenCL kernels                            Yes
                                              Run native kernels                            Yes
                                            printf() buffer size                            16777216 (16MiB)
                                            Built-in kernels                                (n/a)
                                            Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_fp64
                                          
                                          NULL platform behavior
                                            clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Portable Computing Language
                                            clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [POCL]
                                            clCreateContext(NULL, ...) [default]            Success [POCL]
                                            clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
                                              Platform Name                                 Portable Computing Language
                                              Device Name                                   pthread-0x805
                                            clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  Success (1)
                                              Platform Name                                 Portable Computing Language
                                              Device Name                                   pthread-0x805
                                            clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform
                                            clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
                                            clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
                                            clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
                                              Platform Name                                 Portable Computing Language
                                              Device Name                                   pthread-0x805
                                          
                                          ICD loader properties
                                            ICD loader Name                                 OpenCL ICD Loader
                                            ICD loader Vendor                               OCL Icd free software
                                            ICD loader Version                              2.2.11
                                            ICD loader Profile                              OpenCL 2.1
                                          
                                          

                                          but after installing:

                                          apt install -y -qq pocl-opencl-icd;
                                          
                                          Alex KushleyevA Offline
                                          Alex KushleyevA Offline
                                          Alex Kushleyev
                                          ModalAI Team
                                          wrote on last edited by
                                          #20

                                          @Peter-Milani , I see. this looks like a generic implementation of OpenCL for ARM from 3rd party (not Qualcomm), and i think it also overwrites the proprietary opencl libraries, disabling the GPU opencl support. However, you could make two separate docker images, one for each use case (cpu and gpu)

                                          Alex

                                          1 Reply Last reply
                                          0

                                          Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                                          Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                                          With your input, this post could be even better 💗

                                          Register Login
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          ModalAI
                                          Categories Recent Tags ModalAI.com Docs
                                          © 2026 ModalAI® · Accelerating autonomy for smaller, smarter, safer drones · Powered by NodeBB
                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • Users
                                          • Groups