ModalAI Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    GPU "hello world"

    Ask your questions right here!
    2
    6
    175
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      jamesbowman
      last edited by

      Hi,

      I'm trying to write a benchmark for GPU performance on a VOXL 2 (voxl-suite 1.3.5)

      So far I've not been able to get a hardware-accelerated OpenGL context.

      Is there a "hello world" example that draws something on the GPU and reads the resulting image?

      This package

      https://moderngl.readthedocs.io/en/latest/techniques/headless_ubuntu_18_server.html

      looked promising, but when I run it I only get a software-rendering context.

      Or a similar OpenCL tiny app? (I tried "hellocl.zip" mentioned in another thread, and it doesn't find any hardware, as does "clinfo").

      Thanks,
      James.

      Alex KushleyevA 1 Reply Last reply Reply Quote 0
      • Alex KushleyevA
        Alex Kushleyev ModalAI Team @jamesbowman
        last edited by

        @jamesbowman ,

        You can use the following example to do a gpu query using opencl: https://github.com/yell0wd0g/clDeviceQuery/blob/master/clDeviceQuery.cpp

        Download that to voxl2, build it using

        g++ -O2 clDeviceQuery.cpp -lOpenCL -o opencl-query
        

        and run

        voxl2:~/opencl$ ./opencl-query 
        clDeviceQuery Starting...
        
        1 OpenCL Platforms found
        
         CL_PLATFORM_NAME: 	QUALCOMM Snapdragon(TM)
         CL_PLATFORM_VERSION: 	OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch: 
        OpenCL Device Info:
        
         1 devices found supporting OpenCL on: QUALCOMM Snapdragon(TM)
        
         ----------------------------------
         Device QUALCOMM Adreno(TM)
         ---------------------------------
          CL_DEVICE_NAME: 			QUALCOMM Adreno(TM)
          CL_DEVICE_VENDOR: 			QUALCOMM
          CL_DRIVER_VERSION: 			OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch:  Compiler E031.37.12.01
          CL_DEVICE_TYPE:			CL_DEVICE_TYPE_GPU
          CL_DEVICE_MAX_COMPUTE_UNITS:		3
          CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS:	3
          CL_DEVICE_MAX_WORK_ITEM_SIZES:	1024 / 1024 / 1024 
          CL_DEVICE_MAX_WORK_GROUP_SIZE:	1024
          CL_DEVICE_MAX_CLOCK_FREQUENCY:	1 MHz
          CL_DEVICE_ADDRESS_BITS:		64
          CL_DEVICE_MAX_MEM_ALLOC_SIZE:		256 MByte
          CL_DEVICE_GLOBAL_MEM_SIZE:		1024 MByte
          CL_DEVICE_ERROR_CORRECTION_SUPPORT:	no
          CL_DEVICE_LOCAL_MEM_TYPE:		local
          CL_DEVICE_LOCAL_MEM_SIZE:		32 KByte
          CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:	64 KByte
          CL_DEVICE_QUEUE_PROPERTIES:		CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE
          CL_DEVICE_QUEUE_PROPERTIES:		CL_QUEUE_PROFILING_ENABLE
          CL_DEVICE_IMAGE_SUPPORT:		1
          CL_DEVICE_MAX_READ_IMAGE_ARGS:	128
          CL_DEVICE_MAX_WRITE_IMAGE_ARGS:	64
        
          CL_DEVICE_IMAGE <dim>			2D_MAX_WIDTH	 16384
        					2D_MAX_HEIGHT	 16384
        					3D_MAX_WIDTH	 16384
        					3D_MAX_HEIGHT	 16384
        					3D_MAX_DEPTH	 2048
          CL_DEVICE_PREFERRED_VECTOR_WIDTH_<t>	CHAR 1, SHORT 1, INT 1, FLOAT 1, DOUBLE 1
        
        
        clDeviceQuery, Platform Name = QUALCOMM Snapdragon(TM), Platform Version = OpenCL 2.0 QUALCOMM build: commit # changeid # Date: 11/10/21 Wed Local Branch:  Remote Branch: , NumDevs = 1, Device = QUALCOMM Adreno(TM)
        
        System Info: 
        
         Local Time/Date =  03:55:01, 11/22/2024
         CPU Name: none
         # of CPU processors: 8
         Linux version 4.19.125 (oe-user@oe-host) (gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04), GNU ld (GNU Binutils for Ubuntu) 2.30) #1 SMP PREEMPT Sat May 18 00:10:25 UTC 2024
        
        
        TEST PASSED
        

        In a similar way, you should be able to find and build standard opencl tests.

        Qualcomm provides OpenCL SDK which should be able to download from their web site, but there are also some examples here : https://github.com/willhua/QualcommOpenCLSDKNote/tree/master/src/examples , which include some basic tests like matrix manipulation to using special optimized routines for image conversion, convolution, filtering, matching, etc.

        You mentioned drawing, but i don't have good examples for drawing. You should also be able to use OpenGL, if you really need OpenGL sample app, I can find it.

        We are working on integrating GPU image processing into our SDK, so this will be coming soon!

        Alex

        J 1 Reply Last reply Reply Quote 0
        • J
          jamesbowman @Alex Kushleyev
          last edited by

          @Alex-Kushleyev - Thanks.

          I downloaded and compiled clDeviceQuery.cpp.
          It gives the same result as "clinfo" and "hellocl.c" - no devices found:

          $ ./opencl-query 
          clDeviceQuery Starting...
          
           Error -1001 in clGetPlatformIDs Call!
          
          
          System Info: 
          
           Local Time/Date =  13:38:42, 11/22/2024
           CPU Name: none
           # of CPU processors: 8
           Linux version 4.19.125 (oe-user@oe-host) (gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04), GNU ld (GNU Binutils for Ubuntu) 2.30) #1 SMP PREEMPT Fri May 17 23:29:23 UTC 2024
          
          
          TEST FAILED !!!
          

          Perhaps I need to load a kernel module to enable OpenCL? This is what's loaded:

          $ lsmod                                                                                
          Module                  Size  Used by
          voxl_platform_mod      16384  0
          voxl_gpio_mod          16384  0
          voxl_fsync_mod         16384  0
          machine_dlkm          159744  0
          wcd938x_slave_dlkm     16384  0
          wcd938x_dlkm          110592  1 machine_dlkm
          wcd9xxx_dlkm           49152  1 wcd938x_dlkm
          mbhc_dlkm              45056  1 wcd938x_dlkm
          tx_macro_dlkm         106496  0
          rx_macro_dlkm         102400  0
          va_macro_dlkm          98304  0
          wsa_macro_dlkm         69632  1 machine_dlkm
          swr_ctrl_dlkm          57344  4 wsa_macro_dlkm,tx_macro_dlkm,rx_macro_dlkm,va_macro_dlkm
          bolero_cdc_dlkm        57344  5 machine_dlkm,wsa_macro_dlkm,tx_macro_dlkm,rx_macro_dlkm,va_macro_dlkm
          wsa881x_dlkm           45056  1 machine_dlkm
          wcd_core_dlkm          32768  7 wsa881x_dlkm,machine_dlkm,wsa_macro_dlkm,tx_macro_dlkm,rx_macro_dlkm,va_macro_dlkm,wcd938x_dlkm
          stub_dlkm              16384  0
          hdmi_dlkm              24576  0
          swr_dlkm               24576  4 wsa881x_dlkm,wcd938x_dlkm,swr_ctrl_dlkm,wcd938x_slave_dlkm
          pinctrl_lpi_dlkm       20480  0
          pinctrl_wcd_dlkm       16384  0
          usf_dlkm               57344  0
          native_dlkm           163840  0
          platform_dlkm        2195456  1 native_dlkm
          q6_dlkm               909312  9 bolero_cdc_dlkm,machine_dlkm,pinctrl_lpi_dlkm,usf_dlkm,va_macro_dlkm,swr_ctrl_dlkm,wcd9xxx_dlkm,native_dlkm,platform_dlkm
          adsp_loader_dlkm       16384  0
          apr_dlkm              229376  4 q6_dlkm,usf_dlkm,adsp_loader_dlkm,platform_dlkm
          snd_event_dlkm         16384  5 bolero_cdc_dlkm,machine_dlkm,q6_dlkm,pinctrl_lpi_dlkm,apr_dlkm
          q6_notifier_dlkm       16384  3 q6_dlkm,pinctrl_lpi_dlkm,apr_dlkm
          q6_pdr_dlkm            16384  1 q6_notifier_dlkm
          88XXau               2342912  0
          8821cu               2465792  0
          8188eu               1200128  0
          

          @Alex-Kushleyev said in GPU "hello world":

          QUALCOMM Adreno

          Alex KushleyevA 1 Reply Last reply Reply Quote 0
          • Alex KushleyevA
            Alex Kushleyev ModalAI Team @jamesbowman
            last edited by

            @jamesbowman , this is strange. i have not seen this error before.

            Can you do a quick check right after booting:

            voxl2:~/opencl$ dmesg | grep gsl
            [    1.676429] arm-smmu 3da0000.kgsl-smmu: Linked as a consumer to regulator.72
            [    1.676512] arm-smmu 3da0000.kgsl-smmu: 	non-coherent table walk
            [    1.676520] arm-smmu 3da0000.kgsl-smmu: 	(IDR0.CTTW overridden by FW configuration)
            [    1.676531] arm-smmu 3da0000.kgsl-smmu: 	stream matching with 6 register groups
            [    1.818850] subsys-pil-tz soc:qcom,kgsl-hyp: for a650_zap segments only will be dumped.
            [    1.818883] subsys-pil-tz soc:qcom,kgsl-hyp: for md_a650_zap segments only will be dumped.
            [    1.827612] iommu-debug soc:kgsl_iommu_test_device: Linked as a consumer to 3da0000.kgsl-smmu
            [    1.827666] iommu: Adding device soc:kgsl_iommu_test_device to group 6
            [    1.949719] platform 3d6a000.qcom,gmu:gmu_user: Linked as a consumer to 3da0000.kgsl-smmu
            [    1.950189] platform 3d6a000.qcom,gmu:gmu_kernel: Linked as a consumer to 3da0000.kgsl-smmu
            [    1.950877] kgsl-3d 3d00000.qcom,kgsl-3d0: Linked as a consumer to regulator.72
            [    1.950895] kgsl-3d 3d00000.qcom,kgsl-3d0: Linked as a consumer to regulator.73
            [    1.951118] platform 3da0000.qcom,kgsl-iommu:gfx3d_user: Linked as a consumer to 3da0000.kgsl-smmu
            [    1.951138] iommu: Adding device 3da0000.qcom,kgsl-iommu:gfx3d_user to group 34
            [    1.951327] platform 3da0000.qcom,kgsl-iommu:gfx3d_secure: Linked as a consumer to 3da0000.kgsl-smmu
            [    1.951344] iommu: Adding device 3da0000.qcom,kgsl-iommu:gfx3d_secure to group 35
            [  352.615269] subsys-pil-tz soc:qcom,kgsl-hyp: a650_zap: loading from 0x00000000ede00000 to 0x00000000ede01000
            [  352.621129] subsys-pil-tz soc:qcom,kgsl-hyp: a650_zap: Brought out of reset
            

            kgsl is the kernel module for the gpu.

            Alex

            J 1 Reply Last reply Reply Quote 0
            • J
              jamesbowman @Alex Kushleyev
              last edited by

              @Alex-Kushleyev

              We recompiled with:

              g++ -O2 clDeviceQuery.cpp -L /usr/lib -l OpenCL -o opencl-query
              

              and now I have an OpenCL device. (fwiw my dmesg output looks very much like yours above.)

              Thanks, J.

              @Alex-Kushleyev said in GPU "hello world":

              dmesg | grep gsl

              Alex KushleyevA 1 Reply Last reply Reply Quote 0
              • Alex KushleyevA
                Alex Kushleyev ModalAI Team @jamesbowman
                last edited by

                @jamesbowman , so the difference in compilation is just adding of -L /usr/lib or something else? I believe this is redundant, as this path should already be in the library path.. Hmm

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Powered by NodeBB | Contributors