ModalAI Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Neural network inference fails on VOXL2 Adreno GPU, but works on CPU, with Qualcomm SDK

    Software Development
    3
    3
    764
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      dario-pisanti
      last edited by

      Hi,
      I hope you could help me with the following issue.

      SUMMARY:
      I am interested in running inference of deep neural network models on a VOXL2 by using the Qualcomm Neural Processing SDK, hopefully benefiting from the GPU and the NPUs onboard.
      Specifically, I'm trying to run a pre-trained VGG-16 model from the ONNX framework, following the tutorial at https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/tutorial_onnx.html

      After successfully converting the model from ONNX to DLC format through Qualcomm SDK, everything works fine when I run inference of the vgg16.dlc model (Step 7. of the tutorial) on the VOXL2 CPUs by running:

      cd $SNPE_ROOT/examples/Models/VGG/data/cropped
      snpe-net-run --input_list raw_list.txt --container ../../dlc/vgg16.dlc --output_dir ../../output***
      

      with the expected output:

      -------------------------------------------------------------------------------
      Model String: N/A
      SNPE v2.15.4.231013125348_62905
      -------------------------------------------------------------------------------
      Processing DNN input(s):
      /opt/qcom/aistack/snpe/2.15.4.231013/examples/Models/VGG/data/cropped/kitten.raw
      Successfully executed!
      

      However, when I enable GPU usage, by running:

      snpe-net-run --input_list raw_list.txt --container ../../dlc/vgg16.dlc --output_dir ../../output --use_gpu
      

      I get the following error:

      error_code=201; error_message=Casting of tensor failed. error_code=201; error_message=Casting of tensor failed. Failed to create input tensor: vgg0_dense0_weight_permute for Op: vgg0_dense0_fwd error: 1002; error_component=Dl System; line_no=817; thread_id=547788872288; error_component=Dl System; line_no=277; thread_id=547865747472
      

      In conclusion, why the same model inference works on the VOXL2 CPU, but not on its GPU? In addition: does anyone have any experience with running deep learning inference on the VOXL2 NPUs with Qualcomm SDKs?

      HOW TO REPRODUCE:
      I succesfully setup Qualcomm Neural Processing SDK on VOXL2 following the instructions at
      https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/setup.html, using the binaries in $SNPE_ROOT/bin/aarch64-ubuntu-gcc7.5 and I accordingly modified $SNPE_ROOT/bin/envsetup.sh for correct environment variables setup.

      I followed the instructions from steps1 to step 4 of the VGG tutorial at https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/tutori..., on VOXL2.

      I converted the VGG ONNX model into Qualcomm SDK DLC format (step 5) on a Host machine running with Ubuntu 20.04 and a Clang 9 compiler installed, where I setup the Qualcomm Neural Processing SDK addressing the binaries in $SNPE_ROOT/bin/x86_64-linux-clang (the conversion operation is not supported on VOXL2 architecture).

      I pushed the converted VGG model in DLC format to the VOXL2 and I followed the remaining instructions of the tutorial up to step 7, where I got the situation reported in the summary above.

      VOXL2 SPECS:
      Architecture: Aarch64
      OS: Ubuntu 18.04
      CPU: Qualcomm® QRB5165: 8 cores up to 3.091 GHz, 8GB LPDDR5
      GPU: Adreno 650 GPU - 1024 ALU
      NPU: 15 TOPS AI embedded Neural Processing Unit
      ONNX PYTHON PACKAGES: onnx==1.14.1, onnxruntime==1.16.1

      HOST SPECS:
      Architecture: x86
      OS: Ubuntu 20.04
      CPU: Intel(R) Xeon(R) W-2125 8 cores @ 4.00GHz
      GPU: NVIDIA Corporation GP106GL [Quadro P2000]
      ONNX PYTHON PACKAGES: onnx==1.14.1, onnxruntime==1.16.1

      FURTHER DETAILS:
      I checked the availability of GPU runtime on VOXL2, by executing the snpe-platform-validator tool (available with the Qualcomm Neural Processing SDK) from my Host machine:

      cd /opt/qcom/aistack/snpe/2.15.4.231013/bin/x86_64-linux-clang 
      python3 snpe-platform-validator-py --runtime="all" --directory=/opt/qcom/aistack/snpe/2.15.4.231013 --buildVariant="aarch64-ubuntu-gcc7.5"
      

      The platform validator results for GPU are:

      Runtime supported: Supported
      Library Prerequisites: Found
      Library Version: Not Queried
      Runtime Core Version: Not Queried
      Unit Test: Passed
      Overall Result: Passed
      
      ModeratorM Manu Bhardwaj 0M 2 Replies Last reply Reply Quote 1
      • ModeratorM
        Moderator ModalAI Team @dario-pisanti
        last edited by

        @dario-pisanti Our efforts have focused on voxl-tflite-server. voxl-tflite-server can take advantage of the CPU, GPU and NPU depending on the model.

        1 Reply Last reply Reply Quote 0
        • Manu Bhardwaj 0M
          Manu Bhardwaj 0 @dario-pisanti
          last edited by

          @dario-pisanti Hi Dario,

          Thanks for sharing your issue. I'm also working with the Qualcomm Neural Processing SDK on VOXL2 Mini.

          The error you mentioned could be due to:

          Model compatibility issues during conversion.
          SDK configuration, especially for GPU.
          Double-check the GPU-specific settings in $SNPE_ROOT/bin/envsetup.sh and try running a simpler model to verify the GPU setup.

          Have you managed to resolve this, or do you have any additional details? I'm also trying to use the QNN SDK with ONNX runtime on VOXL2 Mini.

          Best,
          Manu

          1 Reply Last reply Reply Quote 0
          • First post
            Last post
          Powered by NodeBB | Contributors