ModalAI Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. dario-pisanti
    3. Best
    D
    • Profile
    • Following 0
    • Followers 0
    • Topics 2
    • Posts 2
    • Best 1
    • Controversial 0
    • Groups 0

    Best posts made by dario-pisanti

    • Neural network inference fails on VOXL2 Adreno GPU, but works on CPU, with Qualcomm SDK

      Hi,
      I hope you could help me with the following issue.

      SUMMARY:
      I am interested in running inference of deep neural network models on a VOXL2 by using the Qualcomm Neural Processing SDK, hopefully benefiting from the GPU and the NPUs onboard.
      Specifically, I'm trying to run a pre-trained VGG-16 model from the ONNX framework, following the tutorial at https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/tutorial_onnx.html

      After successfully converting the model from ONNX to DLC format through Qualcomm SDK, everything works fine when I run inference of the vgg16.dlc model (Step 7. of the tutorial) on the VOXL2 CPUs by running:

      cd $SNPE_ROOT/examples/Models/VGG/data/cropped
      snpe-net-run --input_list raw_list.txt --container ../../dlc/vgg16.dlc --output_dir ../../output***
      

      with the expected output:

      -------------------------------------------------------------------------------
      Model String: N/A
      SNPE v2.15.4.231013125348_62905
      -------------------------------------------------------------------------------
      Processing DNN input(s):
      /opt/qcom/aistack/snpe/2.15.4.231013/examples/Models/VGG/data/cropped/kitten.raw
      Successfully executed!
      

      However, when I enable GPU usage, by running:

      snpe-net-run --input_list raw_list.txt --container ../../dlc/vgg16.dlc --output_dir ../../output --use_gpu
      

      I get the following error:

      error_code=201; error_message=Casting of tensor failed. error_code=201; error_message=Casting of tensor failed. Failed to create input tensor: vgg0_dense0_weight_permute for Op: vgg0_dense0_fwd error: 1002; error_component=Dl System; line_no=817; thread_id=547788872288; error_component=Dl System; line_no=277; thread_id=547865747472
      

      In conclusion, why the same model inference works on the VOXL2 CPU, but not on its GPU? In addition: does anyone have any experience with running deep learning inference on the VOXL2 NPUs with Qualcomm SDKs?

      HOW TO REPRODUCE:
      I succesfully setup Qualcomm Neural Processing SDK on VOXL2 following the instructions at
      https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/setup.html, using the binaries in $SNPE_ROOT/bin/aarch64-ubuntu-gcc7.5 and I accordingly modified $SNPE_ROOT/bin/envsetup.sh for correct environment variables setup.

      I followed the instructions from steps1 to step 4 of the VGG tutorial at https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/tutori..., on VOXL2.

      I converted the VGG ONNX model into Qualcomm SDK DLC format (step 5) on a Host machine running with Ubuntu 20.04 and a Clang 9 compiler installed, where I setup the Qualcomm Neural Processing SDK addressing the binaries in $SNPE_ROOT/bin/x86_64-linux-clang (the conversion operation is not supported on VOXL2 architecture).

      I pushed the converted VGG model in DLC format to the VOXL2 and I followed the remaining instructions of the tutorial up to step 7, where I got the situation reported in the summary above.

      VOXL2 SPECS:
      Architecture: Aarch64
      OS: Ubuntu 18.04
      CPU: Qualcomm® QRB5165: 8 cores up to 3.091 GHz, 8GB LPDDR5
      GPU: Adreno 650 GPU - 1024 ALU
      NPU: 15 TOPS AI embedded Neural Processing Unit
      ONNX PYTHON PACKAGES: onnx==1.14.1, onnxruntime==1.16.1

      HOST SPECS:
      Architecture: x86
      OS: Ubuntu 20.04
      CPU: Intel(R) Xeon(R) W-2125 8 cores @ 4.00GHz
      GPU: NVIDIA Corporation GP106GL [Quadro P2000]
      ONNX PYTHON PACKAGES: onnx==1.14.1, onnxruntime==1.16.1

      FURTHER DETAILS:
      I checked the availability of GPU runtime on VOXL2, by executing the snpe-platform-validator tool (available with the Qualcomm Neural Processing SDK) from my Host machine:

      cd /opt/qcom/aistack/snpe/2.15.4.231013/bin/x86_64-linux-clang 
      python3 snpe-platform-validator-py --runtime="all" --directory=/opt/qcom/aistack/snpe/2.15.4.231013 --buildVariant="aarch64-ubuntu-gcc7.5"
      

      The platform validator results for GPU are:

      Runtime supported: Supported
      Library Prerequisites: Found
      Library Version: Not Queried
      Runtime Core Version: Not Queried
      Unit Test: Passed
      Overall Result: Passed
      
      posted in Software Development
      D
      dario-pisanti