ModalAI Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Alibaba's LLM Qwen2-0.5B for VOXL 2

    VOXL Compute & Autopilot
    1
    1
    35
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Suvasis MS
      Suvasis M
      last edited by

      hi,
      Has anyone tried Qwen2-0.5B for the VOXL 2?

      Here's are some requirements and compatibility:

      1. RAM Requirements for Qwen2-0.5B

      Base model (FP16): ~1GB of RAM
      Quantized to INT8: ~500MB of RAM
      Quantized to INT4: ~250MB of RAM

      VOXL 2 Specifications

      The VOXL 2 has approximately 4GB of RAM available
      It uses a Qualcomm QRB5165 processor
      Has AI acceleration capabilities via the Hexagon DSP
      Primarily designed for computer vision tasks

      Compatibility Assessment
      Qwen2-0.5B could potentially run on the VOXL 2 with quantization to INT4 or INT8, the memory footprint would be manageable.

      Inference speed would likely be slow, perhaps 1-2 seconds per token

      Question:

      Has ModalAI explored using TensorFlow Lite with the Hexagon delegate to run compact language models like Qwen2-0.5B on the VOXL 2 platform? I'm interested in whether you've experimented with leveraging the Qualcomm GPU for language model inference, even though I understand the VOXL 2 is primarily optimized for computer vision workloads. Have you conducted any experiments or performance testing with small LLMs on this hardware?
      Thanks.
      suvasis

      1 Reply Last reply Reply Quote 0
      • First post
        Last post
      Powered by NodeBB | Contributors