From ProventusNova DeveloperWiki
Revision as of 18:32, 26 November 2025 by Nico (talk | contribs) (→‎Set up)

How to run NVIDIA NanoOWL tutorial

Keywords: NVIDIA Jetson, NanoOWL, object detection

Description

This wiki pages shows how to run the tree prediction with a live camera tutorial from NVIDIA NanoOWL. NanoOWL is a project that optimizes OWL-ViT to run real-time on NVIDIA Jetson Orin Platforms with NVIDIA TensorRT. NanoOWL also introduces a new "tree detection" pipeline that combines OWL-ViT and CLIP to enable nested detection and classification of anything, at any level, simply by providing text.

Set up

Minimal requirements:

  • One of the following Jetson:
    • Jetson Orin NX (16 GB)
    • Jetson Orin Nano (8 GB)
    • Jetson AGX Orin (32 GB or 64 GB)
  • Running one of the following JetPack versions:
    • Jetpack 5 (L4T r35.x)
    • Jetpack 6 (L4T r36.x)

This tutorial was tested with the following setup:

  • x86/x64 based host machine running Ubuntu 24.04
  • Jetson Orin NX (16 GB) + CTI Boson carrier board running Jetpack 6.2.1 (L4T 36.4.4).
  • One Framos IMX464 camera
  • Ethernet cable

1. Connect to the board via ssh:

 
ssh <user>@<board_ip>

Tutorial

1. Clone and setup jetson-containers

 
git clone https://github.com/dusty-nv/jetson-containers
bash jetson-containers/install.sh

2. Run the following to pull or build a compatible container image.

 
jetson-containers run --workdir /opt/nanoowl $(autotag nanoowl)

After running above command you should be in workdir (/opt/nanoowl) inside the container.

3. Verify you have a camera connected:

 
ls /dev/video*

4. Install missing module inside container:

 
pip install aiohttp

Note: If this takes too long or fails, try the following command:

 
pip install aiohttp --index-url https://pypi.org/simple --prefer-binary

5. Run the tree_demo.py example.

 
cd examples/tree_demo
python3 tree_demo.py --camera 0 --resolution 640x480 ../../data/owl_image_encoder_patch32.engine

You should see the following after the application starts:

Adaptations

  • If you are using a camera that does not support the GStreamer v4l2src element—for example, a CSI-connected camera—you may encounter an error similar to the following:
 
[ WARN:0@15.685] global cap_gstreamer.cpp:2829 handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module v4l2src0 reported: Internal data stream error.

To enable frame capture from such a camera, modify the detection_loop method as follows:

 
        # Delete or comment the three lines of code below
        # camera = cv2.VideoCapture(CAMERA_DEVICE)
        # camera.set(cv2.CAP_PROP_FRAME_WIDTH, width)
        # camera.set(cv2.CAP_PROP_FRAME_HEIGHT, height)

        # Add the following GStreamer pipeline to use nvarguscamerasrc
        gst_pipeline = (
                 f"nvarguscamerasrc ! "
                 f"video/x-raw(memory:NVMM), width={width}, height={height}, format=NV12, framerate=30/1 ! "
                 f"nvvidconv ! video/x-raw, format=BGRx ! videoconvert ! "
                 f"appsink"
        )

        camera = cv2.VideoCapture(gst_pipeline, cv2.CAP_GSTREAMER)
        camera.set(cv2.CAP_PROP_BUFFERSIZE, 1)