How to run NVIDIA NanoOWL tutorial
Keywords: NVIDIA Jetson, NanoOWL, object detection
Description
This wiki pages shows how to run the tree prediction with a live camera tutorial from NVIDIA NanoOWL. NanoOWL is a project that optimizes OWL-ViT to run real-time on NVIDIA Jetson Orin Platforms with NVIDIA TensorRT. NanoOWL also introduces a new "tree detection" pipeline that combines OWL-ViT and CLIP to enable nested detection and classification of anything, at any level, simply by providing text.
Set up
- One of the following Jetson:
- Jetson Orin NX (16 GB)
- Jetson Orin Nano (8 GB)
- Jetson AGX Orin (32 GB or 64 GB)
- Running one of the following JetPack versions:
- Jetpack 5 (L4T r35.x)
- Jetpack 6 (L4T r36.x)
Tutorial
1. Clone and setup jetson-containers
git clone https://github.com/dusty-nv/jetson-containers
bash jetson-containers/install.sh
2. Run the following to pull or build a compatible container image.
jetson-containers run --workdir /opt/nanoowl $(autotag nanoowl)
After running above command you should be in workdir (/opt/nanoowl) inside the container.
3. Verify you have a camera connected:
ls /dev/video*
4. Install missing module inside container:
pip install aiohttp
Note: If this takes too long or fails, try the following command:
pip install aiohttp --index-url https://pypi.org/simple --prefer-binary
5. Run the tree_demo.py example.
cd examples/tree_demo
python3 tree_demo.py --camera 0 --resolution 640x480 ../../data/owl_image_encoder_patch32.engine
Adaptations
- If you are using a camera that does not support the GStreamer v4l2src element—for example, a CSI-connected camera—you may encounter an error similar to the following:
[ WARN:0@15.685] global cap_gstreamer.cpp:2829 handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module v4l2src0 reported: Internal data stream error.
To enable frame capture from such a camera, modify the detection_loop method as follows:
# Delete or comment the three lines of code below
# camera = cv2.VideoCapture(CAMERA_DEVICE)
# camera.set(cv2.CAP_PROP_FRAME_WIDTH, width)
# camera.set(cv2.CAP_PROP_FRAME_HEIGHT, height)
# Add the following GStreamer pipeline to use nvarguscamerasrc
gst_pipeline = (
f"nvarguscamerasrc ! "
f"video/x-raw(memory:NVMM), width={width}, height={height}, format=NV12, framerate=30/1 ! "
f"nvvidconv ! video/x-raw, format=BGRx ! videoconvert ! "
f"appsink"
)
camera = cv2.VideoCapture(gst_pipeline, cv2.CAP_GSTREAMER)
camera.set(cv2.CAP_PROP_BUFFERSIZE, 1)
