From ProventusNova DeveloperWiki

Debug NVIDIA Jetson nvv4l2av1enc VPR Violation (EMEM decode error)

Issue description

When using NVIDIA's encoder AV1 GStreamer element nvv4l2av1enc with a non-standard resolution, such as 2712x1538, it may trigger unauthorized access to protected memory issues. You may see logs from the kernel reporting EMEM address decode error (EMEM decode error), route sanity error, and VPR violation issues.

Replicate issue

For replicating this issue do the following.

To see kernel debug messages run:

dmesg -wH

On another console run the following GStreamer pipeline:

gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=2712, height=1538, framerate=90/1' ! nvvidconv ! 'video/x-raw(memory:NVMM), width=2712, height=1538, framerate=90/1' ! queue ! 'video/x-raw(memory:NVMM), width=2712, height=1538, framerate=90/1' ! nvv4l2av1enc name=encoder1 bitrate=22000000 maxperf-enable=1 insert-seq-hdr=true idrinterval=15 iframeinterval=256 ! fakesink

Important: The issue does not occur every time. For getting the error cancel and execute the pipeline repeatedly until you see a log from the dmesg the looks like the following:

arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0x3ffded8000, fsynr=0x6f0013, cbfrsynra=0x835, cb=2
[  +0.000462] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0x3ffded8000, fsynr=0x6f0003, cbfrsynra=0x835, cb=2
[  +0.000436] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0x3ffded8000, fsynr=0x6f0003, cbfrsynra=0x835, cb=2
[  +0.000423] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0x3ffded8000, fsynr=0x6f0013, cbfrsynra=0x835, cb=2
[  +0.000406] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0x3ffded8000, fsynr=0x6f0013, cbfrsynra=0x835, cb=2
[  +0.000407] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0x3ffded8000, fsynr=0x6f0003, cbfrsynra=0x835, cb=2
[  +0.000400] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0x3ffded8000, fsynr=0x6f0003, cbfrsynra=0x835, cb=2
[  +0.000396] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0x3ffded8000, fsynr=0x6f0003, cbfrsynra=0x835, cb=2
[  +0.000865] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0x3ffded8000, fsynr=0x6f0003, cbfrsynra=0x835, cb=2
[  +0.001541] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0x3ffded8000, fsynr=0x6f0013, cbfrsynra=0x835, cb=2
[  +0.008255] tegra-mc 2c00000.memory-controller: nvencsrd: secure read @0x000000ffffffff00: EMEM address decode error (EMEM decode error)
[  +0.009824] tegra-mc 2c00000.memory-controller: nvencswr: secure write @0x00000003ffffff00: VPR violation ((null))
[  +0.009101] tegra-mc 2c00000.memory-controller: nvencswr: secure write @0x00000003ffffff00: Route Sanity error ((null))
[  +0.016422] tegra-mc 2c00000.memory-controller: nvencsrd: secure read @0x000000ffffffff00: EMEM address decode error (EMEM decode error)
[  +0.004142] tegra-mc 2c00000.memory-controller: nvencswr: secure write @0x00000003ffffff00: VPR violation ((null))
[  +0.009097] tegra-mc 2c00000.memory-controller: nvencswr: secure write @0x00000003ffffff00: Route Sanity error ((null))
[  +0.019064] tegra-mc 2c00000.memory-controller: unknown: secure read @0x000000ffffffff00: EMEM address decode error (EMEM decode error)
[  +0.001501] tegra-mc 2c00000.memory-controller: nvencswr: secure write @0x00000003ffffff00: VPR violation ((null))
[  +0.009187] tegra-mc 2c00000.memory-controller: nvencswr: secure write @0x00000003ffffff00: Route Sanity error ((null))
[  +0.015548] tegra-mc 2c00000.memory-controller: nvencsrd: secure read @0x000000ffffffff00: EMEM address decode error (EMEM decode error)

What may be causing this issue?

NVIDIA's nvv4l2av1enc encoder uses NVENC hardware unit in Jetson platforms for hardware-accelerated processing on encoding. NVENC has strict constraints for resolution alignment, max resolution limits, frame rate limits and memory requirements. If any of this constraints is not met the encoder will silently fail and report the issue only in the kernel logs and not on GStreamer. In our case, resolution 2712x1538 is a non-standard resolution and it seems to trigger the resolution alignment constraint on the hardware causing the unauthorized access to protected memory issue(EMEM address decode error, route sanity error, and VPR violation). Even though NVIDIA does not have explicitly documented which alignment is required; through some extensive testing we have come up to the conclusion that a 64 pixel width alignment is necessary. What does this mean? This means that the resolution's width should be divisible by 64. In our case:

2712 % 64 = 24 --> Not aligned

Possible Solutions

Solution 1: Use an image resolution that aligns with 64 pixels

  • Use a lower resolution close to the original desired resolution

In this case the closest following resolution would be: 2688 x 1526. For this new resolution the width respects the 64 pixel alignment required by the NVENC.

2688 % 64 = 0 --> Aligned

Pipeline example:

gst-launch-1.0 videotestsrc is-live=1 ! 'video/x-raw, width=2688, height=1526, framerate=90/1' ! nvvidconv ! queue ! nvv4l2av1enc name=encoder1 bitrate=22000000 maxperf-enable=1 insert-seq-hdr=true idrinterval=15 iframeinterval=256 ! fakesink
  • Use original resolution for capturing and add padding with black pixels

This approach allows to adjust the original resolution to a resolution that fits the 64 pixel width alignment required by the NVENC hardware unit to avoid getting the VPR violation and EMEM address decode errors. For that, you will need to increment the width resolution to the next number divisible by 64. In this case the next number would be: 2752.

2752 % 64 = 0 --> Aligned

The capturing process plus adding the padding would be the following:

Camera capture @ 2712x1536 --> Add horizontal padding of 40 pixels (2752-2712) --> AV1 encoder

With this approach we are taking into account the 64 pixel width alignment for the NVENC hardware unit; solving the issue.

Experimental section

This section includes the exploration of ideas for solving the width alignment encoder issue.

Modify TEGRA_WIDTH_ALIGNMENT on tegra_camera_core.h file

The tegra_camera_core.h is a header file for the NVIDIA Tegra camera core driver. It includes constants, enums and structures used by the Tegra video input device driver in the Linux Kernel. In this file you can see the following line:

/* Width alignment */
#define TEGRA_WIDTH_ALIGNMENT	1
  • Proposal: Modify TEGRA_WIDTH_ALIGNMENT to 64 so that the encoder may process non-standard image resolutions.
  • Veredict: Not possible.

The TEGRA_WIDTH_ALIGNMENT sets the required alignment for image width, in this case, to 1 pixel. This means the image width does not need to be aligned to any larger boundary. Essentially letting the camera core driver to capture with any resolution. In our proposal we suggested changing it to 64. The effect on this would be that for the camera driver to be able to capture, the requested resolution's width needs to be divisible by 64. Hence, the 2712x1538 resolution won't work since 2712 is not divisible by 64.

  • Conclusion: Modifying TEGRA_WIDTH_ALIGNMENT will only affect the camera driver instead of the encoder, so modifying it won't represent any change in the encoder behavior when receiving a width not aligned to 64 pixels, it would actually break the pipeline at the capture stage rather than on the encoding stage.