SPDX-FileCopyrightText: Copyright (c) 2021-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. SPDX-License-Identifier: LicenseRef-NvidiaProprietary NVIDIA CORPORATION, its affiliates and licensors retain all intellectual property and proprietary rights in and to this material, related documentation and any modifications thereto. Any use, reproduction, disclosure or distribution of this material and related documentation without an express license agreement from NVIDIA CORPORATION or its affiliates is strictly prohibited. NvSciStream Event Loop Driven Sample App - README --- # nvscistream_event_sample - NvSciStream Sample App ## Description This directory contains an NvSciStream sample application that supports a variety of use cases, using an event-loop driven model. Once the stream is fully connected, all further setup and streaming operations are triggered by events, processed either by a single NvSciEvent-driven thread or separate threads which wait for events on each block. The former is the preferred approach for implementing NvSciStream applications. In addition to those events which NvSci itself generates, any other event which can be bound to an NvSciEvent can be added to the event loop. This allows for robust applications which can handle events regardless of the order in which they occur. To use this sample for writing your own applications: * See main.c for examples of how to do top level application setup and how to select the blocks needed for your use case and connect them all together. * See the descriptions in the usecase*.h files to determine which use cases involve the producer and consumer engines that you are interested in. * See the appropriate block_*.c files for examples of creating the necessary blocks and handling the events that they encounter. See the block_producer_*.c and block_consumer_*.c files for examples of how to map the relevant engines to and from NvSci. * See the appropriate event_loop_*.c file for your chosen event handling method. ## Build the application The NvSciStream sample includes source code and a Makefile. Navigate to the sample application directory to build the application: make clean make ## Examples of how to run the sample application: * NOTE: * Inter-process and inter-chip test cases must be run with sudo. * NvMedia/CUDA stream (use case 2) of the sample application is not supported on x86 and Jetson Linux devices. * Inter-chip use cases are not supported on Jetson Linux devices. * Update the NvIpc/PCIe endpoint accordingly. Single-process, single-consumer CUDA/CUDA stream that uses the default event service: ./nvscistream_event_sample Single-process, single-consumer stream that uses the threaded event handling: ./nvscistream_event_sample -e t Single-process NvMedia/CUDA stream with yuv format: ./nvscistream_event_sample -u 2 -s y Single-process NvMedia/CUDA stream with three consumers, and the second uses the mailbox mode: ./nvscistream_event_sample -u 2 -m 3 -q 1 m Multi-process CUDA/CUDA stream with three consumers, one in the same process as the producer, and the other two in separate processes. The first and the third consumers use the mailbox mode: ./nvscistream_event_sample -m 3 -p -c 0 -q 0 m & ./nvscistream_event_sample -c 1 -c 2 -q 2 m Multi-process CUDA/CUDA stream with three consumers, one in the same process as the producer, and the other two in separate processes. To simulate the case with a less trusted consumer, one of the consumer processes is set with lower priority. A limiter block is used to restrict this consumer to hold at most one packet. The total number of packets is increased to five. Linux example: ./nvscistream_event_sample -m 3 -f 5 -p -c 0 -l 2 1 & ./nvscistream_event_sample -c 1 & nice -n 19 ./nvscistream_event_sample -c 2 & # Makes the third process as nice as possible. QNX example: ./nvscistream_event_sample -m 3 -f 5 -p -c 0 -l 2 1 & ./nvscistream_event_sample -c 1 & nice -n 1 ./nvscistream_event_sample -c 2 & # Reduces the priority level of the third process by 1. Multi-process CUDA/CUDA stream with two consumers, one in the same process as the producer, and the other in a separate processe. Both processes enable the endpoint information option: ./nvscistream_event_sample -m 2 -p -c 0 -i & ./nvscistream_event_sample -c 1 -i Multi-process CUDA/CUDA stream with extra validation steps for ASIL-D process (Not support on x86 or Jetson Linux devices): ./nvscistream_event_sample -u 3 -p & ./nvscistream_event_sample -u 3 -c 0 Multi-process CUDA/CUDA stream using external event service to handle internal I/O messages acroess process boundary: ./nvscistream_event_sample -p -E & ./nvscistream_event_sample -c 0 -E Multi-process CUDA/CUDA stream with one consumer on another SoC. The consumer has the FIFO queue attached to the C2C IpcSrc block, and a three-packet pool attached to the C2C IpcDst block. It uses IPC channel nvscic2c_pcie_s0_c5_1 <-> nvscic2c_pcie_s0_c6_1 for C2C communication. ./nvscistream_event_sample -P 0 nvscic2c_pcie_s0_c5_1 -Q 0 f # Run below command on another OS running on peer SOC. ./nvscistream_event_sample -C 0 nvscic2c_pcie_s0_c6_1 -F 0 3 Multi-process CUDA/CUDA stream with four consumers, one in the same process as the producer, one in another process but in the same OS as the producer, and two in another process on another OS running in a peer SoC. The third and fourth consumers have a mailbox queue attached to the C2C IpcSrc block, and a five-packet pool attached to the C2C IpcDst block. The third consumer uses nvscic2c_pcie_s0_c5_1 <-> nvscic2c_pcie_s0_c6_1 for C2C communication. The 4th consumer uses nvscic2c_pcie_s0_c5_2 <-> nvscic2c_pcie_s0_c6_2 for C2C communication. ./nvscistream_event_sample -m 4 -c 0 -q 0 m -Q 2 m -Q 3 m -P 2 nvscic2c_pcie_s0_c5_1 -P 3 nvscic2c_pcie_s0_c5_2 & ./nvscistream_event_sample -c 1 -q 1 m # Run below command on another OS running on peer SOC. ./nvscistream_event_sample -C 2 nvscic2c_pcie_s0_c6_1 -q 2 f -F 2 5 -C 3 nvscic2c_pcie_s0_c6_2 -q 3 m -F 3 5 #Example commands for inter-process late attach usecase Multi-process CUDA/CUDA stream with one early consumer and one late-attached consumer Producer and early consumer processes are configured to stream 100000 frames, where as the late-attached consumer process is configured to receive 10000 frames. # Run the below commands to launch producer and early consumer processes. ./nvscistream_event_sample -m 2 -r 1 -p & ./nvscistream_event_sample -c 0 -k 0 100000 & # Run the below command after some delay to launch the late-attached consumer process. sleep 1; # This 1s delay will let producer and consumer to enter into streaming phase. ./nvscistream_event_sample -L -c 1 -k 1 10000 & Multi-process CUDA/CUDA stream with one early consumer and two late-attached consumers Producer and early consumer processes are configured to stream 100000 frames, where as the late-attached consumer process one is configured to receive 10000 frames and the late-attached consumer process two is configured to receive 50000 frames # Run the below commands to launch producer and early consumer processes. ./nvscistream_event_sample -m 3 -r 2 -p & ./nvscistream_event_sample -c 0 -k 0 100000 & # Run the below command after some delay to launch the late-attached consumer process one. sleep 1; # This 1s delay will let producer and consumer to enter into streaming phase. ./nvscistream_event_sample -L -c 1 -k 1 10000 & # Run the below command after some delay to launch the late-attached consumer process two. sleep 1; # This 1s delay will let producer and consumer to enter into streaming phase. ./nvscistream_event_sample -L -c 2 -k 2 50000 & #Example commands for inter-process re-attach usecase Multi-process CUDA/CUDA stream with one early consumer and two late-attached consumers Producer and early consumer processes are configured to stream 100000 frames, where as the late-attached consumer process one is configured to receive 10000 frames and the late-attached consumer process two is configured to receive 50000 frames. Once late-attached consumer process one completes streaming, re-attach it for receiving 5000 frames. # Run the below commands to launch producer and early consumer processes. ./nvscistream_event_sample -m 3 -r 2 -p & ./nvscistream_event_sample -c 0 -k 0 100000 & # Run the below command after some delay to launch the late-attached consumer process one. sleep 1; # This 1s delay will let producer and consumer to enter into streaming phase. ./nvscistream_event_sample -L -c 1 -k 1 10000 & # Run the below command after some delay to launch the late-attached consumer process two. sleep 1; ./nvscistream_event_sample -L -c 2 -k 2 50000 & # After late-attached consumer process one completes, re-attach it. ./nvscistream_event_sample -L -c 1 -k 1 5000 & Limitations with C2C late/re-attach: This sample app has the following limitations. 1. For C2C late/re-attach, this sample app does not support IPC consumer being the only early consumer and all the remaining consumers as C2C late-attached. This is due to setting static attribute logic for late-attach is not added. 2. A C2C consumer can acts as an IPC consumer during late-/re-attach but an IPC consumer cannot be made as C2C consumer during Late/re-attach. #Example commands for inter-chip late attach usecase Multi-process CUDA/CUDA stream with one early C2C consumer and one C2C late-attached consumer Producer and early C2C consumer processes are configured to stream 100000 frames, where as the late-attached C2C consumer process is configured to receive 10000 frames. The early consumer uses nvscic2c_pcie_s0_c5_1 <-> nvscic2c_pcie_s0_c6_1 for C2C communication. The late-attached consumer uses nvscic2c_pcie_s0_c5_2 <-> nvscic2c_pcie_s0_c6_2 for C2C communication. # Run the below commands to launch producer on SOC1 ./nvscistream_event_sample -m 2 -r 1 -P 0 nvscic2c_pcie_s0_c5_1 -P 1 nvscic2c_pcie_s0_c5_2 & # Run the below commands to launch early consumer process on SOC2 ./nvscistream_event_sample -C 0 nvscic2c_pcie_s0_c6_1 -k 0 100000 & # Run the below command after some delay to launch the late-attached consumer process on SOC2 sleep 1; # This 1s delay will let producer and consumer to enter into streaming phase. ./nvscistream_event_sample -L -C 1 nvscic2c_pcie_s0_c6_2 -k 1 10000 & Multi-process CUDA/CUDA stream with one early C2C consumer and two C2C late-attached consumer Producer and early C2C consumer processes are configured to stream 100000 frames, where as the late-attached C2C consumer process is one configured to receive 10000 frames and the late-attached C2C consumer process is two configured to receive 10000 frames. The early consumer uses nvscic2c_pcie_s0_c5_1 <-> nvscic2c_pcie_s0_c6_1 for C2C communication. The late-attached consumer one uses nvscic2c_pcie_s0_c5_2 <-> nvscic2c_pcie_s0_c6_2 for C2C communication and the late-attached consumer two uses nvscic2c_pcie_s0_c5_3 <->nvscic2c_pcie_s0_c6_3 for C2C communication. # Run the below commands to launch producer on SOC1 ./nvscistream_event_sample -m 3 -r 2 -P 0 nvscic2c_pcie_s0_c5_1 -P 1 nvscic2c_pcie_s0_c5_2 -P 2 nvscic2c_pcie_s0_c5_3 & # Run the below commands to launch early consumer process on SOC2 ./nvscistream_event_sample -C 0 nvscic2c_pcie_s0_c6_1 -k 0 100000 & # Run the below command after some delay to launch the late-attached consumer process. sleep 1; # This 1s delay will let producer and consumer to enter into streaming phase. ./nvscistream_event_sample -L -C 1 nvscic2c_pcie_s0_c6_2 -k 1 10000 & # Run the below command after some delay to launch the late-attached consumer process. sleep 1; ./nvscistream_event_sample -L -C 2 nvscic2c_pcie_s0_c6_3 -k 2 10000 & #Example commands for inter-chip/process re-attach usecase Multi-process CUDA/CUDA stream with one early consumer and two late-attached consumers Producer and early consumer processes are configured to stream 100000 frames, where as the late-attached consumer process one is configured to receive 10000 frames and the late-attached consumer process two is configured to receive 50000 frames. Once late-attached consumer process one completes streaming, re-attach it for receiving 5000 frames. Once late-attached consumer process two completes streaming, re-attach it as IPC consumer for receiving 5000 frames. # Run the below commands to launch producer on SOC1 ./nvscistream_event_sample -m 3 -r 2 -P 0 nvscic2c_pcie_s0_c5_1 -P 1 nvscic2c_pcie_s0_c5_2 -P 2 nvscic2c_pcie_s0_c5_3 & # Run the below commands to launch early consumer process on SOC2 ./nvscistream_event_sample -C 0 nvscic2c_pcie_s0_c6_1 -k 0 100000 & # Run the below command after some delay to launch the late-attached consumer process. sleep 1; # This 1s delay will let producer and consumer to enter into streaming phase. ./nvscistream_event_sample -L -C 1 nvscic2c_pcie_s0_c6_2 -k 1 10000 & # Run the below command after some delay to launch the late-attached consumer process. sleep 1; ./nvscistream_event_sample -L -C 2 nvscic2c_pcie_s0_c6_3 -k 2 50000 & # Once late-attached consumer process one completes streaming, # re-attach it for receiving 5000 frames. ./nvscistream_event_sample -L -C 1 nvscic2c_pcie_s0_c6_2 -k 1 5000 & # Once late-attached consumer process two completes streaming, # re-attach it as IPC consumer on SOC1 for receiving 5000 frames. ./nvscistream_event_sample -L -c 2 -k 2 5000 &