NCP-AIN NVIDIA-Certified Professional AI Networking Questions and Answers
You are optimizing an AI workload that involves multiple GPUs across different nodes in a data center. The application requires both high-bandwidth GPU-to-GPU communication within nodes and efficient communication between nodes.
Which combination of NVIDIA technologies would best support this multi-node, multi-GPU AI workload?
You are tasked with troubleshooting a link flapping issue in an InfiniBand AI fabric. You would like to start troubleshooting from the physical layer.
What is the right NVIDIA tool to be used for this task?
When designing a multi-tenancy East/West (E/W) fabric using Unified Fabric Manager (UFM), which method should be used?
You are optimizing an InfiniBand network for AI workloads that require low-latency and high-throughput data transfers. Which feature of InfiniBand networks minimizes CPU overhead during data transfers?
You are troubleshooting a Spectrum-X network and need to validate the fabric configuration. Which feature of Spectrum-X allows for automated fabric validation?
What are the two general user account types in MLNX-OS?
Pick the 2 correct responses below:
You are tasked with configuring multi-tenancy using partition key (PKey) for a high-performance storage fabric running on InfiniBand. Each tenant’s GPU server is allowed to access the shared storage system but cannot communicate with another tenant’s GPU server.
Which of the following partition key membership configurations would you implement to set up multi-tenancy in this environment?
A financial services company is planning to implement an AI infrastructure to support real-time fraud detection and risk assessment. They need a solution that can handle both training and inference workloads while maintaining data privacy and security.
Which NVIDIA reference architecture component would be most appropriate to address the data privacy and security concerns in this AI networking setup?
You have recently implemented NVIDIA Spectrum-X in your data center to optimize AI workloads. You need to verify the performance improvements and create a baseline for future comparisons.
Which tool would be most appropriate for creating performance baseline results in this Spectrum-X environment?
What are two methods for accessing the operating system on a BlueField DPU?
Pick the 2 correct responses below
You're troubleshooting a Spectrum-X network and notice that the System Status LED on a switch is blinking for more than 5 minutes. What is the most likely cause of this issue?
In an AI cluster using NVIDIA GPUs, which configuration parameter in the NicClusterPolicy custom resource is crucial for enabling high-speed GPU-to-GPU communication across nodes?
Which of the following NCCL environment variables enable SHARP aggregation with NCCL when using the NCCL-SHARP plugin?
Pick the 2 correct responses below
You're designing a multi-GPU system for AI training using NVIDIA GPUs with NVLink connections. You need to maximize inter-GPU communication bandwidth. Which feature included in NCCL allows for improved communication between GPUs and NICs?
You are troubleshooting InfiniBand connectivity issues in a cluster managed by the NVIDIA Network Operator. You need to verify the status of the InfiniBand interfaces. Which command should you use to check the state and link layer of InfiniBand interfaces on a node?
In Cumulus Linux, which technology enables the ability to provide active-active redundancy to servers, without the need for direct inter-switch links?