NCA-GENM 試験問題を無料オンラインアクセス

試験コード:	NCA-GENM
試験名称:	NVIDIA Generative AI Multimodal
認定資格:	NVIDIA
無料問題数:	403
更新日:	2025-09-04
評価 100%

ページ: 1 / 81
トータル 403 問

問題 1

Which of the following are potential benefits of using multi-modal learning compared to single-modal learning? (Select all that apply)

A.Improved robustness to noisy or incomplete data.
B.Reduced risk of overfitting to spurious correlations in a single modality.
C.The ability to learn more comprehensive and nuanced representations.
D.Guaranteed higher accuracy across all tasks.
E.Increased computational complexity and data requirements.

問題 2

You are working on a multimodal model for video captioning, where the model needs to generate captions describing the actions and events happening in a video. You notice that the model tends to focus only on the most salient objects in the scene and ignores subtle but important actions. Which of the following techniques can help the model attend to these subtle actions and generate more comprehensive captions?

A.Adding more layers to the LSTM or GRIJ used for sequence modeling.
B.Implementing a hierarchical attention mechanism that first attends to relevant time steps and then to relevant regions within those time steps.
C.Decreasing the regularization strength.
D.Increasing the learning rate during training.
E.Using a larger batch size.

問題 3

Consider the following code snippet used within a U-Net architecture. What is its purpose?
torch.cat ([up, skip], dim=1)

A.It performs an element-wise addition of the 'up' and 'skip' tensors.
B.It performs a matrix multiplication between the 'up' and 'skip' tensors.
C.It multiplies the 'up' and 'skip' tensors element-wise.
D.It concatenates the 'up' and 'skip' tensors along the channel dimension.
E.It subtracts the 'skip' tensor from the 'up' tensor.

問題 4

You are building a multimodal application that analyzes images and generates descriptive captions. The application needs to handle noisy images and maintain caption consistency. Which of the following techniques would be MOST effective in achieving this?

A.Using a smaller, less complex captioning model to avoid overfitting to the noise.
B.Preprocessing the images using a simple Gaussian blur before feeding them into the captioning model.
C.Increasing the learning rate of the captioning model during training to compensate for the noise.
D.Employing a denoising autoencoder to clean the images followed by a transformer-based captioning model and using beam search with consistency constraints during caption generation.
E.Directly feeding noisy images into a standard image captioning model.

問題 5

You're tasked with building a generative A1 model for music composition. You have a large dataset of MIDl files, but the data is inconsistent in terms of tempo, key, and instrumentation. What are the crucial data transformation steps needed before training the model?

A.Converting all MIDl files to MP3 format.
B.Normalizing the tempo of all MIDl files to a standard BPM.
C.Transposing all MIDl files to the same key (e.g., C major/A minor).
D.Standardizing the instrumentation by mapping different instrument patches to a predefined set.
E.Rescaling the MIDl note velocities to a uniform range.

最新アップロード: 126SAP.C-TS412-2021.v2025-09-06.q90; 138Microsoft.MB-700.v2025-09-06.q281; 134Docker.DCA.v2025-09-06.q175; 113SAP.C-BCFIN-2502.v2025-09-05.q12; 121Avaya.77201X.v2025-09-05.q58; 109Oracle.1Z0-1079-24.v2025-09-05.q19; 110NBMTM.BCMTMS.v2025-09-05.q33; 109Huawei.H19-423_V1.0.v2025-09-04.q138; 114Nokia.4A0-113.v2025-09-04.q69; 127Microsoft.PL-200.v2025-09-04.q112