DSA-C03 試験問題を無料オンラインアクセス

試験コード:	DSA-C03
試験名称:	SnowPro Advanced: Data Scientist Certification Exam
認定資格:	Snowflake
無料問題数:	289
更新日:	2025-09-05
評価 100%

ページ: 1 / 58
トータル 289 問

問題 1

Which of the following statements about Z-tests and T-tests are generally true? Select all that apply.

A.A T-test has fewer degrees of freedom compared to the Z-test, making it more robust to outliers.
B.A T-test is generally used when the sample size is large (n > 30) and the population standard deviation is known.
C.As the sample size increases, the T-distribution approaches the standard normal (Z) distribution.
D.Both Z-tests and T-tests assume that the data is non-normally distributed.
E.A Z-test requires knowing the population standard deviation, while a T-test estimates it from the sample data.

問題 2

You've trained a model using Snowflake ML and want to deploy it for real-time predictions using a Snowflake UDF. To ensure minimal latency, you need to optimize the UDF's performance. Which of the following strategies and considerations are most important when creating and deploying a UDF for model inference in Snowflake to minimize latency, especially when the model is large (e.g., > 100MB)?
Select all that apply.

A.Ensure the UDF code is written in Python and utilizes vectorized operations with libraries like NumPy to process data in batches efficiently.
B.Use a Snowflake Stage to store the model file and load the model within the UDF using 'snowflake.snowpark.files.SnowflakeFile' to minimize memory footprint.
C.Store the trained model as a BLOB within the UDF code itself to avoid external dependencies.
D.Use smaller warehouse size for UDF evaluation in order to reduce latency and compute costs.
E.Utilize a Snowflake external function instead of a UDF if the model requires access to resources outside of Snowflake's environment.

問題 3

You are building a churn prediction model for a telecommunications company using Snowflake and Snowpark ML. You have trained a Gradient Boosting Machine (GBM) model and want to understand the feature importance to identify key drivers of churn. You've used SHAP (SHapley Additive exPlanations) values to explain individual predictions. Given a customer with a high churn risk, you observe that the 'monthly_charges' feature has a significantly large negative SHAP value for that specific prediction. Which of the following statements best interprets this observation in the context of feature impact?

A.Increasing 'monthly_charges' for this customer is likely to decrease their probability of churning.
B.The negative SHAP value indicates that 'monthly_charges' is negatively correlated with all customers' churn probability, irrespective of their individual profile.
C.Increasing 'monthly_charges' for this customer is likely to increase their probability of churning.
D.The 'monthly_charges' feature has no impact on the customer's churn probability.
E.The negative SHAP value suggests 'monthly_charges' interacts with other features. Its precise impact is conditional and cannot be generalized without further analysis of feature interaction effects with SHAP values.

問題 4

You are analyzing customer transaction data in Snowflake to identify fraudulent activities. The 'TRANSACTION AMOUNT' column exhibits a right-skewed distribution. Which of the following Snowflake queries is MOST effective in identifying outliers based on the Interquartile Range (IQR) method, specifically targeting unusually large transaction amounts? Assume IQR is already calculated as variable and QI as and Q3 as in snowflake session.

A.SELECT TRANSACTION ID FROM TRANSACTIONS WHERE TRANSACTION AMOUNT > q3 + (1.5 iqr);
B.SELECT TRANSACTION ID FROM TRANSACTIONS WHERE TRANSACTION_AMOUNT > (SELECT WITHIN GROUP (ORDER BY TRANSACTION_AMOUNT) FROM TRANSACTIONS);
C.SELECT TRANSACTION ID FROM TRANSACTIONS WHERE TRANSACTION_AMOUNT > (SELECT + 3 FROM TRANSACTIONS);
D.SELECT TRANSACTION ID FROM TRANSACTIONS WHERE TRANSACTION_AMOUNT < qi - (1.5 iqr);
E.SELECT TRANSACTION ID FROM TRANSACTIONS WHERE TRANSACTION_AMOUNT > (SELECT MEDIAN(TRANSACTION AMOUNT) FROM TRANSACTIONS);

問題 5

A marketing team uses Snowflake to store customer purchase data'. They want to segment customers based on their spending habits using a derived feature called The 'PURCHASES' table has columns 'customer id' (IN T), 'purchase_date' (DATE), and 'purchase_amount' (NUMBER). The team needs a way to handle situations where a customer might have missing months (no purchases in a particular month). They want to impute a 0 spend for those months before calculating the average. Which approach provides the most accurate and robust calculation, especially when considering users with sparse purchase history?

A.Calculate the average monthly spend directly from the 'PURCHASES' table without accounting for missing months: 'AVG(purchase_amount) GROUP BY customer_id, date_trunc('month',
B.Calculate the total spend for each customer and divide by the number of months since their first purchase: / DATEDlFF(month, CURRENT DATE()) GROUP BY customer_id'.
C.Use a window function to calculate the average spend over a fixed window of the last 3 months, ignoring missing months in the calculation.
D.Create a view containing all months for each customer, left join with the 'PURCHASES' table, impute 0 for null 'purchase_amounts values, and then calculate the average spend. Requires creating a helper table for all the month.
E.Calculate the average spend only for customers with purchases in every month of the year. Ignore other customers in the analysis.

最新アップロード: 125SAP.C-TS412-2021.v2025-09-06.q90; 129Microsoft.MB-700.v2025-09-06.q281; 129Docker.DCA.v2025-09-06.q175; 113SAP.C-BCFIN-2502.v2025-09-05.q12; 121Avaya.77201X.v2025-09-05.q58; 109Oracle.1Z0-1079-24.v2025-09-05.q19; 110NBMTM.BCMTMS.v2025-09-05.q33; 109Huawei.H19-423_V1.0.v2025-09-04.q138; 114Nokia.4A0-113.v2025-09-04.q69; 127Microsoft.PL-200.v2025-09-04.q112