Which of the following should be accomplished NEXT after understanding a business requirement for a data analysis report?
正解: B
Explanation Exploratory data analysis (EDA) is a process of examining and summarizing a dataset using various techniques, such as descriptive statistics, visualizations, correlations, outliers detection, and hypothesis testing. EDA can help reveal the main characteristics, patterns, trends, and insights from the data, as well as identify any problems or issues with the data quality or structure. EDA is usually performed after understanding a business requirement for a data analysis report and before building a mock dashboard/presentation layout. Therefore, the correct answer is B. References: [What is Exploratory Data Analysis? | Definition and Examples], [Exploratory Data Analysis in Python]
Which of the following best describes a difference between JSON and XML?
正解: A
The best answer is A) JSON is quicker to read and write. JSON (JavaScript Object Notation) is a lightweight data-interchange format that is based on the JavaScript programming language and easy to understand and generate. JSON uses a simple syntax that consists of name-value pairs and arrays, and does not require any end tags or attributes. JSON is quicker to read and write than XML (Extensible Markup Language), which is a markup language that uses a tag structure to represent data items. XML has a more complex and verbose syntax that requires end tags, attributes, and namespaces123
DA0-001 試験問題 129
A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer The analyst must choose an appropriate chart to include in the dashboard. The following data is available: Which of the following types of charts should be considered?
正解: D
Explanation The best type of chart to display the data is D. Include a column chart using the site and sales to average sales per customer. A column chart is a good choice for comparing categorical data with numerical data, such as the site and sales to average sales per customer. A column chart can show the relative differences between the sites and highlight the site with the highest sales volume per customer. A column chart can also be easily labeled and formatted to make the data clear and understandable. A line chart is not suitable for this data, because it is used to show trends or changes over time, which is not relevant for the site and sales to average sales per customer data. A line chart would also be confusing and misleading, as it would imply a connection or correlation between the sites that does not exist. A pie chart is also not a good choice for this data, because it is used to show the proportion of a whole, not the comparison of different categories. A pie chart would also be difficult to read and interpret, as it would require labels or legends to identify the sites and their sales to average sales per customer. A pie chart would also not be able to show the exact values of the sales to average sales per customer, only their relative sizes. A scatter chart is another inappropriate option for this data, because it is used to show the relationship or correlation between two numerical variables, not between a categorical and a numerical variable. A scatter chart would also be cluttered and unclear, as it would plot each site as a point on a coordinate plane, without any labels or axes. A scatter chart would also not be able to show the differences or rankings between the sites and their sales to average sales per customer.
DA0-001 試験問題 130
An analyst modified a data set that had a number of issues. Given the original and modified versions: Which of the following data manipulation techniques did the analyst use?
正解: B
Explanation The correct answer is B. Recoding. Recoding is a data manipulation technique that involves changing the values or categories of a variable to make it more suitable for analysis. Recoding can be used to simplify or group the data, to correct errors or inconsistencies, or to create new variables from existing ones12 In the example, the analyst used recoding to change the values of Var001, Var002, Var003, and Var004 from numerical to textual form. The analyst also used recoding to assign meaningful labels to the values, such as "Absent" for 0, "Present" for 1, "Low" for 2, "Medium" for 3, and "High" for 4. This makes the data more understandable and easier to analyze.