Labour Day Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: netbudy65

Databricks-Certified-Professional-Data-Scientist Databricks Certified Professional Data Scientist Exam Questions and Answers

Questions 4

RMSE measures error of a predicted

Options:

A.

Numerical Value

B.

Categorical values

C.

For booth Numerical and categorical values

Buy Now
Questions 5

Select the correct statement which applies to Principal component analysis (PCA)

Options:

A.

Is a mathematical procedure that transforms a number of (possibly) correlated variables into a (smaller) number of uncorrelated variables.

B.

Is a mathematical procedure that transforms a number of (possibly) correlated variables into a (higher) number of uncorrelated variables

C.

Increase the dimensionality of the data set.

D.

1 and 3 are correct

E.

1 and 2 are correct

Buy Now
Questions 6

Which of the following problem you can solve using binomial distribution

Options:

A.

A manufacturer of metal pistons finds that on the average: 12% of his pistons are rejected because they are either oversize or undersize. What is the probability that a batch of 10 pistons will contain no more than 2 rejects?

B.

A life insurance salesman sells on the average 3 life insurance policies per week. Use Poisson's law to calculate the probability that in a given week he will sell Some policies

C.

Vehicles pass through a junction on a busy road at an average rate of 300 per hour Find the probability that none passes in a given minute.

D.

It was found that the mean length of 100 parts produced by a lathe was 20.05 mm with a standard deviation of 0.02 mm. Find the probability that a part selected at random would have a length between 20.03 mm and 20.08 mm

Buy Now
Questions 7

In unsupervised learning which statements correctly applies

Options:

A.

It does not have a target variable

B.

Instead of telling the machine Predict Y for our data X, we're asking What can you tell me about X?

C.

telling the machine Predict Y for our data X

Buy Now
Questions 8

RMSE is a useful metric for evaluating which types of models?

Options:

A.

Logistic regression

B.

Naive Bayes classifier

C.

Linear regression

D.

All of the above

Buy Now
Questions 9

A fruit may be considered to be an apple if it is red, round, and about 3" in diameter. A naive Bayes classifier considers each of these features to contribute independently to the probability that this fruit is an apple, regardless of the

Options:

A.

Presence of the other features.

B.

Absence of the other features.

C.

Presence or absence of the other features

D.

None of the above

Buy Now
Questions 10

Suppose you have been given a relatively high-dimension set of independent variables and you are asked to come up with a model that predicts one of Two possible outcomes like "YES" or "NO", then which of the following technique best fit.

Options:

A.

Support vector machines

B.

Naive Bayes

C.

Logistic regression

D.

Random decision forests

E.

All of the above

Buy Now
Questions 11

In which of the scenario you can use the regression to predict the values

Options:

A.

Samsung can use it for mobile sales forecast

B.

Mobile companies can use it to forecast manufacturing defects

C.

Probability of the celebrity divorce

D.

Only 1 and 2

E.

All 1 ,2 and 3

Buy Now
Questions 12

You are studying the behavior of a population, and you are provided with multidimensional data at the individual level. You have identified four specific individuals who are valuable to your study, and would like to find all users who are most similar to each individual. Which algorithm is the most appropriate for this study?

Options:

A.

Association rules

B.

Decision trees

C.

Linear regression

D.

K-means clustering

Buy Now
Questions 13

The method based on principal component analysis (PCA) evaluates the features according to

Options:

A.

The projection of the largest eigenvector of the correlation matrix on the initial dimensions

B.

According to the magnitude of the components of the discriminate vector

C.

The projection of the smallest eigenvector of the correlation matrix on the initial dimensions

D.

None of the above

Buy Now
Questions 14

Under which circumstance do you need to implement N-fold cross-validation after creating a regression model?

Options:

A.

The data is unformatted.

B.

There is not enough data to create a test set.

C.

There are missing values in the data.

D.

There are categorical variables in the model.

Buy Now
Questions 15

You are creating a Classification process where input is the income, education and current debt of a customer, what could be the possible output of this process.

Options:

A.

Probability of the customer default on loan repayment

B.

Percentage of the customer loan repayment capability

C.

Percentage of the customer should be given loan or not

D.

The output might be a risk class, such as "good", "acceptable", "average", or "unacceptable".

Buy Now
Questions 16

Find out the classifier which assumes independence among all its features?

Options:

A.

Neural networks

B.

Linear Regression

C.

Naive Bayes

D.

Random forests

Buy Now
Questions 17

Feature Hashing approach is "SGD-based classifiers avoid the need to predetermine vector size by simply picking a reasonable size and shoehorning the training data into vectors of that size" now with large vectors or with multiple locations per feature in Feature hashing?

Options:

A.

Is a problem with accuracy

B.

It is hard to understand what classifier is doing

C.

It is easy to understand what classifier is doing

D.

Is a problem with accuracy as well as hard to understand what classifier us doing

Buy Now
Questions 18

Let's say you have two cases as below for the movie ratings

1. You recommend to a user a movie with four stars and he really doesn't like it and he'd rate it two stars

2. You recommend a movie with three stars but the user loves it (he'd rate it five stars). So which statement correctly applies?

Options:

A.

In both cases, the contribution to the RMSE is the same

B.

In both cases, the contribution to the RMSE is the different

C.

In both cases, the contribution to the RMSE, could varies

D.

None of the above

Buy Now
Questions 19

In which of the following scenario you should apply the Bay's Theorem

Options:

A.

The sample space is partitioned into a set of mutually exclusive events {A1, A2, . .., An }.

B.

Within the sample space, there exists an event B, for which P(B) > 0.

C.

The analytical goal is to compute a conditional probability of the form: P(Ak | B ).

D.

In all above cases

Buy Now
Questions 20

Select the choice where Regression algorithms are not best fit

Options:

A.

When the dimension of the object given

B.

Weight of the person is given

C.

Temperature in the atmosphere

D.

Employee status

Buy Now
Exam Name: Databricks Certified Professional Data Scientist Exam
Last Update: May 2, 2024
Questions: 138

PDF + Testing Engine

$130

Testing Engine

$95

PDF (Q&A)

$80