Oracle Data Science Professional Practice Exam

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/50

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

51 Terms

1
New cards

A team wants to use CPU utilization as a metric to trigger autoscaling. Which type of autoscaling policy should they configure?

Pre-defined metric

2
New cards

A data scientist is working on a fraud detection model. They need to store the trained model so that it can be versioned, tracked, and later deployed without modification. Which feature should they use?

Model Catalog

3
New cards

A data scientist wants to develop a PySpark application iteratively using a sample of their dataset. Which environment is recommended for this purpose?

Data Science notebook session

4
New cards

Arrange the following steps in the correct Git repository workflow order. 1 - Install, configure, and authenticate Git. 2 - Create a local and remote Git repository. 3 - Configure SSH keys for the Git repository. 4 - Commit files to the local Git repo. 5 - Push files to the remote Git repo.

1, 3, 2, 4, 5

5
New cards

A data scientist is using an AI model to predict fraudulent transactions. A financial regulator asks why a specific transaction was flagged as fraud. Which technique should the data scientist use?

Local Explanation

6
New cards

Which statement is true about overriding pipeline defaults?

Pipeline defaults can be overridden before starting the pipeline execution.

7
New cards

A data scientist using Oracle AutoML, to produce a model and you are evaluating the score metric for the model. Which two prevailing metrics would you use for evaluating the multiclass classification model?

F1, Recall

8
New cards

Which is NOT a supported encryption algorithm in OCI Vault?

RSA (Rivest-Shamir-Adleman)

9
New cards

What can AI Quick Actions help you?

Fine-tune the model and deploy

10
New cards

What triggers the automation of the MLOps pipeline?

Changes in data, monitoring events, or calendar intervals

11
New cards

Which statement is true regarding configuration for an existing model deployment's Autoscaling Policy changes in an active state?

Changes to non-infrastructure-related aspects can be specified for an active model deployment.

12
New cards

The ADS operator is used to redact the sensitive information. A data scientist is given data with sensitive information like emails and contact numbers of the customer. You need to redact this information before using it further. Which OCI Data Science Operator can you use for this purpose?

Anomaly

13
New cards

A data scientist is working on a set of projects via the OCI Console UI (manual selection) or programmatically. They want to collaborate in a structured manner. Which option is available to create a Data Science Project in OCI?

Can be created through either the OCI Console UI or the ADS SDK

14
New cards

What is the primary goal of the loss function in model training?

To compare predicted values with true target values and quantify their difference

15
New cards

Which statement is incorrect regarding the benefits of autoscaling for model deployment in Oracle Data Science?

Autoscaling, in conjunction with load balancers, enhances availability by rerouting traffic to healthy instances in case of instance failure.

16
New cards

How can a team ensure that data processing occurs before model training in a pipeline?

By setting dependencies between steps

17
New cards

Which correlation method is used to measure the relationship between two categorical variables in ADS?

Cramer's V method

18
New cards

What is the difference between a job and a job run in OCI Data Science Jobs?

A job is a template, while a job run is a single execution of that template.

19
New cards

Which two statements are true about Oracle Cloud Infrastructure (OCI) Open Data?

Open Data includes text and image data repositories for AI and ML. Audio and video formats are not available. Each dataset in Open Data consists of code and tooling usage examples for consumption and reproducibility.

20
New cards

Which statement best describes Oracle Cloud Infrastructure Data Science Jobs?

Data Science Jobs lets you define and run repeatable tasks on fully-managed infrastructure.

21
New cards

A data scientist taxonomy allows you to describe the metadata you are saving to the model catalog. You have been asked to analyze and improve a deep neural network model that was built based on the electrocardiogram records of patients. There are no descriptions of the model framework that was built. Where would be the best way to find more details about the machine learning models inside model catalog?

Check for Model Taxonomy details.

22
New cards

What is the primary reason for performing feature scaling in machine learning models?

To convert categorical data into numerical form

23
New cards

What is the purpose of a dynamic group in OCI?

To manage API access for resources such as notebook sessions

24
New cards

Which resource types are included in the default matching rules in the Data Science service template definition of the Dynamic Groups for model deployments, and job runs, allowing these resources to interact with OCI services securely.

datasciencemodeldeployment, datasciencenotebooksession, datasciencejobrun

25
New cards

What is the model parameter value are you most likely to use if you are not sure of your selection while configuring the Forecasting operator?

prophet

26
New cards

The 'read' verb means the users will be able to access and view the details of data science resources, but they won't have the permission to modify, delete or create them. This verb is commonly used in access control policies to grant read-only access to users or groups. You want to create a user group for a team of external data science consultants. The consultants should only have the ability to view data science resource details but not the ability to create, delete, or update data science resources. Which verb should you write in the policy?

read

27
New cards

Data labeling is the process of annotating data with properties or labels, which is necessary before training a machine learning model. A data scientist is working on a project to train a machine learning model to identify tigers in images. What is the first step they need to take before training the model?

Label the images with "tiger" or "not tiger".

28
New cards

Which type of data is NOT available in Oracle Open Data?

Financial transaction data

29
New cards

Oracle Cloud Infrastructure (OCI) Data Labeling supports annotating Document, Image and Text labeling. You have been given a collection of different formats that you would like to annotate using Oracle Cloud Infrastructure (OCI) Data Labeling. Which three types of files could this tool annotate?

Images of computer server racks: A type-written document that details an annual budget A collection of purchase orders for office supplies

30
New cards

The PII Detection Operator helps detect and redact personal identifiable information (PII) from datasets, ensuring data privacy compliance before sharing sensitive information. A national healthcare company needs to redact personal information (such as names, emails, and phone numbers) from patient records before sharing them with a research institute. Which Data Science operator is best suited for this task?

PII Detection Operator

31
New cards

OCI Vault credentials is to securely access control and prevents leaks, as credentials are retrieved only when needed and not stored in code or configuration files. What is the best way to store the credentials?

Store the credentials in OCI Vault and retrieve them programmatically when needed.

32
New cards

What is the key difference between PDP and ICE (Individual Conditional Expectation) in ADS?

PDP feature provides level- insights, while ICE provides sample-level insights.

33
New cards

You wants to fetch data from an Autonomous Database in OCI without using a database wallet. Which must you do?

Provide the hostname and port number in the connection_parameters dictionary.

34
New cards

During the fine-tuning process, the checkpoints and final outputs are automatically stored in an OCI Object Storage bucket, ensuring persistence and accessibility. Where are the training job outputs stored after fine-tuning is completed?

In an OCI Object Storage bucket

35
New cards

A data scientist wants to create a sophisticated autoscaling query that combines multiple metrics using logical operators. Which option should they use?

Custom scaling metric with NQL expressions

36
New cards

OCI Data Science AI Quick Actions provide a fast and easy way to deploy pre-trained LLMs without requiring extensive coding or configuration, making it the ideal choice for quickly integrating a model into a chatbot. What is the fastest way to integrate an LLM into its customer support chatbot?

Using AI Quick Actions to quickly deploy a pretrained LLM

37
New cards

The machine learning lifecycle order, i.e., Data Access, Data Exploration, Feature Exploration, Engineering, and Modeling. A consulting firm has collected user data for the past three years. For increasing the profitability and making useful inferences, a machine learning model needs to be built from the accumulated data. Which option has the correct order of the required machine learning tasks for building a model?

Data Access, Data Exploration, Feature Exploration, Feature Engineering, Modeling

38
New cards

You have trained a binary classifier for a loan application and saved this model to the model catalog. A colleague wants to examine the model and you need to share the model with your colleague. What are the model artifacts which can be shared?

Models, model metadata, hyperparameters, and metrics

39
New cards

A data scientist for a hardware company, it follows a logical order, starting with preparing the model and ending with making predictions on new data. What is the correct sequence of steps to predict the revenue demand values for the upcoming quarter?

Prepare model, verify, save, deploy, predict.

40
New cards

A company is running a job in OCI Data Science. Jobs and wants to ensure that the infrastructure is deprovisioned immediately after the job completes to avoid unnecessary costs. What happens when the job ends?

The infrastructure is automatically deprovisioned.

41
New cards

You are a data scientist working on 'Education' dataset. You have decided to use Oracle AutoML Pipeline for automating your machine learning task and want to ensure that the two features 'Age' and 'Education' are part of the final model, you should set 'minfeatures' equal to a list containing these feature names. In this case, the correct way to define the 'minfeatures' argument would be minfeatures=['Age', 'Education']. The other options are range, fractions and logical AND operator. Which are these features are not dropped during the feature selection phase, what would be the best way to define the minfeatures argument in your code?

min_features = ['Age', 'Education']

42
New cards

A data scientist is running a long-term experiment in an OCI notebook session. They need to save all the results even if they deactivate the session to reduce costs. What should they do?

Store results only in the block volume, as it is retained indefinitely.

43
New cards

You are developing an RAG application to OCI Data Science. What is the correct sequence of steps you would need to follow?

Load documents. 2. Split documents. 3. Embed documents. 4. Create vector database from documents. 5. Create retriever. 6. Create chain. 7. Prepare model artifacts. 8. Create model.

44
New cards

After running the Resource Manager stack, the user group is created automatically, but you must manually add users to this group to enable access to Data Science resources. What is the final step after running the Oracle Resource Manager stack for Data Science configuration?

Adding users to the automatically created user group

45
New cards

A data scientist updates an IAM policy to grant their notebook session access to an Object Storage bucket. However, the notebook still cannot access the bucket. What is the likely reason?

The resource principal token is still cached.

46
New cards

What is the primary advantage of using Conda environments in Data Science?

They enable isolated software configurations for different projects.

47
New cards

A company has trained a machine learning model and wants to fine-tune it by experimenting with hyperparameter values based on prior experience. Which approach should they take?

Define a custom search space with specific hyperparameter values.

48
New cards

While working with Git on Oracle Cloud Infrastructure (OCI) Data Science, you notice that two of the operations are taking more time than the others due to your slow internet speed. Which two operations would take the most time?

Updating the local repo to match the remote repo Pushing changes to a remote repository

49
New cards

What happens when a model deployment is deactivated, its associated VM instances are load balancer shut down, stopping billing. However, the model's metadata remains saved, allowing it to be reactivated later with the same endpoint. What happens when a model deployment in OCI Data Science is deactivated?

The model's HTTP endpoint becomes unavailable, but metadata is preserved.

50
New cards

After using AI Quick Actions, the deployed model can be invoked using API and CLI. How can you invoke your model?

Through API and CLI

51
New cards