AIL303m (FPTU_AI) test

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/279

There's no tags or description

Looks like no tags are added yet.

Last updated 7:50 AM on 3/28/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

280 Terms

New cards

(True/False) Machine Learning is a subset of Artificial Intelligence

A: False

B: True

True

New cards

(True/False) Deep Learning is a subset of Machine Learning

A: False

B: True

True

New cards

(True/False) Machine Learning consists in programming computers to learn from real-time human interactions

A: False

B: True

False

New cards

(True/False) AI Winters happened mostly due to the lack of understanding behind the theory of neural networks

A: True

B: False

True

New cards

Most modern applications that use computer vision, use models that were trained using this discipline:

A: Machine Learning

B: Artificial Intelligence

C: Deep Learning

Deep Learning

New cards

In the Machine Learning Workflow, the main goal of the Data Exploration and Preprocessing step is to:

A: Identify what data that is best suited to find a solution to your business problem

B: Determine how to clean your data such that you can use it to train a model

Determine how to clean your data such that you can use it to train a model

New cards

What is the goal of supervised learning?

A: Predict the labels.

B: Find the target.

C: Find an underlying structure of the dataset without any labels.

D: Predict the features.

Predict the labels.

New cards

What is deep learning?

A: Deep learning is machine learning that involves deep neural networks.

B: Deep learning is another name for artificial intelligence.

C: Deep learning includes artificial intelligence and machine learning.

D: None of the above are correct.

Deep learning is machine learning that involves deep neural networks.

New cards

When is a standard machine learning algorithm usually a better choice than using deep learning to get the job done?

A: When working with small data sets.

B: When the data is steady over time.

C: When working with large data sets.

D: None of the above are correct.

When working with small data sets.

New cards

What is a Turing test?

A: It tests images.

B: It tests and cleans the dataset.

C: It tests the dataset.

D: It tests a machine's ability to exhibit intelligent behavior.

It tests a machine's ability to exhibit intelligent behavior.

New cards

What are some of the different milestones in deep learning history?

A: Geoffrey Hinton's work, AlexNet, and TensorFlow

B: Deep Blue defeats a world champion chess player and TensorFlow is released

C: Deep Blue defeats a world champion chess player, and AlexNet is created.

D: Deep Blue defeats a world champion chess player, and Keras is released.

Geoffrey Hinton's work, AlexNet, and TensorFlow

New cards

What is artificial intelligence?

A: A subset of deep learning.

B: Any program that can sense, reason, act, and adapt.

C: A subset of machine learning

D: None of the above.

Any program that can sense, reason, act, and adapt.

New cards

What are two spaces within AI that are going through drastic growth and innovation?

A: Language processing and deep learning.

B: Deep learning and machine learning.

C: Computer vision and natural language processing.

D: Computer vision and deep learning.

Computer vision and natural language processing.

New cards

Why did AI flourish so much in the last years?

A: Faster and inexpensive computers and data storage

B: Access to hardware for cleaning data

C: Stylish designed computers

D: Data storage in the cloud is much more expensive

Faster and inexpensive computers and data storage

New cards

How does Alexa use artificial intelligence?

A: Recognizes faces and pictures.

B: Recognizes our voice and answers questions.

C: Suggests who a person on a photo is.

D: None of the above answers are correct.

Recognizes our voice and answers questions.

New cards

What are the first two steps of a typical machine learning workflow?

A: Problem statement and data cleaning.

B: Problem statement and data collection.

C: Data collection and data transformation.

D: None of the above answers is correct.

Problem statement and data collection.

New cards

Which statement about the Pandas read_csv function is TRUE?

A: It reads data into a 2-dimensional NumPy array.

B: It can read both tab-delimited and space-delimited data.

C: It can only read comma-delimited data.

D: It allows only one argument: the name of the file.

It can read both tab-delimited and space-delimited data.

New cards

Which of the following is a reason to use JavaScript Object Notation (JSON) files for storing data?

A: Because the data is stored in a matrix format.

B: Because they can store NA values.

C: Because they can store NULL values.

D: Because they are cross-platform compatible.

Because they are cross-platform compatible.

New cards

The data below appears in 'data.txt', and Pandas has been imported. Which Python command will read it correctly into a Pandas DataFrame?

63.03 22.55 39.61 40.48 98.67 -0.25 AB

39.06 10.06 25.02 29 114.41 4.56 AB

68.83 22.22 50.09 46.61 105.99 -3.53 AB

A: pandas.read_csv('data.txt')

B: pandas.read_csv('data.txt', header=None, sep=' ')

C: pandas.read_csv('data.txt', delim_whitespace=True)

D: pandas.read_csv('data.txt', header=0, delim_whitespace=True)

pandas.read_csv('data.txt', header=None, sep=' ')

New cards

(True/False) Outliers must be very extreme to noticeably impact the fit of a statistical model.

A: True

B: False

False

New cards

(True/False) Outliers should always be replaced, since they never contain useful information about the data.

A: True

B: False

False

New cards

Which residual-based approach to identifying outliers compares running a model with all data to running the same model, but dropping a single observation?

A: Standardized residuals

B: Unstandardized residuals

C: Externally-studentized residuals

D: Abnormally-studentized residuals

Externally-studentized residuals

New cards

What is a CSV file?

A: CSV is a method of JavaScript Object Notation.

B: CSV files are rows of data or values separated by commas.

C: CSV makes data readily available for analytics, dashboards, and reports.

D: CSV files are a standard way to store data across platforms.

CSV files are rows of data or values separated by commas.

New cards

What are residuals?

A: Residuals are a method for handling identified outliers.

B: Residuals are the difference between the actual values and the values predicted by a given model.

C: Residuals are data removed from the dataframe.

D: Residuals are a method to standardize data.

Residuals are the difference between the actual values and the values predicted by a given model.

New cards

If removal of rows or columns of data is not an option, why must we ensure that information is assigned for missing data?

A: Information must be assigned to prevent outliers.

B: Most models will not accept blank values in our data.

C: Missing data may bias the dataset.

D: Assigning information for missing data improves the accuracy of the dataset.

Most models will not accept blank values in our data.

New cards

What are the two main data problems companies face when getting started with artificial intelligence/machine learning?

A: Outliers and duplicated data

B: Lack of relevant data and bad data

C: Lack of training and expertise

D: Data sampling and categorization

Lack of relevant data and bad data

New cards

What does SQL stand for and what does it represent?

A: SQL stands for Structured Query Language, and it represents databases that are not relational, they vary in structure.

B: SQL stands for Sequential Query Language, and it represents a set of relational databases with fixed schemas.

C: SQL stands for Structured Query Language, and it represents a set of relational databases with fixed schemas.

D: SQL stands for Sequential Query Language, and it represents a set of sequential databases with fixed schemas.

SQL stands for Structured Query Language, and it represents a set of relational databases with fixed schemas.

New cards

What does NoSQL stand for and what does it represent?

A: NoSQL stands for Non-Structured Query Language, and it represents a set of non-relational databases with varied schemas.

B: NoSQL stands for Not-only SQL, and it represents a set of databases that are not relational, therefore, they vary in structure.

C: NoSQL stands for Non-Structured Query Language, and it represents a set of relational databases with fixed schemas.

D: NoSQL stands for Not-only SQL, and it represents a set of databases that are relational, therefore, they have fixed structure.

NoSQL stands for Not-only SQL, and it represents a set of databases that are not relational, therefore, they vary in structure.

New cards

What is a JSON file?

A: JSON stands for JavaString Object Notation, and they have very similar structure to Python Dictionaries.

B: JSON stands for JavaScript Object Notation, and it is a standard way to store the data across platforms.

C: JSON stands for JavaScript Object Notation, and it is a non-standard way to store the data across platforms.

D: JSON stands for JavaString Object Notation, and it is a standard way to store the data across platforms.

JSON stands for JavaScript Object Notation, and it is a standard way to store the data across platforms.

New cards

What is meant by the Messy Data?

A: Duplicated or unnecessary data.

B: Inconsistent text and typos.

C: Missing data.

D: All of the above.

All of the above.

New cards

What is an outlier?

A: Outlier is a data point that has the highest or lowest value in the dataset.

B: Outlier is a data point that does not belong in our dataset.

C: Outlier is a data point that is very close to the mean value of all observations.

D: Outlier is an observation in dataset that is distant from most other observations.

Outlier is an observation in dataset that is distant from most other observations.

New cards

How do we identify outliers in our dataset?

A: We can only identify outliers visually through building plots.

B: We can identify outliers only by calculating the minimum and maximum values in the dataset.

C: We can identify outliers both visually and with statistical calculations.

D: We can only identify outliers by using some statistical calculations.

We can identify outliers both visually and with statistical calculations.

New cards

From the options listed below, select the option that is NOT a valid exploratory data approach to visually confirm whether your data is ready for modeling or if it needs further cleaning or data processing:

A: Create a panel plot that shows distributions for the dependent variable and scatter plots for all independent variables

B: Train a model and identify the observations with the largest residuals

C: Create visualizations for scatter plots, histograms, box plots, and hexbin plots

D: Create a correlation heatmap to confirm the sign and magnitude of correlation across your features.

Create a correlation heatmap to confirm the sign and magnitude of correlation across your features.

New cards

These are two of the most common variables for data visualization:

A: matplotlib and seaborn

B: scipy and seaborn

C: numpy and matplotlib

D: scipy and numpy

matplotlib and seaborn

New cards

(True/False) You can use the pandas library to use plots.

A: True

B: False

True

New cards

(True/False) Classification models require that input features be scaled.

A: True

B: False

False

New cards

(True/False) Feature scaling allows better interpretation of distance-based approaches.

A: True

B: False

True

New cards

(True/False) Feature scaling reduces distortions caused by variables with different scales.

A: True

B: False

True

New cards

Which scaling approach converts features to standard normal variables?

A: MinMax scaling

B: Standard scaling

C: Robust scaling

D: Nearest neighbor scaling

Standard scaling

New cards

Which variable transformation should you use for ordinal data?

A: Min-max scaling

B: Standard scaling

C: One-hot encoding

D: Ordinal encoding

Ordinal encoding

New cards

What are polynomial features?

A: They are higher order relationships in the data.

B: They are represented by linear relationships in the data.

C: They are logistic regression coefficients.

D: They are lower order relationships in the data.

They are higher order relationships in the data.

New cards

What does Boxcox transformation do?

A: It transforms categorical variables into numerical variables.

B: It makes the data more left skewed

C: It transforms the data distribution into more symmetrical bell curve

D: It makes the data more right skewed.

It transforms the data distribution into more symmetrical bell curve

New cards

Select three important reasons why EDA is useful.

A: To determine if the data makes sense, to determine whether further data cleaning is needed, and to help identify patterns and trends in the data

B: To analyze data sets, to determine the main characteristics of data sets, and to use sampling to examine data

C: To examine correlations, to sample from dataframes, and to train models on random samples of data

D: To utilize summary statistics, to create visualizations, and to identify outliers

To determine if the data makes sense, to determine whether further data cleaning is needed, and to help identify patterns and trends in the data

New cards

What assumption does the linear regression model make about data?

A: This model assumes an addition of each one of the model parameters multiplied by a coefficient.

B: This model assumes that raw data in data sets is on the same scale.

C: This model assumes a transformation of each parameter to a linear relationship.

D: This model assumes a linear relationship between predictor variables and outcome variables.

This model assumes a linear relationship between predictor variables and outcome variables.

New cards

What is skewed data?

A: Data that has a normal distribution.

B: Raw data that may not have a linear relationship.

C: Raw data that has undergone log transformation.

D: Data that is distorted away from normal distribution; may be positively or negatively skewed.

Data that is distorted away from normal distribution; may be positively or negatively skewed.

New cards

Select the two primary types of categorical feature encoding.

A: Log and polynomial transformation

B: Nominal encoding and ordinal encoding

C: Encoding and scaling

D: One-hot encoding and ordinal encoding

One-hot encoding and ordinal encoding

New cards

Which scaling approach puts values between zero and one?

A: Min-max scaling

B: Robust scaling

C: Standard scaling

D: Nearest neighbor scaling

Min-max scaling

New cards

Which variable transformation should you use for nominal data with multiple different values within the feature?

A: Ordinal encoding

B: Standard scaling

C: One-hot encoding

D: Min-max scaling

One-hot encoding

New cards

(True/False) In general, the population parameters are unknown.

A: True.

B: False.

True.

New cards

(True/False) Parametric models have finite number of parameters.

A: True.

B: False.

True.

New cards

The most common way of estimating parameters in a parametric model is:

A: using the maximum likelihood estimation

B: using the central limit theorem

C: extrapolating a non-parametric model

D: extrapolating Bayesian statistics

using the maximum likelihood estimation

New cards

A p-value is:

A: the smallest significance level at which the null hypothesis would be rejected

B: the probability of the null hypothesis being true

C: the probability of the null hypothesis being false

D: the smallest significance level at which the null hypothesis is accepted

the smallest significance level at which the null hypothesis would be rejected

New cards

Type 1 Error 1 is defined as:

A: Saying the null hypothesis is false, when it is actually true

B: Saying the null hypothesis is true, when it is actually false

Saying the null hypothesis is false, when it is actually true

New cards

You find through a graph that there is a strong correlation between Net Promoter Score and the visual time that customers spend on a website. Select the TRUE assertion:

A: There is an underlying factor that explains this correlation, but manipulating the time that customers spend on a website may not affect the Net Promoter Score they will give to the company

B: To boost the Net Promoter Score of a business, you need to increase the time that customers spend on a website.

There is an underlying factor that explains this correlation, but manipulating the time that customers spend on a website may not affect the Net Promoter Score they will give to the company

New cards

Which one of the following is common to both machine learning and statistical inference?

A: Using sample data to make inferences about a hypothesis.

B: Using population data to make inferences about a null sample.

C: Using population data to model a null hypothesis.

D: Using sample data to infer qualities of the underlying population distribution.

Using sample data to infer qualities of the underlying population distribution.

New cards

Which one of the following describes an approach to customer churn prediction stated in terms of probability?

A: Data related to churn may include the target variable for whether a certain customer has left.

B: Churn prediction is a data-generating process representing the actual joint distribution between our x and the y variable.

C: Predicting a score for individuals that estimates the probability the customer will stay.

D: Predicting a score for individuals that estimates the probability the customer will leave.

Predicting a score for individuals that estimates the probability the customer will leave.

New cards

What is customer lifetime value?

A: The total purchases over the time which the person is a customer.

B: The total churn a customer generates in the population.

C: The total churn generated by a customer over their lifetime.

D: The total value that the customer receives during their life.

The total purchases over the time which the person is a customer.

New cards

Which one the following statements about the normalized histogram of a variable is true?

A: It is a non-parametric representation of the population variance.

B: It provides an estimate of the variable's probability distribution.

C: It serves as a bar chart for the null hypothesis.

D: It is a parametric representation of the population distribution.

It provides an estimate of the variable's probability distribution.

New cards

The outcome of rolling a fair die can be modelled as a _______ distribution.

A: Poisson

B: log-normal

C: uniform

D: normal

uniform

New cards

Which one of the following features best distinguishes the Bayesian approach to statistics from the Frequentist approach?

A: Frequentist statistics incorporates the probability of the hypothesis being true.

B: Bayesian statistics incorporate the probability of the hypothesis being true.

C: Frequentist statistics requires construction of a prior distribution.

D: Bayesian statistics is better than Frequentist.

Bayesian statistics incorporate the probability of the hypothesis being true.

New cards

Which of the following best describes what a hypothesis is?

A: A hypothesis is a statement about a posterior distribution.

B: A hypothesis is a statement about a prior distribution.

C: A hypothesis is a statement about a population.

D: A hypothesis is a statement about a sample of the population.

A hypothesis is a statement about a population.

New cards

A Type 2 error in hypothesis testing is _____________________:

A: correctly rejecting the alternative hypothesis.

B: incorrectly accepting the null hypothesis.

C: correctly rejecting the null hypothesis.

D: incorrectly accepting the alternative hypothesis.

incorrectly accepting the null hypothesis.

New cards

Which statement best describes a consequence of a type II error in the context of a churn prediction example? Assume that the null hypothesis is that customer churn is due to chance, and that the alternative hypothesis is that customers enrolled for greater than two years will not churn over the next year.

A: You correctly conclude that a customer will eventually churn

B: You correctly conclude that customer churn is by chance

C: You incorrectly conclude that there is no effect

D: You incorrectly conclude that customer churn is by chance

You incorrectly conclude that customer churn is by chance

New cards

Which of the following is a statistic used for hypothesis testing?

A: The acceptance region.

B: The standard deviation.

C: The likelihood ratio.

D: The rejection region.

The likelihood ratio.

New cards

Predicting payment default, whether a transaction is fraudulent, and whether a customer will be part of the top 5% spenders on a given year, are examples of:

A: classification

B: regression

classification

New cards

(True/False) It is less concerning to treat a Machine Learning model as a black box for prediction purposes, compared to interpretation purposes:

A: True

B: False

True

New cards

Predicting total revenue, number of customers, and percentage of returning customers are examples of:

A: classification

B: regression

regression

New cards

(True/False) The Sum of Squared Errors (SSE) can be used to select the best-fitting regression model.

A: True

B: False

True

New cards

(True/False) The R-squared value from estimating a linear regression model will almost always increase if more features are added.

A: True

B: False

True

New cards

(True/False) The Total Sum of Squares (TSS) can be used to select the best-fitting regression model.

A: True

B: False

False

New cards

You can use supervised machine learning for all of the following examples, EXCEPT:

A: Segment customers by their demographics.

B: Predict the number of customers that will visit a store on a given week.

C: Predict the probability of a customer returning to a store.

D: Interpret the main drivers that determine if a customer will return to a store.

Segment customers by their demographics.

New cards

The autocorrect on your phone is an example of:

A: Unsupervised learning

B: Supervised learning

C: Semi-supervised learning

D: Reinforcement learning

Supervised learning

New cards

This is the type of Machine Learning that uses both data with labeled outcomes and data without labeled outcomes:

A: Supervised Machine Learning

B: Unsupervised Machine Learning

C: Mixed Machine Learning

D: Semi-Supervised Machine Learning

Semi-Supervised Machine Learning

New cards

This option describes a way of turning a regression problem into a classification problem:

A: Create a new variable that flags 1 for above a certain value and 0 otherwise

B: Use outlier treatment

C: Use missing value handling

D: Create a new variable that uses autoencoding to transform a continuous outcome into categorical

Create a new variable that flags 1 for above a certain value and 0 otherwise

New cards

This is the syntax you need to predict new data after you have trained a linear regression model called LR :

A: LR=predict(X_test)

B: LR.predict(X_test)

C: LR.predict(LR, X_test)

D: predict(LR, X_test)

LR.predict(X_test)

New cards

All of these options are useful error measures to compare regressions except:

A: SSE

B: R squared

C: TSS

D: ROC index

ROC index

New cards

All of the listed below are part of the Machine Learning Framework, except:

A: Observations

B: Features

C: Parameters

D: None of the above

None of the above

New cards

Select the option that is the most INACCURATE regarding the definition of Machine Learning:

A: Machine Learning allows computers to learn from data

B: Machine Learning allows computers to infer predictions for new data

C: Machine Learning is a subset of Artificial Intelligence

D: Machine Learning is automated and requires no programming

Machine Learning is automated and requires no programming

New cards

In Linear Regression, which statement about model evaluation is the most accurate?

A: Model selection involves choosing a model that minimizes the cost function.

B: Model estimation involves choosing parameters that minimize the cost function.

C: Model estimation involves choosing a cost function that can be compared across models.

D: Model selection involves choosing modeling parameters that minimize in-sample validation error.

Model estimation involves choosing parameters that minimize the cost function.

New cards

When learning about regression we saw the outcome as a continuous number. Given the below options what is an example of regression?

A: A fraudulent charge

B: Under certain circumstances determine if a person is a Republican or Democrat

C: Customer churn

D: Housing prices

Housing prices

New cards

What is another term for the testing data:

A: Training data

B: Unseen data

C: Corroboration data

D: Cross validation data

Unseen data

New cards

(True/False) The ShuffleSplit will ensure that there is no bias in your outcome variable.

A: True

B: False

True

New cards

Select the option that has the syntax to obtain the data splits you will need to train a model having a test split that is a third the size of your available data.

A: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)

B: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

C: X_train, y_test = train_test_split(X, y, test_size=0.33)

D: X_train, y_test = train_test_split(X, y, test_size=0.5)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

New cards

What is the main goal of adding polynomial features to a linear regression?

A: Remove the linearity of the regression and turn it into a polynomial model.

B: Capture the relation of the outcome with features of higher order.

C: Increase the interpretability of a black box model.

D: Ensure similar results across all folds when using K-fold cross validation.

Capture the relation of the outcome with features of higher order.

New cards

What is the most common sklearn methods to add polynomial features to your data?

Note: polyFeat = PolynomialFeatures(degree)

A: polyFeat.add and polyFeat.transform

B: polyFeat.add and polyFeat.fit

C: polyFeat.fit and polyFeat.transform

D: polyFeat.transform

polyFeat.fit and polyFeat.transform

New cards

How can you adjust the standard linear approach to regression when dealing with fundamental problems such as prediction or interpretation?

A: Create a class instance

B: Add some non-linear patterns, i.e., polynomial features

C: Import the transformation method

D: By transforming the data

Add some non-linear patterns, i.e., polynomial features

New cards

The main purpose of splitting your data into a training and test sets is:

A: To improve accuracy

B: To avoid overfitting

C: To improve regularization

D: To improve crossvalidation and overfitting

To avoid overfitting

New cards

Complete the following sentence: The training data is used to fit the model, while the test data is used to:

A: measure the parameters and hyperparameters of the model

B: tweak the model hyperparameters

C: tweak the model parameters

D: measure error and performance of the model

measure error and performance of the model

New cards

What term is used if your test data leaks into the training data?

A: Test leakage

B: Training leakage

C: Data leakage

D: Historical data leakage

Data leakage

New cards

Which one of the below terms use a linear combination of features?

A: Binomial Regression

B: Linear Regression

C: Multiple Regression

D: Polynomial Regression

Linear Regression

New cards

When splitting your data, what is the purpose of the training data?

A: Compare with the actual value

B: Fit the actual model and learn the parameters

C: Predict the label with the model

D: Measure errors

Fit the actual model and learn the parameters

New cards

Polynomial features capture what effects?

A: Non-linear effects.

B: Linear effects.

C: Multiple effects.

D: Regression effects.

Non-linear effects.

New cards

Which fundamental problems are being solved by adding non-linear patterns, such as polynomial features, to a standard linear approach?

A: Prediction.

B: Interpretation.

C: Prediction and Interpretation.

D: None of the above.

Prediction and Interpretation.

New cards

A testing data could be also reffered to as:

A: Training data

B: Unseen data

C: Corroboration data

D: None of the above

Unseen data

New cards

Select the correct syntax to obtain the data split that will result in a train set that is 60% of the size of your available data.

A: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.6)

B: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)

C: X_train, y_test = train_test_split(X, y, test_size=0.40)

D: X_train, y_test = train_test_split(X, y, test_size=0.6)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)

New cards

What is the correct sklearn syntax to add a third degree polynomial to your model?

A: polyFeat = polyFeat.add(degree=3)

B: polyFeat = polyFeat.fit(degree=3)

C: polyFeat = PolynomialFeatures(degree=3)

D: polyFeat = polyFeat.transform(degree=3)

polyFeat = PolynomialFeatures(degree=3)

New cards

(True/False) In model complexity versus error diagram, the model compexity increases as the training error decreases.

A: True

B: False

False

New cards

(True/False) In model complexity versus error diagram, there is an inflection point after which, as the cross validatio error increases, so does the complexity of the model.

A: True

B: False

True

New cards

(True/False) In the model complexity versus error diagram, the right side of the curve is where the model is underfitted and the left side of the curve, is where the model is overfitted.

A: True

B: False

False

100

New cards

In K-fold cross-validation, how will increasing k affect the variance (across subsamples) of estimated model parameters?

A: Increasing k will not affect the variance of estimated parameters.

B: Increasing k will usually reduce the variance of estimated parameters.

C: Increasing k will usually increase the variance of estimated parameters.

D: Increasing k will increase the variance of estimated parameters if models are underfit, but reduce it if models are overfit.

Increasing k will usually increase the variance of estimated parameters.