CSC 308. Exam 1 (Ch 1-4)

0.0(0)
studied byStudied by 6 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/70

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

71 Terms

1
New cards

Chapter 1

2
New cards
  1. This code chains the head() method to the mean() method:

    df.head().mean()

    df.mean.head()()

    df.mean()df.head()

    df.mean().head()

df.mean().head()

3
New cards
  1. The Pandas module provides methods for 

    data visualization

    plotting

    data analysis

    data analysis and plotting

data analysis and plotting

4
New cards
  1. Data analysis includes all but one of the following:

    data mining

    descriptive analysis

    predictive analysis

    data visualization

data mining

5
New cards
  1. The whos Magic Command displays

    the names of all of the variables in the Notebook

    the names and memory use of the active variables

    the names of the active variables

    the names and data types of the active variables

the names and data types of the active variables

6
New cards
  1. This Magic Command  returns the run time for the entire cell:

    $time

    %time

    $$time

    %%time

%%time

7
New cards
  1. A runtime error occurs when a Python statement 

    can’t be run because it’s out of sequence

    when the Python syntax is okay but the statement can’t be executed

    violates one of the rules of Python coding

    can’t be run because the data is faulty

when the Python syntax is okay but the statement can’t be executed

8
New cards
  1. This is a list comprehension for the numbers 1 through 20:

    pd.comprehension(1,20)

    (x for x in range(1,20))

    [x for y in range(1,20)]

    [x for x in range(1,20)]

[x for x in range(1,20)]

9
New cards
  1. This code creates a dictionary :

    ('1-4':'01-04 Years','5-9':'05-09 Years')

    {'1-4':'01-04 Years','5-9':'05-09 Years'}

    '1-4':'01-04 Years';'5-9':'05-09 Years'

    ['1-4':'01-04 Years','5-9':'05-09 Years']

{'1-4':'01-04 Years','5-9':'05-09 Years'}

10
New cards
  1. To get the tooltip for a method, you place the cursor

    in the name of the method or the parentheses of the method and press Shift+Enter

    anywhere in the cell and press Shift+Tab

    in the name of the method or the parentheses of the method and presss Shift+Tab

    in the parentheses for the method and press Shift+Tab

in the name of the method or the parentheses of the method and presss Shift+Tab

11
New cards
  1. This Markdown language creates a level 2 heading for “Clean the data”:

    <h2>Clean the data</h2>

    ##Clean the data##

    ##Clean the data

    %%Clean the data%%

##Clean the data

12
New cards
  1. What kind of error would the following code generate? df[['year']

    Logic error

    Runtime error

    Syntax error

    Memory error

Syntax error

13
New cards
  1. This code imports the pandas module with the name pd:

    from urllib import pd

    import pandas as pd

    from pandas import pd

    import pd from pandas

import pandas as pd

14
New cards
  1. To run a cell in JupyterLab, you can

    click the + button in the toolbar

    press Shift+Enter

    press Ctrl+Shift+Enter

    select the Run > Current Cell command

press Shift+Enter

15
New cards
  1. Although there is some overlap between the phases, data analysis consists these phases:

    get, clean, prepare, and analyze

    get, clean, analyze, and visualize

    clean, prepare, analyze, and visualize

    get, clean, prepare, analyze, and visualize

clean, prepare, analyze, and visualize

16
New cards

Chapter 2

17
New cards
  1. This statement returns all columns in the fires DataFrame in which the row value in the Year column is equal to 1900:

    fires.Year.query(1900)

    fires.query(Year == 1900)

    fires.query(fires.Year == 1900)

    fires.query('Year == 1900')

fires.query('Year == 1900')

18
New cards
  1. To apply more than one method to a group of columns in a DataFrame, you can use the

    chain the agg() method to the groupby() method

    chain the agg() method to the pivot() method

    pivot() method

    groupby() method

chain the agg() method to the groupby() method

19
New cards
  1. To get data into a DataFrame, you can either import the data

    from a file or database or you can use the DataFrame() constructor to build it

    from a file or database, or you can read it from a pickle file

    from a file or you can use the DataFrame() constructor to build it

    from a file or you can read it from a pickle file

from a file or database or you can use the DataFrame() constructor to build it

20
New cards
  1. For each column in a DataFrame, the info() method returns

    the name, the number of non-null values, and the data type

    the name, the number of unique values, and the data type

    the number of non-null values, the number of unique values, and the data type

    the name, the number of non-null values, and the number of unique values

the name, the number of non-null values, and the data type

21
New cards
  1. To prepare a DataFrame for plotting by the Pandas plot() method, you can use all but one of the following to set an index. Which one is it?

    groupby() method

    pivot() method

    melt() method

    index() method

melt() method

22
New cards
  1. This statement mortality_data[['AgeGroup','DeathRate']].max()

    returns the maximum value for just the DeathRate column

    causes a syntax error because brackets are coded within brackets

    causes a runtime error because AgeGroup isn’t a numeric column

    returns the maximum value for two columns

returns the maximum value for two columns

23
New cards
  1. This statement displays all of the columns but only five of the rows of a DataFrame named fires:

    A. with pd.option_context('display.max_rows', 5,

    'display.max_columns', None): display(fires)

    B. with pd.option_context('display.max_rows', 5,

    'display.max_columns', All): display(fires)

    C. with pd.option_context('display.max_rows', 5,

    'display.max_columns', Yes): display(fires)

    D. with pd.option_context('display.max_rows', None,

    'display.max_columns', None): display(fires)

A. with pd.option_context('display.max_rows', 5,

'display.max_columns', None): display(fires)

24
New cards
  1. This statement sorts the rows in the fires DataFrame by the fire_size variable from largest to smallest:

    fires.sort_values('fire-size', ascending=False)

    fires.sort_values('fire-size')

    fires.sort('fire-size', ascending=False)

    fires.sort_values('fire-size', descending=True)

fires.sort_values('fire-size', ascending=False)

25
New cards
  1. After this statement is executed

       mortality_wide = mortality_data.pivot(

             index='Year', columns='AgeGroup', values='DeathRate')

    the values in the AgeGroup column will be summarized by Year

    each unique value in the AgeGroup column will be a column name

    each value in the AgeGroup column will be a column name

    each value in the DeathRate column will be a column name

each unique value in the AgeGroup column will be a column name

26
New cards
  1. The statement

        mortality_data.AgeGroup.replace(

                {'1-4 Years':'01-04 Years','5-9 Years':'05-09 Years'})

    a Pandas method to replace the data in one column

    a Python method to replace the data in one column

    a Python method to replace the data in one column, but doesn’t change the DataFrame

    a Pandas method to replace the data in one column, but doesn’t change the DataFrame

a Pandas method to replace the data in one column, but doesn’t change the DataFrame

27
New cards
  1. Assume that this URL points to a valid CSV file:

    url = 'https://www.murach.com/python_analysis/forest_fires.csv'

    Then, this statement imports a DataFrame named fires from the data in the CSV file:

    fires.read_csv(url)

    fires = pd.import_csv(url)

    fires.import_csv(url)

    fires = pd.read_csv(url)

fires = pd.read_csv(url)

28
New cards
  1. The describe() method returns these statistics for each numeric column in a DataFrame

    row count, column count, mean, minimum value, and maximum value

    row count, unique value count, mean, standard deviation, and maximum value

    row count, mean, standard deviation, minimum value, and maximum value

    row count, non-null count, mean, standard deviation, min value, and max value

row count, mean, standard deviation, minimum value, and maximum value

29
New cards
  1. A DataFrame consists of an index, column data,

    column labels, and attributes

    row labels, column data types, and attributes

    column labels, and column data types

    column labels, column data types, and attributes

column labels, column data types, and attributes

30
New cards
  1. This code accesses the fire_size column in the fires DataFrame:

    fires.'fire_size'

    fires[fire_size]

    fires.[fire_size]

    fires.fire_size

31
New cards
  1. A Series object

    does provide methods for working with the data but a DataFrame doesn’t

    doesn’t provide methods for working with the data but a DataFrame does

    has only one column but a DataFrame can have one or more

    can have one or more columns but a DataFrame has only one

has only one column but a DataFrame can have one or more

32
New cards
  1. This statement saves a DataFrame named fires in a pickle file named fires.pkl:

    fires.to_pickle('fires.pkl')

    fires.to_pkl('fires')

    fires.to_pkl('fires.pkl')

    fires.to_pickle('fires')

fires.to_pickle('fires.pkl')

33
New cards
  1. When you melt the data in four columns of a DataFrame, you

    end up with the same number of rows

    end up with twice as many rows

    melt the data into one column

    end up with four times as many rows

end up with four times as many rows

34
New cards
  1. To access a subset of rows and columns, you can

    use a slice to get the columns and dot notation or brackets to get the rows

    use a slice to get the rows and dot notation or brackets to get the columns

    use the query() method to get the rows and dot notation or brackets to get the columns

    use the query() method to get the columns and dot notation or brackets to get the rows

use the query() method to get the rows and dot notation or brackets to get the columns

35
New cards
  1. Assume that the rows variable contains tabular data and the names variable contains the column names for the data. Then, this statement builds a DataFrame named fires

    DataFrame(data=rows, columns=names)

    fires = pd.DataFrame(data=rows, columns=names

    fires.to_dataframe(data=rows, columns=names)

    fires = pd.to_dataframe(data=rows, columns=names)

fires = pd.DataFrame(data=rows, columns=names)

36
New cards
  1. This statement displays the first and last columns and rows of a DataFrame named fires:

    fires

    fires.head().tail()

    fires.tail()

    fires.head()

fires

37
New cards
  1. The shape attribute of a DataFrame tells you the number of

    rows

    rows and columns

    columns

    elements

rows and columns

38
New cards
  1. The columns attribute of a DataFrame tells you the

    data types of the columns

    number of columns

    names of the columns

    size of the columns

names of the columns

39
New cards
  1. When you set an index for a DataFrame,

    the index can’t have duplicate values

    you can verify that the index doesn’t have duplicate values

    the index must be based on a single column

    you can’t return to the original index

you can verify that the index doesn’t have duplicate values

40
New cards

Chapter 3

41
New cards
  1. One of the following isn’t a data visualization library for Python. Which one is it?

    numpy

    altair

    ggplot

    seaborn

numpy

42
New cards
  1. When compared to long data, wide data

    has more columns and fewer rows and works better for Pandas plots

    has more columns and fewer rows and works better for Seaborn plots

    has fewer columns and more rows and works better for Pandas plots

has more columns and fewer rows and works better for Pandas plots

43
New cards
  1. If you don’t set any parameters, the Pandas plot() method creates a line plot with the index values

    on the y-axis and all numeric columns on the x-axis

    on the y-axis and all columns on the x-axis

    on the x-axis and all columns on the y-axis

    on the x-axis and all numeric columns on the y-axis

on the x-axis and all numeric columns on the y-axis

44
New cards
  1. This This type of plot shows the relationships between two columns of data, often over time:

    line plot

    histogram

    bar plot

    box plot

line plot

45
New cards
  1. This This type of plot shows the frequency of the datapoints:

    bar plot

    histogram

    line plot

    box plot

histogram

46
New cards
  1. This type of plot is used to chart data in categories:

    histogram

    box plot

    line plot

    bar plot

bar plot

47
New cards
  1. Bar plots

    are horizontal while histograms are vertical

    plot the values of the data while histograms plot the distribution of the values

    plot the distribution of the values while histograms plot the values of the data

    use bins for the data but histograms don’t

plot the values of the data while histograms plot the distribution of the values

48
New cards
  1. When you use Pandas to create a plot with subplots

    each subplot can have its own title

    the layout parameter determines whether grid lines are displayed

    the subplots must share the x and y axis values

    long data usually works the best

each subplot can have its own title

49
New cards
  1. Refer to the mortality DataFrames. This code creates a line plot with a different colored line for each age group:

    mortality_long.plot.line()

    mortality_wide.plot.line()

    mortality_long.plot.line

    mortality_wide.plot.line

mortality_wide.plot.line()

50
New cards
  1. Refer to the mortality DataFrames. This code creates a histogram that puts the death rates into 8 bins:

    mortality_long.plot.hist(y='DeathRate', bins=8)

    mortality_long.plot.hist(x='DeathRate', bins=8)

    mortality_wide.plot.hist(x='DeathRate', bins=8)

    mortality_wide.plot.hist(y='DeathRate', bins=8)

mortality_long.plot.hist(y='DeathRate', bins=8)

51
New cards
  1. Refer to the mortality DataFrames. This code creates a scattter plot of the death rates in each of the four age groups by year, but with all of the dots the same color:

    mortality_long.plot.scatter(x='Year', y='DeathRate')

    mortality_wide.plot.scatter(x='Year', y='AgeGroup')

    mortality_wide.plot.scatter(x='Year', x='AgeGroup')

    mortality_long.plot.scatter(x='Year', x='DeathRate')

mortality_long.plot.scatter(x='Year', y='DeathRate')

52
New cards
  1. Refer to the mortality DataFrames. This code creates a line plot with the x-axis ranging from the year 2000 through 2018 and the y-axis ranging from 0 through 100:

    mortality_wide.plot.line(xlim=(2000,2018), ylim=(0,100))

    mortality_wide.plot.line(x_limit=(2000,2018), y_limit=(0,100))

    mortality_wide.plot.line(x_limit=(2000:2018), y_limit=(0:100))

    mortality_wide.plot.line(xlim=(2000:2018), ylim=(0:100))

mortality_wide.plot.line(xlim=(2000,2018), ylim=(0,100))

53
New cards
  1. Refer to the mortality DataFrames. This code creates a line plot with four subplots in two rows that share both the x-axis and the y-axis:

    mortality_wide.plot.line(sharey=True, subplots=True, layout=(2:2))

    mortality_wide.plot.line(sharex=True, sharey=True, layout=(2,2))

    mortality_wide.plot.line(sharex=True, sharey=True, layout=(2:2))

    mortality_wide.plot.line(sharey=True, subplots=True, layout=(2,2))

mortality_wide.plot.line(sharey=True, subplots=True, layout=(2,2))

54
New cards
  1. Refer to the mortality DataFrames. This code creates a bar plot for the mean of each age group:

    mortality_wide.mean().plot.bar()

    mortality_wide.plot.bar().agg=mean

    mortality_wide.plot.bar().mean()

    mortality_wide.mean().plot()

mortality_wide.mean().plot.bar()

55
New cards
  1. Refer to the mortality DataFrames. This code creates a horizontal bar plot for each of the four age groups but for just the years 1900 and 2018:

    mortality_long.query('Year in (1900,2018)').plot.bar()

    mortality_wide.query('Year in (1900,2018)').plot.bar()

    mortality_long.query('Year in (1900,2018)').plot.barh()

    mortality_wide.query('Year in (1900,2018)').plot.barh()

mortality_wide.query('Year in (1900,2018)').plot.barh()

56
New cards

Chapter 4

57
New cards
  1. Unlike Seaborn’s specific methods for plotting, its general methods

    return Axes objects and provide for categorical plots

    return FacetGrid objects and provide for subplots

    return Axes objects and provide for subplots

    return FacetGrid objects and provide for categorical plots

return FacetGrid objects and provide for subplots

58
New cards
  1. A Seaborn bar plot is a type of

    relational plot

    distribution plot

    categorical plot

    linear model plot

categorical plot

59
New cards
  1. A Seaborn histogram is a type of

    categorical plot

    relational plot

    distribution plot

    linear model plot

distribution plot

60
New cards
  1. A Seaborn distribution plot shows

    the distribution of the data in each category

    the relative distribution of the data in each category

    how numeric data is distributed across a range of values

    how data is distributed across a range of values

how numeric data is distributed across a range of values

61
New cards
  1. The basic parameters for most Seaborn general plots are

    kind x, y, and legend

    data, kind, x, and y

    data, kind x, y, and legend

    kind, x, and y

data, kind, x, and y

62
New cards
  1. The one type of Seaborn plot that doesn’t require a y parameter is a

    relational plot

    distribution plot

    scatter plot

    categorical plot

distribution plot

63
New cards
  1. The confidence interval in a Seaborn line plot

    is 95 percent

    shows the likelihood that new datapoints will fall within the top and bottom limits that are shown

    shows the top and bottom limits of past data

    shows the top and bottom values in the data that’s plotted

shows the likelihood that new datapoints will fall within the top and bottom limits that are shown

64
New cards
  1. To create a Seaborn plot that has subplots, you need to use these parameters:

    col, col_wrap, and aspect

    col and col_wrap

    col, col_wrap, and hue

    hue and col_wrap

col and col_wrap

65
New cards
  1. To add a title and a y label to a specific Seaborn plot, you can use code like this:

    A. ax = sns.lineplot(data=mortality_data,

    x='Year', y='DeathRate', hue='AgeGroup')

    ax.set(title='Deaths by Age Group', ylabel='Death Rate')

    B. sns.lineplot(data=mortality_data,

    x='Year', y='DeathRate', hue='AgeGroup')

    ax.set(title='Deaths by Age Group', ylabel='Death Rate')

    C. sns.lineplot(data=mortality_data,

    x='Year', y='DeathRate', hue='AgeGroup')

    g.set(title='Deaths by Age Group', ylabel='Death Rate')

    D. g = sns.lineplot(data=mortality_data,

    x='Year', y='DeathRate', hue='AgeGroup')

    ax.set(title='Deaths by Age Group', ylabel='Death Rate')

A. ax = sns.lineplot(data=mortality_data,

x='Year', y='DeathRate', hue='AgeGroup')

ax.set(title='Deaths by Age Group', ylabel='Death Rate')

66
New cards
  1. To add a title and a y label to a general Seaborn plot, you can use code like this:

    A. sns.relplot(data=mortality_data, kind='line',

    x='Year', y='DeathRate', hue='AgeGroup', aspect=1.5)

    ax.set(title='Deaths by Age Group', ylabel='Death Rate')

    B. g = sns.relplot(data=mortality_data, kind='line',

    x='Year', y='DeathRate', hue='AgeGroup', aspect=1.5)

    ax.set(title='Deaths by Age Group', ylabel='Death Rate')

    C. g = sns.relplot(data=mortality_data, kind='line',

    x='Year', y='DeathRate', hue='AgeGroup', aspect=1.5)

    for ax in g.flat:

    ax.set(title='Deaths by Age Group', ylabel='Death Rate')

    D. g = sns.relplot(data=mortality_data, kind='line',

    x='Year', y='DeathRate', hue='AgeGroup', aspect=1.5)

    for ax in g.axes.flat:

    ax.set(title='Deaths by Age Group', ylabel='Death Rate')

D. g = sns.relplot(data=mortality_data, kind='line',

x='Year', y='DeathRate', hue='AgeGroup', aspect=1.5)

for ax in g.axes.flat:

ax.set(title='Deaths by Age Group', ylabel='Death Rate')

67
New cards
  1. This code adds a super title to a Seaborn plot that has its FacetGrid object in a variable named g:

    g.ax.fig.suptitle('Deaths by Age Group (1910-1930)', y=1.025)

    g.suptitle('Deaths by Age Group (1910-1930)', y=1.025)

    g.fig.suptitle('Deaths by Age Group (1910-1930)', y=1.025)

    g.ax.suptitle('Deaths by Age Group (1910-1930)', y=1.025)

g.fig.suptitle('Deaths by Age Group (1910-1930)', y=1.025)

68
New cards
  1. This code saves a Seaborn line plot that has its FacetGrid object in a variable named g to a file in PNG format:

    g.ax.get_figure().savefig('lineChart.png')

    g.ax.savefig('lineChart.png')

    g.savefig('lineChart.png')

    g.get_figure().savefig('lineChart.png')

g.savefig('lineChart.png')

69
New cards

Refer to the mortality DataFrame. This code creates a scatter plot for the death rates by year with the dots for each age group in a different color:

A. sns.relplot(data=mortality, kind='scatter',

x='Year', y='DeathRate', hue='AgeGroup')

B. sns.relplot(data=mortality, kind='scatter',

x='Year', y='DeathRate')

C. sns.relplot(data=mortality, kind='scatter',

x='Year', y='DeathRate', palette='AgeGroup')

D. sns.relplot(data=mortality, kind='scatter',

y='Year', x='DeathRate')

A. sns.relplot(data=mortality, kind='scatter',

x='Year', y='DeathRate', hue='AgeGroup')

70
New cards
  1. Refer to the mortality DataFrame. This code creates a vertical bar plot for the death rates in 1950 and 2000 with no confidence interval shown:

    A. sns.catplot(data=mortality.query('Year in (1950,2000)'),

    kind='bar', y='Year', x='DeathRate', ci=None)

    B. sns.catplot(data=mortality.query('Year in (1950,2000)'),

    kind='bar', x='Year', y='DeathRate', ci=None)

    C. sns.catplot(data=mortality.query('Year in (1950,2000)'),

    kind='bar', x='Year', y='DeathRate')

    D. sns.catplot(data=mortality.query('Year in (1950,2000)'),

    kind='bar', y='Year', x='DeathRate')

B. sns.catplot(data=mortality.query('Year in (1950,2000)'),

kind='bar', x='Year', y='DeathRate', ci=None)

71
New cards
  1. Refer to the mortality DataFrame. This code creates a KDE plot with one subplot for the death rates in each age group and two subplots in each row:

    A. sns.displot(data=mortality, kind='kde', x='DeathRate',

    col='DeathRate', row_wrap=2)

    B. sns.displot(data=mortality, kind='kde', x='DeathRate', hue='AgeGroup',

    col='AgeGroup', col_wrap=2)

    C. sns.displot(data=mortality, kind='kde', x='DeathRate', hue='AgeGroup',

    col='AgeGroup', row_wrap=2)

    D. sns.displot(data=mortality, kind='kde', x='DeathRate',

    col='DeathRate', col_wrap=2)

B. sns.displot(data=mortality, kind='kde', x='DeathRate', hue='AgeGroup',

col='AgeGroup', col_wrap=2)