Descriptive Data Analysis

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/45

flashcard set

Earn XP

Description and Tags

Flashcards on Descriptive Data Analysis, Campo Techniques, and Data Analysis.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

46 Terms

1
New cards

Statistics

The science that studies methods for collecting, organizing, summarizing, and analyzing data, as well as obtaining valid conclusions about the population under study.

2
New cards

Population

A set of elements that meet a certain characteristic.

3
New cards

Parameter

A descriptive property of the population.

4
New cards

Sample

A subset formed by elements of the population.

5
New cards

Statistic

A descriptive property of the sample.

6
New cards

Descriptive Statistics

The set of techniques oriented to the numerical description of a set of elements (sample). The results of the analysis do not intend to go beyond the data set.

7
New cards

Statistical Inference

The set of techniques oriented to obtaining valid conclusions about a population from a sample of it. The results of the analysis go beyond the collected data set.

8
New cards

Variable

A characteristic of the sample or population that is being observed and that varies among the different individuals in the study. It collects all possible values that the characteristic of interest takes.

9
New cards

Qualitative Variable

Expresses a quality. Categories are either nominal (not ordered) or ordinal (ordered).

10
New cards

Quantitative Variable

Expresses a quantity. Can be discrete (countable values) or continuous (uncountable values).

11
New cards

Discretization

The process of transforming quantitative variables into qualitative ones, which results in loss of information.

12
New cards

Absolute Frequency (ni)

Number of times the value xi is repeated.

13
New cards

Relative Frequency (fi)

Proportion of times the value xi is repeated (ni / N).

14
New cards

Absolute Cumulative Frequency (Ni)

Sum of the ni of all values less than or equal to xi.

15
New cards

Relative Cumulative Frequency (Fi)

Sum of the fi of all values less than or equal to xi.

16
New cards

Graphical Representations

Presents information in a reliable and fast manner, but can be misleading if not constructed correctly.

17
New cards

Pie Chart (Diagrama de Sectores)

A circle with a sector for each value, with the angle proportional to its frequency; suitable for nominal qualitative variables.

18
New cards

Bar Chart (Diagrama de Barras)

A rectangle for each value of the variable, with height equal to its frequency; used to compare categories.

19
New cards

Histogram

A rectangle for each interval, with the area equal to the fraction of data within the interval; for continuous or discrete (many values) quantitative variables.

20
New cards

Map

Shows the spatial distribution of a characteristic of interest; any variable can be represented on it.

21
New cards

Arithmetic Mean

The sum of all values in the distribution divided by the total number of data points. Only applicable to quantitative variables.

22
New cards

Mode

The value that appears most frequently in a dataset.

23
New cards

Median

The value that divides the distribution into two equal parts when the values are ordered.

24
New cards

Quantiles

Values that divide the distribution into intervals of equal frequency.

25
New cards

Quartiles

Three values that divide the distribution into four parts of equal frequency.

26
New cards

Deciles

Nine values that divide the distribution into 10 parts of equal frequency.

27
New cards

Percentiles

Ninety-nine values that divide the distribution into 100 parts of equal frequency.

28
New cards

Range

Difference between the maximum and minimum values; sensitive to outliers.

29
New cards

Interquartile Range (RI)

Difference between the third and first quartiles; represents the dispersion of the central 50% of the data.

30
New cards

Variance (S^2)

Represents the dispersion of data with respect to the arithmetic mean.

31
New cards

Standard Deviation (S)

Square root of the variance; expressed in the same units as the data.

32
New cards

Coefficient of Variation (CV)

A dimensionless measure of relative dispersion, allowing comparison of distributions with different units.

33
New cards

Bivariate Descriptive Analysis

Studying two variables together to see if there is any relationship between them.

34
New cards

Scatter Plot

Shows the relationship between two quantitative variables using points on a Cartesian plane.

35
New cards

Marginal Distribution of Y

Expresses how many times each value yj is repeated, regardless of the X value.

36
New cards

Marginal Distribution of X

Expresses how many times each value xi is repeated, regardless of the Y value.

37
New cards

Conditional Distribution

Describes how the values of X are distributed for each value of Y (X|Y = yj) or vice versa (Y|X = xi).

38
New cards

Statistical Independence

Two variables are statistically independent when the joint relative frequency is equal to the product of the marginal relative frequencies.

39
New cards

Covariance (Sxy)

A measure of the linear association between two quantitative variables.

40
New cards

Statistical Relationship between Variables

Studying the degree of dependence between variables (correlation) and determining the function that best expresses the relationship (regression).

41
New cards

Linear Correlation Coefficient (r)

A measure of the degree of linear dependence between two variables, ranging from -1 to 1.

42
New cards

Dependent Variable

The variable whose behavior is to be explained or predicted (Y).

43
New cards

Independent or Explanatory Variable

The variable used to try to explain the behavior of the dependent variable (X).

44
New cards

Regression of Y on X

A function that explains variable Y for each value of X.

45
New cards

Simple Linear Regression

Focusing on linear adjustments, representing straight lines: y* = f(x) = a + bx.

46
New cards

Coefficient of Determination (r^2)

Indicates the percentage of variability of Y explained by the adjusted model; a measure of how well the model fits.