Bitch ass CS

0.0(0)
studied byStudied by 1 person
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/78

flashcard set

Earn XP

Description and Tags

SLU 1030-McKenzie

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

79 Terms

1
New cards

Object-Oriented Programming (OOP)

A programming paradigm based on the concept of "objects," which can contain data and code to manipulate that data.

2
New cards

Encapsulation

implementation details are hidden or encapsulated in objects

3
New cards

Inheritance

child classes can inherit from parent classes and module

4
New cards

Polymorphism

objects and names exist in many forms. So, the same attribute or method can exist in multiple classes and mean different things, etc

5
New cards

Abstraction

handling a concept rather than the implementation details

6
New cards

Interpreters

Analyzes source code, generates byte code, and initializes the Python Virtual Machine (PVM). Ours for python is called CPython

7
New cards

Scope

LEGB Rule

8
New cards

Python Lists

Ordered and changeable collections in Python, written with square brackets, allowing for heterogeneous elements. The list can be homogeneous or heterogeneous, Element-wise operation is not possible on the list,  Python list is by default 1-dimensional. But we can create an N-Dimensional list. But then too it will be 1 D list storing another 1D list, Elements of a list need not be contiguous in memory.

9
New cards

Numpy Arrays


Structured lists of numbers: Vectors, Matrices, Images, Tensors, ConvNet. Arrays can have any number of dimensions, including zero (a scalar) Arrays are typed: np.uint8,
np.int64, np.float32, np.float64. Arrays are dense. Each element of the array exists and has the same type. Arrays are faster than python lists (consume less memory). Can only combine arrays of the same shape!

10
New cards

Data Types in Python

Various types including str, int, float, list, dict, set, bool, bytes, and NoneType, each serving different purposes.

11
New cards

Classes

contain 3 types: static, class, and instance

12
New cards

Truthiness

The evaluation of values in conditional statements, where certain values equate to true or false.

13
New cards

Central Tendencies

represents the center point of “typical” value of a dataset. As a rule, we replace null values with the mean when the data is normally distributed and replace null values with the median when the data is skewed.

14
New cards

Correlation Coefficient

A statistical measure indicating the strength and direction of a relationship between two variables, ranging from -1 to 1. also it looks like this

<p>A statistical measure indicating the strength and direction of a relationship between two variables, ranging from -1 to 1. also it looks like this</p>
15
New cards

Random Variables

x, is a variable where the possible outcomes are a
function of a random phenomena. The probability for any event is between 0 and 1, inclusive. The summation of the probabilities of each outcome equals 1

16
New cards

Random State

A method to generate pseudo-random numbers in computing.

17
New cards

Central Limit Theorem

A statistical principle stating that sample means will be normally distributed regardless of the population's distribution, given a large enough sample size.

18
New cards

Discrete Data

Finite and countable data values.

19
New cards

Continuous Data

Infinite data values that can take any numeric value.

20
New cards

Law of Large Numbers

as the sample size increases the sample mean gets closer to the population mean

21
New cards

Scales of Measurement

Different levels of data categorization including nominal, ordinal, interval, and ratio, each with unique properties.

22
New cards

Stdin and Stdout

Standard input and output streams for data processing in programming.

23
New cards

Data Cleaning

The process of correcting errors and inconsistencies in data to improve quality.

24
New cards

Preprocessing

cleaning up all null values, data cleaning (dashes, odd characters, handle missing and extreme outliers). One hot encoding, convert categorical to numerical sometimes its the same as onehotencod. Standardization/Normalization, deal with multicollinearity which can be cause by 3

25
New cards

Machine Learning

is a branch of artificial intelligence where algorithms use data to improve automatically through experience, without explicit programming. These algorithms identify patterns in large datasets, encompassing numbers, words, images, and more, enabling predictions or decisions. ML allows systems to adapt to new data without human intervention.

26
New cards

Supervised Learning

A type of machine learning using labeled training data to predict outcomes of classification or categorization using discrete values

27
New cards

Unsupervised Learning

A type of machine learning that does not use labeled data, focusing on finding patterns clustering from discrete values

28
New cards

Cross-Validation

testing performance of a machine learning model by training a model using the subset of the data and test the performance using a different subset with or without replacement.

29
New cards

Overfitting

problem that occurs when the model cannot make generalizations and fits too closely to the training data

30
New cards

Underfitting

problem that occurs when the model is overgeneralized.

31
New cards

Feature Selection

The process of selecting a subset of relevant features for model training to improve performance and reduce complexity.

32
New cards

KNN (K-Nearest Neighbors)

predicts the group of a datapoint based on majority “votes” from
nearest neighbors K represents the hyperparameter which
indicates how many data points any new datapoint must listen to in order to decide what class it is in

33
New cards

Linear Regression

is to model and predict the relationship between independent and dependent variables. Univariate linear regression predicts a
dependent variable from ONE independent variable whereas multiple linear regression predicts a dependent variable from MULTIPLE independent variables :y = mx + b

34
New cards

Multiple Regression

predicts a dependent variable from MULTIPLE
independent variables. “Multivariate” means the result is a vector. We look at correlations, we compare the R² values before and after a feature is added, and sklearn.feature_selection has many functions to assist with feature selection. * y = m1x1 + m2x2 + m3x3 + ... + b

35
New cards

Logistic Regression

Used when trying to predict the answer to a yes/no question or any binary question, response follows a S shaped curve.

36
New cards

Confusion Matrix

A table used to evaluate the performance of a classification algorithm by comparing predicted and actual outcomes.

37
New cards

Accuracy

number of correct predictions/ total predictions

38
New cards

Precision

true positive results / total predicted positives, indicating the accuracy of positive predictions.

39
New cards

Recall

The ability of a classifier to identify all relevant instances, measuring the proportion of true positives among actual positives. : tp/tp +fn

40
New cards

Relational Databases

Databases that use tables to store and manage structured data.

41
New cards

Cloud Databases

Databases that reside on cloud computing platforms, allowing for flexible data storage and access.

42
New cards

Distributed Databases

Databases that consist of data stored across multiple locations or sites.

43
New cards

Object-Oriented Databases

Databases designed to handle complex data types and relationships efficiently.

44
New cards

NoSQL Databases

Non-relational databases that allow for stattistacal analysis

45
New cards

What are the 4 scales of measurement

Nominal, Ordinal, Interval, and Ratio

46
New cards

What is Nominal

Categories that do not have a natural order Ex. blood type, zipcode, race

47
New cards

Ordinal

categories where order matters but the difference between them is neither clear nor even. Ex. satisfaction scores, happiness level from 1-10

48
New cards

Interval

There is an order and the difference between two values is meaningful. Ex. Temp(Cand F), credit scores, pH

49
New cards

Ratio

The same as interval except it has a concept of 0. There are no negative numbers Ex. concentration, Kelvin, weight

50
New cards

gitignore files

untracked files that are files that have been created within your repo's working directory but have not yet been added to the repository's tracking index using the git add command. Most files .File are hidden by defalut

51
New cards

Text types

str

52
New cards

Numeric Types

int, float, complex

53
New cards

Sequence

list, tupule, range

54
New cards

Mapping

dict

55
New cards

Set types

set, frozensets

56
New cards

Boolean type

bool

57
New cards

Binary types

bytes, bytearray, memoryview

58
New cards

NoneType

data that does not fit into any of these categories

59
New cards

Supervised and Unsupervised greatest difference

The biggest difference between supervised and unsupervised machine learning is the type of data used. Supervised learning uses labeled training data, and unsupervised learning does not. More simply, supervised learning models have a baseline understanding of what the correct output values should be.

60
New cards

How does KNN evaluate its performance

via accuracy

61
New cards

Multiple regression evaluation of performance

Use mean squared error and R² (R-squared) to validate model performance. *really, this depends on the shape of the data. There are other statistical models and tests like ANOVA that we won't discuss here

62
New cards

How to evaluate performance of Linear regression

a method we can use to understand the relationship between one or more predictor variables (Xi) and a response variable (Y). R Square/Adjusted R Square.:Mean Square Error(MSE)/Root Mean Square Error(RMSE), Mean Absolute Error(MAE), illustrate Residual of model as a normal distribution ( bell shape), By OLS from statemodels.formula.

63
New cards

complete separation

happens when the outcome variable separates a predictor variable or a combination of predictor variables completely.

64
New cards

class method

can modify class state and cant modify object state. Its used for factory functions

65
New cards

Static Method

cant access class state and cant access object state. Used for utility functions.

66
New cards

Instance method

can modify class srtate and can modify object state

67
New cards

What are factory functions

separate the process of creating an object from the code that depends on the interface of the object

68
New cards

Utility functions

handle logic and checks(see if the person on the website is over 18, see if inputs meet particular requirements

69
New cards

Local Scope

only available to other code in this scope. A function, for example, only has access to the names defined in that function or passed into it via arguments.

70
New cards

Eclosing- scope

only exists for nested functions. Inner nests can have access to the names in outer nests

71
New cards

Global- Scope

vailable to all your code and can pass through modules, classes, etc

72
New cards

Built-in*Scope

all names that are created by Python when you run a script

73
New cards

Uniform Distribution

or a random variable, expected value is (a+b)/2, where a is the maximum possible value and b is the minimum

74
New cards

Binomial distribution

mean is the expected value, which is equal to n trials
* p probability

75
New cards

Standard normal distribution

mean is the expected value

76
New cards

Structured Databases

data that has no inherent structure and is usally stored as different types of files. Ex. text docs, Pdfs, images, and videos

77
New cards

Quasi-Structured

Textual data with erratic formats that can be formatted with effort and software tools Ex. clickstream data

78
New cards

Semi-structured

textual data files with an apparent patern enabling analysis Ex. Spreadsheets and XML files

79
New cards

Structured

Data having defined data model, format, and structure. Ex. database