Final

Structured vs unstructured data

Unstructured:

Natural Language Processing (NLP): sentiment classification

ChatGPT and LLMs:

CATEGORICAL VARIABLE (qualitative)

age group, student ID, preference ranking

NUMERICAL VARIABLE

  • Discrete (finate values: # of students, purchase quantity)

  • Continuous (infinite values: height, price, weight)

Measures of central tendency: mean, median, mode

Measures of Dispersion: range, variance, standard deviation

DISTRIBUTIONS AND HYPOTHESIS