Python Libraries for Data Science

1. What is NumPy used for?
πŸ‘‰ Works with large arrays & matrices, speeds up math operations.

2. What is SciPy used for?
πŸ‘‰ Advanced math, linear algebra, optimization, and statistics.

3. What is Pandas used for?
πŸ‘‰ Helps store, manipulate, and analyze table-like data (like Excel sheets).

4. What is Scikit-Learn used for?
πŸ‘‰ Provides machine learning tools for classification, regression, and clustering.

5. What is Matplotlib used for?
πŸ‘‰ Creates basic graphs like line plots, bar charts, and histograms.

6. What is Seaborn used for?
πŸ‘‰ Makes fancy statistical graphs, built on top of Matplotlib.

7. How do you load a dataset in Pandas?
πŸ“Œ df = pd.read_csv("data.csv")

8. How do you view the first 5 rows of a dataset?
πŸ“Œ df.head()

9. How do you check data types in Pandas?
πŸ“Œ df.dtypes

10. How do you filter data in Pandas?
πŸ“Œ df[df["salary"] > 120000] (Select rows where salary > 120K)

11. How do you group data in Pandas?
πŸ“Œ df.groupby("rank").mean() (Find average salary per job rank)

12. What is missing data in Pandas?
πŸ‘‰ Data that is blank or NaN (Not a Number).

13. How do you remove missing data?
πŸ“Œ df.dropna() (Drops rows with missing values).

14. How do you replace missing data?
πŸ“Œ df.fillna(0) (Fills missing values with 0).

15. How do you sort data in Pandas?
πŸ“Œ df.sort_values(by="salary") (Sorts data by salary).

16. How do you visualize data in Seaborn?
πŸ“Œ sns.histplot(df["age"]) (Creates a histogram of ages).