Python Libraries for Data Science
1. What is NumPy used for?
π Works with large arrays & matrices, speeds up math operations.
2. What is SciPy used for?
π Advanced math, linear algebra, optimization, and statistics.
3. What is Pandas used for?
π Helps store, manipulate, and analyze table-like data (like Excel sheets).
4. What is Scikit-Learn used for?
π Provides machine learning tools for classification, regression, and clustering.
5. What is Matplotlib used for?
π Creates basic graphs like line plots, bar charts, and histograms.
6. What is Seaborn used for?
π Makes fancy statistical graphs, built on top of Matplotlib.
7. How do you load a dataset in Pandas?
π df = pd.read_csv("data.csv")
8. How do you view the first 5 rows of a dataset?
π df.head()
9. How do you check data types in Pandas?
π df.dtypes
10. How do you filter data in Pandas?
π df[df["salary"] > 120000] (Select rows where salary > 120K)
11. How do you group data in Pandas?
π df.groupby("rank").mean() (Find average salary per job rank)
12. What is missing data in Pandas?
π Data that is blank or NaN (Not a Number).
13. How do you remove missing data?
π df.dropna() (Drops rows with missing values).
14. How do you replace missing data?
π df.fillna(0) (Fills missing values with 0).
15. How do you sort data in Pandas?
π df.sort_values(by="salary") (Sorts data by salary).
16. How do you visualize data in Seaborn?
π sns.histplot(df["age"]) (Creates a histogram of ages).