Note

0.0(0)

Take a practice test

Chat with Kai

Explore Top Notes

Cognitive psychology

Studied by 23 people

Chapter 11: The Cotton Revolution

Studied by 119 people

Studied by 11 people

APUSH UNIT 3 VOCAB

Studied by 108 people

Unit 7: Natural Selection

Studied by 27999 people

Come cambiare volo Delta? Cambia la tua rotta in pochi minuti

Studied by 3 people

Reading and Writing Data with Python using Pandas

Reading and Writing Data with Python

Working Directory and File Paths

Purpose: To locate data for loading into Python.
Checking Current Working Directory:
- Use the os module.
- import os
- os.getcwd() or os.get_current_working_directory() displays the current file system location.
Loading Data from Different Directories:
- If files are not in the current working directory, the correct directory must be specified.
- Example using os.chdir() to change the directory:
  - os.chdir('kaggle') (moves one level up from kaggle/working to kaggle).
- os.listdir() or os.listdir() lists files and folders in a directory.
Example Scenario:
- Navigating Kaggle environment directories:
  - Moving from kaggle/working to kaggle.
  - Listing directories to find the input folder.
  - Exploring input/titanic to find train.csv, gender_submission.csv, and test.csv.

Reading Comma Separated Values (CSV) and Tab Separated Values (TSV) Files

CSV and TSV Files:
- Common flat file data formats.
- Readable and editable in spreadsheet programs like Excel.
- CSV: Data values separated by commas.
- TSV: Data values separated by tabs.
Reading CSV Files with pandas:
- Use pandas.read_csv() function.
- Example:
  - import pandas as pd
  - pd.read_csv('input/titanic/train.csv')
  - Loads the Titanic training dataset into a pandas DataFrame.
  - .head(6) displays the first six rows of the DataFrame.
Reading TSV Files with pandas:
- Use pd.read_table() function.
- Default delimiter is tab.
- For other delimiters, use the sep argument.
- Example:
  - pd.read_table('file.tsv')
  - pd.read_table('file.txt', sep='-') (if data is separated by hyphens).
- Specifying tab character:
  - Use '\t' to represent a tab character in the sep argument.

Reading Data from Excel Files

Reading Excel Files with pandas:
- Use pd.read_excel() function.
- Requires specifying the file path and sheet name.
- Example:
  - draft = pd.read_excel('input/draft2015/draft.xlsx', sheet_name='draft2015')
  - Loads data from the 'draft2015' sheet in the Excel file.
  - .head() displays the first few rows of the DataFrame.

Reading Data from the Web

General Approach:
- Download data in CSV or Excel format when possible.
- Use pd.read_csv() or pd.read_excel() to load the downloaded file.
Reading Data from Clipboard:
- Use pd.read_clipboard() function.
- Copy data from a website or other source and paste it into the clipboard.
- Example:
  - df = pd.read_clipboard()
- May require cleaning and reformatting in a spreadsheet program before reading.
Reading HTML Tables:
- Use pd.read_html() function.
- Requires the html5lib package.
Handling Nested Data Structures (HTML, JSON, XML):
- Requires more complex parsing and specialized libraries.

Writing Data with pandas

Writing Data to CSV Files:
- Use the .to_csv() method on a DataFrame.
- Example:
  - draft.to_csv('draft_save.csv')
  - Saves the draft DataFrame to a CSV file named 'draft_save.csv' in the current working directory.
Verifying the Output:
- Use os.listdir() to confirm the file has been created.
Application:
- Saving prediction results in machine learning competitions.

Conclusion

Pandas for Data Handling:
- Simplifies reading and writing common data formats.
Databases and Nested Data:
- Require specialized libraries and more complex handling.
Python's Versatility:
- Libraries available for almost any data format.
- Research and documentation are key to finding the right tools.
Next Steps:
- Learning basic Python programming constructs and custom functions.
- Control flow (if/else logic, for/while loops) for custom code execution.

Note

0.0(0)

Take a practice test

Chat with Kai

Explore Top Notes

Cognitive psychology

Studied by 23 people

Chapter 11: The Cotton Revolution

Studied by 119 people

Studied by 11 people

APUSH UNIT 3 VOCAB

Studied by 108 people

Unit 7: Natural Selection

Studied by 27999 people

Come cambiare volo Delta? Cambia la tua rotta in pochi minuti

Studied by 3 people