ch 2 CLI (pipe)
Terminal and Commands Introduction
Opening Terminal
Start the terminal
Use command
pwdto check current working directory (should be CLI files)
Working with the iris.csv Dataset
Navigating to the Dataset
Go to lesson zero two
Examining the Contents of iris.csv
Use command
cat iris.csv(0:28)Dataset Overview:
Contains classes of three species of flowers
Three specified classes:
Iris virginica
Iris versicolor
Iris setosa
Not the focus of analysis but serves as a case study
Measuring Data Points
Counting Lines
Use command
wc iris.csvResult:
Total of 150 data points
Checking for Header in the CSV file
Verifying Header Presence
Use command
head iris.csvOutput:
Displays top 10 lines, confirming no header is present
Using Pipe Symbol with Commands
Alternative Methods of Counting Lines
Combine commands:
cat iris.csv | wcExplanation of the Pipe Symbol (
|)Takes output from
catand uses it as input forwcEach command operates independently but collaboratively
Significance:
Flexible constructs without hardcoding dependencies
Understanding grep Command
grep means take a specified string or pattern and return the lines from text files that contain it. This command is instrumental in searching for data within files, allowing users to filter and manipulate large datasets efficiently.
Searching Within Files
Use command
grep "setosa" iris.csvPurpose: Find lines containing setosa
Output:
Only lines with the word setosa are returned
Regular Expressions in grep
setosa as a basic regex example
Encouraged to learn more about regex, outside of current topic
Chaining Commands Effectively
Using cat and grep Together
Command:
cat iris.csv | grep "setosa"Achieves same outcome as previous
grepcommand
Counting Specific Lines
Chain to count occurrences:
Command:
cat iris.csv | grep "setosa" | wcOutcome:
Total of 50 setosa data points
Explanation:
Creates a counting utility after selection
Additional Chaining Examples
Searching for Numerical Values
Command:
cat iris.csv | grep "3.5"Output lines with the numerical value of 3.5
Combining Multiple Filters
Command:
cat iris.csv | grep "setosa" | grep "3.5"Filters for setosa lines that also include 3.5
Counting Filtered Lines
Command:
cat iris.csv | grep "setosa" | grep "3.5" | wcAllows for the counting of the lines that meet both criteria automatically, providing a quick summary of the data.
Output: 6 lines matching both filters
Listing Files with Specific Extensions
Finding CSV Files
Command:
ls | grep .csvOutputs all files in the directory that contain the word .csv
Exercises for Practice
Experiment with various combinations of commands:
Filter lines containing verticular and 2.0
Practice counting these filtered results
Importance of mastering these commands:
Essential for efficiently working with large datasets
More efficient than writing custom scripts.