SQL

Common SQL data types are numeric, date/time, string, Boolean

The SELECT statement is used to retrieve data from one or more tables in a database and allows you to specify which columns and rows you want to retrieve

FROM identifies which table

Primary Key is unique identifier for each row in a table and ensures that there no duplicate values and no NULL values in its columns

Foreign Key links a column in one table to the primary key column in another table and enforces data integrity and creates relationships between tables

A FULL OUTER JOIN returns all rows from both tables

INNER JOIN returns matching rows from both tables

LEFT JOIN returns all rows from the left table and matching rows from the right table, if no matches are found then NULL values will appear in the right table columns

RIGHT JOIN returns all rows from the right table and matching rows from the left table, if no matches are found then NULL values will appear in the left table columns

FULL OUTER JOIN returns all rows from both tables

CROSS JOIN returns all possible combinations of rows from both tables

OUTER JOIN returns all rows from both tables and fills in missing values with NULL

Self-join is query in which a table in joined with itself and is useful when you need to retrieve related data from the same table like hierarchical data

UNION combines results of two SELECT statements into a single result and removes duplicates

UNION ALL includes all rows even duplicates

WHERE filters rows before aggregation (before a GROUP BY statement)

GROUP BY groups rows into summary rows based on column values

HAVING filters records after aggregation (after GROUP BY statement)

ORDER BY sorts rows in ascending or descending order, but ascending order the default in SQL

DISTINCT removes duplicates

Subquery is a nested query inside another query

LIMIT restricts the number of rows returned

RANK () assigns a unique rank to each row and skips rank numbers in rows if there are duplicates

DENSE_RANK() assigns ranks but does NOT skip numbers in rows for duplicates

BETWEEN is used to filter data within a range and is inclusive

NULL represents missing data in SQL

IS NULL finds missing data and identifies where a column has missing data, can be used to clean data

IS NOT NULL is used to filter out rows that do NOT have missing data, can be used to clean data

COALESCE () can dynamically replace NULL values in queries to precent calculation errors

AGGREGATE FUNCTIONS:

SUM() Adds up all values in a column

AVG() Calculates the average value

COUNT() Counts the number of rows

MIN() Returns the smallest value

MAX() Returns the largest value

Use DELETE when you need to remove specific rows and want rollback capability.

Use TRUNCATE when you want to remove all records quickly but keep the table structure, cannot be rolled back (undone)

Use DROP when you want to permanently delete the entire table from the database, cannot be rolled back (undone)

Common SQL Data Types

Numeric: Used for storing numbers (e.g., INT, DECIMAL, FLOAT).
Date/Time: Represents dates and times (e.g., DATE, TIMESTAMP).
String: Used to store text (e.g., CHAR, VARCHAR).
Boolean: Represents true/false values.

SELECT Statement

The SELECT statement is a fundamental SQL command used to query and retrieve data from one or more tables in a database.
This statement allows you to specify which columns and rows you want to retrieve using various clauses.

FROM Clause

The FROM clause identifies the table(s) from which to retrieve the data. You can query data from a single table or join multiple tables in this clause.

Keys in SQL

Primary Key: A primary key is a unique identifier for each row in a table, ensuring that there are no duplicate values and no NULL values in its columns. Each table should have a primary key to maintain integrity.
Foreign Key: A foreign key creates a link between a column in one table to the primary key column in another table, enforcing data integrity and creating relationships between tables (e.g., parent-child relationships).

JOINS in SQL

FULL OUTER JOIN: Returns all rows from both tables, including rows with no matches.
INNER JOIN: Retrieves only the rows that have matching values in both tables, effectively filtering out non-matching records.
LEFT JOIN: Returns all records from the left table, along with the matching records from the right table; if no match exists, NULL values are returned for the right table columns.
RIGHT JOIN: Similar to LEFT JOIN, but returns all records from the right table and matching records from the left table, filling in NULLs where there is no match.
CROSS JOIN: Produces a Cartesian product, returning all possible combinations of rows from both tables.
OUTER JOIN: This term is often used interchangeably with FULL OUTER JOIN but generally refers to any join that returns all rows from both tables and fills with NULL when there isn't a match.
Self-join: Is a query that joins a table with itself, which is useful for retrieving related data within the same table, particularly in hierarchical structures.

Set Operations

UNION: Combines the results of two or more SELECT statements into a single result set, removing duplicate records.
UNION ALL: Combines results similar to UNION but includes all rows, retaining duplicates.

Filtering and Aggregation

WHERE: This clause filters rows before any aggregation occurs (i.e., before applying GROUP BY).
GROUP BY: Groups rows sharing a specified column value into summary rows, useful for aggregation.
HAVING: Similar to WHERE but applied after aggregation has been performed.

Sorting and Uniqueness

ORDER BY: Sorts the results set in ascending (default) or descending order based on one or more columns.
DISTINCT: Removes duplicates from the result set, ensuring all returned values are unique.

Subqueries

A Subquery is a nested query within another SQL query that can be used to provide additional filtering or calculation.

Limiting Results

LIMIT: Restricts the number of rows returned by a query, which is useful for paginating results.

Ranking Functions

RANK(): Assigns a unique rank to each row within a result set while skipping ranks for duplicates (e.g., if two rows are tied for 1st, the next rank will be 3rd).
DENSE_RANK(): Similar to RANK(), but does not skip ranks for duplicates, so consecutive ranks are assigned.

Data Handling and NULL Values

BETWEEN: Filters records within a certain range, including the boundary values.
NULL: Represents absent or missing data in SQL.
IS NULL: Checks for NULL values in specified columns and can help identify missing data.
IS NOT NULL: Filters out rows that do NOT have missing values.
COALESCE(): A function that takes a list of values and returns the first non-NULL value, useful for replacing NULL in queries.

Aggregate Functions

SUM(): Calculates the total sum of a numerical column.
AVG(): Computes the average of a numerical column.
COUNT(): Counts the number of rows or non-NULL values in a column.
MIN(): Returns the smallest value in a specified column.
MAX(): Returns the largest value in a specified column.

Data Manipulation LLanguage (DML)

DELETE: Used to remove specific rows from a table while allowing for rollback if needed.
TRUNCATE: Efficiently removes all records from a table, but retains the table structure; this operation cannot be rolled back.
DROP: Permanently deletes an entire table from the database, including its structure and data; cannot be undone.