SQL
Introduction to SQL- datacamp
Relational database
Defines relationships within tables of data inside the database
Not all tables have to relate to each other
Different to spreadsheets as more powerful
More data
More secure
More people can access at once
Queries
Data doesnt change, its presented in the way the query asked
SQL
Structured query language
Most widely used programming language for databases
Used to write queries to organise data from database
Tables
Rows = records
Unlimited
Holds data on individual observation
Columns = fields
Set when database was created
Holds one piece of information about all observations
Names lowercase with no spaces, and singular, not plural
2 fields cannot have same name
Cannot share name with table so its clear whether field or table being referred to
Table Names
Lowercase
No spaces (underscores instead)
Can be collective group e.g. inventory, or plural name
Unique identifier
Identifies records so it can be distinguished from other records in same table
Unique, often numerical
Not name because eventually 2 might have same name etc.
More tables with clear structure better than fewer with info on multiple subjects
Can always use sequel to gather and connect data from multiple tables, but table topics should be separate
Data types
Data type chosen based on type of data the field would hold e.g. number, text, date
Different types of data stored differently and take up different space
Some operations only apply to certain data types
Cant multiply text but can multiply numbers
String
Sequence of characters such as letters or punctuation e.g. names
Some string data type <250 characters, saves storage spaces
VARCHAR
Flexible and popular string data type
Very long strings
Integer
Data types storing whole numbers
Few types depending on how big numbers we need to store
INT can store big numbers
Float
Store numbers including fractional parts
NUMERIC can store up to 30 digits
You can tell which type of data by looking at database schema
Schema
Blueprint of database
What tables are included in database and relationships between the tables
Storage
Physically stored on harddisk of server
Server is centralised computer that performs services via requests made over a network
Service performed = data access in this case
Any computer can be a server if it is set up to perform a service
Very powerful and large can handle high volume of requests
SQL answers questions within and across relational database tables
Best for large datasets
If data can fit in spreadsheet and isnt related to other data of interest it can be analysed in a spreadsheet
For diverse data, database is best
Keywords
Reserved words used for operations
SELECT
Indicates which fields should be selected
FROM
Indicates the table fields are located in
Query can be written
SELECT x
FROM y;
End with semicolon to show its complete
Capital query lowercase names
Result = result set
Select multiple fields
SELECT x,z
FROM y;
Does not have to match order fields are presented in table
Select all fields in table
SELECT *
FROM y;
Writing queries
Aliasing
Rename columns for clarity or brevity
SELECT x as y,z
FROM w
Produces a column y (containing the x data), z
Doesnt change the table itself
Selecting distinct records with no repeats
SELECT DISTINCT x
FROM y;
Selecting distinct with multiple fields
SELECT DISTINCT x,y
FROM z;
Views
Virtual table which is the result of a SELECT statement
When accessed, automatically updates in response to updates in the database
CREATE VIEW x AS
SELECT a,b,c
FROM d;
Then you can use view to select from
SELECT x
FROM view;
SQL Flavours
Free vs paid
All used with relational databases
Keywords are shared
Must follow universal standards
Only additional features -> different flavours
PostgreSQL
Free and opensource relational database system
Refers to system and associated flavour
SQL Server
Microsoft
free/paid
T-SQL is microsoft SQL flavour used with SQL server databases
Think of as dialects of same language
PostgreSQL vs SQL Server
LIMIT 2; selects top 2 entries
SELECT TOP (2) x,y selects top 2 entries
Limiting results is useful for testing code
Choosing a flavour
Any is a good choice, differences are small
Can’t have e.g. - - - in the code