SQL

Introduction to SQL- datacamp

  • Relational database

    • Defines relationships within tables of data inside the database

    • Not all tables have to relate to each other

  • Different to spreadsheets as more powerful

    • More data

    • More secure

    • More people can access at once

  • Queries

    • Data doesnt change, its presented in the way the query asked

  • SQL

    • Structured query language

    • Most widely used programming language for databases

    • Used to write queries to organise data from database

  • Tables

    • Rows = records

      • Unlimited

      • Holds data on individual observation

    • Columns = fields

      • Set when database was created

      • Holds one piece of information about all observations

      • Names lowercase with no spaces, and singular, not plural

        • 2 fields cannot have same name

        • Cannot share name with table so its clear whether field or table being referred to

    • Table Names

      • Lowercase

      • No spaces (underscores instead)

      • Can be collective group e.g. inventory, or plural name

    • Unique identifier

      • Identifies records so it can be distinguished from other records in same table

      • Unique, often numerical

        • Not name because eventually 2 might have same name etc.

    • More tables with clear structure better than fewer with info on multiple subjects

    • Can always use sequel to gather and connect data from multiple tables, but table topics should be separate

  • Data types

    • Data type chosen based on type of data the field would hold e.g. number, text, date

    • Different types of data stored differently and take up different space

    • Some operations only apply to certain data types

      • Cant multiply text but can multiply numbers

    • String

      • Sequence of characters such as letters or punctuation e.g. names

      • Some string data type <250 characters, saves storage spaces

      • VARCHAR

        • Flexible and popular string data type

        • Very long strings

    • Integer

      • Data types storing whole numbers

      • Few types depending on how big numbers we need to store

      • INT can store big numbers

    • Float

      • Store numbers including fractional parts

      • NUMERIC can store up to 30 digits

    • You can tell which type of data by looking at database schema

  • Schema

    • Blueprint of database

    • What tables are included in database and relationships between the tables

  • Storage

    • Physically stored on harddisk of server

    • Server is centralised computer that performs services via requests made over a network

      • Service performed = data access in this case

      • Any computer can be a server if it is set up to perform a service

      • Very powerful and large can handle high volume of requests

  • SQL answers questions within and across relational database tables

  • Best for large datasets

    • If data can fit in spreadsheet and isnt related to other data of interest it can be analysed in a spreadsheet

    • For diverse data, database is best

  • Keywords

    • Reserved words used for operations

    • SELECT

      • Indicates which fields should be selected

    • FROM

      • Indicates the table fields are located in

    • Query can be written

      • SELECT x

      • FROM y;

      • End with semicolon to show its complete

      • Capital query lowercase names

    • Result = result set

    • Select multiple fields

      • SELECT x,z

      • FROM y;

      • Does not have to match order fields are presented in table

    • Select all fields in table

      • SELECT *

      • FROM y;

  • Writing queries

    • Aliasing

      • Rename columns for clarity or brevity

      • SELECT x as y,z

      • FROM w

      • Produces a column y (containing the x data), z

      • Doesnt change the table itself

      • Selecting distinct records with no repeats

        • SELECT DISTINCT x

        • FROM y;

      • Selecting distinct with multiple fields

        • SELECT DISTINCT x,y

        • FROM z;

    • Views

      • Virtual table which is the result of a SELECT statement

      • When accessed, automatically updates in response to updates in the database

      • CREATE VIEW x AS

      • SELECT a,b,c

      • FROM d;

      • Then you can use view to select from

        • SELECT x

        • FROM view;

  • SQL Flavours

    • Free vs paid

    • All used with relational databases

    • Keywords are shared

    • Must follow universal standards

    • Only additional features -> different flavours

    • PostgreSQL

      • Free and opensource relational database system

      • Refers to system and associated flavour

    • SQL Server

      • Microsoft

      • free/paid

      • T-SQL is microsoft SQL flavour used with SQL server databases

    • Think of as dialects of same language

    • PostgreSQL vs SQL Server

      • LIMIT 2; selects top 2 entries

      • SELECT TOP (2) x,y selects top 2 entries

      • Limiting results is useful for testing code

    • Choosing a flavour

      • Any is a good choice, differences are small

  • Can’t have e.g. - - - in the code