Lec 7-9
audio moving as a sound wave
sound waves (analog) —> electric signals (digital) —> sound waves (analog)
how do microphones work
as a transducer, converting waves to electric signals
how speakers work
similar to mics, but with an electric voltage/current to sound waves
2 ways of creating digital audio
sampling
synthesis
anatomy of audio
uses sine waves as a function of time and every analog sound has an infinite number of sine waves with varying amplitudes and phases
amplitude equation
a = max / min
period equation
period = 1/f
sine audio equation
a = amplitude (intensity dB)
f = frequency (pitch Hz)
theta = phase (relative location of sound)
hearing frequency limits
upper boundary = variable (!) and decreases with age
16 - 20 Hz
how do high frequencies change the timbre
it affects the colouration of sounds
digitizing sound
takes samples (measuring rate) at a fixed rate (sampling rate) and recording them
how rate speeds affects files
too fast = large file
too slow = inaccurate
what happens if low sample rate is taken
reproduces a sound that doesn’t match the original frequency
nyquist-shannon
sample interval <= half period
but to catch peaks and valleys = sampling rate (frequency) >= 2 x max frequency (for accurate reproduction)
if highest frequency of sample is 5, what samples can be used?
if max frequency of sample is 5, 5×2 = 10 kHz min is needed of sampling frequency
bit depth
number of bits used to encode a single sample
higher bit depth = more accurate sample (quiet sounds better preserved)
bit depth in practice —> <8 bits
not used for sound, but for recording physical processes (blood pressure, heartbeat, etc)
bit depth in practice —> 8 bits
common in telephony and quantization noise is sometimes perceptible
bit depth in practice —> 16 bits
most high quality sound (CDs, MP3, etc)
bit depth in practice —> 24+ bits
even higher quality and dynamic range, often used before or during sound processing/editing (mastering)
bit rate
combines bit depth and sampling rate —> higher bit rate = better quality = larger file size
bit rate = bit depth x sample frequency x channels = # bits /sec
if sampling rate is 6 and bit depth is 2, what’s uncompressed bit rate?
6 × 2 × 1 (channel) = 12 bits/sec
if bit rate is 12 and sample frequency is 6, what’s bit depth?
bit depth = bit rate/sampling frequency x channel
= 12/6×1 = 2 bits
format of uncompressed audio
usually in .wav
audio compression technique —> lossless
similar to text compression techniques
audio compression technique —> lossy
removes imperceptible sounds (using psychoacoustic models) and reduces bit rate (less accurate reproduction of original)
audio compression technique —> codecs
combines techniques with compression and decompression algorithms for audio/video
portmanteau
is word combination of compressor (coder) and decompressor (decoder)
example codec - FLAC (free lossless audio codec)
lossless
62% compression ratio
archives high quality audio
example codec - MP3 (moving pictures expert group, layer 3)
lossy
23% ratio
uses psychoacoustics and huffman encoding
lower bit rates
mobile devices
example codec - AAC (advanced audio coding)
lossy
14% ratio
uses psychoacoustics, huffman encoding
lower bit rates
better perceived quality than MP3
apple devices and modern smartphones
representing colour
human retina has 4 types of light-sensitive cells
1 = low-light night vision rods
3 = s/m/l cones for regular vision (RGB in ASCII order)
what happens when all 3 cones are stimulated by light
causes a grey colour and the cones that stimulate RGB cones can form other colours by combination
bits for colour range
10 bits possible for high dynamic range, but 3 integers for a range of 0-255 reproduces colour
CYMK
RGB forms Cyan, Yellow, Magenta, blacK
starts with white paper and subtracts individual components
HSL vs YUV
Hue, Saturation, Luminance
Luminance, Blue, Red
both similar to colour wheel and is how the brain sees colour
these are alternative colour models used in fields
representing colour with hexadecimal
each RGB is 1 byte
256 values = 1 byte = 8 bits = 16 hexadecimal digits
hexadecimal digits to represent colour
RRGGBB
pixel
dots of colour in an image or display
resolution
number of pixels in an image that determines size
sometimes refers to pixel density (ppi = # pixels/density)
vector graphics
tells how an image is displayed, defined using equations, lines, curves, and polygons
lossless b/c no pixelation of images as instructions are encoded
how vector graphics works
can be enlarged without loss of detail or file size change
images created using drawing applications or text editors
scalable vector graphics (svg)
text-based then compressed accordingly
computer generates simple images through this
rester graphics
pixels are specified for complicated images using a matrix of pixels in a painting application
used for photographs but is lossy because suffers from pixelation
indexed colour
was popular for compressing images, and only represents useful or used colours
similar to compressing text with keyword encoding
GIF (graphics interface format)
lossless
indexed colours (256)
coding similar to Huffmans’
10% ratio
allows transparency and animation
less modern than PNG
PNG (portable network graphics)
lossless - encodes with high level of detail
indexed colours
RLE
7% ratio
allows transparency
GIF vs PNG
both used for line drawings, logos, or diagrams
PNG is too large but lossless
GIF is low quality but includes 256 colours
JPEG (joint photographic experts group)
ideal format for photos, but bad for images with text or sharp lines
lossy
1-10% ratio
divides entire image into blocks of 8×8 pixels but each block stores average intensity
guide to choosing image format
video structure
comprised of frames of still images and audio
rapid succession gives appearance of motion (24 - 60 fps)
aspect ratio
16:9 = # pixels horizontally : # pixels vertically
uncompressed video occupation
takes up vast storage space, greater than 100 MB for full HD, but compression techniques are usually used
video compression
typically little change between successive video frames and use techniques to remove redundant information
spatial (intraframe) compression technique
M-JPEG
lossless or lossy
uses info within same frame to reduce file size
temporal (interframe) compression technique
MJPEG
lossy
uses data from before and after nearby frames to reduce file size
temporal (interframe) compression technique - key (I) frames
compressed using spatial techniques and can be reproduced independently
it’s inserted at scene changes and/or regular intervals to preserve quality of streaming or playback
P and B frames
Predictive and Bidirectional
other frames encoded by saving differences between it and previous key-frame
what P/B frames are used for
if a keyframe’s lost during streaming or skipped during fwd/bwd, the displayed video can be distorted, but it’s good for video distribution (not editing)
lossless codec
HuffYUV
lossy codecs
MPEG-2, H.264, H.265
file containers vs codecs
file containers puts video/audio into a file, but codecs compress
file containers
combines video and audio streams into one file and doesn’t indicate the codecs used
some allow additional data but containers indicated by file extension
.avi (audio video interface)
very old, not good for streaming
.mp4 (MPEG-4)
supports multiple audio/video streams and subtitles, but limited codecs (mobile devices)
.mkv (matroska)
like mp4, but supports unlimited streams and codecs and varying support on mobile devices
claude shannon
applied boolean logic to create digital computing machines (father of modern information age)
gate
device that performs basic operation on electric signals with transistors
circuits
gates combined to make more complicated tasks
boolean expressions
uses boolean algebra as a mathematical for expressed two-valued logic
logic diagrams
graphical representation of a circuit, each gate with it’s own symbol
truth tables
table showing possible input and output values
functional notation
uses function name with a list of arguments in place of operands used in boolean logic
transistor
trans-conductance variable resistor
either an amplifier or binary switch (bipolar)
how transistors work
implements logic function in hardware
passes in outputs to inputs for complex operations
pros of transistors
faster and smaller than relays, less fragile and consuming than vacuum tubes
NOT gate
outputs the inverted input value, accepts one value
*AND gate
if all 1’s, then = 1
if not then output is 0
*OR gate
if all 0, then = 0
otherwise output = 1
NAND gate
if all are 1, then = 0
or else it = 1
NOR gate
if all 0, then = 1
or else = 0
*XOR gate
if both inputs same, then = 0
or else it’s 1
combinational circuits
uses output of one gate as input for another
adders
special circuits that carry out addition operations that’s performed in binary
half adder
circuit that computes sum of 2 bits and produces the correct carry bit
full adder
circuit that computes sum of 2 bits and a carry in bit, then produces carry out bit
CPU
central processing unit
RAM
random access memory
SDRAM
synchronous dynamic RAM (double data rate)
ROM
read only memory
PCI
peripheral component interconnect
replaced by PCIe
BUS
wires connecting CPU and other components
neumann architecture
created a simplified computing model w/
cpu
memory
I/O devices
maybe secondary storage
data flow
hardware diagram
CPU
brains of computer
decodes instructions and carries out the corresponding arithmetic, logic, or control operations
sometimes replaceable in desktops, but usually soldered to motherboard
processing speed
computation occurs in cycles and simpler designed CPUs require more cycles/instruction
Hz = cycles/sec
storage space
data’s represented in bits
byte = 8 bits (GHz)
GHz and TB
measuring in base units are combined with metric or binary prefixes to represent larger magnitudes for contemporary computing
Bus
connection between components or devices in a computer, often a set of parallel wires
some modern designs have chip with CPU and memory in them