CS/ECE/ME532 Assignment 1 Notes

Assignment 1 - Binary Classifier and Polynomial Vector Space

1. Binary Classifier

a) Expressing y as an inner product

The decision rule for the binary classifier is based on the sign of x<em>1a</em>1+x<em>2a</em>2bx<em>1a</em>1 + x<em>2a</em>2 - b. We can express this as an inner product y=xTwy = x^T w, where:

  • x=[x<em>1 x</em>2 1]x = \begin{bmatrix} x<em>1 \ x</em>2 \ 1 \end{bmatrix} is the feature vector.

  • w=[a<em>1 a</em>2 b]w = \begin{bmatrix} a<em>1 \ a</em>2 \ -b \end{bmatrix} is the weight vector.

Thus, y=x<em>1a</em>1+x<em>2a</em>2b=xTwy = x<em>1a</em>1 + x<em>2a</em>2 - b = x^T w.

b) Decision Boundary as a Straight Line

The decision boundary is defined by x<em>1a</em>1+x<em>2a</em>2=bx<em>1a</em>1 + x<em>2a</em>2 = b. To show this is a straight line in the x<em>1x<em>1-x</em>2x</em>2 plane, we can rearrange the equation to solve for x2x_2:

x<em>2=a</em>1a<em>2x</em>1+ba2x<em>2 = -\frac{a</em>1}{a<em>2}x</em>1 + \frac{b}{a_2}

This is in the form of a straight line y=mx+cy = mx + c, where:

  • Slope: m=a<em>1a</em>2m = -\frac{a<em>1}{a</em>2}

  • Intercept with the vertical axis (x<em>2x<em>2): c=ba</em>2c = \frac{b}{a</em>2}

c) Feature Matrix X

Given the four data samples, the feature matrix X is constructed as follows:

X=[0amp;0.4amp;1 0.2amp;0.1amp;1 0.5amp;0.6amp;1 0.9amp;0.8amp;1]X = \begin{bmatrix} 0 &amp; 0.4 &amp; 1 \ 0.2 &amp; 0.1 &amp; 1 \ 0.5 &amp; 0.6 &amp; 1 \ 0.9 &amp; 0.8 &amp; 1 \end{bmatrix}

Each row represents a data sample, with columns corresponding to x<em>1x<em>1, x</em>2x</em>2, and a 1 for the bias term.

d) Sketching the Decision Boundary and Classifying Data

Given a<em>1=1a<em>1 = 1, a</em>2=2a</em>2 = 2, and b=1b = 1, the decision boundary equation is:

x<em>1+2x</em>2=1x<em>1 + 2x</em>2 = 1

Or, solving for x2x_2:

x<em>2=12x</em>1+12x<em>2 = -\frac{1}{2}x</em>1 + \frac{1}{2}

This is a straight line with a slope of -1/2 and a y-intercept of 1/2. To classify the data points:

  1. (0, 0.4): 0 + 2(0.4) = 0.8 < 1, Class -1

  2. (0.2, 0.1): 0.2 + 2(0.1) = 0.4 < 1, Class -1

  3. (0.5, 0.6): 0.5 + 2(0.6) = 1.7 > 1, Class 1

  4. (0.9, 0.8): 0.9 + 2(0.8) = 2.5 > 1, Class 1

The points (0, 0.4) and (0.2, 0.1) are classified as -1, and (0.5, 0.6) and (0.9, 0.8) are classified as 1.

e) Linear Classifier Script

The linear classifier script classifies 5000 data points with two features. The decision boundary observed is a straight line, which separates the two classes.

f) Changing Classifier Weights

Changing the classifier weights to w=[1.6 2 1.6]w = \begin{bmatrix} 1.6 \ 2 \ -1.6 \end{bmatrix} alters the slope and position of the decision boundary. This leads to a different linear separation of the data points, resulting in a different classification for some points.

2. Polynomial Vector Space

a) P as a Vector Space

To show that PP is a vector space, we need to verify the vector space axioms:

  1. Closure under addition: If p,q<br>Pp, q <br>\in P, then p+qp + q is also a polynomial of degree <br>n<br>\leq n, so p+q<br>Pp + q <br>\in P.

  2. Closure under scalar multiplication: If p<br>Pp <br>\in P and c<br>Rc <br>\in \mathbb{R}, then cpcp is also a polynomial of degree <br>n<br>\leq n, so cp<br>Pcp <br>\in P.

  3. Commutativity of addition: For all p,q<br>Pp, q <br>\in P, p+q=q+pp + q = q + p.

  4. Associativity of addition: For all p,q,r<br>Pp, q, r <br>\in P, (p+q)+r=p+(q+r)(p + q) + r = p + (q + r).

  5. Existence of additive identity: The zero polynomial 00 is in PP (can be viewed as a polynomial of degree -\infty), and for all p<br>Pp <br>\in P, p+0=pp + 0 = p.

  6. Existence of additive inverse: For every p<br>Pp <br>\in P, there exists p<br>P-p <br>\in P such that p+(p)=0p + (-p) = 0.

  7. Distributivity of scalar multiplication with respect to vector addition: For all a<br>Ra <br>\in \mathbb{R} and p,q<br>Pp, q <br>\in P, a(p+q)=ap+aqa(p + q) = ap + aq.

  8. Distributivity of scalar multiplication with respect to scalar addition: For all a,b<br>Ra, b <br>\in \mathbb{R} and p<br>Pp <br>\in P, (a+b)p=ap+bp(a + b)p = ap + bp.

  9. Associativity of scalar multiplication: For all a,b<br>Ra, b <br>\in \mathbb{R} and p<br>Pp <br>\in P, a(bp)=(ab)pa(bp) = (ab)p.

  10. Existence of multiplicative identity: For all p<br>Pp <br>\in P, 1p=p1p = p.

b) Inner Product Definition

To show that pTq=11p(x)q(x)dxp^Tq = \int_{-1}^{1} p(x)q(x) dx is an inner product, we need to verify the following properties:

  1. Symmetry: pTq=qTpp^Tq = q^Tp since <em>11p(x)q(x)dx=</em>11q(x)p(x)dx\int<em>{-1}^{1} p(x)q(x) dx = \int</em>{-1}^{1} q(x)p(x) dx.

  2. Linearity: (ap+bq)Tr=a(pTr)+b(qTr)(ap + bq)^Tr = a(p^Tr) + b(q^Tr) for any scalars a, b. This holds because integration is linear.

  3. Positive-definiteness: pTp<br>0p^Tp <br>\geq 0 and pTp=0p^Tp = 0 if and only if p=0p = 0. Since p(x)2<br>0p(x)^2 <br>\geq 0 for all xx, <em>11p(x)2dx0\int<em>{-1}^{1} p(x)^2 dx \geq 0. If </em>11p(x)2dx=0\int</em>{-1}^{1} p(x)^2 dx = 0, then p(x)=0p(x) = 0 for all x<br>[1,1]x <br>\in [-1, 1], so pp is the zero polynomial.

c) Orthogonal Polynomials

To check for orthogonality, we need to compute the inner product of each pair of polynomials:

  • p<em>1(x)=xp<em>1(x) = x, p</em>2(x)=1xp</em>2(x) = 1 - x, p3(x)=3x21p_3(x) = 3x^2 - 1

  1. p<em>1Tp</em>2=<em>11x(1x)dx=</em>11(xx2)dx=[x22x33]11=(1213)(12+13)=23p<em>1^Tp</em>2 = \int<em>{-1}^{1} x(1 - x) dx = \int</em>{-1}^{1} (x - x^2) dx = [\frac{x^2}{2} - \frac{x^3}{3}]_{-1}^{1} = (\frac{1}{2} - \frac{1}{3}) - (\frac{1}{2} + \frac{1}{3}) = -\frac{2}{3}. Not orthogonal.

  2. p<em>1Tp</em>3=<em>11x(3x21)dx=</em>11(3x3x)dx=[3x44x22]11=(3412)(3412)=0p<em>1^Tp</em>3 = \int<em>{-1}^{1} x(3x^2 - 1) dx = \int</em>{-1}^{1} (3x^3 - x) dx = [\frac{3x^4}{4} - \frac{x^2}{2}]_{-1}^{1} = (\frac{3}{4} - \frac{1}{2}) - (\frac{3}{4} - \frac{1}{2}) = 0. Orthogonal.

  3. p<em>2Tp</em>3=<em>11(1x)(3x21)dx=</em>11(3x213x3+x)dx=[x3x3x44+x22]11=(1134+12)(1+134+12)=0p<em>2^Tp</em>3 = \int<em>{-1}^{1} (1 - x)(3x^2 - 1) dx = \int</em>{-1}^{1} (3x^2 - 1 - 3x^3 + x) dx = [x^3 - x - \frac{3x^4}{4} + \frac{x^2}{2}]_{-1}^{1} = (1 - 1 - \frac{3}{4} + \frac{1}{2}) - (-1 + 1 - \frac{3}{4} + \frac{1}{2}) = 0. Orthogonal.

Thus, p<em>1(x)p<em>1(x) and p</em>3(x)p</em>3(x) are orthogonal, and p<em>2(x)p<em>2(x) and p</em>3(x)p</em>3(x) are orthogonal.