Saturday, 17 February 2018

matplotlib pyplot tutorial

In this post we will learn some basics features of matplotlib pyplot API through demos. This API collects functions that make matplotlib work like MATLAB. It supports: creates a figure, plots lines in a plotting area, labeled plot, ...
1. Plot parabola function y = (x-5)2 with requirements:
+ '-':  solid line style (refer to this)
+ 'r':  red
+ grid(True):  show grid
+ xlabel('x'):  x axis named 'x'
+ ylabel('(x-5)^2'):  y axis named '(x-5)^2'
+ plt.axis([0, 10, 0, 20]):  set axis limit xmin=0, xmax=10, ymin=0, ymax=20
+ show values of (x, y) at plotted points
Source code:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#import matplotlib and pyplot api
import matplotlib.pyplot as plt
#import numpy to generate (x-5)^2
import numpy as np

#just plot x values in range 0, 10 step 0.5
x = np.arange(0, 10, 0.5)

#generate y values
y = np.power(x-5, 2)

#'-':  solid line style (refer to this) 'r':  red
plt.plot(x, y, 'g-')

#y axis named '(x-5)^2'
plt.ylabel('(x-5)^2')

#x axis named 'x'
plt.xlabel('x')

#set axis limit xmin=0, xmax=10, ymin=0, ymax=20
plt.axis([0, 10, 0, 20])

#show grid
plt.grid(True)

#show values of (x, y) at plotted points
for tx in x:
    ty = np.power(tx-5, 2)
    plt.text(tx, ty, '(' + str(tx) + ',' + str(ty) + ')')

#now show the graph
plt.show()
2. Axis points are text
The X-axis values are texts [Pecan, Pumpkin, Chess] instead of numbers.
It has the form like below:
Source code:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import matplotlib.pyplot as plt

#names and values of points
names = ['Pecan', 'Pumpkin', 'Chess']
values = [500, 300, 400]

#set up figure (1)
plt.figure(1)

#set locations of names
x = [1, 2, 3]

#plot bar chart
plt.bar(x, values)

#map the locations and labels of names
plt.xticks(x, names)

#set title
plt.title('Flavor')

#show graph
plt.show()
3. Multiple figures and subplot
We use figure(id) to indicate the figure with id that we want to operate on it. And subplot(nrows, ncols, index) to divide the figure into nrows and ncols areas. And index is to indicate which area that we want to plot on it. E.g: subplot(2, 2, 1) or subplot(221): there is 4 areas and we focus on the top-left area.
subplot(2, 2, 4) or subplot(224): there is 4 areas and we focus on the bottom-right area

Let 's make a demo that using 3 figures:
+ In figure (1), there are 2 plotting areas that plot functions y=x and y=2*x
+ In figure (2), there are 2 plotting areas that plot functions y=3*x and y=4*x
+ In figure (3), there are 4 plotting areas that plot functions y=x, y=2*x, y=3*x and y=4*
Source code:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
import matplotlib.pyplot as plt

#figure (1) has 2 areas (1 row, 2 cols, idx=1 or 2) 
plt.figure(1) 

#focus on the top-left area w index=1 of figure (1)                 
plt.subplot(121)         
plt.plot([1, 2, 3], [1, 2, 3])
plt.title('y=x')

#figure (2) has 2 areas (2 rows, 1 col, idx=1 or 2) 
plt.figure(2)

#focus on the top-left area w index=1 of figure (2)
plt.subplot(211)              
plt.plot([1, 2, 3], [3, 6, 9])
plt.title('y=3*x')

#focus on the bottom-left area w index=1 of figure (2)
plt.subplot(212)              
plt.plot([1, 2, 3], [4, 8, 12])
plt.title('y=4*x')

#re-active figure (1) to continue plotting
plt.figure(1) 
 
#focus on the top-right area w index=2 of figure (1)         
plt.subplot(122)     
plt.plot([1, 2, 3], [2, 4, 6])
plt.title('y=2*x')

# the figure (3) has 4 areas (2 rows, 2 cols, idx=1,2,3,4)
plt.figure(3) 

#focus on top-left area idx=1               
plt.subplot(221)            
plt.plot([1, 2, 3], [1, 2, 3])
plt.title('y=x')

#focus on top-right area idx=2
plt.subplot(222)     
plt.plot([1, 2, 3], [2, 4, 6])
plt.title('y=2*x')

#focus on bottom-left area idx=3
plt.subplot(223)              
plt.plot([1, 2, 3], [3, 6, 9])
plt.title('y=3*x')

#focus on bottom-right area idx=4
plt.subplot(224)              
plt.plot([1, 2, 3], [4, 8, 12])
plt.title('y=4*x')

plt.show()

Friday, 9 February 2018

Note 1: Machine learning categories 

There are 2 main machine learning categories :
- Supervised learning
- Unsupervised learning
1. What is Supervised learning?
We have data set (labeled training data or training examples). Each element of data set is a pair of an input object and a desired output value. A supervised learning algorithm analyzes the data set and produces a function, which can be used for predicting new data set. Let 's see 2 examples:
- The first example is predicting housing prices. We have data set and each element of data set is a pair of area of the house in square feet and its price. We plot this data set on a graph. The horizontal axis represents the size of house in square feet while the vertical axis represents the price of the house in $. The learning algorithm will try to produce a function that can go through this data set. This function may be a straight line (pink line) or a quadratic function (blue line). With the pink line if the input is 750f2, the output will be predicted as 150k$. But with the blue line the output will be 200k$.
- The second example is predicting whether a patient get a breast cancer as malignant or benign. Each element of data set is a pair of input is tumor size and output is yes (a breast cancer is malignant) and no (a breast cancer is benign). This is also a classification problem where the learning algorithm try to classify the output is yes or no (maybe more states than 2).
- Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results using a continuous function (continuous values). In a classification problem, we are instead trying to predict results using a discrete function (discrete values such as yes or no).
The supervised learning process
2. What is Unsupervised learning?
A unsupervised learning algorithm analyzes the data set (unlabeled training data) and produces a function to describe some structures in data set. We take Google News as example of using unsupervised learning. Go to that website, you will see that the categories locate at the left side and the news (from many websites) relates to the category locate at the middle. How can the news of many websites be grouped together? Every day, Googlebot will go through other News websites, reads news and use unsupervised learning to find the structure of the news and group the news that has similar structure together (cohesive news). (And then using supervised learning to classify these news to categories)

Tuesday, 11 July 2017

Python numpy for Linear Algebra and Machine Learning

1. Introduction 
- I will show you how to use Python Python numpy for Linear Algebra and Matrices. We will focus on some topics:
+ Scalars, vectors and matrices
+ Vector and matrix calculations
+ Identity, inverse matrices & determinants
+ Solving simultaneous equations
2. Setup
- In order to install numpy:
+ Try this first: pip install numpy if it is not successfull then follow steps below.
+ Download and unzip the package at: https://sourceforge.net/projects/numpy/files/NumPy/
+ Find where is the file setup.py and  and from the command line run the command: 
python setup.py install
- In order to use numpy in python source code, just use import:
import numpy as np
3. Let 's start
3.1 Scalar
- An element of a field, usually described by a real number
3.2 Vector
- Column of numbers
 Figure: vector
- Length of a vector above is calculated by:  |x| = (x1 ^2 + x2^2 + x3^2)^1/2
- In numpy we use norm() function to calculate length of vector:

1
2
3
4
5
6
#length of vector
x = np.array([[1], [2], [3]])
from numpy.linalg import norm
print(x)
#length of vector x
print(norm(x))
Figure: calculate length of a vector

3.3 Matrices
- Rectangular of vectors in rows and columns. Defined as rows x columns (R x C).
Figure: matrix R=2 and C=3
- In order to express the matrix above, we will use array in numpy.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import numpy as np

matrix = np.array([[1, 2, 3], 
                   [5, 4, 1], 
                   [6, 7, 4]])

#print matrix
print(matrix)

#print type of matrix is <class 'numpy.ndarray'>                   
print(type(matrix)) 

#print size of matrix is (3, 3)
print(matrix.shape) 

#print element row=2, col=3 but index start from 0 
#so row=2(index=1), col=3(index=2) => return 1
print(matrix[1, 2])
3.4 Transposition
This will change row to col and col to row.
Figure: bT is transposition of b, AT is
transposition of A
 - We use ".T" to calculate the Transposition of matrix
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
b = np.array([[1], 
              [1], 
              [2]])
print(b)
#Transposition
print(b.T)

A = np.array([[1, 2, 3], 
              [5, 4, 1], 
              [6, 7, 4]])

#print A
print(A)
#Transposition
print(A.T)



Figure: calculate the Transposition of matrix using numpy
3.5 Matrix Calculations
3.5.1. Addition
 Figure: Matrix addition

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#Matrix Calculations
#Addition 
#Commutative: A+B=B+A
#Associative:  (A+B)+C=A+(B+C)
A = np.array([[2, 4], 
              [2, 5])
B = np.array([[1, 0], 
              [3, 1])
print(A)
print(B)
print(A+B) 
Figure: Add 2 matrices
- Commutative: A+B=B+A
- Associative:  (A+B)+C=A+(B+C)
3.5.2. Subtraction
Figure: Matrix subtraction

1
2
3
4
5
6
#Subtraction    
A = np.array([[2, 4], [5, 3]])
B = np.array([[1, 2], [3, 4]])         
print(A)
print(B)
print(A-B)
 Figure: Subtract 2 matrices
3.5.3. Scalar multiplication
- Scalar x matrix = scalar multiplication
Figure: Scalar multiplication
1
2
3
4
5
#Scalar multiplication
#Scalar x matrix = scalar multiplication
A = np.array([[1, 2], [3, 4]])             
print(A)
print(2*A)
Figure: Scalar x matrix
3.5.4. Matrix Multiplication
A is a MxN matrix and B is a RxS matrix. AxB is possible if N=R (Number of columns in A = Number of rows in B). The result will be an MxS matrix.
 Figure: matrix A x matrix B
1
2
3
4
5
6
7
A = np.array([[1, 0], [2, 3]])
B = np.array([[2, 3], [1, 1]])

print(A)
print(B)
#Matrix Multiplication
print(A.dot(B))
Figure: Matrix Multiplication
- Matrix multiplication is NOT commutative: AB≠BA
- Matrix multiplication IS associative: A(BC)=(AB)C
- Matrix multiplication IS distributive: A(B+C)=AB+AC and (A+B)C=AC+BC
3.6 Vector Products
Suppose that we have 2 vectors x and y

 The vector product is calculated as below
1
2
3
4
5
6
7
x = np.array([[1], [2], [3]])
y = np.array([[4], [5], [6]])

print(x)
print(y)
#Vector Products
print(x.T.dot(y))
Figure:Vector Products
3.7 Identity matrix
It is similar to the number 1 in number multiplication (e.g: 1x2 = 2). It is called Identity matrix.
Figure: Identity matrix
- Matrix A is nxn , we have  A In = In A = A
- Matrix A is nxm , we have In A = A, and  A Im = A
- We use eye(size) function to create identity matrix.
1
2
3
4
5
6
7
8
#Identity matrix
x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
i = np.eye(3) 

print(x)
print(i)
#Identity matrix
print(x.dot(i))
Figure: Identity matrix
3.8 Matrix inverse
- A matrix A is called  invertible if there exists a matrix B such that:
- Notation for the inverse of a matrix A is A-1
- The inverse matrix is unique if it exists. And if A is invertible, then A-1 is also invertible and (AT)-1 = (A-1)T
- Matrix division: A/B= A*B-1
- In numpy we have to use this: from numpy.linalg import inv
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#Matrix inverse
from numpy.linalg import inv

A = np.array([[1, 2], [3, 4]])

print(A)
#Matrix inverse
print(inv(A))

#(AT)-1 = (A-1)T
print(inv(A.T))
print(inv(A).T)
Figure:Matrix inverse
3.9 Determinants
- Determinants can only be found for square matrices. 
- A matrix A has an inverse matrix A-1  if and only if det(A)≠0. Because:
 Figure: calculate A-1
- For a 2x2 matrix A, det(A) = ad-bc

Figure: Determinants for 2x2 matrix
- In numpy we have to use: from numpy.linalg import det
1
2
3
4
5
6
7
#Determinants
from numpy.linalg import det
a = np.array([[1, 2], [3, 4]])

print(a)
#Determinants
print(det(a))
Figure: Determinants