# CS代考 Fundamentals of Image Processing – cscodehelp代写

Fundamentals of Image Processing
Book of Exercises Part 2
Master 1 Computer Science – IMAge & VCC Sorbonne Universit ́e
Year 2020-2021

Fundamentals of Image Processing – Exercises
IMAge & VCC
Contents
6. Keypoint Detection
7. Image descriptors
8. Segmentation: split and merge
9. Principal component analysis (PCA) 10. Linear Discriminant Analysis (LDA)
1 Appendix: Projection and Lagrange multipliers
3 6 8
10
11
13
1.1 Orthogonalprojectiononavectorialline………………….. 13 1.2 Optimization under constraints: Lagrange multipliers . . . . . . . . . . . . . . . 13
Practical works 14
Sorbonne Universit ́e 2 Master 1 in Computer Science

Fundamentals of Image Processing – Exercises IMAge & VCC
6. Keypoint Detection Exercise 1 — Moravec keypoint (corner) detection
Let I be an image, and W a window of size Wx × Wy. The mean squared intensity variation Eu,v(x1,y1) between the region W centered at pixel (x1,y1) and the region W centered at pixel (x1 + u, y1 + v), (u, v) ∈ Z2, is defined as (see Figure 1):

􏰒2 􏰒2 k=−Wx l=−Wy
Figure 1: Illustration of the principle of Moravec detector. The Moravec keypoint (corner) detector  is as follows:
• Using a 3×3 window W, compute Eu,v(x,y) (Equation 1) for each pixel (x,y) of the image, for all possible translations in a 8-neighbors model:
{(1, 0); (1, 1); (0, 1); (−1, 1); (−1, 0); (−1, −1); (0, −1); (1, −1)}.
• Compute a corner map as: C(x,y) = minu,v Eu,v(x,y).
• Compute the local maxima, leading to the final detection.
Sorbonne Universit ́e 3 Master 1 in Computer Science
Eu,v(x1,y1)=
[I(x1 +k+u,y1 +l+v)−I(x1 +k,y1 +l)]2 (1)
22

Fundamentals of Image Processing – Exercises Example: Let I be defined as:
 255  255 255  255
I=255  0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
255 255 255 255 255 255 255 255 255 255 0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255
IMAge & VCC
255  255 
255 255  255  255  255  255 255
255 255 255
255 255 255
255 255 255
255 255 255
255 255 255
255 0000
255 255 255 255 255 255 255 255 255 255
1. Compute Eu,v(x, y) and C(x, y) at the boxed pixels of I. 2. What do these chosen points represent?
3. Discuss the detection performance of this detector.
Exercise 2 — Harris corner detector
Harris & Stephens  proposed to modify the Moravec detector as follows:
1 −x2+y2
1. Use a Gaussian window w(x,y) = √ squared intensity variation:
u,v
e 2σ2
E1 (x,y)= 􏰒 w(x,y)×[I(x+u,y+v)−I(x,y)]2
2πσ2
as a weighting function in the mean
x,y∈W
2. Compute the Taylor series expansion of I(x + u, y + v) at order 1:
I(x+u,y+v)=I(x,y)+u· ∂I +v· ∂I +O(u2,v2)≈I(x,y)+u·Ix +v·Iy (2) ∂x ∂y
1. Prove that E1u,v(x, y) = (u, v) · M · (u, v)T, with:
 􏰑 w(x,y)·Ix2 x,y∈W
􏰑 w(x,y)·IxIy  􏰉A C􏰊 x,y∈W
M= 􏰑 w(x,y)·IxIy x,y∈W
􏰑 w(x,y)·Iy2 = C B x,y∈W
2. The matrix M, called auto-correlation matrix, represents the local structure of function E1(x,y) in a neighborhood of pixel (x,y). Harris & Stephens  proposed to compute the eigenvalues λ1 and λ2 of E1 (Figure 2(a)). To reduce the computational complexity, these authors proposed the following criterion R(x, y) (Figure 2(b)):
R(x, y) = det(M ) − k · [trace(M )]2 (3) wheredet(M)=AB−C2 =λ1·λ2,trace(M)=A+B=λ1+λ2.
Prove the equivalence between Figures 2(a) and 2(b).
Sorbonne Universit ́e 4 Master 1 in Computer Science

Fundamentals of Image Processing – Exercises IMAge & VCC
(a) Eigenvalues (b) Harris criterion Figure 2: Harris detector.
3. Prove that the Harris detector is invariant by rotation. Prove that the local maxima of R(x, y) are invariant with respect to affine intensity variations, i.e. that can be written as I′(x, y) = a · I(x, y) + b. Is the detector scale invariant?
4. Propose a simple method to detect corners at different scales, based on Harris detector.
Sorbonne Universit ́e 5 Master 1 in Computer Science

Fundamentals of Image Processing – Exercises IMAge & VCC
7. Image descriptors Exercise 3 — Integral images
An integral image Iint(x,y) is defined at pixel (x,y) as the sum of the gray levels I(x′,y′) of all pixels to the left and above pixel (x, y):
Iint(x, y) = 􏰒 I(x′, y′) (4) x′≤x, y′≤y,
Integral images have been used to compute efficiently some characteristics such as SURF  (a more compact and easier to compute variant of SIFT descriptor), or Haar wavelets for face detection .
(a) I (x,y) (b) (c) Iθ (x,y) (d) int int
Figure 3: Integral image (a) and sum in a rectangular region (b). Oriented case: (c), (d).
1. Propose an algorithm using only one scan of the image and that is linear in the number of image pixels to compute Iint(x, y).
2. Prove that it is possible to compute the sum 􏰑 I(x′,y′) of the values inside a rectan- x′,y′∈A
gular region A (Figure 3(b)) from Iint(x,y), in only four operations.
A variant of the integral image consists in computing sums of intensities in oriented regions,
e.g. rotated by π4 (Figure 3(c)):
Iθ (x,y)= 􏰒 I(x′,y′) (5)
y′≤y, y′≤y−|x−x′|,
3. Propose an algorithm using only one scan of the image to compute Iθ . What is its
complexity?
4. How can the sum of values in region A of Figure 3(d) be computed efficiently?
Exercise 4 — Blob detector
Sorbonne Universit ́e
6 Master 1 in Computer Science
int
int

Fundamentals of Image Processing – Exercises IMAge & VCC A blob detector aims at detecting image regions in which the grey levels have a Gaussian
distribution (with standard deviation σ0):
G(x0,y0,σ0)(x, y) = 2πσ e 2σ0 (6)
1 − (x−x0 )2 +(y−y0 )2 0
Reminder: the Laplacian ∆ of an image I is ∆I = ∂2I + ∂2I . ∂x2 ∂y2
1. Prove that arg minx,y ∆ 􏰃G(x0,y0,σ0)(x, y)􏰄 = {x0, y0}.
Interpretation: the minima of the Laplacian of a Gaussian image correspond to the position
of the center (x0,y0) of the Gaussian.
Generalization:
Let us now consider the convolution of an image I with the Gaussian kernel G: L =
. Recall that convolution and derivation commute in this case: ∆L = ∂2L + ∂x2
I ⋆ G
∂2L = I ⋆ ∂2G + ∂2G = I ⋆∆G. Let us admit the following result:
(x0,y0,σ0)
∂y2 ∂x2 ∂y2
• If I = G(x0,y0,σ0) and if the convolution is done by a Gaussian kernel G(x0,y0,σ) of variable size σ, then arg minx,y maxσ ∆ [σ · L(x, y, σ)] = {x0, y0, σ0} .
2. Prove that G(x0,y0,σ0) is a solution of the heat equation: ∂G = 1∆G. ∂σ 2
3. Derive that G(kσ) − G(σ) ≈ ∆G · (k − 1)σ. How is it then possible to build a multi-scale blob detector by convoluting the image by a difference of Gaussians (DoG) filter?
Sorbonne Universit ́e 7 Master 1 in Computer Science

Fundamentals of Image Processing – Exercises IMAge & VCC
8. Segmentation: split and merge Exercise 5 — Split principle
1. Based on the example detailed during the lecture, propose a recursive segmentation algorithm using splitting and producing a quadtree.
2. Illustrate the steps of the algorithm on the following 64 × 64 image (each element of the grid represents a 8 × 8 square), and build the quadtree:
3. Which splitting predicate can be used to segment correctly this image?
4. Compute the mean and variance of each leaf in the quadtree, given that the gray levels of the background, square and rectangle are 150, 50 and 100, respectively.
5. Now a Gaussian noise with 0 mean and variance equal to 25 is added to this image. Which threshold on the variance should be applied in order to get the same quadtree as in the previous questions?
Exercise 6 — Implementation
The implementation will use Python lists. The leaves of the quadtree provide the list of regions.
1. What does a leaf in the quadtree represent? Which information should it contain? Define a formal type in Python to represent this leaf, called block from now on.
2. How can the quadtree obtained in the previous exercise be represented in Python? Derive a formal type in Python.
3. Propose a simple splitting criterion. Write the splitting predicate, i.e. the function: def predsplit(I,reg,∗args):
Sorbonne Universit ́e 8 Master 1 in Computer Science

Fundamentals of Image Processing – Exercises IMAge & VCC
””” Array∗Block ∗… −> bool
return True if the block should be splitted
”””
The argument ∗args is a Python iterable, which allows passing a variable number of parameters. Here we assume that the first parameter is a threshold on the standard variation.
4. Write the split() function which implements the splitting algorithm:
def split(I,reg,pred,∗args):
””” Array∗Block∗(Array∗Block ∗… −> bool )∗… −> quadtree
Applies a quadtree splitting on I and region reg according to predicate pred and
parameters ∗args used by the predicate .
”””
5. Write the function printblocks(L,I) which applies depth-first traversal of the quadtree L and prints the coordinates and size of the leaves, as well as the statistics (mean and standard deviation) of the associated image block in I.
Exercise 7 — Merging blocks in a quadtree
1. From a quadtree, write a merging algorithm which relies only on a homogeneity criterion of the gray levels: pixels in homogeneous regions will be set to 1 and the other to 0.
2. Consider the set of pixels with label 1. Is this set connected (specify the connectivity)? If not, what is the minimal size of each connected component? Is this good or bad?
3. Propose a merging algorithm producing a segmentation with a unique homogeneous re- gion. It is assumed that a function neighbors(b, L) is available, returning the list of elements of L which are neighbors (4 closest neighbors on the Cartesian grid) of block b.
4. Apply this algorithm on the quadtree obtained in the first exercise in this session, using a similarity criterion.
5. Let R1 and R2 be two distinct regions with means μ1 and μ2, and variances σ12 and σ2. Derive the formulas for the mean and variance of R1 ∪ R2 based on |R1|, |R2|, μ1, μ2, σ12 and σ2.
6. Derive a method to compute the homogeneity of a region (whether connected or not).
Sorbonne Universit ́e 9 Master 1 in Computer Science

Fundamentals of Image Processing – Exercises
IMAge & VCC
9. Principal component analysis (PCA)
Exercise 8 — Derivation of PCA
• Le􏰑t X ∈ Rd be a random vector, and g the empirical expectation estimator of X: g = n
n1 Xi ≈ E[X] where Xi are realizations of the random variable X. i=1
→−
• Let Π be the projection operator on the vectorial line V defined by the unit vector v
→− →−d
(|| v || = 1). The coordinates of v in R are denoted by v. Then we have (see Appendix
1.1): Π=vv⊤ andv⊤v=1.
• The variance σv2 of X on V is:
1􏰒n􏰅􏰁􏰂⊤􏰆􏰌1􏰒n 􏰍 σv2(X)= n−1 Π(Xi −g) Π(Xi −g) = n−1 ||Π(Xi −g)||2 (8)
i=1 i=1
2 1􏰑n ⊤⊤ 1.Provethatσv(X)=n−1 (Xi−g) vv (Xi−g)
i=1
2⊤ 1􏰑n ⊤
2. Derive that σv(X) = v Σ v, where Σ = n−1 (Xi−g)(Xi−g) i=1
is the variance-covariance
matrix of X.
The principle of Principal Component Analysis (PCA) consists in determining the
→− 2􏰐 ⊤ unit vector v that maximizes σv(X), i.e. that maximizes v
Σ v, under the constraint
v⊤v = 1:
max v⊤Σv v
s.c. v⊤v = 1
the following function L(v, λ) over v and λ (called Lagrange multiplier):
L(v, λ) = v⊤Σv − λ(1 − v⊤v) (10)
This problem can be solved by the Lagrangian method (see Appendix 1.2), i.e. maximizing
3. By using the first order􏰋necessary conditions given in Appendix 1.2, prove that maximizing
σv2(X) for a pair of eigenvector, eigenvalue (vi,λi)?
5. Conclusion: propose a method do determine the line V maximizing the variance of the data
defined in Equation 8.
Sorbonne Universit ́e 10 Master 1 in Computer Science
Σv = λv v⊤v = 1
4. Derive that Equation 10 can be solved by diagonalizing Σ. Which is then the variance
(9)

Fundamentals of Image Processing – Exercises IMAge & VCC
10. Linear Discriminant Analysis (LDA)
Let X ∈ Rd be a random vector. Let us consider N realizations Xi of X, i ∈ {1…N}. Let Yi be a label associated with Xi, representing the membership of Xi to one of K classes Ck: Xi ∈Ck ⇔Yi =k(k∈{1…K}).
Linear Discriminant Analysis (LDA) consists in determining discriminant factors, as linear combinations of the input variables, which take values as close as possible to each other for elements in a same class, and as different as possible for elements in different classes. Math- ematically, LDA is formalized as the search of a data projection leading to:
• a minimal intra-class variance, and • a maximal inter-class variance.
Exercise 9 — Decomposition of the total variance
1 􏰒N
Xi. Let Nk be the numbe􏰒r of elements in class k (Nk = |Ck|), and gk the center (mean) of the kth class:
Let g be the empirical estimation of the expectation of X: g = N
g k = N1 X i . k Xi∈Ck
The total variance of the data writes:
1 􏰒N 1 􏰒K 􏰒
σ2(X) = N (Xi −g)⊤ (Xi −g) = N
i=1 k=1 Xi∈Ck
LetSS(k)=N1 􏰒 (Xi−g)⊤ (Xi−g) Xi ∈Ck
1. Prove that SS(k) can be decomposed as:
i=1
(Xi −g)⊤ (Xi −g) (12)
SS(k)= 1 􏰎 􏰒 (Xi −gk)⊤(Xi −gk)􏰏+ Nk(gk −g)⊤(gk −g) (13) NN
Xi ∈Ck
2. Derive that the total variance σ2(X) can be decomposed as: σ2(X) = σb2(X) + σw2 (X),
intra-class variance being defined as: σ2 = (X − g )⊤(X − g ). k,w N i k i k
where:
• σ2 = 􏰒K Nk σ2 is the weighted mean of the intra-class (“within”) variances, the
w N k,w
k=1 1􏰒
k Xi∈Ck
•σ2 =􏰒K Nkσ2 withσ2 =(g −g)⊤(g −g)isthevarianceofthemeansg of
b Nk,b k,b k k k k=1
classes (inter-class variance, “between”).
Sorbonne Universit ́e 11 Master 1 in Computer Science

Fundamentals of Image Processing – Exercises IMAge & VCC
Exercise 10 — Projected variance and solution of LDA
→− →−
Let Π be the projection operator on a vectorial line V defined by the unit vector v (|| v || = 1).
→− d ⊤ ⊤􏰑 Thecoordinatesof v inR aredenotedbyv. Wehave: Π=vv andv v=1. Remindthat
n
the projection of the total variance on V write: σ2 = v⊤Σv, with Σ = 1 (X − g)(X − g)⊤ v n−1ii
i=1
= v⊤Wv, with:
= v⊤Bv, with:
(projected points
and minimizing
with λ = v⊤Bv.
v⊤Σv 􏰁⊤ 􏰂 􏰁⊤ 􏰂
Indication: oncanadmitthat ∂ v Bv =2Bvand ∂ v Σv =2Σv. ∂v ∂v
5. Conclusion: prove that LDA consists in diagonalizing Σ−1B, and that the eigenvector associated with the largest eigenvalue defines the vectorial line, solution of LDA.
Sorbonne Universit ́e 12 Master 1 in Computer Science
(cf work on PCA).
1. Prove that the projection of the intra-class variance on V writes: σ2
1 􏰒K •W=N NkWk
k=1
• Wk = N1 􏰒 (Xi −gk)(Xi −gk)⊤
k Xi∈Ck
2. Prove that the projection of the inter-class variance on V writes: σ2
v(b)
1 􏰒K
• B = N Nk(gk − g)(gk − g)⊤
k=1
3. The projected variance can therefore be decomposed as σ2 = σ2 +σ2
v v(w) v(b) are still in Rd). LDA consists in determining the line V maximizing σ2
2 􏰇v⊤Bv􏰈 v(b) σv(w). Prove that the line verifies maxv J(v) with J(v) = v⊤Σv .
v(w)
4. Note that J(v) is invariant with respect to scaling transformations v ← αv. It is then possible to choose v such that v⊤Σv = 1. Write the Lagrangian expressing the optimization problem under constraints.
By using the Lagrange multipliers method (see work on PCA), prove that a necessary
condition to maximize J(v) = 􏰇v⊤Bv􏰈, under the constraint v⊤Σv = 1, is: Bv = λΣv, v⊤Σv

Fundamentals of Image Processing – Exercises IMAge & VCC
1 Appendix: Projection and Lagrange multipliers 1.1 Orthogonal projection on a vectorial line
In a vectorial space E of finite dimension, the orthogonal projection Π on a subspace S of E is a linear operator, which can therefore be represented by a matrix PS. Let Z be a matrix whose columns define an orthonormal basis of S. The orthogonal projection u = Π(x) of a vector x is given by:
u = Π(x) = ZZ⊤x = PSx (14) In the case of a vectorial line, we have Z = v where v is a unit vector defining the vectorial
line. The projection matrix then writes PS = vv⊤.
1.2 Optimization under constraints: Lagrange multipliers
Let X = (x1,…,xn) ∈ Rn, and consider the following optimization problem: maxX f(X) under the constraint h(X) = 0, where f is a quadratic function of X, and h is linear. The Lagrangian associated with this problem is defined as the following function L(X, λ):
L(X, λ) = f(X) + λh(X) (15)
This Lagrangian allows intoducing the constraint in the objective function, with a penalty λ ∈ R (Lagrange multiplier). The maximization is then over both X and λ.
It can be shown that the following first order conditions are necessary in order to maximize f under the constraint h:
Sorbonne Universit ́e
13
Master 1 in Computer Science
􏰋 ∂L =0 ∀i ∂xi
∂L =0 ∂λ
􏰋 ∂f =λ∂h ∀i
⇔ ∂xi ∂xi (16)
h(X)=0

Fundamentals of Image Processing – Exercises IMAge & VCC
References
 H. Bay, A. Ess, T. Tuytelaars, and L. Vangool. Speeded-up robust features (surf). Computer Vision and Image Understanding, 110(3):346–359, June 2008.
 C. Harris and M. Stephens. A combined corner and edge detector. In Proc. Fourth Alvey Vision Conference, pages 147–151, 1988.
 T. Lindeberg. Scale-Space Theory in Computer Vision. Kluwer, December 1993.
 . Obstacle avoidance and navigation in the real world by a seeing robot rover. In
tech. report CMU-RI-TR-80-03, Robotics Institute, University and doctoral dissertation, Stanford University, number CMU-RI-TR-80-03. September 1980.
 P.A. Viola and M.J. Jones. Robust real-time face detection. IJCV, 57(2):137–154, May 2004.
Practical works
Practical work materials (Jupyter notebooks and data) are to be retrieved from the course’s
Sorbonne Universit ́e 14 Master 1 in Computer Science

Practical work 6 : Detector
The goal of this pratical work is to implement the Harris-Stephen’s corners detector (C. Harris and M. Stephens. A combined corner and edge detector. In Proc. Fourth Alvey Vision Conference, pages 147–151, 1988).
Recall the Harris detector computes a map of corners from an image I: R(x, y) = det(M) − k(trace(M))2, (x, y)pixels
with k ∈ [0.04, 0.06]. M is the auto-correlation of image I:
M = 􏰇 ∑x,y∈W w(x,y)Ix2 ∑x,y∈W w(x,y)IxIy􏰈 = 􏰇A
B􏰈 ∑x,y∈W w(x, y)Ix Iy ∑x,y∈W w(x, y)Iy2 C D
1 (x−xc )2 +(y−yc )2
with w(x, y) = 2πσ2 e 2σ2 a Gaussian mask centered on the window W. Partial derivatives
[ ]:
1 0 −1 1 0 −1 0 0 0 -Prewitt:Gx =1 0 −1,Gy =GxT -Sobel:Gx =2 0 −2,Gy =GxT
1 0 −1 1 0 −1
000 T Ix and Iy are estimated by one of the following kernels : – Gradient: Gx = 1 0 −1 , Gy = Gx
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
import scipy.signal
# Useful functions
def gaussianKernel(sigma):
N = np.int(np.ceil(3*sigma))
x = y = np.linspace(np.int(-3*sigma),np.int(3*sigma),2*N+1)
X,Y = np.meshgrid(x,y)
noyau = np.exp(-(X*X+Y*Y)/(2*sigma*sigma))
return noyau/noyau.sum()
Exercise 1: Harris response calculation
1) Write a function computeR(I, scale, kappa) that returns the Harris response R from an image I and a scale scale. You will use 5 steps:
• Computation of the directionnal derivate Ix and Iy. Use the Sobel kernel.
• Computation of the products Ix2, Iy2, Ix.Iy.
• Computation of the convolution of Ix2, Iy2 and Ix.Iy by a gaussian kernel of size N (use given
function gaussianKernel())
• Computation of det(M(x, y)) and trace(M(x, y)) for each pixel
• Computation of R(x, y) = det(M(x, y)) − k.(trace(M(x, y)))2. You can use k = 0.04.
You can compute the convolutions by using the scipy.signal.convolve2d function. 1

[ ]:
[ ]:
Exercise 2 : Harris corner detector
From the Harris response calculated at exercise 1, we will write all the functions needed for the Harris detector. Write the following functions:
1) A function thresholdR(R, thres) that calculates and returns the binary thresholding Rb of the response R according to the threshold thres
[ ]:
Bonus: Write a faster version of the script using Numpy function np.roll(). :
3) Write a function cornerDetector(image, scale, kappa, thresh) that returns an array of the same size as the image. The array takes two values: 1 where a corner is detected and 0 elsewhere.
[ ]:
[ ]:
def computeR(image,scale,kappa):
“”” Array[n, m]*float*float->Array[n, m] “””
2) Write a script that displays the Harris response for the image img/house2.png along with the original image. Use a gaussian window of size W = 15 pixels.
3) Write in a few lines an interpretation of the results, explaining how the Harris response allows to detect and discriminate homogeneous areas, edges and corners.
def thresholdR(R, thres):
“”” Array[n, m] * float -> Array[n, m] “””
2) A function Rnms(R, Rbin) that performs a non-maximum supression from the response R and the binarized response Rbin. It returns the image Rlocmax (same size as R) =1 where Rbin = 1 and the pixel has a greater value R than its 8 nearest neighbors.
def rnms(image_harris):
“”” Array[n, m] -> Array[n, m] “””
def cornerDetector(image, sigma, kappa, thres):
“”” Array[n, m]*float*float*float -> Array[n, m] “””
4 ) Display the detected corners on the original image for the image img/house2.png. Each de- tected corner will be displayed as a small red disk. You can use the functions np.nonzero() and plt.scatter() to that purpose.
2

5) Evaluate the performances of the corner detector. Try to find good values for Sigma and Threshold.
Exercise 3 : Properties of Harris corner detector
The goal of this exercice is to study some invariance properties of Harris detector.
1) Write a script that detects the corners on the images img/toyHorse1.png and img/toyHorse2.png with a scale of 15 and appropriate threshold value. Display the detected corners on the images.
[ ]:
2) What are the dynamic ranges of these two images ? Your answer. . .
3) What are the transformations beetween the two images ? Your answer. . .
4) Using a fixed threshold, is the detection invariant to rotation ? To affine transformation of brightness ?
3

1 TME 7 : Color quantification and search by content
In this practical work session, we will:
• Develop a color based descriptor that can be applied to every image in a database
• Use this color descriptor to create a method that searches images by content: the goal is to
find the images that are the most similar to a query.
[ ]:
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
import scipy.signal
import scipy.ndimage
from skimage.color import rgb2hsv, hsv2rgb
# Usefull functions
def setColors(nH, nS, nV):
“”” int**3 -> Array[nH*nS*nV,3]*Array[nH,nS,nV,3]
computes an RGB palette from a sampling of HSV values
“””
pal1 = np.zeros((nH*nS*nV, 3))
pal2 = np.zeros((nH, nS, nV, 3))
tH, tS, tV = 1/(2*nH), 1/(2*nS), 1/(2*nV)
idx = 0
for i in range(nH):
for j in range(nS):
for k in range(nV):
HSVval = np.array([[[i/nH + tH, j/nS + tS, k/nV + tV]]])
pal1[idx, :] = hsv2rgb(HSVval)*255
pal2[i, j, k, :] = hsv2rgb(HSVval)*255
idx += 1
return pal1, pal2
def viewQuantizedImage(I,pal): “”” Array*Array -> Array
Display an indexed image with colors according to pal
“””
Iview = np.empty(I.shape)
n, m, c = I.shape
for i in range(n):
for j in range(m):
h, s, v = I[i, j, :]
Iview[i, j, :] = pal[ np.int(h), np.int(s), np.int(v), :]
print( Iview.max())
plt.imshow(Iview/255)
1

plt.show()
def display5mainColors(histo, pal): “”” Array*Array -> NoneType
Display the 5 main colors in histo
“””
idx = np.argsort(histo) idx = idx[::-1]
K=5
for i in range (K):
Ia = np.zeros((1, 1, 3), dtype=np.uint8)
Ia[0,0,0] = pal[idx[i], 0]
Ia[0,0,1] = pal[idx[i], 1]
Ia[0,0,2] = pal[idx[i], 2]
plt.subplot(1, K, i+1)
plt.imshow(Ia)
plt.axis(‘off’)
plt.show()
def display20bestMatches(S, indexQuery): “”” Array*int -> NoneType
“””
L = S[indexQuery, :]
Idx = np.argsort(L)[::-1]
cpt = 1
plt.figure(figsize=(15, 10))
for idx in Idx[:20]:
plt.subplot(5, 4, cpt)
indexQuery = idx
imageName = (pathImage+NomImageBase[indexQuery]).strip()
plt.imshow(np.array(Image.open(imageName))/255.)
plt.title(NomImageBase[indexQuery])
plt.axis(‘off’)
cpt += 1
plt.show()
1.1 Exercise 1: HSV histogram computation
Each image of the base will be represented by its color histogram in the HSV representation. We use the HSV representation rather that the RGB representation because it is a perceptual color space: two colors that look similar will have close HSV vectors.
1) Write a function iv = quantize(v,K) that returns the quantize interval of v considering a uniform quantization of values over the range [0, 1] with K evenly spaced intervals. For a image value v=1, the function will return K-1.
2

[ ]:
[ ]:
[ ]:
3) Write a function normalized_histo = NormalizeHistL2(histo) that applies a normaliza- tion on the histogram histo according to the L2 norm. The L2 norm of x can be computed using the function numpy.linalg.norm(x,2)
def quantize(I, k): “”” float -> int
“””
# You can test your function with the following lines:
h = np.zeros((8))
for i in range(256):
h[quantize(i/255.,8)] += 1
assert (h == 31*np.ones((8))).all()
2) Writeafunction[Iq, histo] = quantificationImage(I,Nh,Ns,Nv)thattakesasinputone image I of size N x M x 3 in the HSV representation and the number of quantification interval needed for H, S and V. Your function will return:
• Iq: the quantified image for each channel, of size N x M x 3
• hist: a 3D histogram of size Nh x Ns x Nv that counts the number of pixel for each quan-
tification bin (iH, iS, iV)
def quantizeImage(I, nH, nS, nV): “”” Array*int -> Array*Array “””
def normalize(H):
“”” Array -> Array”””
5) Test of the HSV histogram on an image: Complete the following code with your functions in order to apply it on one of the images of the base. The code will follow the following steps:
1. Open the image and convert it into HSV representation.
2. Compute the color palette for the display using the given setColors(nH, nS, nV)
function.
3. Compute the quantization of the image then visualize the quantized image using
viewQuantizedImage(I,pal).
4. Transform the 3D histogram into a 1D histogram, normalize it according to L2 norm
then visualize it.
5. Display the 5 most prevalent colors on the image using display5mainColors(histo,
pal).
You can try this on the image Paysages67.pngwith nH = 12, nS = 3 and nV = 8 and find a result similar to Figures 1, 2, 3, and 4.
3

Figure 1: Paysage67.png
4

Figure 2: Paysage67.png quantized
5

Figure 3: Histogram of HSV image (288 bins)
6

Figure 4: 5 main colors
[ ]:
PathImage = ‘./Base/’ # to be completed nom = ” # to be completed
# quantization parameters nH = # to be completed
nS = # to be completed
nV = # to be completed
filename= nom;
I = np.array(Image.open(PathImage + filename)) I=I/255. #Ivaluesrangein[0,1] plt.figure();
plt.imshow(I);
plt.show()
# conversion RGB->HSV
J = rgb2hsv(I)
# color palette computation
palette, palette2 = setColors( nH, nS , nV );
7

Iq, histo = quantizeImage(J, nH, nS, nV) # Visualisation of quantized image
viewQuantizedImage(Iq , palette2)
# flat a 3D histogram to 1D
histo = histo.flat
histo = normalize(histo)
plt.figure()
plt.plot(histo)
plt.show()
## Determine 5 more frequent colors
idx_most_prevalent = (-histo).argsort()[:5]
hsv_most_prevalent = [np.unravel_index(idx,( nH, nS , nV )) for idx in␣
􏰀→idx_most_prevalent] display5mainColors(histo, palette) print(hsv_most_prevalent)
6) Change the values of nH, nS and nV and analyze the results. You can try with other images in the base.
[ ]:
1.2 Exercise 2: Similarity between images:
In this exercice, we will compute a measure of similarity between two images from the normalized histogram. This measure of similarity will be used in order to find images that are the more similar to a given image.
1.2.1 2.1 : Computation of the histograms for the whole base
Complete the following script to compute the histograms for every image in the base. As the com- putation can take a lot of time, we will do it only one time and store the result in ListHisto.mat. TheresultswillbestoredasaN x MarraylistHistowithN = 1040andM = nH x nS x nV.We will also save the names of the images as listImage
Set bcomputed = False for the first run to compute the database histograms and then set it to 1. 8

[ ]:
import os
#####
pathImage = ‘./Base/’
listImage = os.listdir(pathImage)
pathDescriptors = ‘./’
# Quantization HSV
nH = nS = nV =
bcomputed = True # Set to False to compute the histogramm database
if not bcomputed:
listHisto = []
print(‘Histogram database computation … it may take a while …’)
for imageName in listImage:
if os.path.isfile(pathImage+imageName) and imageName[-4:] == ‘.png’: print( imageName)
I = np.array(Image.open(pathImage+imageName)) / 255.
# conversion RGB->HSV
J = rgb2hsv(I);
# Image quantization (your function tested in Exo 1) _,histo = quantizeImage(J, nH, nS, nV)
# flat a 3D histogram in 1D
histo = histo.flatten()
# Normalize histogramme (your function tested in Exo 1)
listHisto.append(normalize(histo))
print(len(listHisto), “histograms computed”)
nomList = pathDescriptors+’ListHisto.mat’
savemat(nomList, {‘listHisto’: np.array(listHisto),
‘listImage’: np.array(listImage)})
else:
1.2.2 2.2 : Computation of the similarity between every images in the base.
1. Write a function similarityMatrix() or a script that performs the similarity computation for every pair of images in the base from the histograms stored in listHisto and store the
9

[ ]:
[ ]:
[ ]:
result in a 1024 x 1024 matrix S. It is possible to make the operation much faster by using only one matrix operation.
listHisto = mat[‘listHisto’]
listImage = mat[‘listImage’]
### you code below
3. Assuming S is already computed and using function display20bestMatches(), test on the image Liontigre1.png (indexQuery = 349). You should obtain something similar to Figure 5.
Figure 5: 20 best matches of image ‘Liontigre1.png’
indexQuery = 349
display20bestMatches(S, indexQuery)
imageName = (pathImage+NomImageBase[indexQuery]).strip()
10

4. Assuming S is already computed, generate a random query (an integer in range [0, 1030]), and display the 20 best matches.
[ ]: import random
5. What can you say about the results ? What are the limitations and the advantages of this
method ?
11

Practical work 8: Split and Merge
In this practical work, we implement and test the split and merge algorithm.
:
### Usefull libraries
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
### Data
img_test = np.full((64,64),150,dtype=np.uint8)
img_test[32:48,16:16+32] = 100
img_test[8:24,40:56] = 50
angio = np.array(Image.open(‘img/angiogra.png’))
cam = np.array(Image.open(‘img/cameraman.png’))
muscle = np.array(Image.open(‘img/muscle.png’))
prisme = np.array(Image.open(‘img/prisme.png’))
seiche = np.array(Image.open(‘img/seiche.png’))
### Usefull functions
def neighbors(b,K):
“”” blockStat*list[blockStat]->list[blockStat]
returns the list of neighbors of b and elements of K
“””
def belongsTo(x,y,a):
“”” int*int*BlockStat -> bool
Test if pixel (x,y) belongs to block a
“””
return x>=a and y>=a and x bool
Test if a and b are neighbors
“””
if a>b and a>b:
a,b=b,a
x,y = a+a//2,a+a//2
return belongsTo(x+a,y,b) or belongsTo(x-a,y,b) or␣ 􏰀→belongsTo(x,y+a,b) or belongsTo(x,y-a,b)
N = []
for n in K:
if areNeighbors(b,n):
N.append(n)
return N
1

Exercise 1
Question 1
Write the recursive function split() discussed in tutorial work. It takes as input the image, a region, a predicate, and a variable number of arguments. The region is a Python formal type Block defined by :
type Block = tuple[int**4]
The function split() returns a quadtree, a Python formal type, recursivelly defined by:
The predicate is a Python function with the following signature:
Array*Bloc*…->bool
It can take a variable number of parameters which correspond to the parameters required by the predicate.
:
Question 2
Write the function predsplit(I,B,*args) with signature: Array*Block*… -> bool
that returns True if the standard deviation of image I computed on region B is greater than the first value of argument *args (it can be accessed simply by *args).
:
Question 3
Write the function listRegions() which applies a depth-first search on the quadtree given as parameter and returns the list of the leaves of the quadtree.
Some recalls about lists in Python; – Initialization: L = [] (empty list) – Add a element a into a list L: L.append(a)
2
# type Block = tuple[int**4]
# type 4-aire = list[(4-aire**4|Block)]
def split(I,reg,pred,*args):
“”” Array*Bloc*(Array*Block*…->bool)*… -> 4-aire
Perform a quadtree splitting of image I drived by a predicate
“””
def predsplit(I,reg,*args):
“”” Array*Block*… -> bool “””

[ ]:
Question 4
Test your codes on the synthetic image img_test seen in tutorial work. Print the value returned by split() as well as the one returned by listRegions().
[ ]:
Question 5
Write the function drawRegions(L,I) which takes as arguments a list of regions, an image, and returns a image qui prend en paramètre une liste de bloc et une image et retourne une image where the boundaries of each region have been traced with red color. Indication: the returned image is une hypermatrix of dimension 3, the third dimension is of size 3 and codes the red, green and blue components of a RGB colorspace. Test the function on the previous example.
:
Question 6
Add a Gaussian noise with standard deviation 5 to the image img_test. Apply the quadtree splitting on the noisy image by adjusting the threshold to obtain the same result as in the previous question. Which threshold value should have been chosen? Does this make sense to you?
Hint: use the Numpy function random.randn() which generates random values according to a normal distribution (Gaussian distribution of null mean and variance 1). To obtain realizations of a Gaussian distribution of standard deviation σ, it is sufficient to multiplied by σ the realizations of a normal distribution.
from numpy import random Exercise 2
Experiment the split algorithm on the 4 natural images provided. For each image try to find the threshold that seems to you visually the best. Display the number of regions obtained after splitting.
def listRegions(L):
def drawRegions(LL,I):
“”” list[Block]*Array -> Array
parcours de la liste dessin des régions
“””
[ ]:
[ ]:
3

Exercise 3
Question 1
Modify the function listRegions(L) to make it a function listRegionsStat(L,I) which com- putes the list of leaves of the quadtree L. Each element of this list will be is enriched with three scalars: the first being the size, the second the mean and the third the variance of pixel values of the block in the image I. This function then returns a list whose elements have the following formal type:
type BlockStat = tuple[int**4,int,float**2]
The first four values are those of the Block type, the fifth is the size of the block (in number of
pixels) and the last two values are the mean and variance calculated over the region.
:
Question 2
In the remainder, the formal type is considered:
type Region = list[BlocStats]
A region, as seen during the tutorial work, is therefore a list of blocks. Write the predicate predmerge(b,R,*args) as seen in tutorial work. This function returns True if the b block should merge into the R region. If a merge happens, then the first item of R will have its statistics updated to describe the statistics of the region R merged with b.
[ ]:
Question 3
Using predmerge() and neighbours() functions, given at the beginning of the notebook, write the function merge() discussed in tutorial work (exercise 3.6).
Recalls on Python lists: – Remove an element a from a list L: L.remove(a) – Test if a belongs to a listL:a in L-IteratetheelementsofalistL:for a in L:-Accesstoanelementofalist: aswith numpy arrays
:
# type BlockStat = tuple[int**4,int,float**2]
def listRegionsStat(L,I):
def predmerge(b,R,*args):
“”” BlocsStat*Region*… -> bool
If merge, R is modified
“””
def merge(S,I,pred,*args):
Merge the leaves of S in a list of regions
“””
4

Question 4
Test the previous functions using the synthetic image img_test. In particular, check that merge() returns a list of 3 elements (i.e. 3 regions).
[ ]:
Question 5
Write a function regions(LR,shape) that takes as arguments a list of regions (such as returned by the function merge()) and an image size, and returns an image of the regions. Each region will be coloured with the grey level corresponding to the average of the region. The shape parameter gives the size of the image to be produced.
Test the function on the previous example.
[ ]:
[ ]:
Question 2
The result of the merge algorithm highly depends on how you visit the regions. One can then sort the leaves of the quadtree, for example, from the smallest to the largest blocks, or the opposite (use the Python function sorted()). The same question arises when calculating the set of neighbours of the merged region. Should they be sorted? If yes, according to which criteria? their size? their proximity? Obviously there is no universal answer but adapted to each type of problem. Do some tests to see the influence of these sortings on the result of the merger.
[ ]:
[ ]:
QT = split(img_test, predsplit, ?) M = merge(QT,predmerge, ?)
assert len(M) == 3
def regions(LR,shape):
“”” list[Region]*tuple[int,int] -> Array “””
Exercise 4: experiments Question 1
Test the function merge() on the images angio, cam, muscle, prisme and seiche. Try to produce the best segmentations.
Question 3 (bonus)
Imagine and experiment alternative predicates both the split and the merge steps. It possible to use edges-based predicates, and also to combine with variance-based predicates.
5

Practical works 9 & 10 : Face recognition by Eigenfaces method
The objective of this practical work is to study the properties of the Eigenfaces face recognition method.
We propose to develop a system capable of: – identify a face from a database of faces – determine whether an image contains a face present in the database – to decide whether an image represents a face or not
Tools developed in this practical work will be applied on the Yale Faces Database.
General principle
The problem of face recognition is defined as follows: given a face image, one wishes to determine the identity of the corresponding person. To this end, it is necessary to have reference images, in the form of a database of faces of all persons known by the system. Each face is associated to a vector of characteristics. These characteristics are supposed to be invariant for the same person, and different from one person to another one. Face recognition then consists of comparing the vector of characteristics of the face to be recognized with that of each of the faces of the database. This makes it possible to find the person in the database having the most similar face.
There are several types of methods, distinguished by the type of characteristics used, see S.A. Sirohey, C.L. Wilson, and R. Chellappa. Human and machine recognition of faces: A survey. Proceedings of the IEEE, 83(5), 1995 for a state of the art:
• The approaches by face models proceed to a biometric analysis of faces. Pertinent biometrics are the distance between the eyes, length of the nose, shape of the chin. . .
• Image approaches, on the contrary, directly compare faces, considering them as images, for which measures of pre-attentive similarities (without a priori model) are defined.
1

Figure 1: General Principle of a Face Recognition System
Analysis by Eigenfaces
Face recognition by Eigenfaces is an image-based approach. Each face image is considered as a vector in a space having as many dimensions as the number of pixels in the image. The image characteristics are extracted by a method of dimensionality reduction based on principal compo- nent analysis (PCA). This approach was originally proposed in 1991, see Mr. Turk and A. Pentland. Eigenfaces for recognition. J. Cognitive Neuroscience, 3(1) :71-86, 1991_.
In the following, we will use the italic notation to designate the scalars (m, K, . . .) and vectors (x, u), as well as boldface for the matrices (X, Xc, W, . . .).
A face image is noted x and presented as a vector of d components. x[i](i = 0,··· ,d−1) is the
pixel number i of that image. A set of faces form a cloud of points in the space Rd. The database is
divided in two parts: the trainning or reference part, used to learn the faces, and the test part, used
to test the method. Faces of the training part are noted xtrain (k = 0, · · · , Ntrain − 1) and faces of test k
the test part are noted xk (k = 0, · · · , Ntest − 1)(k = 0, · · · , Ntest − 1).
We note xaverage the average of the reference faces, or average face. The principle of the Eigenfaces method is to model the difference of any face in relation to this average face by a set limited number of images uh, called Eigenfaces. One image of face x ∈ Rd is thus expressed as the average face to which is added a linear combination of eigenfaces :
2

x = xaverage +∑ahuh +ε h
where ah represents the weight of the eigenface of index h in the face x, and ε represents the error between x and its approximation by eigenfaces (error is due to the truncation of the basis of eigen- vectors). Coefficients ah play a very important role for face recognition, because they correspond to the face coordinates x in the face subspace.
The Eigenfaces method is based on the fact that the number of eigenfaces is much smaller than the total size of the space, which is what the this is called dimensional reduction. In other words, the basis of eigenfaces is truncated, keeping only the vectors coding for the most significant informa- tion. The images are therefore analysed in a sub-space of reduced dimensions, which represents more specifically faces, among all possible types of images.
The average face is always the same for a fixed reference database, each face is examined after subtraction by the average face.
Face database
We use the Yale Faces image database, http://cvc.cs.yale.edu/cvc/projects/yalefaces/yalefaces.html. In this database, all the faces have been preprocessed, in order to resize and crop them to the size
64 × 64 pixels, so the images can be compared pixel per pixel.
This database contains 120 greyscale images, representing the faces of 15 people. There are 8 images per person, each corresponding to a category of images varying according to the following criteria (see figure 2): – variation of facial expression: normal, sad, sleepy, surprised, wink, happy – variation of accessories: glasses, noglasses,
Figure 2: Illustration of the shooting categories
The database is divided into two groups: the reference group will be used as a training set, the other group as a test set: – the reference base contains n images, each with a number of pixels d = nl ×nc. There are 6 images per person in the training database, so n = 6×15 = 90. Each image is 64 × 64, hence d = 4096, – the test base contains 2 images per person so a total of 30 images. Each image is again 64 × 64.
In the following, we always manipulate face images in the form of vectors, and a set of faces in the form of a matrix where each column is a face. As we use Numpy, the images are stored in a multidimensionnal array of real (double). This array is viewed as a matrix X of size d × n:
X = [x0,…,xn−1]
.
The matrix X is split in Xtrain and Xtest respectively of size d × Ntrain and d × Ntest.
3

[ ]:
:
[ ]:
[ ]:
2. Write a function that centers the faces.
Recall: center means subtract the average face.
Vectors id and cat give informations about the images: id[k] and cat[k] are respectively the iden- tification (an index) and the cagetory of face k. Theses vectors are available for the reference and the test bases and will be useful in the following.
To load the database, we simply have to read the Matlab file YaleFaces.mat provided with this notebook: it provides the matrices and vectors Xtrain, Xtest, idtrain, idtest, cattrain, cattest.
The following code loads the database and creates the various matrices and vectors:
##### Useful libraries
import numpy.linalg
import numpy as np
import matplotlib.pyplot as plt
import scipy.io
# The training set (90 faces)
X_train = yaleFaces[‘X_train’]
cat_train = yaleFaces[‘cat_train’]
id_train = yaleFaces[‘id_train’]-1
# The test set (30 faces)
X_test = yaleFaces[‘X_test’]
cat_test = yaleFaces[‘cat_test’]
id_test = yaleFaces[‘id_test’]-1
# Additional images that don’t contain faces
X_noface = yaleFaces[‘X_noface’]
1. Write a function that computes the average face xmoy. Tip: use mean function from Numpy.
def meanFaces(X):
“”” Array[d,n] -> Vector[d] “””
def centeredFaces(X):
“”” Array[d,n]*Vector[d] -> Array[d,n] “””
4

3. Write a function deflat() that takes as argument a face, represented as a vector of 4096 elements, and returns an image of size 64 × 64.
Important: the Yale Faces database has been created in Matlab, for which the matrices are organized column by column. It may be useful to transpose the matrix.
[ ]:
[ ]:
def deflat(V):
“””” Vector -> Array[64,64] “””
4.Display the average face, as well as a few faces with the associ- ated centered faces. Here is an example of the expected result:
Figure 3: average face and centering of the database
Exercise 2: Computation of Eigenfaces (PCA)
The method developed by Turk and Pentland defines the eigenfaces as the main axes obtained by carrying out a principal component analysis (PCA) of the vectors associated with the reference faces. The eigenfaces are thus the eigenvectors of the covariance matrix XcX⊤c , of size d × d , where the matrix Xc of the same size as X represents all the centered faces:
Xc = 􏰃x0 −xmoy,···xn−1 −xmoy􏰄
Each line of Xc corresponds to a pixel p, each column of Xc corresponds to a reference face of index \$kˆ.
5

Rather than using eigenvalue decomposition, we will use singular value decomposition (SVD). The SVD decomposes the matrix Xc of size d × d into 3 matrices U, S, V such as :
Xc = USV⊤
where U and V are orthogonal matrices (UU⊤ = U⊤U = Id and VV⊤ = V⊤V = Ind) of respective sizes d × d and n × n, and S is a matrix of size d × n with null elements everywhere except on the main diagonal.
This decomposition has the following properties: – the columns of V are the eigenvectors of X⊤c Xc, -thecolumnsofUaretheeigenvectorsofXcX⊤c ,-thematrixSisdiagonal.Thediagonalrepresents the singular values of Xc, equal to the square roots of the eigenvalues λk of X⊤c Xc and XcX⊤c .
With Numpy, the SVD can be calculated by this way:
U, S, V = numpy.linalg.svd(Xc)
In our case, n < d, and the eigenvalues λk of XcX⊤c are therefore all null for k > n. We will not need the associated eigenvectors k > n. The svd function has a fast mode, which calculates only the eigenvectors corresponding to the columns of the matrix passed as argument:
U, S, V = svd(Xc, full_matrices=False)
This command returns the matrices U and V, of size d × n and n × n, and the matrix U matrix has been truncated, only the first n columns are retained:
U = [u1,··· ,un]
Finally S is a vector of size n and represents the diagonal matrix S.
1. Write a function eigenfaces(Xc) which returns a t-uple consisting of the U matrix of eigen- faces, computed from a centered database Xc, and the table of associated eigenvalues.
[ ]:
[ ]:
def eigenfaces(Xc):
“”” Array[d,n] -> Array[d,n]*Vector[n] “””
2. Use this function to calculate U and S. Normalize then the eigenvalues so that their sum is equal to 1.
3. Display the average face and the first 15 eigenfaces (see figure 4, use the plt.subplot() function). and their associated own values. Give your interpretation of the eigenfaces im- ages?
6

Figure 4: the 15 first eigenfaces
[ ]:
4. Plot the curve of the cumulative sum of the normalized eigenvalues (see figure 5 for the expected result), to see how much variation is captured by the first K eigenfaces. How many eigenfaces are needed to obtain a good reconstruction?
7

Figure 5: cumulative sum of eigenvalues
[ ]:
Exercise 3: projection in the subspace of faces
In the following, we use a reduced number of eigenfaces/eigenvectors. The vectorial space of faces, WK, is spanned by the basis formed with the K first eigenvectors:
WK =[u1,…,uK]
Note that the set of columns of WK is an orthonormal basis, so W⊤K × WK = IdK.
The projection of a face image x in the face subspace is simply done by subtracting from x the average face and applying the scalar product with each eigenvector. This gives the coordinates of the image x in the subspace of faces, which is of dimension K.
Each face therefore has several representations: – the original image, a vector x ∈ Rn – the coordi- nates of the projected image z in the basis of eigenfaces, {ah}, h ∈ {1; K} (subspace of faces):
z = W⊤K (x − xaverage) – its reconstruction in th