# 程序代写代做代考 Hive matlab scheme algorithm 2008

2008

Open Book Exam for EE4212 (Due: 27th Feb, 2017)

Your name and Matric Number:

Your pledge: I have not given or received aid from anyone else in solving the problems.

Your Signature:

Instructions:

1. Your report should be neatly written, typed up or a mix of both. You can print using recycled papers, or print on both sides! Also submit your codes to the GA via the IVLE workbin->Student Submission->CA1. You can send executables and/or just the source codes but provide a readme file on how to compile and run it.

2. Each question is marked with 0 to 3 asterisks *, with more asterisk indicating increasing level of difficulty. A question with 3 *’s means that it is a challenge question; if you can solve such challenge questions, you will get A+ and you should seriously consider becoming a vision researcher!

3. In the following questions, you can assume that the camera is calibrated. Do not bring in the intrinsic parameters. That is, you can treat all the image quantities given or asked for in the questions as quantities in terms of the camera reference frame, with the implicit assumption that you can always convert them to pixel units if needs be.

1. Projection Model; Homogeneous coordinates

(a) * Suppose you have two parallel lines in 3-space, one passing through the point (100,100,1000), the other through (200,200,1100). The lines are parallel to the vector (1,2,1). The lines are observed by a unit focal length camera at the origin (i.e. the camera reference frame and the world reference frame are identical). All coordinates are in camera coordinates. What is their point of intersection in the image? (Hint: the point of infinity along li in 3-space is given by lim(i((

, where

and

are a point on the line and its direction respectively)

(b) Given the 3D coordinates of several corresponding points Pi and Pi’ in two views, you are required to find the 3D rotation R and translation T that relate the two views (Pi’ = RPi + T). Formulate a linear least squares algorithm (of the form Ax=b) that ignores the orthogonality constraint associated with R (that is, it is ok if the solution for R returned by your formulation is not orthogonal). State the entries of the matrix A, and the vectors x and b.

(c) You are now given two sets of corresponding point clouds Pi and Pi’, in the files pts.txt and pts_prime.txt respectively. Each row is a 3D coordinate, possibly contaminated with some noise. Write a Matlab routine to estimate the optimal values of R and T in the least squares sense via SVD, using the formulation in (b). Compute the determinant of R using the Matlab function det, and comment on the resulting value.

(d) If the world homogeneous coordinates are (Xw, Yw, Zw, Ww), the image plane homogeneous coordinates are (u, v, w), and they are related by:

i. find the 3×4 matrix M as a function of (, h, according to the diagram below (in the left diagram, the axis X and Xw are pointing directly out of the paper). Assume that the only intrinsic camera parameter is the focal length f (given in pixel unit). Use Pc = RPw +t to relate the camera coordinates Pc and the world coordinates Pw. R is the orientation of the world with respect to the camera; t is the world origin expressed in the camera frame.

ii. * Find the 3×3 matrix for the linear transformation that maps points on the world plane Zw = d to the image plane (u, v, w).

Q2 Using Singular Value Decomposition for Image Compression and PCA.

Many thousands of outdoor cameras are currently connected to the Internet. They can be used to measure plant growth (e.g. in response to recent warming trends), survey animal populations (e.g. in national parks), monitor surf conditions, and security, etc. You can see some examples from the following websites:

Santa Catalina Mountains, Arizona – Dec 5 2006 (http://www.cs.arizona.edu/camera);

Yosemite: Half-dome from Glacier Point – Dec 19 2006 (http://www.halfdome.net/);

Tokyo Riverside Skyline – Jan 08 2007 (http://tokyosky.to/ ).

University of Washington AMOS archive: http://www.cse.wustl.edu/amos

In this question, you are provided with a 150-frame time-lapse video of a city scene taken between 6.30-7.30pm. Each frame is of the dimension 161×261 pixels. Watch the video, and you can see that clouds are moving, and planes take off and land from a nearby airport.

a. Using Matlab, load the first image in the video sequence provided with this question and convert it to an appropriate data type. Just type the following commands in the command window:

I= im2double(imread(‘image001.png’));

b. Do a singular value decomposition using the command ‘svd’:

[U S V]=svd(I);

This will give you the singular values and the singular vectors. The singular values in S have been sorted in descending order. Plot the singular value spectrum using the command: plot(diag(S),’b.’). Submit this plot. What do you notice in this plot?

c. Let K=20, Extract the first K singular values and their corresponding vectors in U and V:

K=20;

Sk=S(1:K,1:K);

Uk=U(:,1:K);

Vk=V(:,1:K);

Uk, Vk, Sk contain the compressed version of the image. To see this, form the compressed image:

Imk=Uk*Sk*Vk’; display it by :imshow(Imk).

Print out a copy of this compressed image and submit it.

d. Repeat question c for K=40,60,80. Submit the compressed images for the different values of K. Compare the 4 compressed images. Briefly describe what you notice.

e. Thus, in image transmission, instead of transmitting the original image you can transmit Uk, Vk, Sk, which should be much less data than the original. Is it worth transmitting when K=100 (i.e. do you save any bits in transmission when K=100)? Explain your answer.

f. ** Find an expression that bounds the per-pixel error (the difference between the original image and the compressed image for a particular pixel (i, j)) in terms of K, the singular values and the elements in U and V with the largest absolute magnitude (umax, vmax respectively).

g. ** SVD is intimately related to PCA (Principal components analysis). Some of you might have learned about PCA in the EE3731C course, but it is not a pre-requisite for this question. Basically, finding the principal components of a matrix X amounts to finding an orthonormal basis that spans the column space of X (these are the column vectors ui in the matrix U). Here we create the data matrix X by first vectorizing each of the 150 images into a column vector (by scanning in either row-major order or column-major order), and then stacking these vectors together into a matrix of size 42021×150. In effect, we have captured the entire video sequence into the 2D matrix X.

Vectorizing an image.

Before we proceed further, mean subtraction (a.k.a. “mean centering”) to center the data at the origin is necessary for performing PCA. That is, the images are mean centered by subtracting the mean image vector from each image vector. This is to ensure that the first principal component u1 really describes the direction of maximum variance. Again, if you do not have background in PCA, it is ok; just take it as a preprocessing step (or you can do some independent learning).

Apply SVD to the resultant mean-centered matrix. Take only the first 10 principal components and reconstruct the image sequence. [NB: you can use the Matlab reshape command to convert a matrix to a vector and vice versa]. Observe the dynamics in the reconstructed video, comment on what you find, and explain why. Remember to add back this mean image vector when you are displaying your reconstructed results.

[Non-mandatory]: You can also explore other possibilities, like investigating what each principal component means, and their implications for processing and editing of outdoor webcam imagery.

Question 3: Projection Model; Homogeneous coordinates

a. An affine camera is a simplification of the full perspective camera but is more complicated than the scaled orthographic model. It has a projection relationship given by the following equations:

[x,y]T = A[X,Y,Z]T + b

where A is a 2(3 matrix, and b is a 2(1 vector. If the world point (X,Y,Z) and image point (x,y) are represented by homogeneous vectors, write down the matrix representing the linear mapping between their homogeneous coordinates.

* Show that the point at infinity in space (X,Y,Z,0)T is mapped to point of infinity in the image plane. What does this result imply about the projection of parallel lines in space onto the image plane?

b. ** Given an image of a scene containing a cube as shown in the following.

Each vanishing point is the image of a point at infinity of the form (d, 0), where d is a Euclidean vector with 3 coordinates expressing the direction of a cube edge. Show that the coordinates of a vanishing point v can be expressed as v = K R d, where K is the intrinsic calibration matrix and R is the rotation matrix between the camera and world coordinate system. Hence express an edge direction d as a function of K, R and v. (Hint: Start from the 3×4 projection matrix equation in Ch 1)

** The 3 directions of the cube edges are mutually perpendicular; therefore the dot product between any two directions is zero. Show that this condition leads to an equation in terms of the vanishing points and the unknown calibration matrix K. Note that such an equation can be written for each pair of the 3 vanishing points.

*** (Note that the derived equation above can be used to solve for K, but you are not required to explicitly propose a scheme for solving this equation. However, doing so will earn you bonus point.)

c. The projection of a scene point in world coordinates to pixel coordinates in an image can be represented using a camera projection matrix P as follows:

i. * Given a set of lines in a scene that are all parallel to the world X-axis, what is the vanishing point, (u, v), of these lines in the image? Can you conclude whether an infinite line in 3D space always yield an infinite line in the 2D image plane?

ii. What is the significance of the image point (represented as a homogeneous 3-vector) given by the last column (

,

,

) of P? That is, which world point gives rise to (

,

,

)?

iii. ** Consider the 1-dimensional right null-space of P, i.e., the 4-vector C such that PC=0. In this case, the image point of C is (0, 0, 0) which is not defined. What is the point C which possesses this property? Explain.

d. The equation of a conics in inhomogeneous coordinates is given by a x2 + b xy + c y2 +d x + e y + f =0. Homogenize this equation by the replacement x ( x1 / x3, y ( x2 / x3, and write down the homogeneous form. Finally, express this homogeneous form in the matrix form xT C x = 0 where C is symmetric. Write down the elements of the symmetric matrix C.

** If C has the special form C = l mT + m lT , clearly C is symmetric and therefore represents a conic. Show that the vector x = l ( m is the null vector of C. Since a null vector exists, C is a degenerate conic. Show in this degenerate case, C contains two lines l and m, that is, show that the points on the lines l and m satisfy xT C x = 0.

e. Given a set of two-dimensional points (x,y) as follows, write a Matlab routine to find the conics best (in the least squares sense) represented by these points, in terms of the parameters a, b, c, d, e, and f, as formulated in Question 3d.

3,-8

4,-9

5,-9

6,-9

7,-9

8,-9

9,-9

10,-9

11,-8

12,-8

13,-8

14,-8

15,-7

16,-6

16,-5

17,-4

17,-3

17,-2

18,-1

18,0

18,1

18,2

18,3

18,4

18,5

18,6

18,7

18,8

18,9

17,10

17,11

16,12

16,13

15,14

15,15

14,16

13,17

12,18

11,18

10,19

9,20

8,20

7,20

6,21

5,21

4,21

3,22

2,22

1,22

0,22

-1,21

-2,21

-3,20

-4,19

-5,18

-6,17

-7,16

-7,15

-8,14

-8,13

-8,12

-8,11

-8,10

-8,9

-8,8

-8,7

-8,6

-8,5

-8,4

-7,3

-7,2

-6,1

-5,0

-4,-1

-4,-2

-3,-3

-2,-4

-2,-5

-1,-6

0,-7

1,-7

2,-8

Q4. Estimate Fundamental Matrix F using 8-Point Algorithm [you should be able to attempt this question by the end of week 4]

4.1. Introduction:

The fundamental matrix F completely describes the epipolar geometry of a pair of images. 8-Point Algorithm is the simplest method for estimating the fundamental matrix. In this assignment, you will implement the 8-Point Algorithm to estimate the fundamental matrix given a pair of images. You would also need to implement the normalization so as to improve the stability (outlined in steps 4.2.2 and 4.2.4 below). Finally, we have also provided a package to help you reconstruct and visualize the depths (see point #6 in 4.3; this step is optional but will earn you bonus point). Please read this document completely before starting work. A list of reading materials you may need to solve this problem is given in Appendix I.

4.2. Tasks:

You are given 2 pairs of images (inria1.tif, inria2.tif, frc1.tif, frc2.tif) for this assignment. The steps for this assignment are:

a. Using one of the given pairs of images, establish

point correspondences.

b. Normalize the coordinates of the correspondences.

c. Using the normalized coordinates of the correspondences, compute the fundamental matrix.

d. Perform denormalization so as to find the fundamental matrix corresponding to the original data.

Details of each step are given in Section 4.2.1, 4.2.2, 4.2.3 and 4.2.4. Some useful Matlab commands are given in Appendix II.

4.2.1 Establish Point Correspondences:

In this step, you need to establish point correspondences between the two images. Using Matlab, manually or automatically mark and record at least 8 pairs of corresponding points (preferably much more than 8 pairs for robustness!).

Note:

· Your points should not all lie on the same line, but spread out in the image.

· It is usually easier to mark prominent features in the image, e.g. corners of the buildings.

· You may use the Matlab function ‘cpselect’ for this step. It displays the pair of images side by side, allowing you to click on corresponding points manually. You can also use the zoom feature to mark the points more accurately. See 4.3 (5) if you want to automate this step.

At the end of this step: you should have marked out

point correspondences

between the pair of images. Show where these points are by plotting the correspondences in the same symbol and color in the pair of images. Submit these plots as jpg files.

4.2.2 Normalization of the data

In 8-point algorithm, the stability of the results can be greatly improved by a simple normalization (translation and scaling) of the coordinates of the correspondences (originally presented in [Hartley-1997] and also in [Hartley&Zisserman-2000]). For this normalization, you should find a similarity transformation

, consisting of a translation and scaling, that takes points

to a new set of points

EMBED Equation.DSMT4 such that the centroid of the points

is the coordinates origin, and their average distance (Root Mean Squares) from the origin is

. Similarly, find a similarity transformation

that takes points

to points

EMBED Equation.DSMT4 .

Note: Homogeneous coordinates are used in this step.

At the end of this step: You should have the normalized homogeneous coordinates of the point correspondences

, which are to be used in the next step to compute F. The transformation

and

will be used again in the denormalization step.

4.2.3 Computation of Fundamental Matrix

The computation of the fundamental matrix is described in detail in your textbook or in Section 10.1 of [Hartley&Zisserman-2000]. In summary the steps are as follows:

(1) Find solution for

From a set of n point correspondences, we obtain a set of linear equations of the form

is an

matrix

and

is the 9-vector made up of the entries of

in row-major order. If

, non-trivial solution for the system above exists, and if

, the solution is unique (up to scale). The solution for

should be the last column of

in the

.

(2) Enforce singularity constraint

Fundamental Matrix is singular, in fact of rank 2. The matrix

found by the above method will not in general have rank 2 and steps should be taken to enforce this singularity constraint. One easy way to do this is: let

be the

of

, where

is a diagonal matrix

satisfying

. Let

. According to the so-called Frobenius norm,

is the closest singular matrix to

. Thus replace

by

.

Note:

*The data used in this step is the normalized data obtained from step 4.2.2.

At the end of this step: You should’ve the estimated fundamental matrix

. Keep in mind that because you’ve performed step 4.2.2, this is actually the fundamental matrix corresponding to the normalized data; thus we will instead denote it as

.

4.2.4 Denormalization

Since you have performed step 4.2.2, the fundamental matrix

you obtained from the above corresponds to the normalized data

and need to be denormalized. Set

(

and

are obtained in the normalization step; you may easily prove this yourself). Then,

is the fundamental matrix corresponding to the original data

.

At the end of this step: You should have the estimated fundamental matrix

.

* Note: To visualize the correctness of your estimated F, use the given function displayEpipolarF in displayEpipolarF.m (This function needs another function in epipoles.m). The function displayEpipolarF asks to select an image point in image 1 and displays the corresponding epipolar line in image 2. If F is estimated correctly, the epipolar line should pass through the corresponding point in image 2. Download displayEpipolarF.m, epipoles.m (contained in zipped matlab files)

4.3 What to submit

(1) Plots showing the corresponding points. Explain briefly why you chose these points.

(2) The code that implements the computation of F Matrix, including the normalization of the data. Submit a matlab function with the following signature:

function F = estimateF(x1, x2)

where x1 and x2 are each

matrices with the columns corresponding to the coordinates in the first and the second image respectively.

(3) Print out the values of the estimated F, and also the results of your verification for a few points (using displayEpipolarF). You should repeat the whole process for different values of

and for different sets of corresponding points, or on different pair of images for thorough testing. Don’t despair if the results showed a lot of variation!

(4) ** Comment on the accuracy of the results obtained in (3). If they are not as accurate as you have hoped, attempt to offer an explanation.

(5) *** Bonus: Obtain dense correspondences using KLT tracker. Read information about KLT tracker from the OpenCV libraries and how to run it (Matlab interface routines are available on the preceding website). You can also read this book in the RBR for more information: Gary Bradski, Adrian Kaehler. Learning OpenCV: Computer Vision with the OpenCV Library. From the dense correspondence, attempt to improve the estimation of F.

(6) *** Bonus: use the package (contained in this zipped matlab package (12MB) and see the readme file for instructions) provided to reconstruct the 3d depths and visualize the 3d point cloud. All the functions for estimating 3d depths (from a given F) and visualization have been provided for you in this package. You only need to interface your codes with the package properly. Print out the final plot.

Note that since you are using only fundamental matrix, without any knowledge of the intrinsic parameters, you can only reconstruct a scene structure that is related to the true scene depths by a projective transformation (what is known as projective depths). Thus, the reconstructed plot may look distorted.

Alternatively, use the two building images here (building1.jpg, building2.jpg). If you are unable to download the building1 and building2 images through the link in the word document itself, you can also go to the class website at http://courses.nus.edu.sg/course/eleclf/ee4212/ and download through the links there (in the assignment section.) The intrinsic parameters of the two images are the same and are as follows: the focal length is 1333.3 pixel unit, while the principal point is at the center of the image. With these intrinsic parameters, you are able to do a Euclidean reconstruction instead of a projective reconstruction. There are two things to do. First, convert the image coordinates into metric coordinates. In this case, your estimateF function will automatically become a function estimating the essential matrix; second, tell the package provided that you are working on the ‘calibrated’ case by calling the function computeSMFromW with an additional parameter as follows

anim = computeSMFromW(true, samples, ‘isCalibrated’, true);

Then you will get an Euclidean 3D point cloud up to a scale factor.

NB: The toolbox is supported on Windows and Linux only. If you are a Mac user, you have 2 options:

1. you can try the computers installed with Windows and Matlab in the computer lab in E5.

2. The GA can teach you to install a virtual machine on Mac to run Windows system. This will take hours.

Appendix I. Reading Materials

*[Hartley&Zisserman-2000] R. Hartley, A. Zisserman, Multiple View Geometry in computer vision, Cambridge, 2000 (Section 10.1, 10.2)

* [Hartley-1997] R. Hartley, In Defense of the Eight-Point Algorithm, IEEE-PAMI, Vol.19, No.6, 1997 [describing the rationale behind the normalization step]

* E. Trucco, A. Verri, Introductory Technique for 3-D Computer Vision, Prentice Hall, 1998 (Section 7.3)

Appendix II. Useful Matlab Command

These are some Matlab commands that might be useful for your work. For more information please type: help

* imread, imwrite to read and write images. JPEG, TIFF, BMP formats are supported.

* imshow to display images.

* plot(x,y,’b*’) to plot points given by the coordinates (x,y). The point plotted is a blue

asterisk’*’. See the help page for more information about colors and markers.

* “hold on” causes subsequent image operations to display results on top of image, instead

of replacing image. Use this to plot the features points onto the image.

* save

* load loads back the variables saved by save.

* cpselect displays 2 images side by side, allowing you to click on corresponding points.

* [x, y]=ginput(n) captures the coordinates of n mouse clicks in the figure. If n>1, each of x and y is a n(1 vector.

* [U S V]= svd(A) computes the svd of matrix A.

* reshape(X,m,n) returns the m(n matrix whose elements are taken columnwise from vector X (i.e. 1st column of the m(n matrix comes from the first n elements of vector X).

Q5 Motion Perception [you should be able to attempt (a)-(c) by the end of week 5]

Figure. Two squares translating horizontally in the opposite directions.

(a) In the Figure above, the two squares are translating horizontally in the opposite directions as indicated by the arrows. Indicate the respective perceived motions if you are looking through the apertures 1, 2, and 3. Explain your answers.

(b) Barber-pole with Occlusion: The figure in this question illustrates various barber-pole configurations, with occluders placed along either vertical or horizontal sides of the barber-pole. The arrows indicate the perceived direction of barber-pole motion. Explain why the perceived motion tends to be biased in the direction orthogonal to the occlusion boundary.

Figure. Barber-pole with occlusion. Arrows indicate the perceived motion. Left: Barber-pole with no occluders. Middle: Occluders placed on the top and bottom cause the perceived motion to be mostly vertical. Right: Occluders placed on the left and right sides of the barber-pole bias the perceived motion toward horizontal.

(c) *Referring to the diagram below, the stimulus consists of two orthogonal bars that move sinusoidally, 90 degree out of phase (Figure a and b). When presented together within an occluding aperture (Figure c), the bars perceptually cohere and appear to move in a circle as a solid cross. However, when presented alone (Figure d), they appear to move separately (the horizontal bar translates vertically and the vertical bar translates horizontally), even though the image motion is unchanged in Figures c and d. In either stimulus condition, both percepts are legitimate interpretations of the image motion. Yet a single interpretation is predominantly seen in each case. Why?

(d) * For a purely rotational motion, the equation that relates the optical flow (u, v) to its rotational parameters ((x, (y, (z) is of the conic form explored in Questions 3d and 3e:

u = ωx xy − ωy (x2 + 1) + ωz y

v = ωx (y2 + 1) − ωy xy − ωz x

Assume you are given enough optical flow measurements at different (x, y) locations, solve ((x, (y, (z) in the least squares sense by writing down the least squares equation of the form Ax=b, b ( 0. Explain why in this case, the equation is of the non-homogeneous form, whereas that in Question 3e is of the homogeneous form Ax=0, and explain whether it is appropriate or inappropriate to change the formulation of this Question 5d to the Ax=0 form.

� EMBED Equation.DSMT4 ���

_1135275626.unknown

_1170236633.unknown

_1203544548.unknown

_1481436700.unknown

_1481436725.unknown

_1387802409.unknown

_1170236639.unknown

_1197810848.unknown

_1135275818.unknown

_1135276487.unknown

_1135762704.unknown

_1170236616.unknown

_1135276685.unknown

_1135276438.unknown

_1135275879.unknown

_1135275737.unknown

_1135275746.unknown

_1135275761.unknown

_1135275654.unknown

_1135013447.unknown

_1135015009.unknown

_1135015997.unknown

_1135016314.unknown

_1135018987.unknown

_1135018988.unknown

_1135275522.unknown

_1135016400.unknown

_1135016116.unknown

_1135016154.unknown

_1135016022.unknown

_1135015952.unknown

_1135015977.unknown

_1135015598.unknown

_1135014585.unknown

_1135014916.unknown

_1135014971.unknown

_1135013617.unknown

_1135014402.unknown

_1135014509.unknown

_1135013728.unknown

_1135013565.unknown

_1134996473.unknown

_1135008456.unknown

_1135008280.unknown

_1135008289.unknown

_1134988664.unknown

_1134988722.unknown

11121314

21222324

31323334

1

w

w

w

X

supppp

Y

svpppp

Z

spppp

æö

æöéù

ç÷

ç÷

êú

ç÷

=

ç÷

êú

ç÷

ç÷

êú

ç÷

èøëû

èø

i

i

i

X

Y

Z

éù

êú

êú

êú

ëû

a

b

c

éù

êú

êú

êú

ëû

ú

ú

ú

û

ù

ê

ê

ê

ë

é

+

ú

ú

ú

û

ù

ê

ê

ê

ë

é

=

c

b

a

Z

Y

X

i

C

i

i

i

l

i

l

14

p

24

p

34

p

14

p

24

p

34

p

n

(8)

n

³

n

(8)

n

³

‘

ii

«

xx

T

i

x

ˆ

i

x

(

)

ˆ

ii

T

=

xx

ˆ

i

x

2

‘

T

‘

i

x

ˆ

‘

i

x

(

)

ˆ

”’

ii

T

=

xx

ˆ

ˆ

‘

ii

«

xx

T

‘

T

f

”””

111111111111

”””

1

0

1

nnnnnnnnnnnn

xxxyxyxyyyxy

Aff

xxxyxyxyyyxy

éù

êú

==

êú

êú

ëû

MMMMMMMMM

A

9

n

´

(8)

n

³

f

F

()8

rankA

£

()8

rankA

=

f

V

()

T

SVDAUDV

=

F

T

FUDV

=

SVD

F

D

(,,)

diagrst

rst

³³

‘(,,0)

T

FUdiagrsV

=

‘

F

F

F

‘

F

‘

F

ˆ

‘

F

ˆ

‘

F

ˆˆ

‘

ii

«

xx

ˆ

”

T

FTFT

=

T

‘

T

F

‘

ii

«

xx

F

2

n

´

(8)

n

³

11121314

21222324

31323334

1

w

w

w

X

supppp

Y

svpppp

Z

spppp

æö

æöéù

ç÷

ç÷

êú

ç÷

=

ç÷

êú

ç÷

ç÷

êú

ç÷

èøëû

èø