# CS计算机代考程序代写 Linear Optimal Control (LQR)

Linear Optimal Control (LQR)

Robert Platt Northeastern University

The linear control problem

Given: System:

Given: System:

Cost function:

The linear control problem

where:

The linear control problem

Given: System:

Cost function:

where:

The linear control problem

Given: System:

Cost function:

where:

Initial state:

Calculate:

U that minimizes J(X,U)

The linear control problem

Given: System:

Cost function:

Important problem! Howwdheore:wesolveit?

Initial state:

Calculate:

U that minimizes J(X,U)

One solution: least squares

One solution: least squares

One solution: least squares

where

One solution: least squares

where:

Given: System:

Cost function:

One solution: least squares

where:

Initial state:

Calculate:

U that minimizes J(X,U)

One solution: least squares

Given: System:

Cost function: Initial state:

Calculate: U that minimizes J(X,U)

One solution: least squares Substitute X into J:

Minimize by setting dJ/dU=0:

Solve for U:

What can this do?

Start here

Solve for optimal trajectory:

End here at time=T

Image: van den Berg, 2015

What can this do?

This is cool, but…

– only works for finite horizon problems – doesn’t account for noise

– requires you to invert a big matrix

Bellman solution

Cost-to-go function: V(x)

– the cost that we have yet to experience if we travel along the minimum cost path.

– given the cost-to-go function, you can calculate the optimal path/policy Example:

The number in each cell describes the number of steps “to-go” before reaching the goal state

Bellman solution Bellman optimality principle:

Cost of this time step

(Cost of future time steps)

Bellman solution Bellman optimality principle:

Bellman solution Bellman optimality principle:

Cost-to-go from state x at time t

Cost-to-go from state (Ax+Bu) at time t+1

Cost incurred on this time step

Cost incurred after this time step

Bellman solution

For the sake of argument, suppose that the cost-to-go is always a quadratic function like this:

where:

Bellman solution

For the sake of argument, suppose that the cost-to-go is always a quadratic function like this:

where:

Then:

Bellman solution

For the sake of argument, suppose that the cost-to-go is always a quadratic function like this:

where:

Then:

How do we minimize this term?

– take derivative and set it to zero.

Bellman solution

How do we minimize this term?

– take derivative and set it to zero.

optimal control as a function of state – but: it depends on P_{t+1}…

Bellman solution

How do we minimize this term?

– take derivative and set it to zero.

How solve for P_{t+1}???

optimal control as a function of state – but: it depends on P_{t+1}…

Bellman solution Substitute u into V_t(x):

Bellman solution Substitute u into V_t(x):

Bellman solution Substitute u into V_t(x):

Bellman solution Substitute u into V_t(x):

Bellman solution Substitute u into V_t(x):

Dynamic Riccati Equation

Example: planar double integrator

Initial velocity

m=1

b=0.1 u=applied force

Initial position of the puck

Build the LQR controller for: Initial state:

Time horizon: Cost fn:

Goal position

Air hockey table

Example: planar double integrator

Step 1:

Calculate P backward from T: P_100, P_99, P_98, … , P_1

HOW?

Air hockey table

Example: planar double integrator

Step 1:

Calculate P backward from T: P_100, P_99, P_98, … , P_1

Air hockey table

Example: planar double integrator

Step 1:

Calculate P backward from T: P_100, P_99, P_98, … , P_1

Air hockey table

Example: planar double integrator

Step 1:

Calculate P backward from T: P_100, P_99, P_98, … , P_1

Air hockey table

Example: planar double integrator

Step 1:

Calculate P backward from T: P_100, P_99, P_98, … , P_1

Air hockey table

…

…

Example: planar double integrator

Step 2:

Calculate u starting at t=1 and going forward to t=T-1

Air hockey table

…

…

Example: planar double integrator 1

0.2 0

origin

0 0.2

Example: planar double integrator

u_x, u_y

t

Example: planar double integrator

Example: planar double integrator

origin

0

Example: planar double integrator

origin

0

The infinite horizon case So far: we have optimized cost over a fixed horizon, T.

– optimal if you only have T time steps to do the job

But, what if time doesn’t end in T steps?

One idea:

– at each time step, assume that you always have T

more time steps to go

– this is called a receding horizon controller

The infinite horizon case

Time step

Notice that elt’s of P stop changing (much) more than 20 or 30 time steps prior to horizon.

– what does this imply about the infinite horizon case?

Elements of P matrix

The infinite horizon case

Converging toward fixed P

Time step

Notice that elt’s of P stop changing (much) more than 20 or 30 time steps prior to horizon.

– what does this imply about the infinite horizon case?

Elements of P matrix

The infinite horizon case We can solve for the infinite horizon P exactly:

Discrete Time Algebraic Riccati Equation

So, what are we optimizing for now?

Given: System:

Cost function:

where:

Initial state:

Calculate:

U that minimizes J(X,U)

Controllability

A system is controllable if it is possible to reach any goal state from any other start state in a finite period of time.

When is a linear system controllable?

It’s property of the system dynamics…

Controllability

A system is controllable if it is possible to reach any goal state from any other start state in a finite period of time.

When is a linear system controllable?

Remember this?

Controllability

What property must this matrix have?

Controllability

This submatrix must be full rank.

– i.e. the rank must equal the dimension of the state space