Quantcast
Channel: CodeSection,代码区,Python开发技术文章_教程 - CodeSec
Viewing all articles
Browse latest Browse all 9596

DC SVD I: Just Can’t Let It Go

$
0
0

It’s been way, way, way too long since I’ve posted. I haven’t been slacking though, I’ve merely been busy. Really.

I decided to dive back into the SVD problem and look at an alternative to the QR based SVD computations. Namely, I’m going to give a breakdown of the divide-and-conquer approach to computing the SVD. Similar to the relationship between bidiagonal SVD and tridiagonal QR decompositions, there is a close relationship between dividing-and-conquering QR and SVD. I’m going to start from "the inside out" with the innermost task of DC (divide-and-conquer) process: solving a particular equation known as the secular equation (secular meaning "not heavenly" i.e., earthly or planetly check it out on wikipedia).

One note: I feel very guilty about still using python 2. I had intented to transition this project to Python 3 over the course of the last year. Alas, my major training clients over the last year were using Python 2 and I didn’t really want to have a mixed development environment on my laptop. Well, at least there’s something to do if I make a book out of these posts!

Enough preamble, let’s get down to business.

The Secular Equation Divide-and-conquer SVD is built on computing the roots of the secular equation . The roots, or zeros, of an equation are the \(\lambda\) s such that \(f(\lambda) = 0\) . The secular equation is (where we take \(\rho=1\) ): \[f(\lambda) = 1 + \rho \sum_{i=1}^n \frac{z_{i}^2}{d_i \lambda} = 1 + \sum_{i=1}^n \frac{z_{i}^2}{d_i \lambda}\]

Here is a graph of a secular equation and its roots:

In[1]: import numpy as np import numpy.linalg as nla import matplotlib.pyplot as plt %matplotlib inline xs = np.linspace(-1,10,10000) # graph secular equation (blue curve) # plot z,d --> y zs = np.array([.6,1.2,1.8]) ds = np.array([0.0,3,5]) with np.errstate(divide='ignore'): # plain vanilla to "fancy-schmancy" #ys1 = 1.0 + (1.0 / (0.0-xs_sq)) + (4.0 / (9-xs_sq)) + (16.0 / (25 - xs_sq)) #ys2 = 1.0 + (zs[0]**2 / (ds[0]**2-xs_sq)) + (zs[1]**2 / (ds[1]**2-xs_sq)) + (zs[2]**2 / (ds[2]**2 - xs_sq)) ys3 = 1.0 + ((zs**2).reshape(3,1) / np.subtract.outer(ds, xs)).sum(0) # assert np.allclose(ys1, ys2) and np.allclose(ys2,ys3) plt.plot(xs, ys3) # add an x-axis (yellow horizontal line) plt.plot(xs, np.zeros_like(xs), 'y-') # add poles/asymptotes (grey vertical lines) plt.vlines(ds, -10, 10, '.75') # add roots (red dots) # use equivalent matrix (for these z,d) and eigenvalue computation to find roots # ds are diagonal entries, zs are first columns (note, ds[0] is 0.0 by definition) ROU = np.diag(ds) + np.outer(zs, zs) #tm[:,0] = zs zeros_act = nla.eig(ROU)[0] plt.plot(zeros_act, np.zeros_like(zeros_act), 'r.') #scaling plt.ylim(-10,10), plt.xlim(-1, 10);
DC SVD I: Just Can’t Let It Go

A quick note, the secular equation as written above is most directly used to compute eigenvalues not singular values . There is a strong relationship between the two sets of values and they lead to only slightly different forms in the secular equation. I mention this because you might see slightly different forms of the secular equation depending on whether you are reading about eigenvalues or singular values. We will use the form above for solving both problems by slightly modifying the resulting \(\lambda\) s.

Back to our regularly scheduled program. We will find our desired roots by isolating out \(f(\lambda)\) between each set of poles. The poles occur at the values of \(d_i\) . So, on the interval \((d_i, d_{i+1})\) , the problem simplifies to finding a single root of the secular equation. Throughout this post, we’ll assume that \(d_i < d_j\) for \(i<j\) (in English, the \(d_i\) s are distinct and sorted). We’ll find a single root \(n\) times and find all \(n\) roots. The cleverest will now point out that \(n\) poles only hold \(n-1\) zeros between them. The last zero is to the right of \(d_n\) .

How do we find the single roots?

Newton’s Method

Let’s take a second and review Newton’s method for finding a root of an equation \(f(x)\) near \(x_0\) .

Approximate \(f(x)\) by a linear function \(l(x) = ax+b\) Apply constraints such that \(f(x_0) = l(x_0)\) and \(f'(x_0) = l'(x_0)\) . These determine the coefficients of \(l(x)\) . Find the root of \(l(x)\) and call it \(x_1\) . In this case, the root is the \(x\) -intercept of \(l(x)\) . That root of \(l(x)\) is a better guess as to the root of \(f(x)\) .

Newton’s method is a great technique and it is used broadly because it is conceptually and formally simple, easy to implement, and the iteration steps are reasonably fast. However, in the case of the secular equation, as the numerators get very small, the corner in the graph will go from a gentle bend to a sharp corner. In other words, the graph will go from vertical to horizontal very quickly. When this happens, a linear approximation on the nearly horizontal part (which is the majority of the pole-to-pole interval), will also be approximately horizontal and be aimed far away from the true root in this interval.

A Modified Newton’s Method

So, we need to try something else. Fortunately, we can maintain the outline of Newton’s method while using a different approximating function. Instead of using a linear form, we will use the following rational function of \(\lambda\) which has poles at \(d_i\) and \(d_{i+1}\) :

\[h(\lambda) = \frac{C_1}{d_i-\lambda} + \frac{C_2}{d_{i+1}-\lambda} + C_3\]

So, we are approximating \(f(\lambda)\) with \(h(\lambda)\) :

\[f(\lambda) = 1 + \sum_{i=1}^n \frac{z_{i}^2}{d_i \lambda} \approx
\frac{C_1}{d_i-\lambda} + \frac{C_2}{d_{i+1}-\lambda} + C_3 = h(x)\]

Also, since we want to avoid numerical problems with term cancellation, we will break the sum in \(f(\lambda)\) into two parts: (1) the sum up to term \(k\) and (2) the sum from term \(k+1\) on. Along with the an ordering assumption on the \(d_i\) , this means that the lower terms are all negative and the upper terms are all positive for \(\lambda \in (d_i, d_{i+1})\) . We will also give names to those partial sums.

\[
\begin{eqnarray}
f(\lambda) &=& 1 + & \sum_{i=1}^k \frac{z_{i}^2}{d_i \lambda} + \sum_{i=k+1}^n \frac{z_{i}^2}{d_i \lambda} &=& 1 + \Psi_1(\lambda) + \Psi_2(\lambda)\\
f'(\lambda) &=& & \sum_{i=1}^k \frac{z_i^2}{(d_i \lambda)^2} + \sum_{i=k+1}^n \frac{z_i^2}{(d_i \lambda)^2} &=& \Psi'_1(\lambda) + \Psi'_2(\lambda)
\end{eqnarray}
\]

Note that the derivative (and the derivatives of the partial sums) is strictly positive except at \(\lambda = d_i\) (the poles of \(f\) ).

Since we have broken \(f\) into pieces, we will also break \(h\) into pieces:

\[h(\lambda)=1 + h_1(\lambda) + h_2(\lambda)\]

and we will give them each the same form, which is similar to that of \(h\) :

\[h_1(\lambda) = \hat{c}_1 + \frac{C_1}{d_k \lambda} \quad
h_2(\lambda) = \hat{c}_2 + \frac{C_2}{d_{k+1} \lambda}\] We will also enforce the "Newton conditions" that both the value of the approximations \(h_i\)

Viewing all articles
Browse latest Browse all 9596

Trending Articles