Why pseudo inverse




















Sign up using Facebook. Sign up using Email and Password. Post as a guest Name. Email Required, but never shown. Featured on Meta. Now live: A fully responsive profile. Related 3. Hot Network Questions. Question feed. Mathematics Stack Exchange works best with JavaScript enabled. Then one can use the characterisation that a map is injective if and only if it has a left inverse. I think, I got the right proof now. Then the right pseudoinverse does exist?

Sort the least squares solutions into special cases according to the null space structures. Sign up or log in Sign up using Google. Sign up using Facebook. Sign up using Email and Password. Post as a guest Name. Email Required, but never shown.

Featured on Meta. Now live: A fully responsive profile. Linked 5. I mean, like, you know, you have formulas for surface area, and other awful things and, you know, they do their best in calculus, but it's not elegant. And, linear algebra just is -- well, you know, linear algebra is about the nice part of calculus, where everything's, like, flat, and, the formulas come out right.

And you can go into high dimensions where, in calculus, you're trying to visualize these things, well, two or three dimensions is kind of the limit. But here, we don't -- you know, I've stopped doing two-by-twos, I'm just talking about the general case.

OK, now I really will speak about the general case here. What could be the inverse -- what's a kind of reasonable inverse for a matrix for the completely general matrix where there's a rank r, but it's smaller than n, so there's some null space left, and it's smaller than m, so a transpose has some null space, and it's those null spaces that are screwing up inverses, right?

Because if a matrix takes a vector to zero, well, there's no way an inverse can, like, bring it back to life. My topic is now the pseudo-inverse, and let's just by a picture, see what's the best inverse we could have? So, here's a vector x in the row space. I multiply by A. Now, the one thing everybody knows is you take a vector, you multiply by A, and you get an output, and where is that output?

Always in the column space, right? Ax is a combination of the columns. So Ax is somewhere here. So I could take all the vectors in the row space. I could multiply them all by A. I would get a bunch of vectors in the column space and what I think is, I'd get all the vectors in the column space just right.

I think that this connection between an x in the row space and an Ax in the column space, this is one-to-one. We got a chance, because they have the same. That's an r-dimensional space, and that's an r-dimensional space. And somehow, the matrix A -- it's got these null spaces hanging around, where it's knocking vectors to.

And then it's got all the vectors in between, zero. Almost all vectors have a row space component and a null space.

And it's killing the null space component. But if I look at the vectors that are in the row space, with no null space component, just in the row space, then they all go into the column space, so if I put another vector, let's say, y, in the row space, I positive that wherever Ay is, it won't hit Ax. Do you see what I'm saying? So here's what I said. If x and y are in the row space, then A x is not the same as A y.

They're both in the column space, of course, but they're different. That would be a perfect question on a final exam, because that's what I'm teaching you in that material of chapter three and chapter four, especially chapter three. If x and y are in the row space, then Ax is different from Ay. So what this means -- and we'll see why -- is that, in words, from the row space to the column space, A is perfect, it's an invertible matrix.

If we, like, limited it to those spaces. And then, its inverse will be what I'll call the pseudo-inverse. So that's that the pseudo-inverse is. It's the inverse -- so A goes this way, from x to y -- sorry, x to A x, from y to A y, that's A, going that way. Then in the other direction, anything in the column space comes from somebody in the row space, and the reverse there is what I'll call the pseudo-inverse, and the accepted notation is A plus.

So y will be A plus x. No, y will be A plus times whatever it started with, A y. Do you see my picture there? Same, of course, for x and A x. This way, A does it, the other way is the pseudo-inverse, and the pseudo-inverse just kills this stuff, and the matrix just kills this.

So everything that's really serious here is going on in the row space and the column space, and now, tell me -- this is the fundamental fact, that between those two r-dimensional spaces, our matrix is perfect. Suppose they weren't. Why do I get into trouble?

Suppose -- so, proof. I haven't written down proof very much, but I'm going to use that word once. Suppose they were the same. Suppose these are supposed to be two different vectors. Maybe I'd better make the statement correctly. If x and y are different vectors in the row space -- maybe I'll better put if x is different from y, both in the row space -- so I'm starting with two different vectors in the row space, I'm multiplying by A -- so these guys are in the column space, everybody knows that, and the point is, they're different over there.

So, suppose they weren't. Suppose, well, that's the same as saying A x-y is zero. So, what do I know now about x-y , what do I know about this vector? Well, I can see right away, what space is it in? It's sitting in the null space, right? So it's in the null space. But what else do I know about it? Here it was x in the row space, y in the row space, what about x-y? It's also in the row space, right? Heck, that thing is a vector space, and if the vector space is anything at all, if x is in the row space, and y is in the row space, then the difference is also, so it's also in the row space.

Now I've got a vector x-y that's in the null space, and that's also in the row space, so what vector is it? It's the zero vector. So I would conclude from that that x-y had to be the zero vector, x-y, so, in other words, if I start from two different vectors, I get two different vectors.

If these vectors are the same, then those vectors had to be the same. That's like the algebra proof, which we understand completely because we really understand these subspaces of what I said in words, that a matrix A is really a nice, invertible mapping from row space to columns pace.

If the null spaces keep out of the way, then we have an inverse. And that inverse is called the pseudo inverse, and it's a very, very, useful in application. Statisticians discovered, oh boy, this is the thing that we needed all our lives, and here it finally showed up, the pseudo-inverse is the right thing. Why do statisticians need it?

And because statisticians are like least-squares-happy. I mean they're always doing least squares. And so this is their central linear regression. Statisticians who may watch this on video, please forgive that description of your interests. One of your interests is linear regression and this problem. But this problem is only OK provided we have full column. And statisticians have to worry all the time about, oh, God, maybe we just repeated an experiment.

You know, you're taking all these measurements, maybe you just repeat them a few times. You know, maybe they're not independent.

Well, in that case, that A transpose A matrix that they depend on becomes singular. So then that's when they needed the pseudo-inverse, it just arrived at the right moment, and it's the right quantity.

So now that you know what the pseudo-inverse should do, let me see what it is. Can we find it? So this is my -- to complete the lecture is -- how do I find this pseudo-inverse A plus? Well, here's one way. Everything I do today is to try to review stuff.

One way would be to start from the SVD. The Singular Value Decomposition. And you remember that that factored A into an orthogonal matrix times this diagonal matrix times this orthogonal matrix. But what did that diagonal guy look like? This diagonal guy, sigma, has some non-zeroes, and you remember, they came from A transpose A, and A A transpose, these are the good guys, and then some more zeroes, and all zeroes there, and all zeroes there.

So you can guess what the pseudo-inverse is, I just invert stuff that's nice to invert -- well, what's the pseudo-inverse of this? That's what the problem comes down to. What's the pseudo-inverse of this beautiful diagonal matrix?

But it's got a null space, right? What's the rank of this matrix? What's the rank of this diagonal matrix? It's got r non-zeroes, and then it's otherwise,. So it's got n columns, it's got m rows, and it's got rank r. It's the best example, the simplest example we could ever have of our general setup. So what's the pseudo-inverse? What's the matrix -- so I'll erase our columns, because right below it, I want to write the pseudo-inverse. OK, you can make a pretty darn good guess.

If it was a proper diagonal matrix, invertible, if there weren't any zeroes down here, if it was sigma one to sigma n, then everybody knows what the inverse would be, the inverse would be one over sigma one, down to one over s- but of course, I'll have to stop at sigma r. And, it will be the rest, zeroes again, of course. And now this one was m by n, and this one is meant to have a slightly different, you know, transpose shape, n by m.

They both have that rank r. My idea is that the pseudo-inverse is the best -- is the closest I can come to an inverse. So what is sigma times its pseudo-inverse? Can you multiply sigma by its pseudo-inverse? Multiply that by that? What matrix do you get?

Rectangular, of course. But of course, we're going to get ones, R ones, and all the rest, zeroes. And the shape of that, this whole matrix will be m by. And suppose I did it in the other order. Suppose I did sigma plus sigma. Why don't I do it right underneath? See, this matrix hasn't got a left-inverse, it hasn't got a right-inverse, but every matrix has got a pseudo-inverse.

If I do it in the order sigma plus sigma, what do I get? Square matrix, this is m by n, this is m by m, my result is going to m by m -- is going to be n by n, and what is it? Those are diagonal matrices, it's going to be ones, and then zeroes. It's not the same as that, it's a different size -- it's a projection. One is a projection matrix onto the column space, and this one is the projection matrix onto the row space.

That's the best that pseudo-inverse can do. So what the pseudo-inverse does is, if you multiply on the left, you don't get the identity, if you multiply on the right, you don't get the identity, what you get is the projection. It brings you into the two good spaces, the row space and column space.



0コメント

  • 1000 / 1000