Giter VIP home page Giter VIP logo

Comments (12)

ericchansen avatar ericchansen commented on June 17, 2024

Right now, it's A = UsV, modify s, reform A, use Ab in the solver.

Could you explain more clearly how it should work?

from q2mm.

peonor avatar peonor commented on June 17, 2024

Yes, it should be J = UsV (J, not A, not a square matrix), modify s (invert all large values, set all small values to zero, call the new one s'), determine solutions from
x=VT . s' . ( UT . b )
where b are the residual vectors, and x are the solutions. This should never go back to the solver; it's an alternative to the solver.

from q2mm.

peonor avatar peonor commented on June 17, 2024

sorry, b is the residual vector (singluar), and x is the solutions, the vector of parameter modifications

from q2mm.

ericchansen avatar ericchansen commented on June 17, 2024

Good catch! Scary how these things slip through when no one else double checks me. >_<

Will have this changed today.

from q2mm.

peonor avatar peonor commented on June 17, 2024

The beauty of it is that, since you don't invert small values, but set the inverse equal to zero (counter-intuitive), you get rid of linear dependencies between parameters, and all parameters with only small influence on the objective function.

from q2mm.

ericchansen avatar ericchansen commented on June 17, 2024

Say we have 2 parameters and 20 data points. The Jacobian would be [20, 2] and the residual vector would be [20, 1].

With full SVD, J = USV means that U would be [20, 20], S would be [20, 2] and V would be [2, 2].

I think we would have a problem.

x = VT . s' . (UT . residual)
x = [2, 2] . [20, 2] . ( [20, 20] . [20, 1] )
x = [2, 2] . [20, 2] . [20, 1]

With reduced SVD, J = USV means that U would be [20, 2], S would be [2, 2] and V would be [2, 2].

x = VT . s' . (UT . residual)
x = [2, 2] . [2, 2] . ( [2, 20] . [20, 1] )
x = [2, 2] . [2, 2] . [2, 1]
x = [2, 2] . [2, 1]
x = [2, 1]

Right now (see commit aac1f13) I am working with the 2nd method because the matrices align. Is this what you intended?

Here's more information on NumPy SVD methods and API.

from q2mm.

peonor avatar peonor commented on June 17, 2024

Hi,

Perfect example, but some misunderstandings, especially about U, it will be a [20,2] matrix! The link you gave is confusing, since it assumes you know SVD and tries to illustrate for several matricies at the same time.

SVD is in fact very similar to eigenvalue problems, where you split the matrix into eigenvectors and eigenvalues. The major addition is that you can use it on non-square matricies (both underdetermined and overdetermined equation systems). Can you get your hands on one of the books “Numerical Recipes in …”? Good chapter there, we have the one for “C”, but I think it’s also available in a Python version, and the principles are the same.

OK, starting with J, [20, 2]. Performing SVD yields 3 matricies, U, S, and V, such that J = U.S.VT (note the transpose of V here; check the function, I believe it may actually deliver VT, not V)
U is orthonormal, [20, 2}, we could call this a set of two long eigenvectors.
S is a diagonal matrix, [2, 2]. The two diagonal elements correspond to eigenvalues. As an example here, let’s say that the two diagonal elements are 0.1 and 5. From Python SVD, only get a vector w. the diagonal.
V is orthonormal and square, {2, 2], basically a set of two short eigenvectors. Warning: check if you get VT or V, I’m not sure!

· Reminder: orthonormal means that the transpose functions as inverse, such that UT.U = [1] = VT.V, where [1] is the unitary matrix, [2, 2], diagonals = 1, off-diagonals = 0.
Now we construct the inverse of S, let’s call it S’. It’s again diagonal, elements 10 and 0.2 .

If we can the residual B [20, 1] and the solution vector X [2, 1], we find the solution from: X = V.S’.(UT.B), almost as you wrote (note that V is not transposed). This solution is uninteresting, it is the same as the least-squares.

With 2 parameters, we can only have one interesting SVD solution that is different from LSQ. This is if we instead construct the modified matrix S” where low values have been dropped and high have been inverted. Note that this is counter-intuitive, it corresponds to saying that 1/0 equals 0, not inf! So, S” is again diagonal, but now the new diagonal values are 0. and 0.2 .

We now form the true SVD solution X” = V.S”.(UT.B). Note that because we took away the inverse of the low value (which corresponds to a direction with low curvature), the step is much shorter, which is what we want!

Just for comparison, look at the LSQ solution written in terms of eigenvectors. Let’s work on the matrix A = JT.J, that is, a [2, 2] matrix. I’ll use notation similar to the above we do a diagonalization instead of SVD, yielding:
A = V.S.VT (note that V is identical to the above, but S here is the square of the S in the SVD case). The LSQ solution is now simply X = VT.S’.V.B, where S’ is again simply inverted S, a diagonal matrix where all diagonal elements are the inverses of the corresponding elements in S. You see the similarity with the SVD solution? (For clarity, VT.S.V is the inverse of A; diagonalization followed by this operation is one good way of inverting a matrix). This one also highlights the major problem with LSQ: if one element of S is very close to zero, the inverse becomes huge, and the corresponding step becomes huge. If it is zero, crash! (conditioning basically means use Lagrange…)

And, while we’re at it, I could just explain Lagrange and LM in the same terms. In Lagrange, you simply add a constant to every diagonal element of S (we don’t do it that way, but it is the effect), meaning that all steps will be shorter, and we take away the singularities (division by zero) in another way. Obviously requires that all diagonal elements are positive, but since we did get the matrix A from squaring J, all eigenvalues are guaranteed to be positive (the matrix is positive definite). Finally, the LM method instead multiplies all diagonal elements of S with a constant (1+lambda), ensuring that all become higher, but not really a solution to the problem if a diagonal element is exactly zero, then the matrix inversion will crash. In fact, we might want to skip LM, I don’t think it’s ever better than Lagrange, but it is more published…

Added note: I think we could skip LM (for reasons above) and LSQ (it’s just Lagrange with a very low constant for conditioning).

A final note (not relevant for parameterization): if you do a similar optimization in a case where the eigenvalues can be below zero, you can do something similar by shifting all eigenvalues, basically the same as Lagrange. This is called rational function optimization (RFO). This is what you use if you want to find a TS, you shift all eigenvalues so that one is negative and all are significantly different from zero. Not relevant for our problem, where we always want a minimization.

/Per-Ola

From: Eric Hansen [mailto:[email protected]]
Sent: den 14 juni 2016 03:24
To: ericchansen/q2mm [email protected]
Cc: Norrby, Per-Ola [email protected]; Author [email protected]
Subject: Re: [ericchansen/q2mm] SVD (#28)

Say we have 2 parameters and 20 data points. The Jacobian would be [20, 2] and the residual vector would be [20, 1].

With full SVD, J = USV means that U would be [20, 20], S would be [20, 2] and V would be [2, 2].

I think we would have a problem.

x = VT . s' . (UT . residual)
x = [2, 2] . [20, 2] . ( [20, 20] . [20, 1] )
x = [2, 2] . [20, 2] . [20, 1]

With reduced SVD, J = USV means that U would be [20, 2], S would be [2, 2] and V would be [2, 2].

x = VT . s' . (UT . residual)
x = [2, 2] . [2, 2] . ( [2, 20] . [20, 1] )
x = [2, 2] . [2, 2] . [2, 1]
x = [2, 2] . [2, 1]
x = [2, 1]

Right now (see commit aac1f13aac1f13) I am working with the 2nd method because the matrices align. Is this what you intended?

Here's more informationhttps://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.linalg.svd.html on NumPy SVD methods and API.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHubhttps://github.com//issues/28#issuecomment-225755446, or mute the threadhttps://github.com/notifications/unsubscribe/ANVSNGYgmJShKM7hNahya3VU8nXCetT1ks5qLgKZgaJpZM4I0XzT.


Confidentiality Notice: This message is private and may contain confidential and proprietary information. If you have received this message in error, please notify us and remove it from your system and note that you must not copy, distribute or take any action in reliance on it. Any unauthorized use or disclosure of the contents of this message is not permitted and may be unlawful.

from q2mm.

ericchansen avatar ericchansen commented on June 17, 2024

Just for future reference, it does appear that numpy.linalg.svd gives the transpose of V (using the above notation).

from q2mm.

ericchansen avatar ericchansen commented on June 17, 2024

A few more questions. In commit cbd1cc5, the changes for SVD without thresholds seems to be working well. Here's what I do. Is that functioning as you intended?

mu, vs, mvt = np.linalg.svd(jacob)
ms = np.diag(vs)

This gives us J = U . S . VT, and I can confirm this works.

reform = mu.dot(np.diag(vs).dot(mvt))
np.allclose(jacob, reform)

That returns True, which means they're the same.

That's the same for SVD with and without thresholds. For SVD without thresholds, I go on to do this.

NumPy returns vector vs such that larger values come 1st, and I do this to form the inverse.

ms = np.diag(vs)
msi = np.linalg.inv(ms)

Now, in the inverted matrix, the smaller values come 1st (diagonal elements increase from index [0, 0] to [n, n], where n is the length of vector vs).

I go through zeroing the values that are the largest, one at a time.

for i in xrange(0, len(vs) - 1):
    msi[-(i + 1), -(i + 1)] = 0.
    changes = mvt.T.dot(msi.dot(mu.T.dot(resid)))

This seems to be generating fairly good trial parameter sets, but I would like someone to double check my method. For further clarification, here is how msi evolves over time in a relatively small trial system (120 charge data points and 4 parameters).

------------------------------ ZEROED 0 ELEMENTS ------------------------------
>>> msi:
[[  6.20209407e-01   0.00000000e+00   0.00000000e+00   0.00000000e+00]
 [  0.00000000e+00   7.95221576e-01   0.00000000e+00   0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   1.82438431e+00   0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   0.00000000e+00   1.24827979e+16]]
>>> changes:
[ -1.49218287e+16   1.63099821e+16   2.54761516e+00  -1.01361975e+16]
------------------------------ ZEROED 1 ELEMENTS ------------------------------
>>> msi:
[[ 0.62020941  0.          0.          0.        ]
 [ 0.          0.79522158  0.          0.        ]
 [ 0.          0.          1.82438431  0.        ]
 [ 0.          0.          0.          0.        ]]
>>> changes:
[ 2.98237556  2.05417715  1.1976335  -1.08511153]
------------------------------ ZEROED 2 ELEMENTS ------------------------------
>>> msi:
[[ 0.62020941  0.          0.          0.        ]
 [ 0.          0.79522158  0.          0.        ]
 [ 0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.        ]]
>>> changes:
[ 2.86364141  1.94166881  1.55287877 -1.09135434]
------------------------------ ZEROED 3 ELEMENTS ------------------------------
>>> msi:
[[ 0.62020941  0.          0.          0.        ]
 [ 0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.        ]]
>>> changes:
[ 0.52200671 -0.37786601  0.03060935 -1.37648095]

I'm not as sure exactly how you want the threshold version of SVD to function.

Imagine you have these thresholds, which are then sorted in reverse.

self.svd_factors = [0.001, 0.01, 0.1, 1.]
self.svd_factors.sort(reverse=True)

This gives self.svd_factors = [1., 0.1, 0.01, 0.001].

Here, msi evolves like so.

---------------------------------- NO FACTORS ---------------------------------
>>> msi:
[[  6.20209407e-01   0.00000000e+00   0.00000000e+00   0.00000000e+00]
 [  0.00000000e+00   7.95221576e-01   0.00000000e+00   0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   1.82438431e+00   0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   0.00000000e+00   1.24827979e+16]]
>>> changes:
[ -1.49218287e+16   1.63099821e+16   2.54761516e+00  -1.01361975e+16]
--------------------------------- FACTOR: 1.0 ---------------------------------
>>> msi:
[[ 0.62020941  0.          0.          0.        ]
 [ 0.          0.79522158  0.          0.        ]
 [ 0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.        ]]
>>> old_msi:
[[  6.20209407e-01   0.00000000e+00   0.00000000e+00   0.00000000e+00]
 [  0.00000000e+00   7.95221576e-01   0.00000000e+00   0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   1.82438431e+00   0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   0.00000000e+00   1.24827979e+16]]
>>> changes:
[ 2.86364141  1.94166881  1.55287877 -1.09135434]
--------------------------------- FACTOR: 0.1 ---------------------------------
>>> msi:
[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]
>>> old_msi:
[[ 0.62020941  0.          0.          0.        ]
 [ 0.          0.79522158  0.          0.        ]
 [ 0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.        ]]
  -- Vector is all zeros. Breaking.

Any suggestions?

from q2mm.

peonor avatar peonor commented on June 17, 2024

Seems to be working now. A comment on the inversion in SVD. You convert the vector of diagonal elements to a diagonal matrix and invert that. Costly and unnecessary, and somewhat dangerous. I would manually invert each element in the vector, unless the value was close to or equal to zero, in which case I'd replace it by zero, and then convert the resulting vactor to a diagonal matrix. Cheaper, and much safer if the elements can go to exactly zero (which they will if we include a parameter that doesn't affect the data).

I also think the thresholds become more obvious if you think of them as being applied before the inversion, and there shoudl always be an implied additional threshold of 0. (since zero values should never be inverted. And then obviously only apply a threshold if it makes a difference to previosuly applied thresholds. For the technique without explicit thresholds, I want it done exactly as you did first, throw out the values one at a time. If you look on those four cases, you can also see that a cutoff would work well, throw away the first solution and only retain the three that give reasonable radii.

We could talk much more about ways of selecting smart thresholds, I don't know the final answer. If we use the one-at-a-time technique, we'll get as many solutions as we have parameters. Better to group them somehow. One possibility I thought of: start using a very low threshold (maybe 10^-8 or something). Create that solution. Look for the smallest remaining singular value. Double that and use as the new threshold. Repeat until you reach some limit (like "1", higher thresholds could never be relevant). Looking at your example, and back-calculationg what your vs must have been (ca. [1.61,1.26,0.55,8E-17], right?), this technique would have generated the solutions with one and two zeroed values, look very reasonable to me.

from q2mm.

ericchansen avatar ericchansen commented on June 17, 2024

Good point on the excessive method for diagonal matrix inversion.

There's already a check such that thresholds that don't make a difference are excluded. If we're talking about solutions that are actually different, but only the same within numerical rounding error, that is being addressed in Q2MM/issues/13.

Sounds like the method without explicit thresholds is done and done then.

We need to limit the discussion for the sake of addressing this issue. It's going to get confusing if we try to tackle selecting smart thresholds in here. That can be opened as a separate issue.

I am going to make the changes to the inversion, and suggest that this issue be closed as those changes are merged.

from q2mm.

ericchansen avatar ericchansen commented on June 17, 2024

I think branch iss28 addresses all the concerns raised in this issue #28, and I suggest that someone review it and merge it into master.

from q2mm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.