Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP -- added placeholder file for the nearPD function #800

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ewalsh
Copy link

@ewalsh ewalsh commented Feb 1, 2021

Adding a function to find the nearest positive definite matrix via the Higham (2002) algorithm.

@ewalsh ewalsh marked this pull request as ready for review February 1, 2021 10:32
Copy link
Member

@dlwh dlwh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for being slow here, and thank you for submitting it. I have a bunch of comments about style, and I'd really like to see a few unit tests, since I really have no idea if this algorithm is correct.

* @author ewalsh
**/

class NearPD(matA: DenseMatrix[Double], corrBool: Boolean = false, keepDiag: Boolean = false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this needs to be a class? AFAICT you're only doing it to make it top level? Could you maybe put it in the linalg package object as a def? or make it an object with an apply method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I don't think there's a reason to make this a "UFunc" fwiw)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, thank you for your comments. I found all of them very helpful, and they are very much appreciated. It may take me a hot minute but I will address everything ASAP.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have addressed or commented on each of these points. Please feel free to add any other comments, I will defer to you to revolve each based on your approval.

I have changed to an object with an apply method

* applied to the result of the Higham algorithm.
* @param doSym Boolean indicating if X <- (X + t(X))/2 should be done,
* after X <- tcrossprod(Qd, Q); some doubt if this is necessary.
* @param doDykstra Boolean indicating if Dykstra's correction should be used;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd prefer there be an algo: NearPD.Algorithm or something, where NearPD.Algorithm is a sealed trait or an enum

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added sealed traits for both the algorithm and choice of convergence norm type. This makes more sense and I hope is easier to read.

//--- start algorithm ---
def generate: DenseMatrix[Double] = {
// choose Dykstra or direct
val matX = chooseAlgo(symA, doDykstra)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this. how about inlining?

val matX = if(algo == Dystrka) doDystrkaFunc( ) else doDirect()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed style to inline in each opportunity that I found

}
}
// keepDiag function
def keepDiagFunc(mat: DenseMatrix[Double], bool: Boolean): DenseMatrix[Double] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as with chooseAlgo i'd rather you just inline the boolean gating of this method and this method only does the keepDiag part

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
// doSym function
def doSymFunc(mat: DenseMatrix[Double], bool: Boolean): DenseMatrix[Double] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

val matQ = fliplr(eigVecVals.eigenvectors)
val vecd = reverse(eigVecVals.eigenvalues)
val p = vecd.map(x => x > eigTol)
val pcheck = breeze.linalg.sum(p.map(x => bool2Dbl(x)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

val pcheck = any(vecd > eigTol)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorporated directly

val pcheck = breeze.linalg.sum(p.map(x => bool2Dbl(x)))
if(pcheck < 1.0){
println("Matrix seems negative semi-definite")
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have to wrap the loop you're breaking out of in breakable in Scala. It's dumb

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

matX
}

def doDirectFixedPt(mat: DenseMatrix[Double]): DenseMatrix[Double] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function shares a ton of code with the one above. Is it possible to extra out more. I know I had you inline some stuff, but I think it would be cleaner to extract out the pcheck stuff, and the eigVecVals etc stuff, or possibly to make it so that the main loop takes a function as input that does the differences between the two algorithms, though I think that might be worse.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is the only one I haven't truly addressed. The rationale is this, it would be possible to extract out a lot of the code here. However, when running the function it would require running an if statement within the loop to check which algorithm to run.

Please advise. I'd be happy to change, but I just wanted to point out the original rationale. I think in large cases where lots of iterations are needed this repetitive checking could be slower than necessary.

// ## A longer example, extended from Jens' original,
// ## showing the effects of some of the options:

// val matA = DenseMatrix(Array(Array(1, 0.477, 0.644, 0.478, 0.651, 0.826),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to make this into a test, and maybe add some randomized tests about properties this function should have, e.g.: for any matrix you give it (that one might want to give it), the output is a PD matrix or it errors, and it doesn't error too much, or something like that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and I also added another simple test

}

// bool to double
def bool2Dbl(bool: Boolean): Double = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think you actually need this here, but if you do, there's a function in breeze called I for indicator function that does exactly this, except it works on vectors

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed and removed

Copy link
Member

@dlwh dlwh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for being so slow, just a few more comments

converged = (conv < convTol)
}
if(!converged){
println(s"Did not converge in ${iter} iterations")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we throw here?

//--- start algorithm ---
def generate: DenseMatrix[Double] = {
// choose Dykstra or direct
@inline val matX = if(algo == Dykstra) { doDykstraFunc(symA) } else { doDirectFixedPt(symA) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does @inline do here?

val n = mat.cols - 1
val eps = posdTol * breeze.numerics.abs(vecd(0))
if(vecd(n) < eps){
vecd = vecd.map(x => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: vecd = max(x, eps)

}

// norm type function
def normTypeFunc(matX: DenseMatrix[Double], matY: DenseMatrix[Double], normType: ConvergenceNormType): Double = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i should really fix norm for matrices...

// break if negative semi-definite
def negSemiDefCheckBreak(pcheck: Boolean): Unit = {
if(!pcheck){
println("Matrix seems negative semi-definite")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants