Inaccurate porting of covariance vs naive method #82

JamesYang007 · 2024-01-08T13:26:51Z

Lines 288 to 293 in 813c06f

    
           if X.shape[1] > X.shape[0]: 
        
               # the glmnet docs suggest using a different algorithm for the case 
        
               # of p >> n 
        
               algo_flag = 2 
        
           else: 
        
               algo_flag = 1

glmnet actually does a slightly different check than just a "n" vs "p" comparison like this. It invokes method 1 (covariance method) if p <= 500. The covariance method keeps track of a matrix of covariances C(i,j) for every feature i and every active feature j. And under the hood, C is allocated as a pxp matrix (even though we use much less memory than that usually); this was done out of simplicity because it's very hard to write clever data structures in Fortran. So even when n >> p, if p is also large, this is not a viable default option on most machines.

Anyways, I'd suggest changing to

if X.shape[1] <= 500:
    algo_flag = 1
else:
    algo_flag = 2

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inaccurate porting of covariance vs naive method #82

Inaccurate porting of covariance vs naive method #82

JamesYang007 commented Jan 8, 2024 •

edited

Loading

Inaccurate porting of covariance vs naive method #82

Inaccurate porting of covariance vs naive method #82

Comments

JamesYang007 commented Jan 8, 2024 • edited Loading

JamesYang007 commented Jan 8, 2024 •

edited

Loading