-
-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sklearn StandardScaler vs dask StandardScaler. #979
Comments
I *think* that floating point inaccuracies are just a fact of life when you’re doing things in chunks, at least with the algorithms that dask.array uses today. I don’t think there’s anything we can do in dask-ml to address that (but maybe check the source to be sure).
… On Dec 1, 2023, at 5:35 AM, Arunesh Singh ***@***.***> wrote:
I am getting different results from sklearn StandardScaler and dask StandardScaler.
scaler_sk = sklearn.preprocessing.StandardScaler()
scaler_d = dask_ml.preprocessing.StandardScaler()
scaler_sk.fit(df_pd[["SUMMESSAGECOUNT"]])
scaler_d.fit(df_dask[["SUMMESSAGECOUNT"]])
Dask scaler
scaler_d.mean_[0], scaler_d.var_[0]
output: (19.157653421114507, 47431.17794342375)
Sklearn Scaler
scaler_sk.mean_[0], scaler_sk.var_[0]
output: (19.157653421114507, 47431.17794342373)
I know the difference is negligible. But it is influencing my model training on prophet. Could you please suggest any way to make them identical without using compute().
—
Reply to this email directly, view it on GitHub <#979> or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKAOIQLOIVBEFL4GC2IBMLYHG6G5BFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJLJONZXKZNENZQW2ZNLORUHEZLBMRPXI6LQMWBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTLDTOVRGUZLDORPXI6LQMWSUS43TOVS2M5DPOBUWG44SQKSHI6LQMWVHEZLQN5ZWS5DPOJ42K5TBNR2WLKBZGQ2DKNJXGQ2YFJDUPFYGLJLJONZXKZNFOZQWY5LFVIZDAMRQG4YDCNRYGKTXI4TJM5TWK4VGMNZGKYLUMU>.
You are receiving this email because you are subscribed to this thread.
Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I am getting different results from sklearn StandardScaler and dask StandardScaler.
Dask scaler
Sklearn Scaler
I know the difference is negligible. But it is influencing my model training on prophet. Could you please suggest any way to make them identical without using
compute()
.The text was updated successfully, but these errors were encountered: