Sigma clipping is painfully slow #624

mwcraig · 2018-06-04T04:13:03Z

I don't know what the underlying performance issue is, but wow it is slow (clipping a stack of 25 4k x 4k images takes minutes).

mwcraig · 2018-06-04T04:24:48Z

About 4.5 minutes, but it feels longer..

MSeifert04 · 2018-06-04T06:22:40Z

I also noticed that it's very slow. Thanks for bringing this up (I always forgot).

Could you provide a random-dataset for profiling? I could be interesting depending on the data-types and the strides of the dataset (the actual data is probably not important except it contains lots of unusual values like NaNs and so forth.

Also because we're re-using astropys sigma-clipping it would be handy to know what astropy version you were using.

One additional thing that could be a problem is RAM. Depending on the RAM and size of the arrays there may be lots of swapping going on.

Sorry for asking so much information from you but without having the specifics it's probably hard to profile it (correctly).

bsipocz · 2018-06-04T12:01:06Z

This really brings up the issue that we need to add more benchmarks to astropy-benchmarks, currently there is nothing for the stats package. While the 4.5mins long test is not feasible, if it's not a local RAM issue I suspect it can be scaled back to be useful but still slow?

mwcraig · 2018-06-04T14:45:33Z

Sure; I can also provide the data set I was working with if you want. It is just a bunch of dark frames. It is conceivable the bottle neck is in the combining, not the clipping, I suppose.

SaOgaz · 2018-06-04T15:31:27Z

Probably related, @larrybradley is working on speeding up the astropy one: astropy/astropy#7478

mwcraig · 2018-06-04T15:41:09Z

Very related, thanks for mentioning that, @SaOgaz. And, of course, thanks to @larrybradley for working on it!

mwcraig · 2018-06-10T18:56:11Z

I haven't had a time to do much profiling, but here is an example of a set of calls that takes several minutes to run (for the combine part). Adding in some outlying pixels seems to help drive up the run time ( about 30 secs faster without that).

There are a number of things here that can lead to a speedup, I think. Even without sigma clipping, it takes close to a minute to combine the images.

Away from keyboard for a week, but will take another look when I'm back...

import numpy as np
from astropy.io import fits
from ccdproc import combine
from astropy.stats import median_absolute_deviation

size = [4096, 4109]
n_images = 25

base_name = 'test-combine-{num:03d}.fits'

for num in range(n_images):
    data = np.random.normal(size=size)
    # Now add some outlying pixels so there is something to clip
    n_bad = 50000
    bad_x = np.random.randint(0, high=size[0] - 1, size=n_bad)
    bad_y = np.random.randint(0, high=size[1] - 1, size=n_bad)
    data[bad_x, bad_y] = np.random.choice([-1, 1], size=n_bad) * (10 + np.random.rand(n_bad))
    hdu = fits.PrimaryHDU(data=data)
    hdu.header['for_prof'] = 'yes'
    hdu.header['bunit'] = 'adu'
    hdu.writeto(base_name.format(num=num), overwrite=True)

ic = ImageFileCollection('.', '*')
fils = ic.files_filtered(for_prof='yes')
combine(fils, output_file='foo12.fit', sigma_clip=True, sigma_clip_low_thresh=5, sigma_clip_high_thresh=5, sigma_clip_func=np.ma.median, sigma_clip_dev_func=median_absolute_deviation)

Edit 2018-06-23: move misplaced parentheses in hdu.writeto

larrybradley · 2018-06-15T17:48:22Z

@mwcraig For your sigma-clipping stdfunc (sigma_clip_dev_func) you should probably use mad_std (http://docs.astropy.org/en/stable/api/astropy.stats.mad_std.html) instead of median_absolute_deviation. mad_std is a robust standard deviation estimator. For reference, see e.g. https://en.wikipedia.org/wiki/Median_absolute_deviation#Relation_to_standard_deviation.
Just an FYI -- it has no effect on performance. 😄

larrybradley · 2018-06-15T17:58:13Z

One other note:
If you want to clean up the combine API a bit, you could use the SigmaClip object instead of having a bunch of keywords to control the sigma clipping. That's exactly the use case for why I added the SigmaClip class.

i.e. instead of this:

combine(fils, output_file='foo12.fit', sigma_clip=True, sigma_clip_low_thresh=5, sigma_clip_high_thresh=5, sigma_clip_func=np.ma.median, sigma_clip_dev_func=median_absolute_deviation)

you could do this:

sigclip = SigmaClip(sigma_lower=5, sigma_upper=5, cenfunc=np.ma.median, stdfunc=mad_std)
combine(fils, output_file='foo12.fit', sigma_clip=sigclip)

where sigma_clip=None would mean do not perform sigma clipping.

If interested, I could put in a PR.

MSeifert04 · 2018-07-07T16:31:14Z

One thing that I noticed during my tests is that we actually do most operations on the stacked arrays in Combiner along the first axis. For C contiguous arrays that's actually the worst axis to do operations on because elements along the first axis have the biggest "step" (in memory) between them. However stacking along the first axis is something that is faster than stacking along the last axis (which would be the fastest axis to "reduce" along).

I'm not sure if that matters much because we stack and reduce, however from the number of operations the last axis would be better in cases when we do clipping:

We stack once
We reduce once for the baseline, once for the deviation, then we reduce once for the "combined" value, once for the "combined" mask and once for the "combined" uncertainty.

However processors are typically very clever what to pre-fetch so it might not be too bad that we reduce along the "worst" (memory-layout-wise) axis.

Then I played a bit with possible optimizations, however it's really tricky to make optimizations because we have to "assume" things, for example it's easily possible with a numba function (or C extension) to speed up the clipping by a factor of 3-20 (depending on the shape of the stack), however that's assuming that the stack is 3D, that the clip-functions are either (np.ma.mean and np.ma.std) or (np.ma.median and astropy.stats.median_absolute_deviation). And do we really want to include numba (a heavy dependency if one doesn't have numba) or include code to compile (particular problematic for windows users that don't use anaconda) just for some particular combinations, even though they are probably the most common ones?

mwcraig · 2018-08-09T03:29:24Z

I've a had chance to do a bit of profiling; the notebook I used to generate the profiles is in this gist.

The notebook combines 25 images, each roughly 2k x 2k, combined by averaging, sigma clipped before combining.

The upshot is that numpy masked median is really, really, slow. Two thirds of the time in the combination is spent in sigma clip; each of the three medians it does take roughly 7 seconds. Of that, a surprising amount of time is spent in numpy MaskedArray's __getitem__ (10 seconds).

A screenshot from SnakeViz is below. Following @larrybradley's approach of using the nan... speeds things up some, and bottleneck does even better.

mwcraig added enhancement combiner labels Jun 4, 2018

mwcraig added this to the 2.0 milestone Jun 4, 2018

bsipocz mentioned this issue Jun 4, 2018

Add stats benchmarks astropy/astropy-benchmarks#52

Open

MSeifert04 mentioned this issue Jul 7, 2018

Test: Use a numba-powered function for median sigma clipping #632

Closed

mwcraig mentioned this issue Aug 9, 2018

Refactor combiner.combine to minimize the number of files open at one time #630

Merged

3 tasks

crawfordsm modified the milestones: 2.0, 2.1 Jul 25, 2019

mwcraig modified the milestones: 2.1.0, 2.1.1 Dec 24, 2019

mwcraig modified the milestones: 2.1.1, 2.2.0 Mar 15, 2021

mwcraig modified the milestones: 2.2.0, 2.2.1 May 25, 2021

mwcraig modified the milestones: 2.2.1, 2.3 Nov 21, 2021

mwcraig modified the milestones: 2.3, 3.0 Jan 19, 2022

astrofrog mentioned this issue Aug 18, 2022

Always use astropy's sigma clipping and default to string-based functions #794

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sigma clipping is painfully slow #624

Sigma clipping is painfully slow #624

mwcraig commented Jun 4, 2018

mwcraig commented Jun 4, 2018

MSeifert04 commented Jun 4, 2018 •

edited

Loading

bsipocz commented Jun 4, 2018

mwcraig commented Jun 4, 2018

SaOgaz commented Jun 4, 2018 •

edited by MSeifert04

Loading

mwcraig commented Jun 4, 2018

mwcraig commented Jun 10, 2018 •

edited

Loading

larrybradley commented Jun 15, 2018 •

edited

Loading

larrybradley commented Jun 15, 2018

MSeifert04 commented Jul 7, 2018

mwcraig commented Aug 9, 2018

Sigma clipping is painfully slow #624

Sigma clipping is painfully slow #624

Comments

mwcraig commented Jun 4, 2018

mwcraig commented Jun 4, 2018

MSeifert04 commented Jun 4, 2018 • edited Loading

bsipocz commented Jun 4, 2018

mwcraig commented Jun 4, 2018

SaOgaz commented Jun 4, 2018 • edited by MSeifert04 Loading

mwcraig commented Jun 4, 2018

mwcraig commented Jun 10, 2018 • edited Loading

larrybradley commented Jun 15, 2018 • edited Loading

larrybradley commented Jun 15, 2018

MSeifert04 commented Jul 7, 2018

mwcraig commented Aug 9, 2018

MSeifert04 commented Jun 4, 2018 •

edited

Loading

SaOgaz commented Jun 4, 2018 •

edited by MSeifert04

Loading

mwcraig commented Jun 10, 2018 •

edited

Loading

larrybradley commented Jun 15, 2018 •

edited

Loading