Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add estimate_sigma function to threshold features #395

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
48e6006
first commit to ensure tool chain correct
micha2718l Jul 30, 2018
f018bd3
Naive copy-paste estimate_sigma ripped out of scikit.
micha2718l Jul 30, 2018
3cb502e
Removed image functions. Changed test and example. Comments to match.
micha2718l Aug 1, 2018
25fab87
Removed temp and vs code files.
micha2718l Aug 1, 2018
65ce36c
Fixed variables in estimate_sigma input.
micha2718l Aug 1, 2018
1e53ece
Fixed PEP8 violations.
micha2718l Aug 3, 2018
484856b
Remove print from example in doc to get travis to pass 2.7 build.
micha2718l Aug 4, 2018
407985b
Added ability to pass in object with ppf method.
micha2718l Aug 17, 2018
557738b
Merge branch 'master' into threshold_alpha
micha2718l Mar 18, 2019
602ceb0
Merge https://github.com/PyWavelets/pywt into threshold_alpha
micha2718l Mar 18, 2019
7dab21a
Remove old tests for threshold.
micha2718l May 8, 2019
bc8b1d5
Merge branch 'PyWavelets-master' into threshold_alpha
micha2718l May 8, 2019
ed1d9ff
Fixed test wavelet leftover.
micha2718l May 8, 2019
e2aa85b
Merge pull request #2 from PyWavelets/master
micha2718l Oct 15, 2019
11282da
Add estimate_sigma to docs.
Oct 15, 2019
606fae6
Remove debug wavelet name.
Oct 15, 2019
0ba3996
Fixing docstring formatting.
Oct 15, 2019
644a1c5
Add tests for estimate_sigma
Oct 15, 2019
1f5beee
change estimate_sigma to estimate_noise
Oct 15, 2019
1f284f5
Merge remote-tracking branch 'upstream/master' into threshold_alpha
Feb 10, 2024
73fa511
Merge branch 'master' into threshold_alpha
micha2718l Mar 4, 2024
1c2a6ea
removing auto files
micha2718l Mar 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/ref/thresholding-functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Thresholding

.. autofunction:: threshold
.. autofunction:: threshold_firm
.. autofunction:: estimate_noise

The left panel of the figure below illustrates that non-negative Garotte
thresholding is intermediate between soft and hard thresholding. Firm
Expand Down
92 changes: 91 additions & 1 deletion pywt/_thresholding.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,10 @@
from __future__ import division, print_function, absolute_import
import numpy as np

__all__ = ['threshold', 'threshold_firm']
from ._multidim import dwtn


__all__ = ['threshold', 'threshold_firm', 'estimate_noise']


def soft(data, value, substitute=0):
Expand Down Expand Up @@ -248,3 +251,90 @@ def threshold_firm(data, value_low, value_high):
if np.any(large_vals[0]):
thresholded[large_vals] = data[large_vals]
return thresholded


def estimate_noise(data, distribution='Gaussian', **kwargs):
"""
Robust wavelet-based estimator of the (Gaussian) noise standard deviation.

Parameters
----------
data : ndarray
The data used to estimate sigma.
distribution : str or object with ppf method
The underlying noise distribution.
\\**kwargs : \\**kwargs
Keyword arguments to pass into distribution ppf method.

Returns
-------
sigma : float
Estimated noise standard deviation.

Notes
-----
This function assumes the noise follows a Gaussian distribution. The
estimation algorithm is based on the median absolute deviation of the
wavelet detail coefficients as described in section 4.2 of [1]_.

References
----------
.. [1] D. L. Donoho and I. M. Johnstone. "Ideal spatial adaptation
by wavelet shrinkage." Biometrika 81.3 (1994): 425-455.
DOI:10.1093/biomet/81.3.425

Examples
--------
>>> import numpy as np
>>> import pywt
>>> data = np.sin(np.linspace(0,10,100))
>>> np.random.seed(42)
>>> noise = 0.5 * np.random.normal(0,1,100)
>>> pywt.estimate_noise(data + noise)
0.45634925413327504
"""

coeffs = dwtn(data, wavelet='db2')
detail_coeffs = coeffs['d' * data.ndim]
return _sigma_est_dwt(detail_coeffs, distribution=distribution, **kwargs)


def _sigma_est_dwt(detail_coeffs, distribution='Gaussian', **kwargs):
"""Calculate the robust median estimator of the noise standard deviation.
Parameters
----------
detail_coeffs : ndarray
The detail coefficients corresponding to the discrete wavelet
transform of an image.
distribution : str or object with ppf method
The underlying noise distribution.
\\**kwargs : \\**kwargs
Keyword arguments to pass into distribution ppf method.

Returns
-------
sigma : float
The estimated noise standard deviation (see section 4.2 of [1]_).
References
----------
.. [1] D. L. Donoho and I. M. Johnstone. "Ideal spatial adaptation
by wavelet shrinkage." Biometrika 81.3 (1994): 425-455.
DOI:10.1093/biomet/81.3.425
"""
# Consider regions with detail coefficients exactly zero to be masked out
detail_coeffs = detail_coeffs[np.nonzero(detail_coeffs)]

if hasattr(distribution, 'ppf'):
if not kwargs:
kwargs = {'q': 0.75}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally thought it would be good to auto-fill the arguments with 75th quantile in the case there are no args passed in to keep up the defaults, but this may be a bad idea in case someone wants to use a ppf method which takes no arguments (as this would result in passing in an unwanted keyword argument).

denom = distribution.ppf(**kwargs)
elif str(distribution).lower() == 'gaussian':
# 75th quantile of the underlying, symmetric noise distribution
# denom = scipy.stats.norm.ppf(0.75)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to a string name, you could accept any object with a ppf method here perhaps (assuming that works, didn't read the paper)? Just check with if hasattr(distribution, 'ppf').

The bigger issue here seems to be the hardcoding of the 75th quantile. Why is that not a keyword?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like both choices originate from specific suggestions by the author, I agree that the best solution would be to have any distribution or quantile as valid inputs, as that could prove useful. The issue I saw is the use of scipy.stats in the original code, because PyWavelets doesn't rely on scipy (right?) I'm not sure what the best thing to do there is, including scipy for just this is overkill, perhaps a couple hand coded options for now? Or for just Gaussian with a variable percent input, shouldn't be too hard to get going without scipy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, nice suggestion for finding if a distribution supports a method, seeing the example it seems obvious now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed we don't want to rely on scipy. That's why I suggested checking for a ppf method; users can still pass in scipy.stats distributions but we don't need to import from scipy anywhere so we don't rely on scipy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A concise explanation of the rationale for use of the 75th quantile in MAD is given on the following wikipedia page:
https://en.wikipedia.org/wiki/Median_absolute_deviation
Perhaps it is worth adding that to the References? I don't remember if it was spelled out explicitly in Donoho's paper

I also agree about not relying on scipy here. scipy was already a dependency of scikit-image and a reviewer of my PR there preferred to call the ppf method explicitly instead of hard-coding the value, but that doesn't mean we have to do the same here.

The idea to check for a ppf method is nice.

Another approach to deal with other noise distributions such as Poisson noise is to first apply a variance stabilizing transform prior to calling this Gaussian-based MAD method. An example implementation of this type of approach for use with Rician noise is available here, but it is not under compatible license terms.

# magic number to fill in because no scipy
denom = 0.6744897501960817
else:
raise ValueError("Only Gaussian noise estimation or objects with"
" ppf method currently supported")
sigma = np.median(np.abs(detail_coeffs)) / denom
return sigma
13 changes: 13 additions & 0 deletions pywt/tests/test_thresholding.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,3 +167,16 @@ def test_threshold_firm():
mt_abs_firm = np.abs(d_firm[mt])
assert_(np.all(mt_abs_firm < np.abs(d_hard[mt])))
assert_(np.all(mt_abs_firm > np.abs(d_soft[mt])))


def test_estimate_noise():
data = np.sin(np.linspace(0, 10, 1000))
np.random.seed(42)
noise = 0.5 * np.random.normal(0, 1, 1000)
sigma = pywt.estimate_noise(data + noise)
assert_allclose(sigma, 0.4867884459318056)

assert_raises(ValueError, pywt.estimate_noise, data,
distribution='not_a_distribution')
assert_raises(ValueError, pywt.estimate_noise, data,
distribution=42)
Loading