Variance extractor #2543

ctoennis · 2024-04-24T15:14:41Z

This pull request is to add a new ImageExtractor subclass to generate images from the variance of waveforms that will be needed for the pointing calibration using stars. This request is related to issue #2542 .

src/ctapipe/image/extractor.py

kosack · 2024-04-24T15:52:29Z

If I understand correctly, what you are basically doing here is computing an event-wise pedestal variance. In the data model (see the original top-level data model document), there are a few places you can put pedestals:

dl1.event.telescope.pedestal: pedestals measured for each event (usually in a time window as @maxnoe mentioned, since you can for example just use time slices outside the trace integration window, even for Cherenkov events). This is also computed by some cameras directly in hardware as dl0 information.
dl1.monitoring.telescope.pedestal: is the pedestal computed from an ensemble of pedestal events, or from Cherenkov events with the shower cut out (this is also in the DL0.monitoring.telescope.pedestal, computed by the cameras) computed every few seconds.
dl1.service.telescope.pedestal: pedestals measured from some standard run that are applied for e.g. RTA

So I think what you are computing is the first one, and you should store it as pedestal information: It is in fact a second measurement of the pedestal and its variance, and could be compared to the other methods (even if the use case you are describing here is for doing star tracking, star tracking is computed from the pedestal variances, which can be over many time scales)

fwerner · 2024-04-25T06:56:32Z

In addition to what has been said I'd suggest to clarify the purpose of this extractor: it is calculating the waveform sample variance, which should be reflected in the name/docstring to communicate that it has a complicated relationship to charge.

src/ctapipe/image/extractor.py

ctoennis · 2024-04-25T08:50:55Z

The thing we want to calculate is the variance of the waveforms specifically of interleaved pedestal events and not the variance of the pedestal in units of charge.

@kosack : What we get is a sort of image based on the variance of the waveform in each pixel. What we get is not a pedestal value, so putting it in a pedestal container is not the best option i think.

I think it is best to write a container appropriate for these variance images similar to the DL1CameraContainer

ctoennis · 2024-04-26T13:38:06Z

I just added a custom continer for the variance images. I also compared the variance calculation using just a plain variance and a variance of integrated chunks of the waveforms using the FixedWindowSum extractor in the plot below.

The x axis shows the number of images used in calculating an average variance value that is shown on the y axis.

Though the methods are generally statistically equivalent the average variance becomes stable with fewer images when using a simple variance over the variance from integrated chunks.

maxnoe · 2024-04-26T14:00:11Z

Though the methods are generally statistically equivalent the average variance becomes stable with fewer images when using a simple variance over the variance from integrated chunks.

That's not surprising, since you have just more samples. But you change of what you compute the variance

ctoennis · 2024-05-27T13:26:57Z

Hi all, can we restart the discussion here? I made a small update to the extractor to contain timing information that will later be needed in the corresponding subclass of StatisticsExtractor that i am working on with Tjark.

There was a decision needed as to where the variance images will be attached to. Should this find a place somewhere in dl1.event.telescope.pedestal or something similar?

src/ctapipe/image/extractor.py

maxnoe · 2024-05-29T16:07:36Z

Hi @ctoennis the lint step in the CI is failing due to some typos and wrong formatting.

To check locallly, make sure to install the pre-commit hooks:
https://ctapipe.readthedocs.io/en/latest/developer-guide/getting-started.html#installing-ctapipe-in-development-mode

ctoennis · 2024-06-03T08:58:51Z

i fixed all the isses with the spelling, code checks etc.

src/ctapipe/containers.py

src/ctapipe/image/extractor.py

ctoennis · 2024-07-10T13:48:48Z

i can remove trigger time. But i am not sure how to proceed on the difference to the ImageExtractor API. I could make it that the variance image is stored in a DL1ImageContainer and that it complies with the API or make it a separate class that doesn't inherit from the ImageExtractor.

I think what i generate here is a type of image as i try to look for stars in the camera FOV with them. But if it makes more sense to you it can be separate from the ImageExtractors.

maxnoe · 2024-07-10T15:05:29Z

The question is if it is possible to adhere to the API that is expected from the image extractors: if yes, no problem, let it be an image extractor. But that means for example the call above should work and correctly write out the data to the HDF5 file.

If you need the different API, it should be its own thing and we need to think about how to integrate it into the process tool for example.

kosack · 2024-07-10T15:12:57Z

The question is if it is possible to adhere to the API that is expected from the image extractors: if yes, no problem, let it be an image extractor. But that means for example the call above should work and correctly write out the data to the HDF5 file.

I think what this class does is not the same as an ImageExtractor, or am I missing something? ImageExtractors take a "movie" (a set of waveforms per pixel for one event) for a single event and convert them to set of time-independent images that must include the peak time and integral over a window, but can also optionally inlcude any other parameters of the waveform like the pulse width or variabce. Generally, there is only one ImageExtractor that is run at a during processing. That doesn't seem to be what you have here.

The idea of the ImageExtractors was that low-level routines like those used to compute peak time or integral are implemeted as basic functions taking numpy inpus, and the ImageExtractor composes several of those methods into one set of algorithms. So what I would expect here is more like a function to compute the variance, and then that function is added to one or more of the existing ImageExtractors, perhaps as an option. So that then the chosen ImageExtractor generates the required peak time and integral, but also the variance

maxnoe · 2024-07-10T15:41:26Z

I think what this class does is not the same as an ImageExtractor, or am I missing something?

You could think of this as an ImageExtractor that uses the variance of each waveform to produce the image and doesn't produce a peak_time.

mexanick · 2024-07-10T17:15:07Z

I believe, it is an ImageExtractor, the __call__() signature is of course wrong here. It should copy the one of the base extractor:

@abstractmethod
    def __call__(
        self, waveforms, tel_id, selected_gain_channel, broken_pixels
    ) -> DL1CameraContainer:

Only waveforms are then used, the other parameters are simply ignored.

The only issue, that has to be resolved, is the return type. DL1CameraContainer in its current shape contains a number of fields that are not needed in this case (actually, everything except image).

I'd propose to add an intermediate container:

class CameraImageContainer(Container):
    """
    Storage of output of generic camera images (any reduction of waveforms).
    """

    image = Field(
        None,
        "Numpy array of camera image, after waveform extraction."
        "Shape: (n_pixel) if n_channels is 1 or data is gain selected"
        "else: (n_channels, n_pixel)",
    )

class DL1CameraContainer(CameraImageContainer):
    """
    Storage of output of camera calibration e.g the final calibrated
    image in intensity units and the pulse time.
    """
    peak_time = Field(
        None,
        "Numpy array containing position of the peak of the pulse as determined by "
        "the extractor."
        "Shape: (n_pixel) if n_channels is 1 or data is gain selected"
        "else: (n_channels, n_pixel)",
    )
    # rest of the DL1CameraContainer fields
    
class VarianceContainer(CameraImageContainer):
    """
    Storage of output of camera variance image.

    Stores the variance of waveform in each pixel, composed as an image.
    """
    variance_method = Field(
        VarianceType.SIMPLE,
        "Method by which the variance was calculated"
        "This can either be a plain variance"
        "or a variance of integrated samples",
        type=VarianceType,
    )
    # I don't think you need anything else here. The trigger time you extract from elsewhere in the event tree.

and change the return type of base ImageExtractor to CameraImageContainer

maxnoe · 2024-07-12T08:40:46Z

If it's just that the VarianceExtractor doesn't provide a peak_time, I think that's fine, just leave it None (the default value).

Adding an additional field to the DL1CameraContainer that's an "image_type" sounds like a good idea.

I would avoid having extractors that have different return types.

…ractors

…e container

ctoennis · 2024-07-22T08:42:50Z

So, is this request then good to merge?

maxnoe · 2024-07-22T08:43:59Z

@ctoennis From my side yes, but we require two approvals and @kosack is on vacation right now. Since he had comments / requests for changes before, I'd like to wait for his approval here.

kosack · 2024-07-29T08:54:17Z

I'll take a look tomorrow or today hopefully! Might take a bit of time since there are a lot of changes.

kosack

Looks good. but I would like to have a test that checks for correctness. It is a simple algorithm, but the test will also ensure that nobody breaks it in the future. E.g. a test that runs the extractor on some dummy data with a known variance.

ctoennis · 2024-07-31T11:02:29Z

i added a test to check that an actual variance is being calculated.

ctao-dpps-sonarqube · 2024-07-31T11:19:34Z

Analysis Details

0 Issues

0 Bugs
0 Vulnerabilities
0 Code Smells

Coverage and Duplications

100.00% Coverage (93.60% Estimated after merge)
0.00% Duplicated Code (0.60% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

src/ctapipe/image/tests/test_extractor.py

ctoennis added the calibration label Apr 24, 2024

ctoennis requested a review from fwerner April 24, 2024 15:14

ctoennis self-assigned this Apr 24, 2024

TjarkMiener reviewed Apr 24, 2024

View reviewed changes

src/ctapipe/image/extractor.py Outdated Show resolved Hide resolved

TjarkMiener mentioned this pull request Apr 24, 2024

Camera and pointing calibration API #2542

Open

8 tasks

maxnoe reviewed Apr 24, 2024

View reviewed changes

src/ctapipe/image/extractor.py Outdated Show resolved Hide resolved

maxnoe reviewed Apr 25, 2024

View reviewed changes

src/ctapipe/image/extractor.py Outdated Show resolved Hide resolved

mexanick reviewed May 27, 2024

View reviewed changes

src/ctapipe/image/extractor.py Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view