Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto mcp w/o MC #243

Merged
merged 22 commits into from
Aug 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 10 additions & 27 deletions magicctapipe/scripts/lst1_magic/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,30 +29,17 @@ During the analysis, some files (i.e., bash scripts, lists of sources and runs)

### DL0 to DL1

In this step, we will convert the MAGIC Calibrated data to Data Level (DL) 1 (our goal is to reach DL3) and MC DL0 to DL1.
In this step, we will convert the MAGIC Calibrated data to Data Level (DL) 1 (our goal is to reach DL3).

In your working IT Container directory (e.g. /fefs/aswg/workspace/yourname/yourprojectname), open your environment with the command `conda activate {env_name}` and update the file `config_auto_MCP.yaml` according to your analysis. If you need non-standard parameters (e.g., for the cleaning), take care that the `resources/config.yaml` file gets installed when you install the pipeline, so you will have to copy it, e.g. in your workspace, modify it and put the path to this new file in the `config_auto_MCP.yaml` (this way you don't need to install again the pipeline).
In your working IT Container directory (i.e., `workspace_dir`), open your environment with the command `conda activate {env_name}` and update the file `config_auto_MCP.yaml` according to your analysis. If you need non-standard parameters (e.g., for the cleaning), take care that the `resources/config.yaml` file gets installed when you install the pipeline, so you will have to copy it, e.g. in your workspace, modify it and put the path to this new file in the `config_auto_MCP.yaml` (this way you don't need to install again the pipeline).

The file `config_auto_MCP.yaml` must contain the telescope IDs, the directories with the MC data (ignored if you set NSB_matching = true), the data selection, and some information on the night sky background (NSB) level and software versions:
The file `config_auto_MCP.yaml` must contain parameters for data selection and some information on the night sky background (NSB) level and software versions:

```

mc_tel_ids:
LST-1: 1
LST-2: 0
LST-3: 0
LST-4: 0
MAGIC-I: 2
MAGIC-II: 3

directories:
workspace_dir : "/fefs/aswg/workspace/elisa.visentin/auto_MCP_PR/" # Output directory where all the data products will be saved.
# MC paths below are ignored if you set NSB_matching = true.
MC_gammas : "/fefs/aswg/data/mc/DL0/LSTProd2/TestDataset/sim_telarray" # set to "" if you don't want to process these Monte Carlo simulations.
MC_electrons : ""
MC_helium : ""
MC_protons : "/fefs/aswg/data/mc/DL0/LSTProd2/TrainingDataset/Protons/dec_2276/sim_telarray"
MC_gammadiff : "/fefs/aswg/data/mc/DL0/LSTProd2/TrainingDataset/GammaDiffuse/dec_2276/sim_telarray/"


data_selection:
source_name_database: "CrabNebula" # MUST BE THE SAME AS IN THE DATABASE; Set to null to process all sources in the given time range.
Expand All @@ -68,17 +55,13 @@ general:
base_config_file: '' # path + name to a custom MCP config file. If not provided, the default config.yaml file will be used
LST_version : "v0.10" # check the `processed_lstchain_file` version in the LST database!
LST_tailcut : "tailcut84"
focal_length : "effective"
simtel_nsb : "/fefs/aswg/data/mc/DL0/LSTProd2/TestDataset/sim_telarray/node_theta_14.984_az_355.158_/output_v1.4/simtel_corsika_theta_14.984_az_355.158_run10.simtel.gz" # simtel file (DL0) to evaluate NSB
lstchain_modified_config : true # use_flatfield_heuristic = True to evaluate NSB
proton_train_fraction : 0.8 # 0.8 means that 80% of the DL1 protons will be used for training the Random Forest.
lstchain_modified_config : true # use_flatfield_heuristic = True to evaluate NSB
nsb : [0.5, 1.0, 1.5, 2.0, 2.5, 3.0]
env_name : magic-lst # name of the conda environment to be used to process data.
cluster : "SLURM" # cluster management system on which data are processed. At the moment we have only SLURM available, in the future maybe also condor (PIC, CNAF).
NSB_matching : true # Set to false to process also the MCs. Set to true if adequate MC productions (DLx) are already available on the IT Container.
NSB_MC : 0.5 # extra noise in dim pixels used to process MCs; e.g., you could put here the average NSB value of the processed LST runs. Ignored if NSB_matching=true.



```

WARNING: Only the runs for which the `LST_version` parameter matches the `processed_lstchain_file` version in the LST database (i.e., the version used to evaluate the NSB level; generally the last available and processable version of a run) will be processed.
Expand Down Expand Up @@ -113,9 +96,9 @@ The command `dl1_production` does a series of things:

- Creates a directory with the target name within the directory `yourprojectname/{MCP_version}` and several subdirectories inside it that are necessary for the rest of the data reduction. The main directories are:
```
/fefs/aswg/workspace/yourname/yourprojectname/VERSION/
/fefs/aswg/workspace/yourname/yourprojectname/VERSION/{source}/DL1
/fefs/aswg/workspace/yourname/yourprojectname/VERSION/{source}/DL1/[subdirectories]
workspace_dir/VERSION/
workspace_dir/VERSION/{source}/DL1
workspace_dir/VERSION/{source}/DL1/[subdirectories]
```
where [subdirectories] stands for several subdirectories containing the MAGIC subruns in the DL1 format.
- Generates a configuration file called `config_DL0_to_DL1.yaml` with telescope ID information and adopted imaging/cleaning cuts, and puts it in the directory `[...]/yourprojectname/VERSION/{source}/` created in the previous step.
Expand All @@ -131,7 +114,7 @@ or

> $ squeue -u your_user_name

Once it is done, all of the subdirectories in `/fefs/aswg/workspace/yourname/yourprojectname/VERSION/{source}/DL1` will be filled with files of the type `dl1_MX.RunXXXXXX.0XX.h5` for each MAGIC subrun.
Once it is done, all of the subdirectories in `workspace_dir/VERSION/{source}/DL1` will be filled with files of the type `dl1_MX.RunXXXXXX.0XX.h5` for each MAGIC subrun.

WARNING: some of these jobs could fail due to 'broken' input files: before moving to the next step, check for failed jobs (through `job_accounting` and/or log files) and remove the output files produced by these failed jobs (these output files will generally have a very small size, lower than few kB, and cannot be read in the following steps)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,41 +8,34 @@
from .coincident_events import configfile_coincidence, linking_bash_lst
from .dl1_production import (
config_file_gen,
directories_generator_MC,
directories_generator_real,
lists_and_bash_gen_MAGIC,
lists_and_bash_generator,
)
from .job_accounting import run_shell
from .list_from_h5 import clear_files, list_run, magic_date, split_lst_date
from .merge_stereo import MergeStereo
from .merging_runs import merge, mergeMC, split_train_test
from .stereo_events import bash_stereo, bash_stereoMC, configfile_stereo
from .merging_runs import merge
from .stereo_events import bash_stereo, configfile_stereo

__all__ = [
"bash_stereo",
"bash_stereoMC",
"clear_files",
"configfile_coincidence",
"configfile_stereo",
"config_file_gen",
"directories_generator_real",
"directories_generator_MC",
"existing_files",
"fix_lists_and_convert",
"linking_bash_lst",
"lists_and_bash_generator",
"lists_and_bash_gen_MAGIC",
"list_run",
"magic_date",
"merge",
"mergeMC",
"MergeStereo",
"missing_files",
"rc_lines",
"run_shell",
"slurm_lines",
"split_lst_date",
"split_train_test",
"table_magic_runs",
]
Original file line number Diff line number Diff line change
@@ -1,32 +1,24 @@
directories:
workspace_dir : "/fefs/aswg/workspace/elisa.visentin/auto_MCP_PR/" # Output directory where all the data products will be saved.
# MC paths below are ignored if you set NSB_matching = true.
MC_gammas : "/fefs/aswg/data/mc/DL0/LSTProd2/TestDataset/sim_telarray" # set to "" if you don't want to process these Monte Carlo simulations.
MC_electrons : ""
MC_helium : ""
MC_protons : "/fefs/aswg/data/mc/DL0/LSTProd2/TrainingDataset/Protons/dec_2276/sim_telarray"
MC_gammadiff : "/fefs/aswg/data/mc/DL0/LSTProd2/TrainingDataset/GammaDiffuse/dec_2276/sim_telarray/"
workspace_dir: "/fefs/aswg/workspace/elisa.visentin/auto_MCP_PR/" # Output directory where all the data products will be saved.


data_selection:
source_name_database: "CrabNebula" # MUST BE THE SAME AS IN THE DATABASE; Set to null to process all sources in the given time range.
source_name_output: 'Crabtest' # Name tag of your target. Used only if source_name_database != null.
time_range : True # Search for all runs in a LST time range (e.g., 2020_01_01 -> 2022_01_01).
min : "2023_11_17"
max : "2024_03_03"
date_list : ['2020_12_15','2021_03_11'] # LST list of days to be processed (only if time_range=False), format: YYYY_MM_DD.
time_range: True # Search for all runs in a LST time range (e.g., 2020_01_01 -> 2022_01_01).
min: "2023_11_17"
max: "2024_03_03"
date_list: ['2020_12_15','2021_03_11'] # LST list of days to be processed (only if time_range=False), format: YYYY_MM_DD.
skip_LST_runs: [3216,3217] # LST runs to ignore.
skip_MAGIC_runs: [5094658] # MAGIC runs to ignore.

general:
base_config_file: '' # path + name to a custom MCP config file. If not provided, the default config.yaml file will be used
LST_version : "v0.10" # check the `processed_lstchain_file` version in the LST database!
LST_tailcut : "tailcut84"
focal_length : "effective"
simtel_nsb : "/fefs/aswg/data/mc/DL0/LSTProd2/TestDataset/sim_telarray/node_theta_14.984_az_355.158_/output_v1.4/simtel_corsika_theta_14.984_az_355.158_run10.simtel.gz" # simtel file (DL0) to evaluate NSB
lstchain_modified_config : true # use_flatfield_heuristic = True to evaluate NSB
proton_train_fraction : 0.8 # 0.8 means that 80% of the DL1 protons will be used for training the Random Forest.
nsb : [0.5, 1.0, 1.5, 2.0, 2.5, 3.0]
env_name : magic-lst # name of the conda environment to be used to process data.
cluster : "SLURM" # cluster management system on which data are processed. At the moment we have only SLURM available, in the future maybe also condor (PIC, CNAF).
NSB_matching : true # Set to false to process also the MCs. Set to true if adequate MC productions (DLx) are already available on the IT Container.
NSB_MC : 0.5 # extra noise in dim pixels used to process MCs; e.g., you could put here the average NSB value of the processed LST runs. Ignored if NSB_matching=true.
LST_version: "v0.10" # check the `processed_lstchain_file` version in the LST database!
LST_tailcut: "tailcut84"
simtel_nsb: "/fefs/aswg/data/mc/DL0/LSTProd2/TestDataset/sim_telarray/node_theta_14.984_az_355.158_/output_v1.4/simtel_corsika_theta_14.984_az_355.158_run10.simtel.gz" # simtel file (DL0) to evaluate NSB
lstchain_modified_config: true # use_flatfield_heuristic = True to evaluate NSB
nsb: [0.5, 1.0, 1.5, 2.0, 2.5, 3.0]
env_name: magic-lst # name of the conda environment to be used to process data.
cluster: "SLURM" # cluster management system on which data are processed. At the moment we have only SLURM available, in the future maybe also condor (PIC, CNAF).

Loading