Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project Progression & Updates #2

Open
pbluc opened this issue Dec 14, 2022 · 9 comments
Open

Project Progression & Updates #2

pbluc opened this issue Dec 14, 2022 · 9 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@pbluc
Copy link
Contributor

pbluc commented Dec 14, 2022

Provide self-updates on the progression of the project through a stand-up or daily scrum type format.

@pbluc
Copy link
Contributor Author

pbluc commented Dec 14, 2022

Week 1: November 21 – Getting set up

  • Talked with @dhruvbalwada about first project
    • Goal is to breakdown Ross’ paper into a comprehensive source of documentation to reference in the future
    • House this guide into a set of Jupyter notebooks
      • Potentially similar template to the L96 book
    • Particular bottlenecks and pain points to consider include converting from HPC to cloud setting, making the data live on the cloud, and understanding what starting point to come; in other words, what can be assumed about the reader’s background

TODOs:

  • Join the Slack workspace
  • Create Github repository [@dhruvbalwada]
  • Complete LEAP Tier 2 membership application to get card access to LEAP center and access to cloud deployment
  • Begin deep read of Ross’s paper and familiarize with the associating code
  • Outline and section up Jupyter notebooks
  • Get in contact with [@ Chris] for setting up notebooks and other logistics

@pbluc
Copy link
Contributor Author

pbluc commented Dec 14, 2022

Week 2: November 30 – Continuing to ramp up

  • Slower paced progress due to Thanksgiving break and traveling to visit family back home
  • Within the midst of ramping up: currently joined onto the Slack workspace but have yet to receive building access to the LEAP center and permissions to the Github repository [@dhruvbalwada]

TODOs:

  • Continue reading through Ross’s paper
  • Analyze code for running simulations with pyqg
  • Write up the Introduction notebook section Introduction.ipynb
  • Begin drafting the High Resolution Simulations notebook

@pbluc pbluc self-assigned this Dec 14, 2022
@pbluc pbluc added the documentation Improvements or additions to documentation label Dec 14, 2022
@pbluc
Copy link
Contributor Author

pbluc commented Dec 14, 2022

Week 4: December 14 – High-Res pyqg Simulations

  • Previous week, touched base with @dhruvbalwada on the depth of the notebook content and now moving forward with an approach of laying down the essential foundations so that more information can be built on top if needed
    • Additionally stick to a much more technical culmination of information for the notebooks, not straying on including the scientific aspects
    • Continue to keep in mind the use of the cloud and being careful in the details of reproducing the results made by Ross

TODOs:

  • Run high resolution simulations and compare against those by Ross
  • Add steps to reproduce Ross's generated graphs depicting high resolution PV snapshots
  • Begin drafting notebook on creating filters and coarsening operations to generate low resolution datasets from high resolution ones
  • Work on attaining building access

@pbluc
Copy link
Contributor Author

pbluc commented Dec 19, 2022

Week 5: December 19 – Reproducing Simulation Results

  • I've been able to run simulations of the high resolution pyqg models through the Jupyter notebook environment however the snapshots of the upper PV in the final timestamp differ from that of Ross's, despite supposedly using the same parameters passed to generating the model.
    • Upon further analysis of the model output, as expressed as a xarray Dataset, from my own simulations compared against that of Ross's, there are some dimension differences though I believe with more inspection I should be able to pinpoint the heart of it
  • Moving forward, I've included a troubleshooting section to each notebook, documenting the technical issues I came across for me when trying to run certain portions of code and the solutions I used to resolve them for the purpose of better serving fewer readers who work on replicating the on their own environments
  • For the next notebook section where filtering and coarsening filters will be put onto the generated high res datasets, the course of action is to guide the reader on how the filters are done computationally since Ross already has code in place for the operators/filters he used in his paper and how to tune any variables to their liking.

TODOs:

  • Determine the underlying difference between my simulation run output and that of Ross's results

@pbluc
Copy link
Contributor Author

pbluc commented Dec 26, 2022

Week 6: December 26 – Filtering and Coarse Graining

  • Now having reach the milestone of running high resolution simulations, the next milestone is to generate low resolution datasets by applying filtering and coarse graining operations to the high resolution datasets
    • Continue to keep in the back of mind upcoming milestones beyond this such as running the ML models on the datasets, making sense of metrics, and so forth

TODOs:

  • Finish generating diagnostic plots of high resolution output
  • Read section of Ross's paper that delves into the filtering operations employed
  • Get familiar with Ross's coarsening code
  • Begin drafting the next notebook on generating low resolution datasets with an analyzation of Ross's coarsening code and any tuning or values that can be played with when reproducing these results

@pbluc
Copy link
Contributor Author

pbluc commented Jan 6, 2023

Week 7: January 5 – Filtering and Coarse-Graining

  • First notebook of series, on running pyqg simulations, has been completed though subject to small revision throughout the creation of following notebooks
  • Reached milestone of filtering and coarse-graining high resolution simulations into low resolution datasets
  • Touched base with @dhruvbalwada on direction with which to approach including additional information within the notebooks (e.g., visual plots)
    • Continue moving forward with the broad idea of subjectivity on the readers' part in consideration and including information relevant to get the reader up and running to build their own plots beyond just attaining the data needed for the ML training

TODOs:

  • Finish second notebook of series on filtering and coarse-graining
  • Begin reading section of Ross's paper and existing notebooks pertaining to running the neural networks and metrics that were used
  • Start organizing/outline next notebook on ML training with the coarsened, low-resolution datasets
  • Get access to LEAP cloud service/platform

@pbluc
Copy link
Contributor Author

pbluc commented Jan 16, 2023

Week 9: January 16 – Neural Network ML Training

  • Completed compilation of notebook on filtering and coarse-graining the high-res simulations in to low-res datasets as well as attain and diagnose forcing terms
  • Within the past week or so, ran into permission/access issue with retrieving datasets through Globus (Access denied when attempting to retrieve datasets pyqg_parameterization_benchmarks#6)
    • Following closely with the rest of the group to resolve this issue by getting a more long-term solution set up with keeping Globus open
  • Working on the next milestone of running ML training sessions using network networks on the coarsened, low-res datasets
    • Have been annotating and digesting Sections 4-5 of the paper to form a foundation for moving forward into this next milestone
    • Intend to touch base soon with @dhruvbalwada on current progress and necessary preparations for ML training
      • Meeting Notes:
        • Clarified the language within the paper on parameterizations as well as online vs offline and the inherent relationship between online and offline testing
        • Focus on testing offline as of now before getting into online tests; similarly also focus scope to FCNNs before using symbolic regression and other hybrid models—encouraged to play around with various other parametrizations
        • Walked through setting up workspace on LEAP Jupyter Hub, transferring notebooks from local machine, and git versioning with notebooks
          • Julius is another POC for help working on the LEAP Jupyter Hub
        • Addressed Globus issue and steps to unblock for now, stray away from downloading the data as it is very large in size and work on other intermediary tasks
        • Meet again Thursday, Feb 2 @ 10:30 in LEAP

TODOs:

  • Meet 1:1 with @dhruvbalwada to discuss ML training checkpoint
  • Get in contact with Chris about moving notebooks to LEAP's Jupyter platform
  • Get familiar with Andrew's training code and measured metrics in neural_networks.py and online_metrics.py
  • Submit application to be affiliated with the project

@pbluc
Copy link
Contributor Author

pbluc commented Jan 30, 2023

Week 11: January 30 – Offline Testing of FCNN Parameterization

  • Made strides in annotating and analyzing code in neural_networks.py
  • Unblocked temporarily from Globus dataset access issue; awaiting permanent solution (Access denied when attempting to retrieve datasets pyqg_parameterization_benchmarks#6)
  • Attained official affiliation with the M2LInES project
  • Attended the M2LInES annual meeting online
    • Sat in on Nora's presentation of her work on utilizing reinforcement learning (RL) in the context of Andrew's work on sub grid parameterizations; it was interesting to see how other learning models can be used to solve the same problem
    • Took in the bigger overall structure of what is being worked on in the group and how they are all connected to one another
  • Was able to get pyqg installed onto Jupyter Hub workspace with the help of @jbusecke using conda/mamba versus pip (https://www.anaconda.com/blog/understanding-conda-and-pip)

TODOs:

  • Test first offline FCNN parameterization
  • Continue inspecting the code in neural_networks.py
  • Awaiting more onboarding items from Laure and Johanna
  • Meet 1:1 with @dhruvbalwada to touch base

@pbluc
Copy link
Contributor Author

pbluc commented Feb 22, 2023

Week 14: February 22 – Continuing Offline Testing

  • To make the data from Ross' study more accessible within LEAP and in response to the Globus access issue (Access denied when attempting to retrieve datasets pyqg_parameterization_benchmarks#6), Ross' datasets have been uploaded to my persistent storage bucket on Jupyter Hub to provide a more permanent means of getting the datasets in the future within the group
    • However, currently running into issues with retrieving one of the datasets (jet/forcing3.zarr) through Globus; looks to potentially be an issue with the data itself and how its stored
  • Successfully run an offline parameterization test with generated datasets and found nearly identical numbers of performance; will now test other parametrizations to see how performance differs

TODOs:

  • Run other parametrization tests offline
  • Examine other metrics for analyzing offline performance
  • Resolve issue with retrieving single dataset from Globus
  • Begin looking at online testing with FCNNs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant