Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wrf version #127

Open
wants to merge 36 commits into
base: main
Choose a base branch
from
Open

Add wrf version #127

wants to merge 36 commits into from

Conversation

matthiasdemuzere
Copy link
Owner

I get a lof of questions on the LCZ labels, and related errors because of using a specific WRF version.
See example here: #122

So I worked towards a more structural solution, where the user is required to provide the wrf version as an argument.

Based on this argument, the required parameters are set using this dict:

    wrf_versions_dict = {
        'v4.3': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
        'v4.3.1': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
        'v4.3.2': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
        'v4.3.3': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
        'v4.4': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
        'v4.4.1': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
        'v4.4.2': {'ADD_LCZ_INT': 50, 'NUM_LAND_CAT': 61},
        'v4.5': {'ADD_LCZ_INT': 50, 'NUM_LAND_CAT': 61},
        'v4.5.1': {'ADD_LCZ_INT': 50, 'NUM_LAND_CAT': 61},
        'v4.5.2': {'ADD_LCZ_INT': 50, 'NUM_LAND_CAT': 61}
    }

I tested the code locally, from a W2W perspective using the sample data.
As far as I can see it seems to work.

@jkittner and @andreazonato: could you have a look?

Also:

  • @jkittner: could you check if the tests still work?
  • @andreazonato: perhaps you can check whether the produced outputs actually work in relevant versions? If this is considered needed?

@matthiasdemuzere matthiasdemuzere added the enhancement New feature or request label Mar 12, 2024
setup.cfg Outdated
@@ -18,6 +18,7 @@ classifiers =
[options]
packages = find:
install_requires =
bottleneck>=1.3.4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? I don't think we're using that anywhere?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas was complaining that this version of bottleneck was required. Hence this addition.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm, this seems to be only an optional dependency from pandas:
https://github.com/pandas-dev/pandas/blob/f5d754d4fcaefff9ff08167a55426f3afe88b175/pyproject.toml#L67

did you try this in a clean environment? I works fine for me without it...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, yes, you are right ... this can indeed be removed then!

tests/w2w_test.py Show resolved Hide resolved
Comment on lines +42 to +56
# WRF version dict
WRF_VERSIONS_DICT = {
'v4.3': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
'v4.3.1': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
'v4.3.2': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
'v4.3.3': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
'v4.4': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
'v4.4.1': {'ADD_LCZ_INT': 30, 'NUM_LAND_CAT': 41},
'v4.4.2': {'ADD_LCZ_INT': 50, 'NUM_LAND_CAT': 61},
'v4.5': {'ADD_LCZ_INT': 50, 'NUM_LAND_CAT': 61},
'v4.5.1': {'ADD_LCZ_INT': 50, 'NUM_LAND_CAT': 61},
'v4.5.2': {'ADD_LCZ_INT': 50, 'NUM_LAND_CAT': 61},
}


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's have the dict in a single location and add new versions only here. This automatically populates the argument validation.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this comment/question. What do you mean by this? You want to have this in a separate file? Or ...?
It is probably not a bad idea, since this dict probably needs updates whenever a new WRF version is released.
Even though the next WRF release will have W2W integrated into the WPS/WRF system itself ...

Copy link
Collaborator

@jkittner jkittner Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, just as a module-level constant as it currently is so in one place only. This way, changing things in that dict will automatically change e.g. the help texts, the valid choices etc.
This comment was basically just me explaining why I moved that.

w2w/w2w.py Show resolved Hide resolved
w2w/w2w.py Show resolved Hide resolved
@andreazonato
Copy link
Collaborator

Yes! Defininetely makes sense. Thanks @matthiasdemuzere !

I'l try it tomorrow

@matthiasdemuzere
Copy link
Owner Author

Thanks for reviewing @jkittner. I see now that all tests pass. That is great.
I have now also updated the README file to reflect the wrf version as a required argument.

@jkittner
Copy link
Collaborator

I have now also updated the README file to reflect the wrf version as a required argument.

oh of course - I always forget about docs 🤣

@dargueso
Copy link
Collaborator

Also tested locally on my end with my custom data and v4.5.2, it works fine as well. Outputs look correct.

@dargueso
Copy link
Collaborator

Hmmm the truth is that the new version doesn't calculate the NoUrban well, my guess is that it is looking for the previous categories. I'll need to fix that. It probably broke in previous versions, not this PR. I'll look at it.

@andreazonato
Copy link
Collaborator

Hmmm the truth is that the new version doesn't calculate the NoUrban well, my guess is that it is looking for the previous categories. I'll need to fix that. It probably broke in previous versions, not this PR. I'll look at it.

Mmmm, what is the LU_INDEX you are having in this file?

@dargueso
Copy link
Collaborator

Mmmm, what is the LU_INDEX you are having in this file?

So the attribute ISURBAN still 13 despite the fact that urban areas are actually categories 50-60. This means that the code looks for LU_INDEX that matches ISURBAN and replace them, but there are none and the urban areas are still there with there original category in the range 50-60. I've got this fixed now and will commit to this PR once I've tested it.

However, the resulting file does not work with my setup in which I have 3 domains, a larger one and then 2 identical domains, except for the urban areas. The file with LCZ_params doesn't work either and it looks like its related to GREENFRAC

@dargueso
Copy link
Collaborator

I've also fixed the issue with the GREENFRAC. The problem was that the urban areas were also identified with ISURBAN to replace GREENFRAC. Now it will check the number of categories and select urban areas including LCZ categories too.

@matthiasdemuzere
Copy link
Owner Author

Hi @dargueso, thanks for your feedback on this.

Thus far I did not consider the fact that users might feed W2W a geo_em file that already contains LCZ labels, numbered from 50-61, and derived from the MODIS-CGLC-LCZ dataset.

This makes things even more confusing.

What I'd like to suggest:

  • When launching W2W, it checks for the existence of LCZ labels 50..61

  • if yes, relabel these pixels to 13 => continue the tool (removing urban, adding LCZs, adding parameters, ...)

  • if no, continue the tool (removing urban, adding LCZs, adding parameters, ...)

This starts to become a bit of a circular behaviour, but at the moment I do not see another solution?
I briefly checked this with @andreazonato as wel, and he also thinks this is fine.

So, I can try to add this, and then hopefully you can test again (with a WRF version that understand labels 30..41 and labels 50..61)?

M.

@dargueso
Copy link
Collaborator

dargueso commented Mar 18, 2024 via email

@matthiasdemuzere
Copy link
Owner Author

I’ve done that already, it does something similar to what you suggest, except that those changes are implemented only where the code checks ISURBAN.

I see. And is ISURBAN == True the same as LU_INDEX isin [50...61]?
Note that I am not sure what ISURBAN does? Is it used elsewhere in WRF? The reason I am asking is that I don't think W2W changes this layer, after implementing the new LCZ-based extent?

If your code does more or less what I suggested, then feel free to add it to this branch. Then I can have a look! Thanks!

@dargueso
Copy link
Collaborator

dargueso commented Mar 18, 2024 via email

@matthiasdemuzere
Copy link
Owner Author

ISURBAN is an attribute of the netCDF file and it is a number (13) not a boolean. So if you check which grid points meet ISURBAN, in the new files with CGLC-MODIS, it will be none. I will add my changes to the branch. On 18 Mar 2024, at 15:22, Matthias Demuzere @.> wrote: I’ve done that already, it does something similar to what you suggest, except that those changes are implemented only where the code checks ISURBAN. I see. And is ISURBAN == True the same as LU_INDEX isin [50...61]? Note that I am not sure what ISURBAN does? Is it used elsewhere in WRF? The reason I am asking is that I don't think W2W changes this layer, after implementing the new LCZ-based extent? If your code does more or less what I suggested, then feel free to add it to this branch. Then I can have a look! Thanks! — Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/matthiasdemuzere/w2w/pull/127*issuecomment-2004054014__;Iw!!D9dNQwwGXtA!RTkum9YXmFWLue_g8e6pPytoF6s2jqDVPdIqSSBMAgY9clhqc6AXOnHejVJ-mqczdMGBUc7K0DH_D0PPamYBRYWK$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABAA5Y3KZG5K4HEJZ5LCCXDYY32CVAVCNFSM6AAAAABESJ2WD6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBUGA2TIMBRGQ__;!!D9dNQwwGXtA!RTkum9YXmFWLue_g8e6pPytoF6s2jqDVPdIqSSBMAgY9clhqc6AXOnHejVJ-mqczdMGBUc7K0DH_D0PPalit7x7R$. You are receiving this because you were mentioned.Message ID: @.>

@dargueso: hold your horses! I was just thinking about an alternative strategy ... see email for follow-up!

w2w/w2w.py Outdated
luf[dst_data.ISURBAN-1]=np.sum(luf.values[30:41,:],axis=0)
luf[30:41,:]=0


Copy link
Owner Author

@matthiasdemuzere matthiasdemuzere Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether this should be formulated more generically? In the following sense:

For the built LCZ types, W2W considers by default LCZ labels 1 to 10, see here:

w2w/w2w/w2w.py

Line 80 in 84f7d80

parser.add_argument(

This information is then stored in Info.BUILT_LCZ.

But, a user might also define eg. LCZ 15 (=LCZ E / Bare rock or Paved) as built-up, by providing this as an argument to the tool.

I am not sure if:

  1. anyhone has used this yet?
  2. the tool is actually fully functional when using this argument?

But, perhaps this option should be considered to some extend, by using:

  • luse<=41 and luse<=61?
  • and using :42 and :62, instead of :41 and :61?

Does this makes sense?

w2w/w2w.py Outdated
urban_cat_list = [31, 32, 33, 34, 35, 36, 37, 38, 39, 40, urban_cat]
else:
urban_cat_list = [urban_cat]

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see previous comment. Perhaps here the urban_cat_list can be defined depending on Info.LCZ_BUILT, that by default has [1 .. 10], but might also be [1 .. 11]?

urban_cat_list = LCZ_URBAN + [urban_cat]
else:
urban_cat_list = [urban_cat]

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here as above ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we can clean that a little bit. Perhaps using Info.LCZ_BUILT and dst_data_orig.NUM_LAND_CAT. But not sure how to deal with those options you mention: LCZ 11 and LCZ 15

@matthiasdemuzere
Copy link
Owner Author

matthiasdemuzere commented Apr 19, 2024

Hi @andreazonato, @dargueso,

Here a quick summary of what we discussed at EGU24, including some tasks:

The enhancement of W2W is still an intermediate solution, to make sure it still works with WRF versions >= v4.4.2.
Andrea has been working on integrating W2W completely into WPS/WRF itself, a development that will hopefully be available in the next WRF release.

Until that time, users can use W2W in three different ways, depending on WRF version:

Case A (WRF < v4.4.2)

  • natural land use is the default MODIS map
  • LCZ information provided by the LCZ .tif file
  • LCZ information used with labels 40-51
  • set use_wudapt_lcz =1, so that also the LCZ parameter table will be used.

Case B (WRF >= v4.4.2)

  • B.1: geog_data_res = “default” (WPS namelist)

    • natural land use is the default MODIS map
    • LCZ information provided by the LCZ .tif file
    • LCZ information used with labels 50-61
    • set use_wudapt_lcz =1, so that also the LCZ parameter table will be used.
  • B.2: geog_data_res = “cglc_modis_lcz+default” (WPS namelist)

    • natural land use is the new CGLC-MODIS-LCZ map
    • LCZ extent is initially provided by CGLC-MODIS-LCZ. By using W2W, this LCZ extent will be overwritten by the LCZ extent provided in the .tif file. The use of W2W is still required in this case, as the cglc_modis_lcz product only adds the LCZ extent, and not yet the LCZ-based parameters required for WRF.
    • LCZ information used with labels 50-61
    • set use_wudapt_lcz =1, so that also the LCZ parameter table will be used.

Required steps to finalize this development:

  • @dargueso check if code works for the three use cases above
  • @dargueso / @matthiasdemuzere: make sure that BUILT_LCZ argument (resulting in 10 or 11 classes) is used throughout the routines.
  • @andreazonato @dargueso @jkittner: anything missing/wrong in the above description of the use cases? I think the sentence "The use of W2W is still required in this case, as the cglc_modis_lcz product only adds the LCZ extent, and not yet the LCZ-based parameters required for WRF" is at the core of all of it, but I am not sure this is clear enough? Perhaps we should have a more extended paragraph to explain this?
  • @jkittner and @matthiasdemuzere: make sure tests align with new implementations
  • @matthiasdemuzere: document these changes, both in the repo README, and perhaps also in JOSS .pdf version?
  • @matthiasdemuzere: indicate in the documentation that use_wudapt_lcz must be set to 0 when using the rural-only file generated by w2w in versions 4.4.2 and above
  • @jkittner merge final changes in new major version and push to pypi

If something is not clear or missing, please report in this space!

@dargueso
Copy link
Collaborator

I'm already testing that the code works for the three cases and that the generated files work with their respective WRF versions.
@andreazonato, I mentioned this before to you but it was never clear to me to what extent you're aware of the issue. If only a few LCZ are actually represented in the input file (geo_em.d01.nc), WRF crashes. This is regardless of where the geo_em file comes from and it crashes even if the file is directly generated with WPS. For example, I've tested a domain in the Sahara desert to make sure there are no cities and I get this error message:

Climatological albedo is used instead of table values
INITIALIZE THREE Noah LSM RELATED TABLES
Skipping over LUTYPE = USGS
 LANDUSE TYPE = MODIFIED_IGBP_MODIS_NOAH FOUND          20  CATEGORIES
 INPUT SOIL TEXTURE CLASSIFICATION = STAS
 SOIL TEXTURE CLASSIFICATION = STAS FOUND          19  CATEGORIES
-------------- FATAL CALLED ---------------
USING URBPARM_LCZ.TBL WITH OLD 3 URBAN CLASSES. SET USE_WUDAPT_LCZ=0
-------------------------------------------
Abort(1) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

It is obviously not the purpose of this repo or this discussion because it is a more general issue, but it affects w2w users when using the domains with urban areas removed. It is linked to this wrf-model/WRF#1878 and this #125

@andreazonato
Copy link
Collaborator

Hi guys,

@matthiasdemuzere , thank you for summarizing. I do not think there are additional issues. Maybe we will still reply and help W2W users until the W2W is inside WRF ( I guess, end of the year, depending on the reviews on the paper)

@dargueso Yes, this issue does not depend on w2w. I discussed it with Cenlin He, and we actually did not come to a solution actually, since there is no variable to stick to put a flag and decide whether to use LCZ or not. I will try to reach him again and see if we can find a possibility to solve it.

Thank you for testing

@matthiasdemuzere
Copy link
Owner Author

I'm already testing that the code works for the three cases and that the generated files work with their respective WRF versions.

Thanks @dargueso. I hope all works?
When you are done, we can continue finalizing the last adjustments, including treatment of 11 BUILT_LCZs and appropriate tests.

Also, @andreazonato mentioned the existence of a new matrix that was added, that contains, per pixel, the fraction of each LCZ label from the sub-grid scale information? I believe this variable should be added as a dummy, no? Should this be added only for WRF >= v4.4.2? @andreazonato ?

@andreazonato
Copy link
Collaborator

@matthiasdemuzere I think there is no need to add additional variables.
@dargueso is simply taking advantage of this variable (LANDUSEF, B.2: geog_data_res = “cglc_modis_lcz+default”) , to derive the percentage of each LCZ, and therefore compute the average. So it is simply something in the pre-processing and there is no need to bring it to the simulation step.
@dargueso, correct me if I'm wrong

@matthiasdemuzere
Copy link
Owner Author

@matthiasdemuzere I think there is no need to add additional variables.
@dargueso is simply taking advantage of this variable (LANDUSEF, B.2: geog_data_res = “cglc_modis_lcz+default”) , to derive the percentage of each LCZ, and therefore compute the average. So it is simply something in the pre-processing and there is no need to bring it to the simulation step.
@dargueso, correct me if I'm wrong

@andreazonato, just to make sure I 100% understand.

At EGU you mentioned that, when users use “cglc_modis_lcz+default”, a matrix was added in geo_em, that contains the LCZ fractions for every pixel. Information that is then used within WRF to get the appropriate morphological parameters.

But this functionality will only become available when your new WRF version is released?

Until that time, this matrix variable is not needed / used?

I just want to avoid that, when users use their own LCZ map from the .tif, in reality LCZ information from "cglc_modis_lcz+default" is used within the model?

@dargueso
Copy link
Collaborator

I think we're running in circles.

I don't use LANDUSEF (the matrix @matthiasdemuzere was referring too all along and contains percentage of each land use category in each grid cell). That matrix is in the original geo_em, it is not added by @andreazonato' WRF version, but it may be altered.

Now, if I ignore LANDUSEF in W2W as before, it will defined in a way that makes WRF resets all LCZs to some rural land use. This is because @andreazonato's code activated with use_wudapt_lcz looks into this variable, but previous WRF versions didn't . I found this in v4.5.2, but it probably happens some versions back too.

My approach was thus to add info to LANDUSEF variable based on the category assigned by W2W to each pixel. Hence for LCZ pixels is 1 or 0, and is 1 only for one LCZ. There is no sub grid information in LANDUSEF.

As far as I can tell, this prevents the issue below:

I just want to avoid that, when users use their own LCZ map from the .tif, in reality LCZ information from "cglc_modis_lcz+default" is used within the model?

Is this clear?
I have tested the code for all three cases and adapt it to make it work. Now I need to test that WRF ingest the w2w data well in all cases and doesn't complain.

@matthiasdemuzere
Copy link
Owner Author

I think we're running in circles.

Sorry, my fault. I just find it all too confusing.

But thanks for explaining further, I believe I understand now.
Let's hope WRF does not complain in each of these three cases!

I'll await that result before working further on this (tests, documentation, ...)

matthiasdemuzere and others added 25 commits September 15, 2024 23:08
Fixes to ingest geo_em files with LCZ labels from 31 to 40 or from 50 to 60.
Setting NUM_LAND_CAT to 21 in the NoUrb output, which has cities removed.
Some formatting too.
Change order of defining new landuse
@dargueso
Copy link
Collaborator

I think I can check the Shanghai case this week. I also found geo_em files with 20 categories last month.

@matthiasdemuzere
Copy link
Owner Author

okay, I rebased this against the main branch and resolved a few dozen conflicts.

There is just one thing left before this is ready to go:

The test test_main_shanghai_data is failing because it only provides 20 categories, so it raises:

Number of land categories 20 in original file not supported. Only 21, 41 or 61 are supported.

Can someone check why this is the case? Have the test files been wrong all along or did we change/break behavior? so just to make sure, this did work before this patch (see main branch).

I think then, finally, this would be ready to be merged and released to pypi.

After my rebase onto main, you will likely have to reset your branch using

git fetch --all
git reset --hard origin/add_wrf_version

otherwise you will have to resolve all these conflicts locally once again.

This is really nice @jkittner, well done!
And thanks also @dargueso for looking into the Shanghai number of land use classes.

Perhaps another thing that should be done: add more information in the README on the various WRF versions, and how w2w relates to the automated use of the hybrid land cover datasets that already includes LCZs?

Is this description still valid? @andreazonato @dargueso?

@andreazonato
Copy link
Collaborator

Hi all,
I think that the problem of 20 categories is not linked to LCZ. It should be linked to the 21th category, that is LAKE. Maybe there are no lakes there, so 20 categories is still ok. Maybe we can avoid the failing putting 20 categories in the numbers allowed?

Andrea

@dargueso
Copy link
Collaborator

dargueso commented Sep 19, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants