Update to b2mn parsing #23

eldond · 2024-02-02T21:05:12Z

Intended to close b2mn.dat parser only works in some cases; not flexible enough to cover range of options for setting up b2mn.dat #21
Add more samples (@anchal-physics maybe these need to be moved into dvc?)
- Also, maybe we should make sure the samples I rounded up aren't all redundant
Use all the samples in the test
Port some nuances from the section of OMFIT-source/omfit/omfit_classes/omfit_solps.py that parses b2mn or b2ag.
- This python parser in OMFIT has been used more extensively and is more likely to handle variants of b2mn; maybe this change alone will solve the problem.

- Hopefully this handles a wider variety of b2mn instances

eldond · 2024-02-02T21:07:00Z

@sbdepascuale If you check out this branch within SOLPS2IMAS, you can test whether SOLPS2IMAS.read_b2mn_output() now works on your file. This can be done either by importing just SOLPS2IMAS and only calling read_b2mn_output(), or by doing the whole workflow test after updating.

If you have an existing julia session open, you will have to restart julia unless you imported Revise (import Revise) prior to loading SOLPS2IMAS.

sbdepascuale · 2024-02-02T23:04:42Z

Can confirm demo.ipynd workflow correctly executes all calls with KSTAR case under b2mn_parse branch of SOLPS2IMAS. Furthermore, interferometer data extraction from "ids.interferometer" functions as expected.

anchal-physics · 2024-02-05T18:18:07Z

Few more checks need to be added to conform with all possibilities of the file. I'm adding Jeremy's email instructions here for documentation.

Empty lines are ignored

Should always start with a section like this. Here each variable is optional (Maybe not label?). You might want to grab the label but the rest can be skipped for our purposes (jump to endphy)

label (lblmn: character60)

'F57: D+Ne'

*b2cmpa basic parameters

*b2cmpb boundary conditions

*b2cmpt transport coefficients

*cfsig (0) (1) (2) (3) (4) (5) (6) (7)

'-1' 4.0e-05 0.0e+00 0.0e+00 0.0e+00 0.0e+00 0.0e+00 0.0e+00 0.0e+00

*cfalf (0) (1) (2) (3) (4) (5) (6) (7)

'-1' 4.0e-05 0.0e+00 0.0e+00 0.0e+00 0.0e+00 0.0e+00 0.0e+00 0.0e+00

*cflim (0) (1) (2) (3) (4) (5) (6) (7)

'-1' 3.00E-01 6.00E-01 5.00E-01 0.00E+00 0.00E+00 0.00E+00 0.00E+00 0.00E+00

*endphy

So this section could also just be

label (lblmn, character60; free format)

'ITER_D+He+Ne_nodrift_FPO_100MW_(target=Be,gaspuff=top)_cNe=0.8%_Dpuff=1.85e23'

*b2cmpa basic parameters

*b2cmpb boundary conditions

*b2cmpt transport coefficients

*endphy

After this there are name-value pairs
Characters #, *, and ! indicate comments. These can be after name value pairs or if at the beginning of the line the whole line is skipped
Name value pairs are matched case insensitive but exact strings. These have at least two valid representations:
    Value and name in single quotes
    'b2mwti_jxa'   '36' # optional comment
    Name in single quotes but value is not
    'b2mwti_jxa'   36 # optional comment
    I think double quotes work here as well. The comment section can also contain quotation marks
Due to exact string matching, be aware of “commented out” variables that could look like this.
    '#b2mwti_jxa'   36 # Variable unused
    '!b2mwti_jxa'   36 # Variable unused
If a name value pair is repeated the last one is used

I'll edit some sample files to incorporate all these test cases and test. I think we just need to add some more checking for special characters and also change all case to lower before adding keys.

anchal-physics · 2024-02-05T18:28:30Z

Found another bug in the current implementation. It distinguishes between Float64 and Int data types by checking if there is a '.' in the value field, but many value fields are written in '1e-10' format, so they get parsed as strings. I think, we should not use hidden error catching and instead should have designated code for all cases we can think of so that the parser fails when it is not parsing correctly.

eldond · 2024-02-05T19:38:36Z

Oh, oops. All the samples I saw were like 1.0e-10. But if this file is human writable (and it is), then I guess 1e-10 could happen, too.

jlore · 2024-02-05T20:22:27Z

The type will depend on the original Fortran, you cannot necessarily tell from the format of the value. Same goes for booleans and ints.

Removed: deleted: samples/b2mn.dat.sample_50dn deleted: samples/b2mn.dat.sample_50xd deleted: samples/b2mn.dat.sample_si1 deleted: samples/b2mn.dat.sample_si2 Added: new file: samples/b2mn.dat.json.dvc new file: samples/test_b2mn.dat.dvc new file: samples/test_b2mn.dat.json.dvc Adding a json file of correctly parsed version of b2mn.dat file that is already in the samples directory. Also added a new test_b2mn.dat file that is manually edited to cover all edge cases of the parser and a json file version of the same that has been correctly parsed (checked manually).

Updated parser so that it reads b2mn.dat file with following rules: * Strip all leading and trailing white spaces (including tabs) in a line * Ignore all lines until *enphy is found * Characters #, *, and ! indicate comments. These can be after name value pairs or if at the beginning of the line the whole line is skipped * Name value pairs are matched case insensitive but exact strings. These have at least two valid representations: * Value and name in single quotes * 'b2mwti_jxa' '36' # optional comment * Name in single quotes but value is not * 'b2mwti_jxa' 36 # optional comment * Double quotes work here as well. * The comment section can also contain quotation marks * Due to exact string matching, be aware of “commented out” variables that could look like this. * '#b2mwti_jxa' 36 # Variable unused * '!b2mwti_jxa' 36 # Variable unused * If a name value pair is repeated the last one is used * Read a value containing '.' or 'e' as a float and the rest as an integer.

eldond · 2024-02-06T01:56:38Z

src/parser.jl

+                end
+                # Get key and value in lowercase
+                key = lowercase(name_value[1])
+                value = lowercase(name_value[2])


Does this file have arrays in it? The python version seems to be expecting to get multiple values sometimes, but it also is supposed to work for two of the SOLPS files. Maybe b2mn doesn't have multiple values; maybe that's the other file.

I'm not sure. @jlore didn't mention any possibility of arrays in these fields, but we can expand it if that is a possibility.

anchal-physics · 2024-02-06T02:23:31Z

@jlore Currently, the code is reading anything with a decimal point or 'e' in it as float and everything else as int. Is there a possibility that this might cause an issue later. Do we have a list of all the fields that fortran keeps as float and all that fortran keeps as int (maybe indices). Or is there a nomenclature that we can use to determine this information. The julia functions are all type specific and using floats and ints interchangeably will cause errors in future. However, right now, solps2imas only uses b2mn.dat to read inner and outter midplane indices defined in 'b2mwti_jxi' and 'b2mwti_jxa'.

jlore · 2024-02-06T13:24:07Z

@anchal-physics

There is no requirement in SOLPS that a REAL type be initialized using 'e' or a decimal point, so I don't think you should make that assumption here. Is Julia strictly typed in that you have to know the Fortran type? We can, of course, get the type from the Fortran source but I'm not sure why there is a need.

Also, I don't think it is required to read "all" of the variables from b2mn.dat for our project. Why not just look for the small number of variables we will actually use (for which I can tell you the Fortran type) and ignore the rest. There are probably at least 50 variables that are not present in any of the b2mn.dat examples you are using, and these often change with the code base. Also, SOLPS ignores variable names that it doesn't use, which is important so that an obsolete variable definition does not cause the code to stop.

Suggest that you just make a list of the variables that are actually needed for the project and we make a robust parser that works for these. I can then also provide you the default values (and/or logic for computing the defaults) for cases where they are not present.

eldond · 2024-02-06T17:10:45Z

I just think it's awkward to parse number of steps (for example) as a float when it's clearly an int. Maybe we can parse all numbers as float64 by default and keep a list of things to force to int?

The python implementation was intended to preserve type more carefully because it was used to read as well as write, and writing 3.5 steps seems like it wouldn't do anything good.

jlore · 2024-02-06T17:12:58Z

Agreed. Suggest specific treatment of a limited number of variables. If you have a list of the ones you use then I can confirm the original type.

anchal-physics · 2024-02-06T19:03:51Z

From the discussion, we have these two possibilities:

Whitelisting solution

Our code, specifically, SOLPS2IMAS.solps2imas() function, only uses b2mwti_jxa and b2mwti_jxi which I know must be integers as they are indices. I'm not sure if default values can be given for these indices.
But we thought if we limit to only reading these two, the parser will not be useful for further use if someone wants to pick up some field from b2mn while using our code to do something different. That's why we were looking for a more general solution.

Handle types in code where values are used

Since Julia is type specific, we should write Julia code to force cast into integer or float whenever we use the read values. This is the case already in solp2imas() actually https://github.com/ProjectTorreyPines/SOLPS2IMAS.jl/blob/37b04d7ca0eaba0251f3029590152808dbee8d35/src/SOLPS2IMAS.jl#L99
If we use this solution, we just have to be careful when we write code in future using values from b2mn.dat . In this case, we can merge the current parser branch as it is.

Let me know what you all prefer.

jlore · 2024-02-06T19:09:01Z

I can certainly provide the logic for default values of jxa and jxi. It depends on the grid topology, but it is straightforward

eldond · 2024-02-07T21:24:47Z

@jlore Yes, please. It's valuable to have a way to set defaults for anything that's used as much as jxa and jxi. Beyond that, I think the logic should be:
Try to parse as float. Fail? Parse as string and record so value can be inspected. Okay? Check list of things that are numbers but not floats and change type. Otherwise leave as float. We don't have to list all the integers in the file, just the ones that index the mesh and stuff like that where using a float as an index would mess things up. And yes, the client could force our float into an int if they need to downstream from the file parsing, but why not get some of it right if we can?

List of things to force to non-float if possible:

b2mwti_jxa: Int
b2mwti_jxi: Int
b2mndr_ntim: Int

List of things to populate using default values if they're missing:

b2mwti_jxa: <formula needed>
b2mwti_jxi: <formula needed>
b2mndr_ntim: nothing (make it obvious that this is an important value but that we didn't get it)

Created a new file in src/b2mn_int_fields.txt where names of all known int fields are listed. This file is used to check if a field value should be converted to integer. Also, if there is a second value listed in this file in a row, that field is considered to be required and a default value is used if it is not found in the input file. Updated sample json files with additional int fields that get added by default.

anchal-physics · 2024-02-08T20:02:23Z

I used the useful comments in b2mn.dat present in ITER_Lore_2296_00000/run_time_dep_EIRENE_jdl_to_ss_cont_sine2_2d_output to create a list of all integer fields and default values wherever they were listed in the comments. Please see b2mn_int_fields.txt to check the list.

jlore · 2024-02-08T20:27:59Z

Ah good, I'm glad my comments were useful!

anchal-physics · 2024-02-09T23:02:12Z

I realized that since @eldond opened this PR originally, and I pushed commits after that, I can't set him as a reviewer for this PR. Thus, I've set @dautt-silva as a reviewer. Please complete the review and mark approve if it looks good.

eldond · 2024-02-13T01:16:44Z

Good work.

jlore · 2024-02-15T17:04:14Z

Here is some logic for default values of jxa, jxi.

First you need to identify the topology. Use nncut and nnreg from b2fgmtry
if (Geo.nncut == 1) && (Geo.nnreg(1) == 4) % SN case
Geo.geometry = 'SN';
elseif (Geo.nncut == 2) && (Geo.nnreg(1) == 8) && (Geo.nnreg(2) == 13) % DDN case
Geo.geometry = 'DDN';
elseif (Geo.nncut == 2) && (Geo.nnreg(1) == 8) && (Geo.nnreg(2) == 12) % CDN case
Geo.geometry = 'CDN';
endif

Then you can guess jxa, jxi. These guesses can be pretty far off depending on the poloidal spacing, but these are the code defaults. Use leftcut and rightcut from b2fgmtry.

if strcmp(Geo.geometry,'SN')
Geo.jxa_guess = ceil(Geo.rightcut(1) - (Geo.rightcut(1) - Geo.leftcut(1))/4);
Geo.jxi_guess = floor(Geo.leftcut(1) + (Geo.rightcut(1) - Geo.leftcut(1))/4);
elseif strcmp(Geo.geometry,'DDN') || strcmp(Geo.geometry,'CDN')
Geo.jxa_guess = ceil((Geo.rightcut(1) + Geo.rightcut(2))/2);
Geo.jxi_guess = floor((Geo.leftcut(1) + Geo.leftcut(2))/2);
end

eldond · 2024-02-15T19:12:41Z

Since this change to b2mn parsing was already merged, I opened #26 to keep track of the proposed method for guessing jxa and jxi.

eldond added 3 commits February 2, 2024 12:03

Add more b2mn sample files

c6251e2

- #21

Add file parsing test set to test the new group of b2mn samples

c7fd8c0

Adapt OMFIT's function for parsing b2mn

23346ab

- Hopefully this handles a wider variety of b2mn instances

eldond requested a review from anchal-physics February 2, 2024 21:05

eldond self-assigned this Feb 2, 2024

eldond assigned anchal-physics Feb 5, 2024

anchal-physics added 2 commits February 5, 2024 13:08

eldond commented Feb 6, 2024

View reviewed changes

anchal-physics assigned anchal-physics and unassigned eldond and anchal-physics Feb 9, 2024

anchal-physics requested a review from dautt-silva February 9, 2024 23:00

eldond merged commit b976a3e into dev Feb 13, 2024
1 check passed

eldond deleted the b2mn_parse branch February 13, 2024 01:16

eldond mentioned this pull request Feb 15, 2024

Add guesses for jxa and jxi to b2mn parsing #26

Open

eldond assigned anchal-physics Feb 22, 2024

anchal-physics mentioned this pull request Mar 15, 2024

Adapt to b2mn.dat parser to handle irregular files, with specific flags to each run ProjectTorreyPines/SOLPS2ctrl.jl#36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to b2mn parsing #23

Update to b2mn parsing #23

eldond commented Feb 2, 2024

eldond commented Feb 2, 2024

sbdepascuale commented Feb 2, 2024

anchal-physics commented Feb 5, 2024

anchal-physics commented Feb 5, 2024

eldond commented Feb 5, 2024

jlore commented Feb 5, 2024

eldond Feb 6, 2024

anchal-physics Feb 6, 2024

anchal-physics commented Feb 6, 2024

jlore commented Feb 6, 2024

eldond commented Feb 6, 2024

jlore commented Feb 6, 2024

anchal-physics commented Feb 6, 2024

jlore commented Feb 6, 2024

eldond commented Feb 7, 2024

anchal-physics commented Feb 8, 2024

jlore commented Feb 8, 2024

anchal-physics commented Feb 9, 2024

eldond commented Feb 13, 2024

jlore commented Feb 15, 2024 •

edited

Loading

eldond commented Feb 15, 2024

Update to b2mn parsing #23

Update to b2mn parsing #23

Conversation

eldond commented Feb 2, 2024

eldond commented Feb 2, 2024

sbdepascuale commented Feb 2, 2024

anchal-physics commented Feb 5, 2024

anchal-physics commented Feb 5, 2024

eldond commented Feb 5, 2024

jlore commented Feb 5, 2024

eldond Feb 6, 2024

Choose a reason for hiding this comment

anchal-physics Feb 6, 2024

Choose a reason for hiding this comment

anchal-physics commented Feb 6, 2024

jlore commented Feb 6, 2024

eldond commented Feb 6, 2024

jlore commented Feb 6, 2024

anchal-physics commented Feb 6, 2024

Whitelisting solution

Handle types in code where values are used

jlore commented Feb 6, 2024

eldond commented Feb 7, 2024

anchal-physics commented Feb 8, 2024

jlore commented Feb 8, 2024

anchal-physics commented Feb 9, 2024

eldond commented Feb 13, 2024

jlore commented Feb 15, 2024 • edited Loading

eldond commented Feb 15, 2024

jlore commented Feb 15, 2024 •

edited

Loading