Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GA features update - Having multi-objective optimization (NSGA-II ) #2326

Open
wants to merge 121 commits into
base: devel
Choose a base branch
from

Conversation

JunyungKim
Copy link
Collaborator


Pull Request Description

What issue does this change request address? (Use "#" before the issue to link it, i.e., #42.)

#2069

What are the significant changes in functionality due to this change request?

For Change Control Board: Change Request Review

The following review must be completed by an authorized member of the Change Control Board.

  • 1. Review all computer code.
  • 2. If any changes occur to the input syntax, there must be an accompanying change to the user manual and xsd schema. If the input syntax change deprecates existing input files, a conversion script needs to be added (see Conversion Scripts).
  • 3. Make sure the Python code and commenting standards are respected (camelBack, etc.) - See on the wiki for details.
  • 4. Automated Tests should pass, including run_tests, pylint, manual building and xsd tests. If there are changes to Simulation.py or JobHandler.py the qsub tests must pass.
  • 5. If significant functionality is added, there must be tests added to check this. Tests should cover all possible options. Multiple short tests are preferred over one large test. If new development on the internal JobHandler parallel system is performed, a cluster test must be added setting, in XML block, the node <internalParallel> to True.
  • 6. If the change modifies or adds a requirement or a requirement based test case, the Change Control Board's Chair or designee also needs to approve the change. The requirements and the requirements test shall be in sync.
  • 7. The merge request must reference an issue. If the issue is closed, the issue close checklist shall be done.
  • 8. If an analytic test is changed/added is the the analytic documentation updated/added?
  • 9. If any test used as a basis for documentation examples (currently found in raven/tests/framework/user_guide and raven/docs/workshop) have been changed, the associated documentation must be reviewed and assured the text matches the example.

JunyungKim and others added 30 commits February 19, 2023 12:48
@Jimmy-INL Jimmy-INL requested a review from wangcj05 July 15, 2024 15:09
Copy link
Collaborator

@wangcj05 wangcj05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed part of the changes, and I have some general comments for you to consider.

@@ -43,7 +43,7 @@ Note all install methods after "main" take
<scikit-learn>1.0</scikit-learn>
<pandas/>
<!-- Note most versions of xarray work, but some (such as 0.20) don't -->
<xarray>2023</xarray>
<xarray>2023.12</xarray>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to change to this specific version?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @joshua-cogliati-inl suggested this as it was causing diffs in the tests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, you may check PR #2365 (review) for library changes and try to avoid conflicts.

dependencies.xml Outdated
Comment on lines 87 to 89
<setuptools/>
<!-- This is because liblapack 3.9.0 build 21 is broken (and can probably be removed if there ever is a build 22). This can also be removed when scipy is updated to version 1.12 -->
<liblapack skip_check='True' os='linux' machine='x86_64'>3.9.0=20_linux64_openblas</liblapack>
<liblapack skip_check='True' os='windows' machine='x86_64'>3.9.0=20_win64_mkl</liblapack>
<liblapack skip_check='True' os='mac' machine='x86_64'>3.9.0=20_osx64_openblas</liblapack>
<setuptools>69</setuptools> <!-- ray 2.6 can't be installed with setuptools 70 -->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there are some diverges between this branch and devel. The changes are already in devel branch, but still show in this pull request. @Jimmy-INL Could you try to rebase your branch with devel branch?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Junyung's branch, I tried to fix the failing tests. I will check who will address it, but thanks a lot for the comments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we actually copied this from devel.

Comment on lines 153 to 154
self._canHandleMultiObjective = False # boolean indicator whether optimization is a sinlge-objective problem or a multi-objective problem

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be moved to base class with default of False, while set it to True in GA. This will also apply to GradientDescent, and Simulated Annealing class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of whether the base class has this line with default of False, the other algorithms will eventually have this line. saying True (if they support multi-objective optimization). Plus, I think user can feel less confused if we place self._canHandleMultiObjective = True (or False) in every class of optimizer.

Comment on lines 266 to 269
if len(self._minMax) != len(self._objectiveVar):
self.raiseAnError(IOError, 'The length of <type> in <Optimizers>-<GeneticAlgorithm>-<SamplerInit> and <objective> in <Optimizers>-<GeneticAlgorithm> must be of the same length!')
if list(set(self._minMax)-set(['min','max'])) != []:
self.raiseAnError(IOError, "<type> under <Optimizers>-<GeneticAlgorhtm> must be a either 'min' and/or 'max'")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error messages in Optimizer class should not associated with Genetic Algorithm.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Comment on lines +312 to +313
else:
rlz[self._objectiveVar] *= -1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the self._objectiveVar will be a list, will the else condition happen in this case?

Copy link
Collaborator Author

@JunyungKim JunyungKim Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._objectiveVar is list only in GA. Other optimization algorithms do not treat this variable as a list. This can be simplified later once everyone defines self._objectiveVar as a list.

Comment on lines +305 to +313
if 'max' in self._minMax:
if not self._canHandleMultiObjective and len(self._objectiveVar) == 1:
rlz[self._objectiveVar[0]] *= -1
elif type(self._objectiveVar) == list:
for i in range(len(self._objectiveVar)):
if self._minMax[i] == 'max':
rlz[self._objectiveVar[i]] *= -1
else:
rlz[self._objectiveVar] *= -1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these lines can be simplified since self._objectiveVar will be a list instead, am I right here?

Copy link
Collaborator Author

@JunyungKim JunyungKim Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, same answer as above.

self._objectiveVar is list only in GA. Other optimization code does not treat this variable as a list. This can be simplified later once everyone defines self._objectiveVar as a list.

Comment on lines 734 to 744
if len(self._objectiveVar) == 1: # Single Objective Optimization
objValue = rlz[self._objectiveVar[0]]
if 'max' in self._minMax:
objValue *= -1
toExport[self._objectiveVar[0]] = objValue
else: # Multi Objective Optimization
for i in range(len(self._objectiveVar)):
objValue = rlz[self._objectiveVar[i]]
if self._minMax[i] == 'max':
objValue *= -1
toExport[self._objectiveVar[i]] = objValue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if ... else ... can be combined, since self._objectiveVar is a list now, and you do not need to treat single and multi-objective separately.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct! I removed if-else statement.


# 5.1 @ n-1: fitnessCalculation(rlz)
# perform fitness calculation for newly obtained children (rlz)
if not self._canHandleMultiObjective or len(self._objectiveVar) == 1: # This is for a single-objective Optimization case.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there reason why there is no else condition here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we don't need "if not self._canHandleMultiObjective or...... "
I removed entire line of 782.

@ In, g, xr.DataArray, the constraint evaluation function
@ In, info, dict, identifying information about the realization
"""
self.raiseADebug('*'*80)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed self.raiseADebug('*'*80)

coords={'chromosome':np.arange(np.shape(objVal)[0]),
'obj': self._objectiveVar})
if self._writeSteps == 'every':
print("### rlz.sizes['RAVEN_sample_ID'] = {}".format(rlz.sizes['RAVEN_sample_ID']))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert print into debug statements?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! :)

@@ -1030,10 +1594,14 @@ def _addToSolutionExport(self, traj, rlz, acceptable):
# meta variables
toAdd = {'age': 0 if self.popAge is None else self.popAge,
'batchId': self.batchId,
'fitness': rlz['fitness'],
# 'fitness': rlz['fitness'],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uncomment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fitness values are no need to be placed in toAdd anymore. I removed the comment.

@@ -42,7 +47,8 @@ def rouletteWheel(population,**kwargs):
"""
# Arguments
pop = population
fitness = kwargs['fitness']
fitness = np.array([item for sublist in datasetToDataArray(kwargs['fitness'], list(kwargs['fitness'].keys())).data for item in sublist])
# fitness = kwargs['fitness'].data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! :)

selectionProb = shiftedFitness/np.sum(shiftedFitness) # Share of the pie (rouletteWheel)
sumProb = selectionProb[counter]

while sumProb < roulettePointer :
while sumProb <= roulettePointer :
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this added to properly cover discrete cases?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

newFitness = xr.DataArray(newFitness,
dims=['chromosome'],
coords={'chromosome':np.arange(np.shape(newFitness)[0])})
# newFitness = newFitness.to_dataset(name = list(kwargs['fitness'].keys())[0])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! :)

@@ -375,15 +375,6 @@
[../]
[../]

[./GAwithEnsembleModelHDSMconvergence]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this test removed?

Copy link
Collaborator Author

@JunyungKim JunyungKim Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jimmy-INL I need your comments for this question.

Copy link
Collaborator

@wangcj05 wangcj05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JunyungKim @Jimmy-INL I have finished my first part of review. I will to finish the rest in the afternoon. Please let me know if you have any questions regarding my comments.

@@ -43,7 +43,7 @@ Note all install methods after "main" take
<scikit-learn>1.0</scikit-learn>
<pandas/>
<!-- Note most versions of xarray work, but some (such as 0.20) don't -->
<xarray>2023</xarray>
<xarray>2023.12</xarray>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, you may check PR #2365 (review) for library changes and try to avoid conflicts.

Comment on lines +342 to +343
# if not self._canHandleMultiObjective and len(self._objectiveVar) == 1:
# self._objectiveVar = self._objectiveVar[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented lines

optVal = rlz[self._objectiveVar]
# if not self._canHandleMultiObjective and len(self._objectiveVar) == 1:
# self._objectiveVar = self._objectiveVar[0]
if len(self._objectiveVar) > 1 and type(self._objectiveVar)==list:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will self._objectiveVar have different type except list? if not, I think you do not need the second condition here.

descr=r"""Name of the response variable (or ``objective function'') that should be optimized
(minimized or maximized)."""))
descr=r"""Name of the objective variable(s) (or ``objective function'') that should be optimized
(minimized or maximized). It can be a single string or a list of strings if it is a multi-objective problem. """))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add something like: "Currently, only genetic algorithm supports multi-objective ...."

@@ -103,7 +103,8 @@ def getInputSpecification(cls):
descr=r"""seed for random number generation. Note that by default RAVEN uses an internal seed,
so this seed must be changed to observe changed behavior. \default{RAVEN-determined}""")
minMaxEnum = InputTypes.makeEnumType('MinMax', 'MinMaxType', ['min', 'max'])
minMax = InputData.parameterInputFactory('type', contentType=minMaxEnum,
minMaxList = InputTypes.StringListType()
minMax = InputData.parameterInputFactory('type', contentType=minMaxList,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also update the descr to reflect the list of min, max values from users' input.

<author>Mohammad Abdo</author>
<created>2024-02-18</created>
<classesTested/>
<description>NSGA-II min-max test </description>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more descriptions.

<author>Junyung Kim</author>
<created>2022-12-21</created>
<classesTested/>
<description>NSGA-II min-min test </description>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add more descriptions

<author>Mohammad Abdo</author>
<created>2024-04-27</created>
<classesTested/>
<description>NSGA-II min-min test </description>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update the description


# @author: Mohammad Abdo (@Jimmy-INL)

def evaluate(Inputs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add some docstring

@moosebuild
Copy link

Job Precheck on 014dae4 : invalidated by @joshua-cogliati-inl

failed in fetch and branch

@moosebuild
Copy link

Job Precheck on 014dae4 : invalidated by @joshua-cogliati-inl

civet problems

Copy link
Collaborator

@wangcj05 wangcj05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are my second part comments. Please let me know if you have any issues. In addition, Please try to make sure you can generate the manual for multi-objective optimization, and check the manual is complete. @Jimmy-INL @JunyungKim

# Fitness
if self._survivorSelectionType not in ['ageBased','fitnessBased','rankNcrowdingBased']:
self.raiseAnError(IOError, f'Currently constrained Genetic Algorithms only support ageBased, fitnessBased, and rankNcrowdingBased as a survivorSelector, whereas provided survivorSelector is {self._survivorSelectionType}')
if len(self._objectiveVar) == 1 and self._survivorSelectionType == 'rankNcrowdingBased':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the error message, should you use if len(self._objectiveVar) <= 2 instead

Copy link
Collaborator

@Jimmy-INL Jimmy-INL Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand, I thought it works with multiobjective cases, even with two objrctives. At least that's what it should do. I will test it further.

ravenframework/Optimizers/GeneticAlgorithm.py Show resolved Hide resolved
# TODO: @mandd, please explore the possibility to convert the logistic fitness into a constrained optimization fitness.
if 'Constraint' in self.assemblerObjects and self._fitnessType not in ['invLinear','feasibleFirst']:
self.raiseAnError(IOError, f'Currently constrained Genetic Algorithms only support invLinear and feasibleFirst fitnesses, whereas provided fitness is {self._fitnessType}')
if 'Constraint' in self.assemblerObjects and self._fitnessType not in ['invLinear','logistic', 'feasibleFirst']:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to use inputSpecs to check the fitnessType instead.

ravenframework/Optimizers/GeneticAlgorithm.py Outdated Show resolved Hide resolved
Comment on lines +1246 to +1265
if len(self._objectiveVar) == 1:
convs = {}
for conv in self._convergenceCriteria:
fName = conv[:1].upper() + conv[1:]
# get function from lookup
f = getattr(self, f'_checkConv{fName}')
# check convergence function
okay = f(traj, new=new, old=old)
# store and update
convs[conv] = okay
else:
convs = {}
for conv in self._convergenceCriteria:
fName = conv[:1].upper() + conv[1:]
# get function from lookup
f = getattr(self, f'_checkConv{fName}')
# check convergence function
okay = f(traj, new=new, old=old)
# store and update
convs[conv] = okay
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see the different between if and else statements here. If so, I think you need to revert the changes.

Comment on lines +1286 to +1287
if len(self._optPointHistory[traj]) < 2:
return False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine these lines within the if statement and move them outside.

Comment on lines +1289 to +1291
obj1 = o1[self._objectiveVar[0]]
obj2 = o1[self._objectiveVar[1]]
converged = (obj1 == self._convergenceCriteria['objective'] and obj2 == self._convergenceCriteria['objective'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines indicate the convergence only work for 2 objectives, I think you need to extend this part to handle multi-objectives optimizations with more than 2 objectives.

ravenframework/Optimizers/GeneticAlgorithm.py Show resolved Hide resolved
Comment on lines +1597 to +1598
'rank': 0 if ((type(self._objectiveVar) == list and len(self._objectiveVar) == 1) or type(self._objectiveVar) == str) else rlz['rank'],
'CD': 0 if ((type(self._objectiveVar) == list and len(self._objectiveVar) == 1) or type(self._objectiveVar) == str) else rlz['CD'],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to simply these two lines, since the self._objectiveVar is a list.

@moosebuild
Copy link

Job Test Ubuntu 18 PIP on 730d7a4 : invalidated by @joshua-cogliati-inl

cleared cache

1 similar comment
@moosebuild
Copy link

Job Test Ubuntu 18 PIP on 730d7a4 : invalidated by @joshua-cogliati-inl

cleared cache

@moosebuild
Copy link

Job Test Ubuntu 18 PIP on 730d7a4 : invalidated by @joshua-cogliati-inl

monitoring disk usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants