Skip to content
This repository has been archived by the owner on Jun 26, 2021. It is now read-only.

New metric calculation #172

Merged
merged 75 commits into from
Sep 11, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
1909e31
Move Metric calculation outside closure
justusschock Jul 31, 2019
4c53103
PEP-8 Auto-Fix
Jul 31, 2019
0e5d816
Fix Usage of Val metrics
justusschock Jul 31, 2019
89061b2
Fix chainer and sklearn backends
justusschock Jul 31, 2019
de6c600
Fix default argument for sklearn backend
justusschock Jul 31, 2019
4bd9034
PEP-8 Auto-Fix
Jul 31, 2019
aeeb170
First attempt to introduce a per-iteration callback
justusschock Aug 1, 2019
4909285
PEP-8 Auto-Fix
Aug 1, 2019
04264b0
Add logging frequency and reduction types to a (now stateful) logger
justusschock Aug 1, 2019
39eda7c
Add memory warning
justusschock Aug 1, 2019
2d58cdb
Merge remote-tracking branch 'origin/logging_extension' into logging_…
justusschock Aug 1, 2019
ac350d4
Add Frequencies and Reduce Types to `make_logger`
justusschock Aug 1, 2019
bb33097
Create Logging Callback
justusschock Aug 1, 2019
e2382ba
Add "val_" prefix to metrics during predictions
justusschock Aug 1, 2019
fa6a946
Update BaseNetworkTrainer to apply iteration-based logging in callbac…
justusschock Aug 1, 2019
a73640b
Add TODO
justusschock Aug 1, 2019
ec7d7c6
PEP-8 Auto-Fix
Aug 1, 2019
68f472a
Merge branch 'master' into new_metric_calculation
justusschock Aug 2, 2019
1225042
Merge branch 'master' into logging_extension
justusschock Aug 3, 2019
4985d36
PEP-8 Auto-Fix
Aug 3, 2019
84b9db0
Add missing arguments to all trainers
justusschock Aug 3, 2019
9c23906
Merge remote-tracking branch 'origin/logging_extension' into logging_…
justusschock Aug 3, 2019
c33d782
Move Metric calculation outside closure
justusschock Jul 31, 2019
639bb73
PEP-8 Auto-Fix
Jul 31, 2019
5b2b0ed
Fix Usage of Val metrics
justusschock Jul 31, 2019
c1083d8
Fix chainer and sklearn backends
justusschock Jul 31, 2019
6b35d4b
Fix default argument for sklearn backend
justusschock Jul 31, 2019
2789493
PEP-8 Auto-Fix
Jul 31, 2019
5bfc0f8
Merge origin/new_metric_calculation into local rebased version
justusschock Aug 3, 2019
50fd984
Minor Bugfixes
justusschock Aug 3, 2019
ee1f4ab
Check for reduce_type to be None and set it to default type
justusschock Aug 3, 2019
266f2ca
reduce via flatten and unflatten dict
justusschock Aug 4, 2019
0963b94
remove check for training callback (will be readded later in separate…
justusschock Aug 5, 2019
26a2e8d
Update correct init order
justusschock Aug 5, 2019
03073c7
Remove unnecessary statement
justusschock Aug 5, 2019
435c851
Fix argument names
justusschock Aug 5, 2019
cb07c57
PEP-8 Auto-Fix
Aug 5, 2019
eb8333c
fix sklearn arguments
justusschock Aug 6, 2019
c69d805
Merge remote-tracking branch 'origin/new_metric_calculation' into new…
justusschock Aug 6, 2019
f561793
Add already_processed flag to predict function to have same arguments…
justusschock Aug 6, 2019
34241a0
Add metric_keys to kfold test
justusschock Aug 7, 2019
42e7640
Add explicit metric keys to experiment.test test
justusschock Aug 7, 2019
c5f3598
sklearn tests fixed
gedoensmax Sep 2, 2019
555e3a4
chainer added convert to numpy and fixes
gedoensmax Sep 2, 2019
68442ad
change default metric calculation
gedoensmax Sep 2, 2019
faebcfc
Merge branch 'master' into new_metric_calculation
gedoensmax Sep 2, 2019
94467ba
PEP-8 Auto-Fix
Sep 2, 2019
47bebc1
test fixed after merging master to new_metric_calculation
gedoensmax Sep 2, 2019
17e69ec
Merge remote-tracking branch 'origin/new_metric_calculation' into new…
gedoensmax Sep 2, 2019
fe8762a
Attempt to fix slack tests
justusschock Sep 3, 2019
d178097
Add correct callbacks for setup
mibaumgartner Sep 3, 2019
7e7983b
Add tests for reductions and dict (un-)flatten, moved them to utils a…
justusschock Sep 7, 2019
9e6d62f
Change `batch_nr` to `iter_num` and add this as an explicit argument …
justusschock Sep 7, 2019
3043d9f
Add check for logging frequency
justusschock Sep 7, 2019
d189684
Move logging args to group them and don't use default val_score_key
justusschock Sep 7, 2019
6484d23
Raise TypeError instead of AssertionError on typechecks
justusschock Sep 7, 2019
ebfae53
Remove duplicated entry in docstring
justusschock Sep 7, 2019
045a054
more specific docstring for reduce_types
justusschock Sep 7, 2019
12ed869
docstring for `register_callback`
justusschock Sep 7, 2019
7a95fa2
Fix duplicate tests and adjust assertion message and docstrings
justusschock Sep 7, 2019
a25b87c
Add docstrings containing the arguments passed to callbacks
justusschock Sep 7, 2019
5dc9c07
Add tag to queue id
justusschock Sep 7, 2019
3416186
PEP-8 Auto-Fix
Sep 8, 2019
933109d
Merge branch 'master' into new_metric_calculation
justusschock Sep 10, 2019
fa79a2c
Merge branch 'master' into new_metric_calculation
mibaumgartner Sep 11, 2019
64e3d96
Move callback args docstring from trainer to abstract callback
mibaumgartner Sep 11, 2019
25bcc85
PEP-8 Auto-Fix
Sep 11, 2019
5b832a8
Add iter callback test, trainer iter callback fix, rename epoch, pred…
mibaumgartner Sep 11, 2019
ecd2ca3
Merge branch 'new_metric_calculation' of github.com:delira-dev/delira…
mibaumgartner Sep 11, 2019
98dfbdd
cleanup
mibaumgartner Sep 11, 2019
9764e24
Fix encoder test
mibaumgartner Sep 11, 2019
a5fbd50
PEP-8 Auto-Fix
Sep 11, 2019
150ba3a
Add at_training_begin to torch trainer
mibaumgartner Sep 11, 2019
9f11a95
Merge branch 'new_metric_calculation' of github.com:delira-dev/delira…
mibaumgartner Sep 11, 2019
22d6540
Fix args for callbacks
mibaumgartner Sep 11, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 151 additions & 7 deletions delira/logging/base_logger.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
from multiprocessing import Queue, Event
from queue import Full
from delira.logging.base_backend import BaseBackend
from delira.utils.dict_reductions import get_reduction, possible_reductions, \
reduce_dict
import logging
from types import FunctionType


class Logger(object):
Expand All @@ -11,6 +14,7 @@ class Logger(object):
"""

def __init__(self, backend: BaseBackend, max_queue_size: int = None,
logging_frequencies=None, reduce_types=None,
level=logging.NOTSET):
"""

Expand All @@ -22,15 +26,104 @@ def __init__(self, backend: BaseBackend, max_queue_size: int = None,
the maximum size for the queue; if queue is full, all additional
logging tasks will be dropped until some tasks inside the queue
were executed; Per default no maximum size is applied
logging_frequencies : int or dict
justusschock marked this conversation as resolved.
Show resolved Hide resolved
specifies how often to log for each key.
If int: integer will be applied to all valid keys
if dict: should contain a frequency per valid key. Missing keys
will be filled with a frequency of 1 (log every time)
None is equal to empty dict here.
reduce_types : str of FunctionType or dict
Values are logged in each iteration. This argument specifies,
how to reduce them to a single value if a logging_frequency
besides 1 is passed

if str:
specifies the reduction type to use. Valid types are
'last' | 'first' | 'mean' | 'median' | 'max' | 'min'.
The given type will be mapped to all valid keys.
if FunctionType:
specifies the actual reduction function. Will be applied for
all keys.
if dict: should contain pairs of valid logging keys and either str
or FunctionType. Specifies the logging value per key.
Missing keys will be filles with a default value of 'last'.
Valid types for strings are
'last' | 'first' | 'mean' | 'max' | 'min'.
level : int
the logging value to use if passing the logging message to
python's logging module because it is not appropriate for logging
with the assigned logging backend
with the assigned logging backendDict[str, Callable]

Warnings
justusschock marked this conversation as resolved.
Show resolved Hide resolved
--------
Since the intermediate values between to logging steps are stored in
memory to enable reduction, this might cause OOM errors easily
(especially if the logged items are still on GPU).
If this occurs you may want to choose a lower logging frequency.

"""

# 0 means unlimited size, but None is more readable
if max_queue_size is None:
max_queue_size = 0

# convert to empty dict if None
if logging_frequencies is None:
logging_frequencies = {}

# if int: assign int to all possible keys
if isinstance(logging_frequencies, int):
logging_frequencies = {
k: logging_frequencies
for k in backend.KEYWORD_FN_MAPPING.keys()}
# if dict: update missing keys with 1 and make sure other values
# are ints
elif isinstance(logging_frequencies, dict):
for k in backend.KEYWORD_FN_MAPPING.keys():
if k not in logging_frequencies:
logging_frequencies[k] = 1
else:
logging_frequencies[k] = int(logging_frequencies[k])
else:
raise TypeError("Invalid Type for logging frequencies: %s"
% type(logging_frequencies).__name__)

# assign frequencies and create empty queues
self._logging_frequencies = logging_frequencies
self._logging_queues = {}

default_reduce_type = "last"
if reduce_types is None:
reduce_types = default_reduce_type

# map string and function to all valid keys
if isinstance(reduce_types, (str, FunctionType)):
reduce_types = {
k: reduce_types
for k in backend.KEYWORD_FN_MAPPING.keys()}

# should be dict by now!
if isinstance(reduce_types, dict):
# check all valid keys for occurences
for k in backend.KEYWORD_FN_MAPPING.keys():
# use default reduce type if necessary
if k not in reduce_types:
reduce_types[k] = default_reduce_type
# check it is either valid string or already function type
else:
if not isinstance(reduce_types, FunctionType):
assert reduce_types[k] in possible_reductions()
reduce_types[k] = str(reduce_types[k])
# map all strings to actual functions
if isinstance(reduce_types[k], str):
reduce_types[k] = get_reduction(reduce_types[k])

else:
raise TypeError("Invalid Type for logging reductions: %s"
% type(reduce_types).__name__)

self._reduce_types = reduce_types

self._abort_event = Event()
self._flush_queue = Queue(max_queue_size)
self._backend = backend
Expand Down Expand Up @@ -66,14 +159,41 @@ def log(self, log_message: dict):
# convert tuple to dict if necessary
if isinstance(log_message, (tuple, list)):
if len(log_message) == 2:
log_message = (log_message, )
log_message = (log_message,)
log_message = dict(log_message)

# try logging and drop item if queue is full
try:
# logging appropriate message with backend
if isinstance(log_message, dict):
self._flush_queue.put_nowait(log_message)
# multiple logging instances at once possible with
# different keys
for k, v in log_message.items():
# append tag if tag is given, because otherwise we
# would enqueue same types but different tags in same
# queue
if "tag" in v:
queue_key = k + "." + v["tag"]
else:
queue_key = k

# create queue if necessary
if queue_key not in self._logging_queues:
self._logging_queues[queue_key] = []

# append current message to queue
self._logging_queues[queue_key].append({k: v})
# check if logging should be executed
if (len(self._logging_queues[queue_key])
% self._logging_frequencies[k] == 0):
# reduce elements inside queue
reduce_message = reduce_dict(
self._logging_queues[queue_key],
self._reduce_types[k])
# flush reduced elements
self._flush_queue.put_nowait(reduce_message)
# empty queue
self._logging_queues[queue_key] = []
else:
# logging inappropriate message with python's logging
logging.log(self._level, log_message)
Expand Down Expand Up @@ -110,10 +230,12 @@ def close(self):
the abortion event

"""
self._flush_queue.close()
self._flush_queue.join_thread()
if hasattr(self, "_flush_queue"):
self._flush_queue.close()
self._flush_queue.join_thread()

self._abort_event.set()
if hasattr(self, "abort_event"):
self._abort_event.set()

def __del__(self):
"""
Expand Down Expand Up @@ -149,6 +271,7 @@ def log(self, log_message: dict):


def make_logger(backend: BaseBackend, max_queue_size: int = None,
logging_frequencies=None, reduce_types=None,
level=logging.NOTSET):
"""
Function to create a logger
Expand All @@ -159,6 +282,25 @@ def make_logger(backend: BaseBackend, max_queue_size: int = None,
the logging backend
max_queue_size : int
the maximum queue size
logging_frequencies : int or dict
specifies how often to log for each key.
If int: integer will be applied to all valid keys
if dict: should contain a frequency per valid key. Missing keys
will be filled with a frequency of 1 (log every time)
None is equal to empty dict here.
reduce_types : str of FunctionType or dict
if str:
specifies the reduction type to use. Valid types are
'last' | 'first' | 'mean' | 'max' | 'min'.
The given type will be mapped to all valid keys.
if FunctionType:
specifies the actual reduction function. Will be applied for
all keys.
if dict: should contain pairs of valid logging keys and either str
or FunctionType. Specifies the logging value per key.
Missing keys will be filles with a default value of 'last'.
Valid types for strings are
'last' | 'first' | 'mean' | 'max' | 'min'.
level : int
the logging level for python's internal logging module

Expand All @@ -175,4 +317,6 @@ def make_logger(backend: BaseBackend, max_queue_size: int = None,

"""

return SingleThreadedLogger(backend, max_queue_size, level)
return SingleThreadedLogger(backend=backend, max_queue_size=max_queue_size,
logging_frequencies=logging_frequencies,
reduce_types=reduce_types, level=level)
15 changes: 5 additions & 10 deletions delira/models/abstract_network.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,8 @@ def __call__(self, *args, **kwargs):

@staticmethod
@abc.abstractmethod
def closure(model, data_dict: dict, optimizers: dict, losses=None,
metrics=None, fold=0, **kwargs):
def closure(model, data_dict: dict, optimizers: dict, losses: dict,
iter_num: int, fold=0, **kwargs):
"""
Function which handles prediction from batch, logging, loss calculation
and optimizer step
Expand All @@ -67,17 +67,16 @@ def closure(model, data_dict: dict, optimizers: dict, losses=None,
dictionary containing all optimizers to perform parameter update
losses : dict
Functions or classes to calculate losses
metrics : dict
Functions or classes to calculate other metrics
iter_num: int
the number of of the current iteration in the current epoch;
Will be restarted at zero at the beginning of every epoch
fold : int
Current Fold in Crossvalidation (default: 0)
kwargs : dict
additional keyword arguments

Returns
-------
dict
Metric values (with same keys as input dict metrics)
dict
Loss values (with same keys as input dict losses)
dict
Expand All @@ -89,10 +88,6 @@ def closure(model, data_dict: dict, optimizers: dict, losses=None,
If not overwritten by subclass

"""
if losses is None:
losses = {}
if metrics is None:
metrics = {}
raise NotImplementedError()

@staticmethod
Expand Down
55 changes: 16 additions & 39 deletions delira/models/backends/chainer/abstract_network.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,8 @@ def prepare_batch(batch: dict, input_device, output_device):
return new_batch

@staticmethod
def closure(model, data_dict: dict, optimizers: dict, losses={},
metrics={}, fold=0, **kwargs):
def closure(model, data_dict: dict, optimizers: dict, losses: dict,
iter_num, fold=0, **kwargs):
"""
default closure method to do a single training step;
Could be overwritten for more advanced models
Expand All @@ -127,17 +127,16 @@ def closure(model, data_dict: dict, optimizers: dict, losses={},
losses : dict
dict holding the losses to calculate errors;
ignored here, just passed for compatibility reasons
metrics : dict
dict holding the metrics to calculate
iter_num: int
the number of of the current iteration in the current epoch;
Will be restarted at zero at the beginning of every epoch
fold : int
Current Fold in Crossvalidation (default: 0)
**kwargs:
additional keyword arguments

Returns
-------
dict
Metric values (with same keys as input dict metrics)
dict
Loss values (with same keys as input dict losses; will always
be empty here)
Expand All @@ -149,41 +148,19 @@ def closure(model, data_dict: dict, optimizers: dict, losses={},
"Criterion dict cannot be emtpy, if optimizers are passed"

loss_vals = {}
metric_vals = {}
total_loss = 0

inputs = data_dict["data"]
preds = model(inputs)

if data_dict:

for key, crit_fn in losses.items():
_loss_val = crit_fn(preds["pred"], data_dict["label"])
loss_vals[key] = _loss_val.item()
total_loss += _loss_val

with chainer.using_config("train", False):
for key, metric_fn in metrics.items():
metric_vals[key] = metric_fn(
preds["pred"], data_dict["label"]).item()

if optimizers:
model.cleargrads()
total_loss.backward()
optimizers['default'].update()

else:

# add prefix "val" in validation mode
eval_loss_vals, eval_metrics_vals = {}, {}
for key in loss_vals.keys():
eval_loss_vals["val_" + str(key)] = loss_vals[key]

for key in metric_vals:
eval_metrics_vals["val_" + str(key)] = metric_vals[key]

loss_vals = eval_loss_vals
metric_vals = eval_metrics_vals

return metric_vals, loss_vals, {k: v.unchain()
for k, v in preds.items()}
for key, crit_fn in losses.items():
_loss_val = crit_fn(preds["pred"], data_dict["label"])
loss_vals[key] = _loss_val.item()
total_loss += _loss_val

model.cleargrads()
total_loss.backward()
optimizers['default'].update()
for k, v in preds.items():
v.unchain()
return loss_vals, preds
18 changes: 6 additions & 12 deletions delira/models/backends/sklearn/abstract_network.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,8 +103,8 @@ def prepare_batch(batch: dict, input_device, output_device):
return new_batch

@staticmethod
def closure(model, data_dict: dict, optimizers: dict, losses={},
metrics={}, fold=0, **kwargs):
def closure(model, data_dict: dict, optimizers: dict, losses: dict,
iter_num: int, fold=0, **kwargs):
"""
default closure method to do a single training step;
Could be overwritten for more advanced models
Expand All @@ -121,17 +121,16 @@ def closure(model, data_dict: dict, optimizers: dict, losses={},
losses : dict
dict holding the losses to calculate errors;
ignored here, just passed for compatibility reasons
metrics : dict
dict holding the metrics to calculate
iter_num: int
the number of of the current iteration in the current epoch;
Will be restarted at zero at the beginning of every epoch
fold : int
Current Fold in Crossvalidation (default: 0)
**kwargs:
additional keyword arguments

Returns
-------
dict
Metric values (with same keys as input dict metrics)
dict
Loss values (with same keys as input dict losses; will always
be empty here)
Expand All @@ -156,9 +155,4 @@ def closure(model, data_dict: dict, optimizers: dict, losses={},

preds = model(data_dict["X"])

metric_vals = {}

for key, metric_fn in metrics.items():
metric_vals[key] = metric_fn(preds["pred"], data_dict["y"])

return metric_vals, {}, preds
return {}, preds
Loading