Skip to content

Improving extrapolation in neural mathematical reasoning using number segmentation and invariant risk minimization.

Notifications You must be signed in to change notification settings

bencottier/mlp-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Improving Extrapolation in Neural Mathematical Reasoning

Abstract

  • Recent work applying neural networks to mathematical reasoning shows “interpolation” data (statistics inside the range of the training data) is much easier than “extrapolation” data (statistics far outside the training range).
  • However the extrapolation is straightforward mathematically, e.g. increasing the magnitude of numbers, and thus seems feasible to solve. This work aims to understand and address the performance gap.
  • We used worded mathematics problems from the DeepMind mathematics dataset (just the modules that have a corresponding extrapolation test set) using a conventional Transformer architecture.
  • Our baseline character-level model, based directly on prior work, achieves 44% exact-match accuracy on an interpolation set and 27% on extrapolation, a gap of 17%.
  • Simply by parsing words as whole units instead of characters, we improve this to 58% and 39% respectively, though the gap increases to 18%.
  • Adding Invariant Risk Minimisation (IRM), a method to improve generalisation, reduces the gap to 15%. However, accuracy worsens to 52% and 37% respectively. It remains uncertain whether IRM operated as intended. Since it is a very new approach, more work is needed to understand "best practice", and adapt it to this domain.

Content

This repository consists mostly of:

  • Configuration files (config folder) and scripts (scripts folder) for use with the OpenNMT-Py API.
  • Scripts and notebooks for pre- and post-processing the data and results.

Linked repositories

OpenNMT-Py. Fork of OpenNMT-Py implementing invariant risk minimization via gradient accumulation.

OpenNMT_visual-py. Implements attention visualisation.

About

Improving extrapolation in neural mathematical reasoning using number segmentation and invariant risk minimization.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published