Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exponential complexity? #3

Open
andriusstank opened this issue Jul 20, 2021 · 8 comments
Open

Exponential complexity? #3

andriusstank opened this issue Jul 20, 2021 · 8 comments

Comments

@andriusstank
Copy link

Simple example with fibonacci numbers:

n :: Int
n = 50

fibN :: Num a => a -> a
fibN x0 = fibs !! n
  where 
    fibs = 0 : x0 : zipWith (+) fibs (tail fibs)

fibInt :: Integer  
fibInt = fibN 1

fibAD :: (Integer, Integer)
fibAD = rad1g 0 1 fibN 1

fibInt evaluates instantly. fibAD takes a long time and throws stack overflow exception.

@ocramz
Copy link
Owner

ocramz commented Jul 20, 2021 via email

@ocramz
Copy link
Owner

ocramz commented Jul 20, 2021

I don't know what to make of this though. On one hand, this example is not what I'd call a typical application of AD. But you're right, there's certainly a fundamental limitation in the code somewhere. I hope someone with more expertise in debugging laziness issues can shed some light on this.

@turion
Copy link

turion commented Jul 23, 2021

Yeah, there is some potential for lazyness to build up in some places, e.g. the Num instance (+) = op2Num $ \x y -> (x + y, id, id). If you have a lot of additions you might end up with fibInt = fibN 1 50 = 12586269025 as a thunk 1+ 1 + 1 + ..., which can easily blow up memory.

@turion
Copy link

turion commented Jul 23, 2021

Just replacing it by (+) = op2Num $ \x y -> let sum = x + y in sum seq (sum, id, id) doesn't fix it yet though. But since you're building up thunks of function compositions in the other elements of the tuple, it's hard to prevent a space leak there I'd guess since that can't be easily normalised.

@ocramz
Copy link
Owner

ocramz commented Jul 23, 2021

Thank you @turion . It really sounds like a radical redesign is in order. In its current version, ad-delcont is mostly a straightforward port of the implementation from the Wang article, which they were able to pull off by using Scala which is eagerly-evaluated (IIRC).

@andriusstank
Copy link
Author

Somewhat simper example:

forkJoin :: Fractional a => a -> a
forkJoin x = (x+x)/2

forkJoinChain :: Fractional a => a -> a
forkJoinChain x = iterate forkJoin x !! 20

forkJoinChainAD :: Float -> (Float, Float)
forkJoinChainAD x = rad1g 0 1 forkJoinChain x

I don't think lazyness is a problem here. At worst it may cause memory consumption to be proportional to the amount of computation, which shouldn't be an issue with only tens of variables.

My guess would be that this piece of code causes combinatorial explosion:

op2_ zeroa plusa plusb f ioa iob = do
  ra <- ioa
  rb <- iob

ioa and iob are computations, that are evaluated every time they are needed. Each of them has two parents, each of them in turn also has two parents, etc.

Maybe AD variable should also store the value and STRef for gradient along with ContT computation, maybe some flag to avoid repeated evaluations. Like laziness, but manual.

Anyway, I don't really grok the code, so it's all just wild guesses.

@tonyday567
Copy link

Have you tried UseStrict to see if it goes away?

@ocramz
Copy link
Owner

ocramz commented Jul 25, 2021

@tonyday567 thanks but I just tried, -XStrict doesn't help.

Repository owner deleted a comment from jk8898 Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants