Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BAT dependencies, package load time(s) and modularization #351

Open
oschulz opened this issue May 7, 2022 · 10 comments
Open

BAT dependencies, package load time(s) and modularization #351

oschulz opened this issue May 7, 2022 · 10 comments

Comments

@oschulz
Copy link
Member

oschulz commented May 7, 2022

This issue is intended to keep track of modularizing (splitting up) BAT to reduce package load time(s) and increase flexibility.

The current high load time of BAT is mainly due to it's dependencies. Dependency optiions and load-time cost, preliminary analysis:

Unavoidable expensive core deps:

  • Distributions (includes StatsBase)
  • StaticArrays (indirect through many packages)

Unavoidable non-negligible deps:

  • ArraysOfArrays
  • ValueShapes
  • RecipesBase
  • Possibly some ArrayInterface packages (ArrayInterfaceCore is now very lightweight)

Hard-to-avoid non-negligible direct/indirect deps

  • BangBang
  • Transducers (non-negligible cost on top of BangBang)
  • TerminalLoggers
  • PrettyTables (may be avoidable)
  • DataStructures

Cheap deps:

  • StructArrays: Very cheap on top of StaticArrays
  • AbstractDifferentiation: Almost free on top of ChainRulesCore

Autodiff choices:

  • ForwardDiff: Quite cheap on top of StaticArrays and Distributions
  • FiniteDifferences: Basically free on top of StaticArrays and Distributions
  • FiniteDiff: Instead of FiniteDifferences, would pull in ArrayInterface
  • Zygote: Expensive
  • (Future option) Diffractor: Quite cheap on top of StaticArrays

Math deps:

  • LinearMaps: About 100 ms, no deps

Statistics deps:

  • LogarithmicNumbers: Very cheap on top of unavoidable deps
  • MeasureBase: Not that expensive on top of unavoidable deps (uses LogarithmicNumbers)

Optimizer deps:

  • NLSolversBase: Cheap itself, but depends on ArrayInterface, ChainRulesCore, SpecialFunctions and ForwardDiff
  • Optim: Cheap itself, but pulls in ForwardDiff and ArrayInterface via FiniteDiff,
    BAT would need only Nelder-Mead and L-BFGS for core functionality though
  • LBFGSB: Cheap, but indroduces binary deps
  • Manopt: Would be chop on top of StaticArrays if it didn't pull in ColorSchemes

Sampler deps:

  • AbstractMCMC: Cheap on top of StaticArrays, Distributions, Transducers, TerminalLoggers
  • AdvancedHMC: Quite cheap on top of AbstractMCMC
  • PSIS: Cheap on top of unavoidable deps

Integator deps:

  • CUBA: Cheap, but indroduces binary deps
  • MonteCarloIntegration: Almost free on top of unavoidable deps
  • HCubature: Almost free on top of unavoidable deps
  • QuadGK: Cheap, but only univariate

Deps to avoid:

  • Folds: Significant load time cost on top of Transducers
  • SciMLBase: Really expensive
  • GalacticOptim: Not cheap even on top of SciMLBase, also load-time interplay with SciMLBase
  • Optimization: Over 1000 ms load time on top of it's heavy deps (Optim, GPUArrays, RecursiveArrayTools, SciMLBase)

Expensive deps to get rid of:

  • DoubleFloats
  • Polynomials
  • ... ?

Non-negligible deps to get rid of:

  • KernelDensity
  • PrettyTables (maybe, see above)
  • AdaptiveRejectionSampling (cost due to ForwardDiff though)
  • ... ?
@oschulz oschulz changed the title BAT package load time and modularization BAT dependencies, package load time(s) and modularization May 7, 2022
@ChrisRackauckas
Copy link

GalacticOptim: Really expensive even on top of SciMLBase

Even with v3 splitting out the solvers?

@ChrisRackauckas
Copy link

SciMLBase: Really expensive

Interesting, how expensive, and due to what? There's not much in there 😅

@oschulz
Copy link
Member Author

oschulz commented May 7, 2022

SciMLBase: Really expensive Interesting, how expensive, and due to what? There's not much in there

I was very surprised as well, I had always assumed SciMLBase to be very lightweight, I just never timed it before.

GalacticOptim: Really expensive even on top of SciMLBase

Hm, maybe I had GalacticOptim v2 due to some dep in my tests - though even with v3 it has a not insignificant load time:

Session 1:

pkg> st SciMLBase GalacticOptim
Status `/user/.julia/environments/temp/Project.toml`
  [a75be94c] GalacticOptim v3.1.1
  [0bca4576] SciMLBase v1.31.3

julia> using InverseFunctions # load a super-lightweight package to get some initial Pkg costs out of the way

julia> @time using SciMLBase
  2.356626 seconds (5.58 M allocations: 423.052 MiB, 4.17% gc time, 49.19% compilation time)

Session 2:

julia> using InverseFunctions

julia> @time using SciMLBase, GalacticOptim
  2.990720 seconds (6.56 M allocations: 475.628 MiB, 5.91% gc time, 60.39% compilation time)

Session 3:

julia> using InverseFunctions

julia> @time_imports using SciMLBase, GalacticOptim
     10.9 ms    ┌ MacroTools
     18.5 ms  ┌ ZygoteRules
      0.2 ms    ┌ IteratorInterfaceExtensions
      0.7 ms  ┌ TableTraits
      3.5 ms  ┌ Compat
      0.5 ms  ┌ Requires
    141.0 ms  ┌ FillArrays
      0.2 ms  ┌ DataValueInterfaces
    532.6 ms  ┌ StaticArrays
      3.2 ms    ┌ DocStringExtensions
      0.2 ms    ┌ IfElse
     19.4 ms    ┌ RecipesBase
     37.2 ms      ┌ Static
    718.0 ms    ┌ ArrayInterface
      0.7 ms    ┌ Adapt
     58.6 ms    ┌ ChainRulesCore
    864.2 ms  ┌ RecursiveArrayTools
      1.4 ms    ┌ DataAPI
     14.6 ms  ┌ Tables
      0.2 ms  ┌ CommonSolve
      0.8 ms  ┌ ConstructionBase
      0.2 ms  ┌ TreeViews
   1829.6 ms  SciMLBase
      2.7 ms  ┌ DiffResults
      0.3 ms  ┌ Reexport
      6.3 ms    ┌ AbstractTrees
      2.9 ms    ┌ ProgressLogging
      5.6 ms    ┌ LeftChildRightSiblingTrees
     16.7 ms  ┌ TerminalLoggers
      4.8 ms  ┌ ProgressMeter
      2.1 ms  ┌ LoggingExtras
      0.8 ms  ┌ ConsoleProgressMonitor
    321.1 ms  GalacticOptim

Also, in comparison (though Optim and GalacticOptim are not actually replacements for each other of course) - loading Optim:

julia> using InverseFunctions

julia> @time using Distributions, StaticArrays, ArrayInterface # Pretty much unavoidable for BAT
  1.918963 seconds (4.79 M allocations: 333.435 MiB, 3.12% gc time, 37.83% compilation time)

julia> @time using Optim
  0.181226 seconds (475.28 k allocations: 30.906 MiB, 14.12% gc time, 13.72% compilation time)

Loading GalacticOptim:

julia> using InverseFunctions

julia> @time using Distributions, StaticArrays, ArrayInterface # Unavoidable deps for BAT
  1.921450 seconds (4.79 M allocations: 333.419 MiB, 3.04% gc time, 37.15% compilation time)

julia> @time using GalacticOptim
  0.936281 seconds (2.32 M allocations: 181.437 MiB, 3.88% gc time, 59.31% compilation time)

julia> @time using SciMLBase
  0.595445 seconds (861.23 k allocations: 45.663 MiB, 6.73% gc time, 99.98% compilation time)

Why does loading SciMLBase after GalacticOptim take any time at all? This is really weird:

julia> using InverseFunctions

julia> @time using Distributions, StaticArrays, ArrayInterface # Unavoidable deps for BAT
  1.913109 seconds (4.79 M allocations: 333.435 MiB, 3.06% gc time, 37.17% compilation time)

julia> @time using SciMLBase, GalacticOptim
  1.513281 seconds (3.17 M allocations: 227.054 MiB, 4.97% gc time, 75.43% compilation time)

Why is loading SciMLBase and GalacticOptim more expensive than just loading GalacticOptim, which depends on SciMLBase? Some strange Requires effect?

@oschulz
Copy link
Member Author

oschulz commented May 7, 2022

SciMLBase seems to have some strange load time effects depending on order of package loading in general. I don't get why ...

When timing it all in one go it's not so extreme, but around 300 ms still seems to be very high for the actual code of a "...Base" package:

julia> using InverseFunctions

julia> @time_imports using StaticArrays, ArrayInterface, RecursiveArrayTools, SciMLBase
    527.0 ms  StaticArrays
      3.3 ms  ┌ Compat
      0.4 ms  ┌ Requires
      0.1 ms  ┌ IfElse
     36.9 ms  ┌ Static
    721.1 ms  ArrayInterface
     10.4 ms    ┌ MacroTools
     11.0 ms  ┌ ZygoteRules
    149.6 ms  ┌ FillArrays
      3.3 ms  ┌ DocStringExtensions
     19.0 ms  ┌ RecipesBase
      0.6 ms  ┌ Adapt
     59.7 ms  ┌ ChainRulesCore
    309.3 ms  RecursiveArrayTools
      0.2 ms    ┌ IteratorInterfaceExtensions
      0.6 ms  ┌ TableTraits
      0.2 ms  ┌ DataValueInterfaces
      1.3 ms    ┌ DataAPI
     14.5 ms  ┌ Tables
      0.2 ms  ┌ CommonSolve
      0.8 ms  ┌ ConstructionBase
      0.2 ms  ┌ TreeViews
    307.5 ms  SciMLBase

StaticArrays and especially ArrayInterface make up more of the total load time, of course, together with RecursiveArrayTools which is also not exactly lightweight.

@ChrisRackauckas
Copy link

I wonder how requires is measured. My guess is that it's triggering requires in ArrayInterface and that's measured as part of the SciMLBase time.

@oschulz
Copy link
Member Author

oschulz commented May 7, 2022

I would think so ... probably have to ask Tim or so. :-)

@oschulz
Copy link
Member Author

oschulz commented May 7, 2022

that it's triggering requires in ArrayInterface

Oh wow, ArrayInterface has a lot of requires!

@ChrisRackauckas
Copy link

Hence the idea to make it like GalacticOptim in terms of subpackages.

@oschulz
Copy link
Member Author

oschulz commented May 7, 2022

You mean JuliaArrays/ArrayInterface.jl#211? Yes, that would be great - and if there a more lightweight parts, maybe some of the requires can be turned into thoses packages depending on it? I feel we have quite a few dependencies in the ecosystem right now that should be the other way round, or common interface packages are missing, and lot's of requires to compensate for it.

Update: We have a very lightweight ArrayInterfaceCore now.

@ChrisRackauckas
Copy link

and if there a more lightweight parts, maybe some of the requires can be turned into thoses packages depending on it?

Indeed, that's the dream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants