Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Optional unit rewriting #1252

Closed
wants to merge 11 commits into from
Closed

[WIP] Optional unit rewriting #1252

wants to merge 11 commits into from

Conversation

lamorton
Copy link
Contributor

@lamorton lamorton commented Sep 7, 2021

This is an alternative to #1250 that preserves the original unit validation & merely adds an optional rewrite tool for equations that can be used prior to constructing a system from the equations. Prompted by @isaacsas.

Pros:

  • Less of a breaking change b/c rewriting is opt-in
  • Maybe better performance than rewriting by default
  • May be able to transform arrays-equations prior to scalarizing

Cons:

  • Logic is largely duplicated between get_unit and constructunit (some reuse though)
  • User needs to be in the loop (eg: feed eqs to constructor, fails, rewrite eqs, call constructor again)
  • Automatic unit checking inside system constructor is redundant if the user has already rewritten the equations (but most of the work is saved by tagging the expressions with top-level units & allowing get_unit to find those.)

@lamorton
Copy link
Contributor Author

lamorton commented Sep 7, 2021

This Constant<:Symbolic I'm using is causing issues with show but more importantly with build_function and get_variables. I've tried using anonymous parameters instead. That fixed the display issues, but not the problems with build_function.

I really want to make it work b/c it gives two capabilities:

  1. putting Unitful constants into equations w/o forcing them to be parameters
  2. making the units explicitly match after the rewrite

I can work around (2) by letting get_unit try _get_unit first (which I should do anyway). But there's still this issue that if I have x with unit u"cm" and y with unit u"m", after I rewrite x+y I getx+100y which isn't manifestly balanced. As a stranger, you don't know a priori whether that 100 was there just as a literal in the first place or as a result of unit conversion. In a simple case like this, you can infer what happened, but it's much nicer to get something like x+[100 cm m⁻¹]*y, so there's no confusion as to the intent. OTOH, it might be confusing to have unit annotations inside the equations (eg: is that m the variable or the unit?)

One could argue that (1) isn't really a problem, but it's annoying/confusing to be required to make a parameter μ_0 just to host the value of Unitful.μ0. It also clutters up the parameters list that I'd rather reserve for other things that really are adjustable parameters, not fundamental constants. OTOH, this approach maybe won't play well with the Latexification of the equations (you'll end up with an ugly 1.2566370614359173e-6 H m⁻¹ instead of seeing the symbol).

Edit: actually, I can't work around (2) if the rewriting is optional & is done by the user before constructing the system. Reason being, any rearranging of the equations that gets done in an outer constructor loses the units that rewrite_units tags onto the expressions. Edit2: I suppose the user could then manually bypass unit checking, but I'd like to avoid people getting in the habit of doing that.

@lamorton
Copy link
Contributor Author

lamorton commented Sep 7, 2021

@YingboMa @shashi @ChrisRackauckas Any preference about whether:

  1. Inserting unit conversions should happen automatically vs. being an optional transformation?
  2. Conversion factors should have units on them vs. being plain numbers?

@ChrisRackauckas
Copy link
Member

Well it should always be correct by default. If you check for whether the users uses units first, and then check if the user needs conversions, there should be no extra cost from what we had before and it would be correct right?

@lamorton
Copy link
Contributor Author

lamorton commented Sep 8, 2021

it should always be correct by default.

No question there.

If you check for whether the users uses units first, and then check if the user needs conversions, there should be no extra cost from what we had before and it would be correct right?

True. The 1st question is whether to automatically insert those conversions during checking. @isaacsas was concerned about the performance of this approach, and requested I make the insertion of conversion factors a standalone transformation that the user would need to apply to the equations before they would pass the checking. I was of the opinion that it would be more convenient if the conversion was done automatically during checking.

@isaacsas
Copy link
Member

isaacsas commented Sep 8, 2021

Just to clarify, I was thinking this could be a system transformation. i.e. a user constructs a system and gets an inconsistent units warning, with the advice to try calling balanceunits if they’d like MT to try to correct the units. That in turn would try to generate a new system with corrected units, but avoid modifying the original system. In this way a user would have the choice to correct the units themselves, which many/some users might prefer since incorrect units indicates a problem in model formation in many circumstances. It also avoids silently rewriting user equations which could have performance implications (but perhaps I’ve misunderstood what is going on here behind the scenes).

@isaacsas
Copy link
Member

isaacsas commented Sep 8, 2021

But that would mean a system with incorrect units would get constructed instead of automatically being fixed as @ChrisRackauckas mentioned.

@lamorton
Copy link
Contributor Author

lamorton commented Sep 8, 2021

@isaacsas I see. I think that even with warnings, it would be better not to allow construction of a system that would be incorrect if simulated. What's the reason to prefer the transformation to be applied to the system instead of the set of equations?

Edit: for concreteness, here's the workflow for using this PR:

@variables t [unit = u"ms"] P(t) [unit = u"MW"] E(t) [unit = u"J"] 
@parameters τ [unit = u"ms"] γ 
D = Differential(t)
eqs = [D(E) ~ P - E/τ]
@test_throws MT.ValidationError ODESystem(eqs,name=:sys) # Existing unit check prevents this construction

neweqs = MT.rewrite_units(eqs) # Apply transformation to the equations
@named sys = ODESystem(neweqs) # Unit check passes

@isaacsas
Copy link
Member

isaacsas commented Sep 9, 2021

Sure, that ordering is fine. I missed your earlier edit about having rewriting before system construction not working, so thought that was the workflow.

I don't want to monopolize the discussion here too much. @ChrisRackauckas and @YingboMa are probably much more familiar with the full breadth of MT users and what approach makes the most sense as the default. I would defer to their preferred approach. I just wanted to make sure that the performance implications were at least being considered in the proposed workflow / design. (And maybe there aren't any relative to just checking the units, which would be great.)

@ChrisRackauckas
Copy link
Member

The problem is that generating the problem without adding in the conversions is just a silent wrong behavior which is really bad. That's the kind of thing that should error. But as mentioned earlier, it should only incur a cost when the conversions are actually required, so there shouldn't be anything to worry about in performance if it's only done when the user is specifically requiring this feature.

But this means it's crucial to set it up to first check if units exist, and then check if units are same vs compatible, and then do conversions only when necessary. The scan to turn off this pass should be rather quick.

@lamorton
Copy link
Contributor Author

lamorton commented Sep 9, 2021

Ahh, I hadn't thought of doing it that way. Maybe this is naive, but my expectation was that the two passes (checking-only vs rewriting) would be about the same speed (especially when nothing needed to be changed), so we might as well do the rewrite by default. OTOH, the checking won't allocate but the rewriting will, so I could be wrong. I should really try profiling these things so we can be quantitative about the performance.

@lamorton
Copy link
Contributor Author

lamorton commented Sep 9, 2021

I have a system with 15 relatively simple equations/states (18 parameters), 5 equations of which throw unit errors when doing validation.

julia> @btime MT.validate(eqs)
... #warning messages elided
  912.514 μs (2658 allocations: 149.98 KiB)

versus

julia> @btime MT.rewrite_units(eqs)
  342.558 μs (1629 allocations: 83.23 KiB)

That's ... interesting. Trying a different system that already has matching units (9 equations, 24 parameters):

julia> @btime MT.validate(gas_eqs)
  560.897 μs (4167 allocations: 212.22 KiB)

versus rewriting:

julia> @btime MT.rewrite_units(gas_eqs)
  633.992 μs (3741 allocations: 174.11 KiB)

I guess I did a bad job implementing check_units in the first place? 😅

@isaacsas
Copy link
Member

isaacsas commented Sep 9, 2021

Weird, though you should make sure to interpolate your variables into @btime:

@btime MT.validate($eqs)

@isaacsas
Copy link
Member

isaacsas commented Sep 9, 2021

@anandijain are there larger SBML models that would have units we could try this out with?

@anandijain
Copy link
Contributor

@isaacsas SciML/SBMLToolkit.jl#20 While SBML.jl does store unit information, SBMLToolkit doesn't do anything with it yet.

@lamorton
Copy link
Contributor Author

I'm testing the idea of having unit conversion factors added as new 'anonymous' parameters during the rewrite inside the system construction. This allows to make the conversion factors clear, such that check_units would pass if called on the transformed system. The downside is that the set of parameters is not the same as the set the user starts with.

using ModelingToolkit, NonlinearSolve, Unitful

@unit qe     "qe"   ElectronCharge    u"q"       false

vars = @variables(begin
v, [unit = u"m/s"], # particle velocity
ρ, [unit = u"m"], # particle gyroradius
S #ratio of device radius to particle gyroradius
end)

pars = @parameters(
    r = 1.0, [unit = u"m"], # device radius
    E = 100, [unit = u"keV"], # particle energy
    B = 1, [unit = u"T"], # magnetic field
    m = 1.0, [unit = u"u"], # particle mass
    qₑ = 1, [unit = qe], # particle charge
    μ_0 = ustrip(Unitful.μ0), [unit = unit(Unitful.μ0)]
    )

eqs = [
    S ~ r/ρ,
    ρ ~ m*v/(qₑ*B),
    E ~ 0.5*m*v^2,];

@named ns = NonlinearSystem(eqs, vars, pars) 

Now:

julia> MT.equations(ns)
3-element Vector{Equation}:
 0 ~ r*^-1) - S
 0 ~ var"1.0364269652680505e-8 qe s T u⁻¹"*m*v*(B^-1)*(qₑ^-1) - ρ
 0 ~ 0.5var"1.0364269652680506e-11 s² keV m⁻² u⁻¹"*m*(v^2) - E

julia> MT.parameters(ns)
8-element Vector{Any}:
 r
 E
 B
 m
 qₑ
 μ_0
 var"1.0364269652680505e-8 qe s T u⁻¹"
 var"1.0364269652680506e-11 s² keV m⁻² u⁻¹"

@isaacsas
Copy link
Member

How to handle extra non-user parameters was something that I was also struggling with and thinking about in #1246

The issue is that build_function would ideally have a way to provide our internal parameters independent of the user parameters (like building a functor that stores the internal parameters instead of just a function). We can of course append internal parameters onto the user parameter object in the various problems (like ODEProblem), but I worry that could cause issues if the user has code that modifies the parameter vector in callbacks or such (and hence doesn't expect extra info to be in it). It would be great to get an interface settled for this.

@lamorton
Copy link
Contributor Author

@isaacsas My first attempt was to make a new subtype Constant<:Symbolic that would have a value, with units as metadata, and would get replaced at build_function-time with its value, totally bypassing the parameter handling mechanisms. I think that would require hacking stuff in Symbolics or SymbolicUtils. I'd want to get buy-in from the maintainers over there, to go that route.

@isaacsas
Copy link
Member

Yeah, I think the conclusion in that issue I linked was that we'll ultimately need some changes to support internal parameters in build_function. @ChrisRackauckas would it even be feasible to generate a functor that wraps internal parameters there and return it, or would that just not work with RuntimeGeneratedFunctions (i.e. require users to explicitly eval expressions)?

@ChrisRackauckas
Copy link
Member

The closure would have to be a RuntimeGeneratedFunction.

@shashi
Copy link
Collaborator

shashi commented Sep 17, 2021

@test_throws MT.ValidationError ODESystem(eqs,name=:sys) # Existing unit check prevents this construction

neweqs = MT.rewrite_units(eqs) # Apply transformation to the equations
@named sys = ODESystem(neweqs) # Unit check passes

this can be automated using a throw catch inside the ODESystem constructor where you check the error is a ValidationError? And if it errors a second time in the catch, you can rethrow.

Constant(x) = Constant(x, Dict(VariableUnit => Unitful.unit(x)))
Base.:*(x::Num, y::Unitful.Quantity) = value(x) * y
Base.:*(x::Unitful.Quantity, y::Num) = x * value(y)
Base.show(io::IO, v::Constant) = Base.show(io, v.val)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably show unit if it's there. Good way to know it's not just a number.

@shashi
Copy link
Collaborator

shashi commented Sep 17, 2021

For validation I'm guess you just need a dimensionality check, that can be done faster by propagating 7 rational numbers. If you don't work with Unitful at that level, then it should be type stable.

Chain([@rule ~x*~y => set_dim(~x*~y, map(+, dimension(~x), dimension(~y))
          @rule ~x + ~y => throw(ValidationError()) where dimension(~x) != dimension(~y)
          ...]) |> Postwalk

^ this should do the validation. dimension should return like 7 integers in a tuple or vector... this is just a hunch I don't know how much faster it will be.

@shashi
Copy link
Collaborator

shashi commented Sep 17, 2021

I think the fastest would be something like this:

struct Dim
    t::NTuple{7, Rational{Int}}
end
Dim(u::Unit) = ....  # make a Dim

function dim(x)
    if istree(x)
        if operation(x) == (+)
            ds = map(dim, arguments(x))
            if !all_equal(ds)
                return ValidationError("...")
            end
            return first(ds)
        elseif operation(x) == (*)
            return add(map(dim, arguments(x)))
            # ... other cases
        else
            return Dim((0,0,0,0,0,0,0))
        end
    elseif x isa Sym
        return Dim(ModelingToolkit.get_unit(x))
    end
end

@lamorton
Copy link
Contributor Author

lamorton commented Sep 9, 2022

Closing because stale. I'll try again once I get a solution for #1802.

@lamorton lamorton closed this Sep 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants