batch_norm states "mean" and "var" never updated #546

robinmonjo · 2023-11-14T20:48:37Z

I have noticed a really strange behaviour that appears to be a bug. Here is a Livebook to demonstrate the bug:

debugging-batch-norm

Mix.install([
  {:nx, "~> 0.6.0"},
  {:axon, "~> 0.6.0"},
  {:kino, "~> 0.11.2"}
])

Batch norm layers not updated

So I have noticed this weird bug.

The state of the batch_norm layers (mean and var) are never updated ! They stay all 0 and all 1, but only when they are on the "right path" of the network.

Example:

model =
  Axon.input("input")
  |> Axon.dense(10)
  |> Axon.batch_norm()

#Axon<
  inputs: %{"input" => nil}
  outputs: "batch_norm_0"
  nodes: 3
>

{init, pred} = Axon.build(model, mode: :train)
params = init.(Nx.template({10, 10}, :f32), %{})
%{state: state} = pred.(params, Nx.iota({10, 10}))
state["batch_norm_0"]

%{
  "mean" => #Nx.Tensor<
    f32[10]
    [-47.73035430908203, 50.59987258911133, -40.22572326660156, 57.21146774291992, -6.546464443206787, 5.084614276885986, -30.406084060668945, 34.05552291870117, -27.80076789855957, 16.147790908813477]
  >,
  "var" => #Nx.Tensor<
    f32[10]
    [963.2650756835938, 967.86474609375, 611.614013671875, 1129.3345947265625, 8.083409309387207, 20.26643943786621, 438.4486083984375, 393.705810546875, 288.500244140625, 81.58602142333984]
  >
}

So here, in this configuration, batch_norm state is updated

Now let's build an example where it's not:

input = Axon.input("input")

l1 =
  Axon.dense(input, 10)

l2 =
  Axon.dense(input, 10)
  |> Axon.batch_norm()

model = Axon.add(l2, l1)
Axon.Display.as_graph(model, Nx.template({10, 10}, :f32))

{init, pred} = Axon.build(model, mode: :train)
params = init.(Nx.template({10, 10}, :f32), %{})
%{state: state} = pred.(params, Nx.iota({10, 10}))
state["batch_norm_0"]

nil

So here batch_norm_0 state is never returned nor updated.

But when the batch_norm is "on the left of the network", it works:

model = Axon.add(l1, l2)
Axon.Display.as_graph(model, Nx.template({10, 10}, :f32))

{init, pred} = Axon.build(model, mode: :train)
params = init.(Nx.template({10, 10}, :f32), %{})
%{state: state} = pred.(params, Nx.iota({10, 10}))
state["batch_norm_0"]

%{
  "mean" => #Nx.Tensor<
    f32[10]
    [38.54197311401367, -13.553568840026855, -53.8503532409668, -13.158158302307129, -38.54884338378906, 7.524081707000732, 29.492563247680664, 33.01778030395508, 19.012977600097656, 17.45140266418457]
  >,
  "var" => #Nx.Tensor<
    f32[10]
    [596.3408203125, 56.97801208496094, 1097.6343994140625, 151.06002807617188, 651.69677734375, 16.675113677978516, 378.416748046875, 327.9981689453125, 123.95314025878906, 161.44265747070312]
  >
}

I would be happy to help fixing this but I'm not yet very familiar with all the internals of Axon and have limited times 😊

The text was updated successfully, but these errors were encountered:

seanmor5 · 2024-05-10T17:47:11Z

This has been fixed with the addition of model state in #553

Thanks for pointing out! I was having issues with RNNs in 553 and realized this was the same issue there!

seanmor5 closed this as completed May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

batch_norm states "mean" and "var" never updated #546

batch_norm states "mean" and "var" never updated #546

robinmonjo commented Nov 14, 2023

seanmor5 commented May 10, 2024

batch_norm states "mean" and "var" never updated #546

batch_norm states "mean" and "var" never updated #546

Comments

robinmonjo commented Nov 14, 2023

debugging-batch-norm

Batch norm layers not updated

seanmor5 commented May 10, 2024