An question about the architecture of DQNAgent #473

tazhu2023 · 2024-08-02T08:06:38Z

tazhu2023
Aug 2, 2024

Regarding the arch of DQNAgent, the last layer is "softmax" is to output the probability distribution of the actions?
And the layer before is "sigmoid", why the architecure is like this? Noramlly "sigmoid" is not required before a "softmax"? 
    
 def _build_policy_network(self):
    network = tf.keras.Sequential([
        tf.keras.layers.InputLayer(input_shape=self.observation_shape),
        tf.keras.layers.Conv1D(filters=64, kernel_size=6, padding="same", activation="tanh"),
        tf.keras.layers.MaxPooling1D(pool_size=2),
        tf.keras.layers.Conv1D(filters=32, kernel_size=3, padding="same", activation="tanh"),
        tf.keras.layers.MaxPooling1D(pool_size=2),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(self.n_actions, activation="sigmoid"),
        tf.keras.layers.Dense(self.n_actions, activation="softmax")
    ])

    return network

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An question about the architecture of DQNAgent #473

{{title}}

Replies: 0 comments

Select a reply

An question about the architecture of DQNAgent #473

tazhu2023 Aug 2, 2024

Replies: 0 comments

tazhu2023
Aug 2, 2024