Raw output from network #582

alchitry · 2024-04-12T15:27:19Z

alchitry
Apr 12, 2024

I was planning to try and make a model learn to play a game as a way to teach myself so ML.

I'm trying to make a network that takes the game state in and outputs the probabilities for the desired moves.

This seems at first to be possible with KotlinDL but I've run into a problem that I can't find any documentation to solve.

For any given board state, only some moves are actually valid. I need to mask the output of the network to set these to 0 probability.

I originally looked for a way to do this in the network with some kind of masking layer but it doesn't seem to exist.

I then thought I'd just output the raw values, set the invalid ones to -inf and run it through a softmax manually after.

However, the only way I see to run the network is to use the .predictSoftly() method which seems like it already runs the output through a softmax.

My question is, is this possible? Either, is it possible to mask some output values before the softmax or is it possible to get the raw activations of the last layer.

I had hope that .predictAndGetActivations() would give me access to them but this seems like it give the activations for every layer except the last.

EDIT: Digging into the code I found the section

        predictionOp = when (loss) {
            is SoftmaxCrossEntropyWithLogits -> tf.withName(OUTPUT_NAME).nn.softmax(yPredOp)
            else -> tf.withName(OUTPUT_NAME).identity(yPredOp)
        }

That makes it clear the softmax is because I was using Losses.SOFT_MAX_CROSS_ENTROPY_WITH_LOGITS as my loss function.

This loss function makes sense during training when I don't need to mask anything. Is it possible then to remove the predictionOp after training?

EDIT2: I found a solution but it is very slow and inefficient to switch between the networks.

    makeNetwork().use {
        it.compile(
            optimizer = Adam(),
            loss = Losses.SOFT_MAX_CROSS_ENTROPY_WITH_LOGITS,
            metric = Metrics.ACCURACY
        )
        it.init()
        it.save(modelDir)
    }

    makeNetwork().use {
        it.compile(
            optimizer = Adam(),
            loss = Losses.MSE,
            metric = Metrics.ACCURACY
        )
        it.loadWeights(modelDir)

        val prediction = it.predictSoftly(FloatArray(inputDim){1.0f})
        println(prediction.contentToString())
    }

EDIT3: I have a very hacky workaround using reflection to change the output.

@Suppress("UNCHECKED_CAST")
fun GraphTrainableModel.setPredOp(softmax: Boolean) {
    val predictionOpProperty = GraphTrainableModel::class.java.getDeclaredField("predictionOp")
    predictionOpProperty.isAccessible = true

    val tfProp = GraphTrainableModel::class.members.find { it.name == "tf" } as KProperty1<GraphTrainableModel, *>
    tfProp.isAccessible = true

    val yPredOpProp = GraphTrainableModel::class.members.find { it.name == "yPredOp" } as KProperty1<GraphTrainableModel, *>
    yPredOpProp.isAccessible = true

    val yPredOp = yPredOpProp.get(this) as Operand<Float>
    val tf = tfProp.get(this) as Ops
    val outputName = "default_output" // from org.jetbrains.kotlinx.dl.api.core.util.tensorNames.kt

    val predOp = if (softmax) {
        tf.withName(outputName).nn.softmax(yPredOp)
    } else {
        tf.withName(outputName).identity(yPredOp)
    }

    predictionOpProperty.set(this, predOp)
}

Using it like this works.

    makeNetwork().use {
        it.compile(
            optimizer = Adam(),
            loss = Losses.SOFT_MAX_CROSS_ENTROPY_WITH_LOGITS,
            metric = Metrics.ACCURACY
        )
        it.init()

        val pred = it.predictSoftly(FloatArray(inputDim){1.0f})
        println(pred.contentToString()) // output is softmaxed

        it.setPredOp(false)

        val pred2 = it.predictSoftly(FloatArray(inputDim){1.0f})
        println(pred2.contentToString()) // output is raw
    }

EDIT4: Sorry for the long saga, but I found a less hacky way.

@Suppress("UNCHECKED_CAST")
fun GraphTrainableModel.addRawOutput() {
    val predictionOpProperty = GraphTrainableModel::class.java.getDeclaredField("predictionOp")
    predictionOpProperty.isAccessible = true

    val tfProp = GraphTrainableModel::class.members.find { it.name == "tf" } as KProperty1<GraphTrainableModel, *>
    tfProp.isAccessible = true

    val yPredOpProp = GraphTrainableModel::class.members.find { it.name == "yPredOp" } as KProperty1<GraphTrainableModel, *>
    yPredOpProp.isAccessible = true

    val yPredOp = yPredOpProp.get(this) as Operand<Float>
    val tf = tfProp.get(this) as Ops

    tf.withName("raw_output").identity(yPredOp)
}

This adds a "raw_output" op that can be accessed by providing predictionTensorName = "raw_output" to predictSoftly()

You can then get the raw output or the softmaxed output without any modifications.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raw output from network #582

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Raw output from network #582

alchitry Apr 12, 2024

Replies: 0 comments

alchitry
Apr 12, 2024