Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BSOD when running the sample with DML runtime #1007

Open
sandrohanea opened this issue Oct 24, 2024 · 3 comments
Open

BSOD when running the sample with DML runtime #1007

sandrohanea opened this issue Oct 24, 2024 · 3 comments
Labels

Comments

@sandrohanea
Copy link
Member

Describe the bug
When running one of the examples for directml OnmxRuntimeGenAI, I get the blue screen of death on my Surface Lpatop Studio 2

To Reproduce
Steps to reproduce the behavior:

  1. Use this sample (any would do):
using Microsoft.ML.OnnxRuntimeGenAI;
using System.Reflection.Emit;
using System.Reflection;

var modelPath = @"C:\Models\microsoft\Phi-3-medium-4k-instruct-onnx-directml\directml-int4-awq-block-128";
var model = new Model(modelPath);
var tokenizer = new Tokenizer(model);

var systemPrompt = "You are an AI assistant that helps people find information. Answer questions using a direct style. Do not share more information that the requested by the users.";

// chat start
Console.WriteLine(@"Ask your question. Type an empty string to Exit.");

// chat loop
while (true)
{
    // Get user question
    Console.WriteLine();
    Console.Write(@"Q: ");
    var userQ = Console.ReadLine();
    if (string.IsNullOrEmpty(userQ))
    {
        break;
    }

    // show phi3 response
    Console.Write("Phi3: ");
    var fullPrompt = $"<|system|>{systemPrompt}<|end|><|user|>{userQ}<|end|><|assistant|>";
    var tokens = tokenizer.Encode(fullPrompt);

    var generatorParams = new GeneratorParams(model);
    generatorParams.SetSearchOption("max_length", 2048);
    generatorParams.SetSearchOption("past_present_share_buffer", false);
    generatorParams.SetInputSequences(tokens);

    var generator = new Generator(model, generatorParams);
    while (!generator.IsDone())
    {
        generator.ComputeLogits();
        generator.GenerateNextToken();
        var outputTokens = generator.GetSequence(0);
        var newToken = outputTokens.Slice(outputTokens.Length - 1, 1);
        var output = tokenizer.Decode(newToken);
        Console.Write(output);
    }
    Console.WriteLine();
}
  1. Have this Csproj:
<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
	  <PlatformTarget>x64</PlatformTarget>
	  <RuntimeIdentifier>win-x64</RuntimeIdentifier>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI" Version="0.4.0" />
    <PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.DirectML" Version="0.4.0" />
  </ItemGroup>

</Project>
  1. Run the sample with the model from here: https://huggingface.co/microsoft/Phi-3-medium-4k-instruct-onnx-directml/tree/main
  2. Say hi

Expected behavior
Getting a greeting or response from the model (preferably using the NPU or at least the GPU)

Screenshots
Can share a video recorded using my phone with the BSOD. (contact me on Teams if needed).

Desktop (please complete the following information):

  • Device: Surface Laptop Studio 2
  • OS: Windows 11 Enterprise 24H2
  • Browser: n/a
  • Version: 0.4.0 (C# Microsoft.ML.OnnxRuntimeGenAI.DirectML and Microsoft.ML.OnnxRuntimeGenAI)
  • Intel NPU Driver: 31.0.100.2016 (latest available for the NPU)
  • Nvidia GeForce RTX 4060 Driver: 31.0.15.3878

Additional context

  • CPU Runtime works as expected
  • Will test CUDA Runtime as well and report the result in the comment.
@sandrohanea
Copy link
Member Author

Image
Error is BSOD screen is VIDEO_SCHEDULER_INTERNAL_ERROR

Also, the issue is reproduced consistently on my device.

@sandrohanea
Copy link
Member Author

I confirm that CUDA also works as expected on my system. only DML runtime is causing the BSOD.

@sandrohanea
Copy link
Member Author

Another strange thing is that even after the model should be "loaded" the RAM usage is super low:
Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant