Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] delayed mtrace collection #4618

Open
aiChaoSONG opened this issue Jul 3, 2023 · 8 comments
Open

[BUG] delayed mtrace collection #4618

aiChaoSONG opened this issue Jul 3, 2023 · 8 comments
Labels
P1 Blocker bugs or important features

Comments

@aiChaoSONG
Copy link
Collaborator

aiChaoSONG commented Jul 3, 2023

Some mtrace in the test case are expanded to next test case, this is spotted in daily test.

for example, in internal daily test 28474, on any test config(model), the timestamp in mtrace for check-playback-all-formats.sh test case is started from 0, but for its next test case (check-capture-all-formats.sh), the timestamp is the continuation of the previous test case, which is unexpected.

This issue is also found in PR tests.

End of mtrace for check-playback-all-formats.sh on MTLP_RVP_SDW

[    9.588290] <inf> dma: dma_put: dma_put(), dma = 0x400ed670, sref = 0
[    9.588923] <inf> ipc: ipc_cmd: rx	: 0x12010000|0x0
[    9.589055] <inf> dma: dma_put: dma_put(), dma = 0x400ed580, sref = Terminated

Start of mtrace for check-capture-all-formats.sh on MTLP_RVP_SDW

01b0U
[    9.583648] <inf> ll_schedule: zephyr_ll_task_done: num_tasks 1 total_num_tasks 1
[    9.583660] <inf> ll_schedule: zephyr_domain_unregister: zephyr_domain_unregister domain->type 1 domain->clk 3
[    9.586620] <inf> ipc: ipc_cmd: rx	: 0x13000002|0x1
[    9.586636] <inf> pipe: pipeline_trigger: pipe:1 0x0 pipe trigger cmd 0
[    9.586658] <inf> dma_intel_adsp_gpdma: intel_adsp_gpdma_power_off: intel_adsp_gpdma_power_off: dma dma@7c000 power off
[    9.586753] <inf> pipe: pipeline_trigger: pipe:0 0x0 pipe trigger cmd 0
[    9.586788] <wrn> copier: copier_comp_trigger: comp:0 0x40000 dai is not ready
[    9.587661] <inf> ipc: ipc_cmd: rx	: 0x46000002|0x3
[    9.588271] <inf> ipc: ipc_cmd: rx	: 0x12000000|0x0
[    9.588290] <inf> dma: dma_put: dma_put(), dma = 0x400ed670, sref = 0
[    9.588923] <inf> ipc: ipc_cmd: rx	: 0x12010000|0x0
[    9.589055] <inf> dma: dma_put: dma_put(), dma = 0x400ed580, sref = 0
[   13.314338] <inf> ipc: ipc_cmd: rx	: 0x44000000|0x3060004c
[   13.314771] <inf> ipc: ipc_cmd: rx	: 0x47000000|0x0
[    0.142388] <inf> ipc: ipc_cmd: rx	: 0x44000000|0x31400008
[    0.142955] <inf> ipc: ipc_cmd: rx	: 0x44000000|0x3060004c

cc:

@marc-hb
Copy link
Collaborator

marc-hb commented Jul 5, 2023

Thanks @aiChaoSONG good catch.

mtrace-reader.py is stopped and restarted between each test and redirected to a new log file every time. So at first sight this does not look like a bug in sof-test but an issue in the logging system itself.

An unsolved mtrace flushing issue was mentioned in:

@kv2019i, @plbossart , @lyakh , @RanderWang any insight?

@marc-hb marc-hb added the P1 Blocker bugs or important features label Jul 5, 2023
@marc-hb marc-hb changed the title [BUG] Inconsistend mtrace collection [BUG] Inconsistent mtrace collection Jul 11, 2023
@kv2019i
Copy link
Collaborator

kv2019i commented Jul 12, 2023

@marc-hb The timestamp is generated by Zephyr OS. If the DSP is not powered off between test cases, the timestamp will continue and will not be reset to zero. So this is expected behaviour.

@kv2019i
Copy link
Collaborator

kv2019i commented Jul 12, 2023

I suspect there is one/many of following happening:

  • DSP crashed in a previous test case and kernel is running with CONFIG_SND_SOC_SOF_DEBUG_RETAIN_DSP_CONTEXT=y so DSP is not powered off anymore (until next system reboot)
  • DSP does not reach runtime-pm-suspended state between test cases

@aiChaoSONG
Copy link
Collaborator Author

DSP crashed in a previous test case and kernel is running with >CONFIG_SND_SOC_SOF_DEBUG_RETAIN_DSP_CONTEXT=y so DSP is not powered off anymore

It should not be the case, because this is also see in PASSED test case, just check any test case in our daily test.

DSP does not reach runtime-pm-suspended state between test cases

This is possible, the test interval between test case is 5 seconds, and in the kernel, the time for DSP to enter D3 is also 5s, let me try to make the test interval bigger.

@marc-hb
Copy link
Collaborator

marc-hb commented Jul 13, 2023

the time for DSP to enter D3 is also 5s, let me try to make the test interval bigger.

I think having the latter always longer than the former is a good idea in CI.

It should not be enforced on personal devices though :-)

@aiChaoSONG
Copy link
Collaborator Author

I have made the test case interval 10 seconds, which is far enough for DSP to enter D3, however, I still see the issue. check the internal report with ID:28982.

@marc-hb marc-hb changed the title [BUG] Inconsistent mtrace collection [BUG] delayed mtrace collection Sep 8, 2023
@marc-hb
Copy link
Collaborator

marc-hb commented Sep 28, 2023

I suspect there is one/many of following happening:

DSP crashed in a previous test case and kernel is running with CONFIG_SND_SOC_SOF_DEBUG_RETAIN_DSP_CONTEXT=y so DSP is not powered off anymore (until next system reboot)

This sounds like a https://github.com/thesofproject/linux issue

DSP does not reach runtime-pm-suspended state between test cases

This sounds like a https://github.com/thesofproject/sof issue

Either way this is not a sof-test issue. Transferring in 5..., 4..., 3...

@marc-hb
Copy link
Collaborator

marc-hb commented Oct 4, 2023

I have made the test case interval 10 seconds, which is far enough for DSP to enter D3, however, I still see the issue.

Maybe because of sof_pci_debug=0x1? (#1087)

@marc-hb marc-hb transferred this issue from thesofproject/sof-test Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 Blocker bugs or important features
Projects
None yet
Development

No branches or pull requests

3 participants