2

We are working on custom embedded Linux board with power pc based SOC and display panel.
We are debugging an issue where during video playback Linux kernel just freezes, it doesn't respond to even sysrq-trigger,

if we enable ftrace issue disappears.
We also tried different kernel hacking options like debug soft irq, detect hung task etc. Haven't tried kdb or kgdb as yet but not sure if that would help in this case.

Lauterbach setup is not in working condition so can't use it :(

Chip vendor provided kernel and platform code however they are out of business so no support from them :(

Approach 1

  • When it didn't respond to sysrq-trigger I had a doubt that, it is getting stuck in interrupt handler,
  • so in working ( before the freeze ), I monitored /proc/interrupts and figured out interrupts involved in the use case. Then I added flags in noinit section and updated it on entry and exit of each irq handler. And I printed them before request_irq
  • After reproducing the issue, after h/w watchdog rebooted the system, when I looked at the dmesg those flags had values indicating that irq_handlers exited.
  • One irq I haven't looked at is timer. I don't have a doubt on that.But if I don't find anything I am going to add flag toggle there as well.(However not hopeful there)

Question set 1
A) What could be a cause of freeze apart from kernel getting stuck in irq handler ?
B) Any suggestion to refine this approach ?
C) Anything other debugging technique?

Approach 2

  • We tested the past different firmware versions and found that two non-related(kernel module which is not related to video playback and the other is small change in kernel code) changes in kernel are causing this issue, if we remove those changes kernel freeze goes away.
  • However there is more to story if keep any of the two kernel changes crash comes back.
  • We would be looking into these changes critically to decide how these unrelated changes are affecting it.

Question set 2
A) From above points in approach 2 anything in particular anyone suspects here?(I feel I don't know what question to ask so asking this).
B) Any suggestion to refine this approach more ?

1
  • 1
    System "freezes" are often caused by running too many, too large programs and running out of available memory. Use free to see if you have swap space, read man mkswap swapon fstab fallocate to create some. Swap space must be contiguous. use mkswap or fallocate, not dd. Traditionally, swap space of 1.5 × RAM has been recommended, but YMMV. If you don't plan to hibernate your system, you can have less than 1.0 × RAM. Commented Mar 30, 2022 at 4:20

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.