Opening serial port causes kernal panic

Started by Zaxter, January 05, 2017, 02:00:02 PM

Previous topic - Next topic

Zaxter

I'm running Armbian jessie linux with sunxi kernel 3.4.112 on an Olimex A20 micro board.
I find that opening serial port device `/dev/ttyS2` causes the system to crash. The board heartbeat LED either stops or blinks rapidly. The board itself is not pingable.
I can reproduce this issue with a simple `cat /dev/ttyS2` or `fd = open("/dev/ttyS2", O_RDWR | O_NOCTTY | O_NDELAY);`.

`/dev/ttyS2` is linked to UART6. In the fex file, I have configured it like so:

    [uart_para6]
    uart_used = 1
    uart_port = 6
    uart_type = 2
    uart_tx = port:PI12<3><default><default><default>
    uart_rx = port:PI13<3><default><default><default>

Note that the UART6 is on MUX3 as it should be according to (sunxi's pio documentation for A20)[http://linux-sunxi.org/A20/PIO#PI12_.28268.29].

Any ideas?

Gerrit

check if PI12 or PI13 are not in use for something else in the fex file

Zaxter

Thank you for your suggestion @Gerrit.

I have checked the Fex file and these pins aren't in use elsewhere.

soenke

What happens if you use one of the other UARTs as ttyS2? Disable the UART of PI12/13 and enable some other.

Zaxter

I hit the problem even if I disable UART6 and try with other UARTs.

Here are the crash logs from serial console:
root@micro:~# cat /dev/ttyS2
<4>i2c-0, xfer timeout

[   63.123577] i2c-0, xfer timeout
<3>axp_mfd 0-0034: failed reading at 0x23

[   63.130520] axp_mfd 0-0034: failed reading at 0x23
<3>[cpu_freq] ERR:try to set voltage failed!

[   63.139404] [cpu_freq] ERR:try to set voltage failed!


And here's the dmesg output for registered uarts:
root@micro:~# dmesg | grep uart
[    0.996902] [uart]: used uart info.: 0xc9
[    1.001620] [uart]: serial probe 0 irq 33 mapbase 0x01c28000
[    1.027944] sunxi-uart.0: ttyS0 at MMIO 0x1c28000 (irq = 33) is a U6_16550A
[    1.889704] [uart]: serial probe 3 irq 36 mapbase 0x01c28c00
[    1.921599] sunxi-uart.3: ttyS1 at MMIO 0x1c28c00 (irq = 36) is a U6_16550A
[    1.933439] [uart]: serial probe 6 irq 51 mapbase 0x01c29800
[    1.965318] sunxi-uart.6: ttyS2 at MMIO 0x1c29800 (irq = 51) is a U6_16550A
[    1.977174] [uart]: serial probe 7 irq 52 mapbase 0x01c29c00
[    2.009053] sunxi-uart.7: ttyS3 at MMIO 0x1c29c00 (irq = 52) is a U6_16550A


Have you seen something like this before?

Gerrit


soenke

That looks more like an i2c communication problem with the axp (which is the power management chip for the A20 processor).

Try "i2cdetect 0" multiple times if this also causes a crash and if the chip is detected every time at its address.

Also try setting the cpufreq-govenor to "performance" so that the axp doesnt have to change the cpu voltage while you access the uart.

Zaxter

No luck trying with a fresh install on new SD card @Gerrit. Still the same problem. :(

`i2detect 0` works multiple times without crashing @soenke.

I did `while true; do i2cdetect -y 0; sleep 1; done;`, and it worked everytime. But a single `cat /dev/ttyS2` crashes the system.

Thanks for `cpufreq-governor` suggestion. I will try that.

Do you think a watchdog will be helpful?

Zaxter

This is an intermittent problem, and occurs once in ~3 power cycles.

The watchdog doesn't seem to be of any help. Neither does the `performance` setting for cpu frequency governor help.
I'm out of ideas now. :(

Zaxter

After I commented out a lot of unused kernel modules from `/etc/modules`, I hit this problem again. Only this time I got some more logs out of it on the serial console:

<3>INFO: rcu_preempt detected stalls on CPUs/tasks:[   89.965690] INFO: rcu_preempt detected stalls on CPUs/tasks:<c>


<3>.0: (1 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1

[   89.977834] .0: (1 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1
<3>.[   89.985239] .<c>(detected by 1, t=6003 jiffies)

(detected by 1, t=6003 jiffies)
<3>INFO: Stall ended before state dump start

[   89.996648] INFO: Stall ended before state dump start
<3>INFO: rcu_bh detected stalls on CPUs/tasks:[  121.735208] INFO: rcu_bh detected stalls on CPUs/tasks:<c>


<3>.0: (4294966997 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1

[  121.747684] .0: (4294966997 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1
<3>.[  121.755873] .<c>(detected by 1, t=6002 jiffies)

(detected by 1, t=6002 jiffies)
<3>INFO: Stall ended before state dump start

[  121.767282] INFO: Stall ended before state dump start
<3>INFO: rcu_preempt detected stalls on CPUs/tasks:[   91.068481] INFO: rcu_preempt detected stalls on CPUs/tasks:<c>


<3>.0: (1 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1

[   91.080617] .0: (1 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1
<3>.[   91.088022] .<c>(detected by 1, t=24009 jiffies)

(detected by 1, t=24009 jiffies)
<3>INFO: Stall ended before state dump start

[   91.099606] INFO: Stall ended before state dump start
<3>INFO: rcu_bh detected stalls on CPUs/tasks:[  122.828003] INFO: rcu_bh detected stalls on CPUs/tasks:<c>


<3>.0: (4294966997 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1

[  122.840473] .0: (4294966997 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1
<3>.[  122.848662] .<c>(detected by 1, t=24007 jiffies)

(detected by 1, t=24007 jiffies)
<3>INFO: Stall ended before state dump start

[  122.860245] INFO: Stall ended before state dump start
<3>INFO: rcu_preempt detected stalls on CPUs/tasks:[   92.161276] INFO: rcu_preempt detected stalls on CPUs/tasks:<c>


<3>.0: (1 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1

[   92.173412] .0: (1 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1
<3>.[   92.180817] .<c>(detected by 1, t=42014 jiffies)

(detected by 1, t=42014 jiffies)
<3>INFO: Stall ended before state dump start

[   92.192399] INFO: Stall ended before state dump start
<3>INFO: rcu_bh detected stalls on CPUs/tasks:[  123.920798] INFO: rcu_bh detected stalls on CPUs/tasks:<c>


<3>.0: (4294966997 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1

[  123.933272] .0: (4294966997 GPs behind) idle=83f/140000000000001/0 drain=3 . timer=-1
<3>.[  123.941462] .<c>(detected by 1, t=42012 jiffies)

(detected by 1, t=42012 jiffies)
<3>INFO: Stall ended before state dump start

[  123.953044] INFO: Stall ended before state dump start



The only module I'm loading currently is the `sunxi-gpio` one. All others have been disabled.

JohnS

Being such an out of date kernel I suppose you have choices like:
1. debug it yourself
2. pay someone
3. move to a recent kernel then if it still fails you can report it to the appropriate ML

John

Zaxter

I was able to solve the problem. It was related to UART settings in the Fex file.
Somehow, I hadn't set the pull-up on UART6 & UART7. ->
[uart_para6]
uart_used = 1
uart_port = 6
uart_type = 2
uart_tx = port:PI12<3><default><default><default>
uart_rx = port:PI13<3><default><default><default>
#                          ^
#                          |
#                         Here


It should've been ->
[uart_para6]
uart_used = 1
uart_port = 6
uart_type = 2
uart_tx = port:PI12<3><1><default><default>
uart_rx = port:PI13<3><1><default><default>


I guess I will be moving to the mainline kernel in the future.

soenke

At least you found the problem :)
But i would never have suspected a missing pullup...