Unable to connect using ssh and serial cable - system freezes

Started by ricar, May 23, 2016, 06:42:39 PM

Previous topic - Next topic

ricar

Hello,
I have a problem with A20 Olinuxino LIME board that sometimes freezes.
The device I have been using is A20-Olinuxino LIME board, with Debian GNU/Linux 7.6 (wheezy) image booted from NAND. I mounted linux rootfs in read-only mode to avoid NAND corruption.
When it freezes I'm not able to connect neither using ssh nor using serial cable, but the board responds to ping request.
My application, running on board, writes some log files on tmpfs without exceeding the amount of free space available.
Originally I thought that this problem was due to RAM overflow, but monitoring the system (every minute) with df -h and free command I observed that the RAM is never full.
I have the same problem even if i installed Asterisk on board instead my application. Asterisk doesn't log anything on tmpfs but still the system freezes.

Has anyone had the same problem and can help me?

Has someone had the same problem and can help me?

soenke

Have you ever witnessed a freeze while you had the serial console open? If it is the kernel who freezes the system there will be some output on uart0 which could help debugging.

In any case, check your power supply, use 5V with at least 2A (and maybe not the cheapest china-brand).

ricar

Thank you for your reply and for your suggestion.
I would like give some information about debian version installed on boards.
On board with Asterisk installed, there is Debian GNU/Linux 7.8, on board with my application installed  there is Debian GNU/Linux 7.6.
The output of  "uname -a" command sent to both boards is:
Linux xxxxx 3.4.102 #1 SMP PREEMPT Thu Sep 4 12:55:36 CEST 2014 armv7l GNU/Linux.
The board with  Debian GNU/Linux 7.8 installed freezes more frequently than the board with Debian GNU/Linux 7.6.
Maybe the version of Debian freeze the system?


soenke

As i said, try another power supply and check the output on uart0 when a freeze occurs.

I dont think it depends on the kernel/debian version, it sounds more like a hardware/driver issue.

ricar

I'm checking the output on uart0 as you said and I'm waiting the next "random" freeze. I have difficultly in reproduce it.
What do you mean for hardware/driver issue? I tested 6 boards and each one has the same problem.

soenke


ricar

The SD card slot is empty because I booted from NAND.
3 boards use the same power supply  and the others use different power supply.

soenke

Well, than it looks more like a driver issue. The kernel output will probably help.

ricar

Hi,
I found this messages in serial console:
Unable to handle kernel paging request at virtual address 000103ec
Internal error: Oops: 17 [#2] PREEMPT SMP ARM
Unable to handle kernel NULL pointer dereference at virtual address 00000168
Modules linked in:[83814.799501]
LR is at __raw_spin_lock+0x20/0x94
pc : [<c001c224>]    lr : [<c0584848>]    psr: 60000013
sp : d162fc00  ip : 00000000  fp : d162fc2c
r10: c083a658  r9 : d162e000  r8 : 00000001
r7 : d162e000  r6 : 00000000  r5 : c083f940  r4 : 00000000
r3 : 00000000  r2 : d162fbf0  r1 : d162e000  r0 : 00000001
r3 : 00000000  r2 : d162fbf0  r1 : d162e000  r0 : 00000001
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
......
......
kernel:[83814.585336] Stack: (0xd162ff2c to 0xd1630000)
kernel:[83814.596806] ff20:    <1>Fixing recursive fault but reboot is needed!

P.S. This messages are displayed in serial console and in ssh console. Now I'm enable to send command using serial and ssh console but the device respond to ping request.

JohnS

If it's always the same message (you must check) then it's maybe a driver or kernel bug but if it changes it may well be bad power or RFI/EMI.

Try another kernel if you think it may be the kernel.

John


soenke

You need to replace the uImage-file in the boot-partition as well as the modules-directory in /lib/modules.

ricar

I updated Wheezy to Jessie but after reboot the "uname -a" command displays the old version of kernel (3.4.102).
In lib/modules directory there are  modules of the new version (3.16.0-4-armp) and in boot directory there are this file:

config-3.16.0-4-armmp      script.bin                 uImage
initrd.img-3.16.0-4-armmp  System.map-3.16.0-4-armmp  vmlinuz-3.16.0-4-armmp

Why "uname -a" doesn't display the new version of Kernel?
P.S. cat /etc/debian_version command displays the new version 8.4.


soenke

I didnt even know that 3.16 mainline can boot on a A20 :)
I thought there is a separate boot-partition on the sd-card containing the kernel?
Maybe you should try one of igors kernels, he has a 4.4 or 4.5 ready for download and install. But you will need to update uboot (bootloader) also. Even with a small tutorial afaik. Just search the forum for his posts, there are multiple topics about this.

ricar

My device doesn't boot from sd-card but from NAND. I think that the device boot from uImage that is in /boot directory. Is not correct?