[linux-lvm] How to fix inconsistent LV structs?
Glenn Shannon
warl0k at lvcm.com
Mon Oct 7 05:35:16 UTC 2002
Was this computer connected to a network when it went down?
Looks like a stack overflow to me, but where was the originator? Coda?
Unlikely but possible. Replaced syscalls usually (IIRC) indicate that
not only did a stack overflow occur but also that registers were
modified by whatever overflowed the stack. The kernel is noting itself
as tainted, which means that some form on non-GPLed module is running
(or has entered the stack to converse with the kernel directly, acting
as a module..) After the initial oops it appears to cascade to the rest
of the network-aware daemons, finally uprooting xfs and (presumably)
b0rking the drive(s).
Just a thought I had while reading the syslog.....forgive me if anything
I say is oncorrect I am by no means the kernel hacker that Heinz is :)
Glenn
--Dawn is Nature's way of telling you that it is time for bed.
-----Original Message-----
From: linux-lvm-admin at sistina.com [mailto:linux-lvm-admin at sistina.com]
On Behalf Of Raffael Herzog
Sent: Monday, October 07, 2002 2:19 AM
To: linux-lvm at sistina.com
Subject: Re: [linux-lvm] How to fix inconsistent LV structs?
Hi Heinz,
Heinz J . Mauelshagen wrote:
> Hmmm...
> Sounds like a nasty overwrite but it is hard to tell because you
> can't remmeber the exact details :(
Well, I can, the syslog is one of the only things that still
exist, besides the backup... :-) These are the last few
messages of the catastrophic reboot:
,----[ /var/log/syslog ]
| Oct 5 21:08:33 rumba kernel: Coda: Bye bye.
| Oct 5 21:08:33 rumba kernel: redir cleanup
| Oct 5 21:08:33 rumba kernel: replacing syscall nr. 12 [e0a01674]
with [c012e408]
| Oct 5 21:08:33 rumba kernel: replacing syscall nr. 106 [e0a017a0]
with [c0134ad0]
| Oct 5 21:08:33 rumba kernel: replacing syscall nr. 107 [e0a0184c]
with [c0134bb0]
| Oct 5 21:08:33 rumba kernel: replacing syscall nr. 33 [e0a018fc]
with [c012e2dc]
| Oct 5 21:08:33 rumba kernel: replacing syscall nr. 5 [e0a019c0]
with [c012ecb4]
| Oct 5 21:08:33 rumba kernel: replacing syscall nr. 85 [e0a01a9c]
with [c0134ce0]
| Oct 5 21:08:33 rumba kernel: replacing syscall nr. 183 [e0a01bd0]
with [c013ef44]
| Oct 5 21:08:33 rumba kernel: replacing syscall nr. 195 [e0a01d7c]
with [c0134ebc]
| Oct 5 21:08:33 rumba kernel: replacing syscall nr. 196 [e0a01e30]
with [c0134f30]
| Oct 5 21:08:33 rumba kernel: replacing syscall nr. 11 [e0a01f4c]
with [c0105a30]
| Oct 5 21:08:33 rumba kernel: Unable to handle kernel paging request
at virtual address e0a019fb
| Oct 5 21:08:33 rumba kernel: printing eip:
| Oct 5 21:08:33 rumba kernel: e0a019fb
| Oct 5 21:08:33 rumba kernel: *pde = 01870067
| Oct 5 21:08:33 rumba kernel: *pte = 00000000
| Oct 5 21:08:33 rumba kernel: Oops: 0000
| Oct 5 21:08:33 rumba kernel: CPU: 0
| Oct 5 21:08:33 rumba kernel: EIP: 0010:[<e0a019fb>] Tainted: P
| Oct 5 21:08:33 rumba kernel: EFLAGS: 00010286
| Oct 5 21:08:33 rumba kernel: eax: 00000005 ebx: 08094482 ecx:
d27ea3e0 edx: c1807ea0
| Oct 5 21:08:33 rumba kernel: esi: 00000241 edi: 08094482 ebp:
dcda5fbc esp: dcda5f94
| Oct 5 21:08:33 rumba kernel: ds: 0018 es: 0018 ss: 0018
| Oct 5 21:08:33 rumba kernel: Process avfscoda (pid: 354,
stackpage=dcda5000)
| Oct 5 21:08:33 rumba kernel: Stack: 08094482 00000241 000001b6
dd27e360 dcda4000 00000241 08094482 00000001
| Oct 5 21:08:33 rumba kernel: c0141df8 c0106e0c bffff6f8
c0106d1b 08094482 00000241 000001b6 00000241
| Oct 5 21:08:33 rumba kernel: 08094482 bffff6f8 00000005
0000002b 0000002b 00000005 4017b2e4 00000023
| Oct 5 21:08:33 rumba kernel: Call Trace: [sys_oldumount+12/16]
[error_code+52/60] [system_call+51/56]
| Oct 5 21:08:33 rumba kernel:
| Oct 5 21:08:33 rumba kernel: Code: Bad EIP value.
| Oct 5 21:08:33 rumba kernel: <6>i8k: module unloaded
| Oct 5 21:08:35 rumba nmbd[7091]: [2002/10/05 21:08:35, 0]
nmbd/nmbd.c:sig_term(63)
| Oct 5 21:08:35 rumba nmbd[7091]: Got SIGTERM: going down...
| Oct 5 21:08:35 rumba xfs[593]: terminating
| Oct 5 21:08:35 rumba xfs[594]: terminating
| Oct 5 21:08:35 rumba ntpd[604]: ntpd exiting on signal 15
| Oct 5 21:08:36 rumba usbmgr[12064]: umount /proc/bus/usb
| Oct 5 21:08:36 rumba rpc.statd[265]: Caught signal 15, un-registering
and exiting.
| Oct 5 21:08:36 rumba kernel: Kernel logging (proc) stopped.
| Oct 5 21:08:36 rumba kernel: Kernel log daemon terminating.
| Oct 5 21:08:36 rumba exiting on signal 15
`----
For a very short time (that laptop is *fast* :-) I've seen a
message about a failed umount, then it went down and never
came up again.
> > But how do I clear these structs?
>
> Presuming that the metadata backups are intact, you need to "pvcreate
-ff"
> the physical volumes and run vgcfgrestore on each of them.
> "vgscan ; vgchange -ay" should get you back to business afterwards.
Yes, I thought this would help, too. But it didn't. :-(
Commands always failed with "pv_read(): read" or "pv_read():
<something about creating names from kdev>". Because I
needed my laptop back up again today I restored my backup
yesterday evening, so unfortunately I can't help anymore to
find out what actually happened... :-(
cu,
Raffi
--
=> Neu im Usenet? Fragen? http://www.use-net.ch/usenet_intro_de.html
<=
The difference between theory and practice is that in theory, there is
no difference, but in practice, there is.
Raffael Herzog - herzog at raffael.ch - http://www.raffael.ch - ICQ
#67961355
_______________________________________________
linux-lvm mailing list
linux-lvm at sistina.com
http://lists.sistina.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
More information about the linux-lvm
mailing list