Skip to content

svn: E200030: sqlite: database disk image is malformed #403

@slovichon

Description

@slovichon

On 2015-01-06 11:33:26 -0500, Zhihui Zhang wrote:

<zhihui> A    projects-krakatoa/compat/getmntinfo/Makefile
<zhihui> A    projects-krakatoa/compat/getmntinfo/getmntinfo_compat.c
<zhihui> A    projects-krakatoa/compat/getifaddrs
<zhihui> A    projects-krakatoa/compat/getifaddrs/getifaddrs_compat.c
<zhihui> A    projects-krakatoa/compat/getifaddrs/Makefile
<zhihui> svn: E200030: sqlite: database disk image is malformed
<zhihui> svn: E200030: sqlite: database disk image is malformed
<zhihui> svn: E200030: sqlite: database disk image is malformed
<zhihui> svn: E200030: sqlite: database disk image is malformed
<zhihui> zhihui@lime: /zzh-slash2/zhihui/projects-krakatoa$ 
<zhihui> lots of the following:
<zhihui> [1420494042:348387 slirathr0:7fffc3fff700:slvr slvr.c slvr_fsio 546] slvr@0x7fff780200c0 num=128 pw=0 pr=1 ts=0:000000000 bii=0x7fff1c002648 slab=0x7fff04020410 bmap=0x7fff1c0025e0 fid=0x048c000000011043 iocb=(nil) flgs=fp--l---- :: bad crc blks=32 off=268435456
<zhihui> [1420494042:354333 slirathr0:7fffc3fff700:slvr slvr.c slvr_fsio 546] slvr@0x7fff20001f60 num=129 pw=0 pr=1 ts=0:000000000 bii=0x7fff1c002648 slab=0x7fff500106b0 bmap=0x7fff1c0025e0 fid=0x048c000000011043 iocb=(nil) flgs=fp--l---- :: bad crc blks=32 off=269484032
<zhihui> [1420494051:405046 slirathr0:7fffc3fff700:slvr slvr.c slvr_fsio 546] slvr@0x7fffbc007f10 num=128 pw=0 pr=1 ts=0:000000000 bii=0x7fff1c002648 slab=0x7fffb40078a0 bmap=0x7fff1c0025e0 fid=0x048c000000011043 iocb=(nil) flgs=fp--l---- :: bad crc blks=32 off=268435456
<zhihui> [1420494051:409341 slirathr0:7fffc3fff700:slvr slvr.c slvr_fsio 546] slvr@0x7fffb4006d90 num=129 pw=0 pr=2 ts=0:000000000 bii=0x7fff1c002648 slab=0xa85050 bmap=0x7fff1c0025e0 fid=0x048c000000011043 iocb=(nil) flgs=fp--l---- :: bad crc blks=32 off=269484032
<zhihui> this happens on iozone  -a, kernel build, and self build runing at the same time.
<yanovich_> hmm
<yanovich_> i can do a make build without a problem
<yanovich_> that said, there are known I/O problems
<yanovich_> specifically there is a race condition with multiple threads on the same file when no csvc exists yet
<yanovich_> some threads get ETIMEDOUT immediately
<yanovich_> bad multiwait code
<yanovich_> it *might* be that
<yanovich_> can you strace it to find out?

On 2015-01-07 14:12:14 -0500, Zhihui Zhang wrote:

At 

zhihui@orange: ~/projects-orange$ svn info
Path: .
Working Copy Root Path: /home/zhihui/projects-orange
URL: svn+ssh://frodo/cluster/svn/projects
Repository Root: svn+ssh://frodo/cluster/svn
Repository UUID: 3eda493b-6a19-0410-b2e0-ec8ea4dd8fda
Revision: 25045
Node Kind: directory
Schedule: normal
Last Changed Author: zhihui
Last Changed Rev: 25045
Last Changed Date: 2014-12-19 16:08:46 -0500 (Fri, 19 Dec 2014)

[1420656909:602534 msnbrqthr0:7fff75ffb700:rpc rpcclient.c pscrpc_abort_inflight 1368] req@0x7ffea71da060 x8629185/t0 cb=0x4f6229 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 0 fl Rpc:/0/0 replyc 0 rc 0/0 to=60 sent=1420656877 :: aborted
[1420656909:602585 msnbrqthr0:7fff75ffb700:rpc rpcclient.c pscrpc_abort_inflight 1368] req@0x7ffeb3515950 x8629188/t0 cb=0x417d0a c0 o42->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 0 fl Rpc:/0/0 replyc 0 rc 0/0 to=60 sent=1420656877 :: aborted
[1420656909:602637 msnbrqthr0:7fff75ffb700:rpc rpcclient.c pscrpc_abort_inflight 1368] req@0x7ffea657ab40 x8629191/t0 cb=0x502a62 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 2 fl Rpc:EXA/1/0 replyc 0 rc -110/0 to=15 sent=1420656893 :: aborted
[1420656909:602682 msnbrqthr0:7fff75ffb700:rpc rpcclient.c pscrpc_abort_inflight 1368] req@0x7ffeb3516180 x8629192/t0 cb=0x502a62 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 1 fl Rpc:A/1/0 replyc 0 rc 0/0 to=15 sent=1420656893 :: aborted
[1420656909:602731 msnbrqthr0:7fff75ffb700:rpc rpcclient.c pscrpc_abort_inflight 1368] req@0x7ffeb324a490 x8629193/t0 cb=0x502a62 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 1 fl Rpc:A/1/0 replyc 0 rc 0/0 to=15 sent=1420656893 :: aborted
[1420656909:602790 msnbrqthr0:7fff75ffb700:def rpc_common.c sl_imp_hldrop_resm 580] req@0x7ffea71da060 x8629185/t0 cb=0x4f6229 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 0 fl Rpc:E/0/0 replyc 0 rc 0/0 to=60 sent=1420656877 :: aborted
[1420656909:602842 msnbrqthr0:7fff75ffb700:def rpc_common.c sl_imp_hldrop_resm 580] req@0x7ffeb3515950 x8629188/t0 cb=0x417d0a c0 o42->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 0 fl Rpc:E/0/0 replyc 0 rc 0/0 to=60 sent=1420656877 :: aborted
[1420656909:602894 msnbrqthr0:7fff75ffb700:def rpc_common.c sl_imp_hldrop_resm 580] req@0x7ffea657ab40 x8629191/t0 cb=0x502a62 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 2 fl Rpc:EXA/1/0 replyc 0 rc -110/0 to=15 sent=1420656893 :: aborted
[1420656909:602944 msnbrqthr0:7fff75ffb700:def rpc_common.c sl_imp_hldrop_resm 580] req@0x7ffeb3516180 x8629192/t0 cb=0x502a62 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 1 fl Rpc:EA/1/0 replyc 0 rc 0/0 to=15 sent=1420656893 :: aborted
[1420656909:602989 msnbrqthr0:7fff75ffb700:def rpc_common.c sl_imp_hldrop_resm 580] req@0x7ffeb324a490 x8629193/t0 cb=0x502a62 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 1 fl Rpc:EA/1/0 replyc 0 rc 0/0 to=15 sent=1420656893 :: aborted
[1420656909:603051 msnbrqthr0:7fff75ffb700:rpc rpcclient.c pscrpc_check_set 839] req@0x7ffea657ab40 x8629191/t0 cb=0x502a62 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 2 fl Rpc:EXA/1/0 replyc 0 rc -110/0 to=15 sent=1420656893 :: expired (resend=0)
[1420656909:603106 msnbrqthr0:7fff75ffb700:bmap io.c msl_readahead_cb 1151] bmap@0x7fff187897e0 bno:1 flg:0x83:RWT fid:0x048c000000059d2d opcnt=98 : sbd_seq=8491521
[1420656909:603200 msfsthr18:7ffefa7fc700:fcmh io.c msl_io 2111] fcmh@0xa312e0 f+g=0x048c000000059d2d:0 flg=0x44:BA ref=2 sz=268435456 bsz=3920 mode=0100640 : q=0xe76f90 bno=1 sz=0 tlen=0 off=203161600 roff=69074944 rw=read rc=-110
[1420656909:603349 msfsthr18:7ffefa7fc700:def io.c mfsrq_seterr 615] setting rqinfo q=0xe76f90 err=-110
[1420656910:000101 msnbrqthr0:7fff75ffb700:rpc rpcclient.c pscrpc_check_set 794] req@0x7ffeb3516180 x8629192/t0 cb=0x502a62 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 1 fl Rpc:EA/1/0 replyc 0 rc 0/0 to=15 sent=1420656893 :: -EIO set, rq_err = 1
[1420656910:000294 msnbrqthr0:7fff75ffb700:bmap io.c msl_readahead_cb 1151] bmap@0x7fff187897e0 bno:1 flg:0x83:RWT fid:0x048c000000059d2d opcnt=65 : sbd_seq=8491521
[1420656910:000603 msnbrqthr0:7fff75ffb700:rpc rpcclient.c pscrpc_check_set 794] req@0x7ffea71da060 x8629185/t0 cb=0x4f6229 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 0 fl Rpc:E/0/0 replyc 0 rc 0/0 to=60 sent=1420656877 :: -EIO set, rq_err = 1
[1420656910:000667 msnbrqthr0:7fff75ffb700:rpc io.c msl_read_cb 853] req@0x7ffea71da060 x8629185/t0 cb=0x4f6229 c0 o33->@54321-128.182.99.26@tcp10:30 lens 264/192 ref 1 res 0 ret 0 fl Complete:E/0/0 replyc 0 rc -5/0 to=60 sent=1420656877 :: bmap=0x7fff00003030 biorq=0x7ffedc1801f0
[1420656910:000720 msnbrqthr0:7fff75ffb700:bmap io.c msl_read_cb 858] bmap@0x7fff00003030 bno:0 flg:0x81:RT fid:0x048c0000000564e1 opcnt=2 : sbd_seq=8491357
[1420656910:000768 msnbrqthr0:7fff75ffb700:bmap io.c msl_read_cb 860] biorq@0x7ffedc1801f0 flg=0x1:r ref=2 off=0 len=1343 retry=0 buf=(nil) rqi=0xee2f90 sliod=ffffffff np=1 b=0x7fff00003030 ex=1420656877:548036830 : rc=-5
[1420656910:000831 msnbrqthr0:7fff75ffb700:def io.c mfsrq_seterr 615] setting rqinfo q=0xee2f90 err=-5

On 2015-01-07 14:13:56 -0500, Zhihui Zhang wrote:

[1420657000:001690 sliricthr06:7fff767fc700:rpc service.c pscrpc_server_handle_request 449] req@0xdd5a90 x8629189/t0 cb=(nil) c0 o33->@128.182.99.27@tcp10:-1 lens 264/192 ref 0 res 0 ret 0 fl Complete:/0/0 replyc 0 rc -110/-110 to=0 sent=0 :: timeout, processed in 122s
[1420657030:001388 sliricthr15:7fff557fa700:rpc rsx.c rsx_bulkserver 147] req@0xdbb3d0 x8629193/t0 cb=(nil) c0 o33->@128.182.99.27@tcp10:-1 lens 264/192 ref 0 res 0 ret 0 fl Interpret:/1/0 replyc 0 rc 0/0 to=0 sent=0 :: timeout on bulk GET
[1420657030:001612 sliricthr15:7fff557fa700:rpc rsx.c rsx_bulkserver 213] req@0xdbb3d0 x8629193/t0 cb=(nil) c0 o33->@128.182.99.27@tcp10:-1 lens 264/192 ref 0 res 0 ret 0 fl Interpret:/1/0 replyc 0 rc 0/0 to=0 sent=0 :: ignoring bulk I/O comm error; id U20873-128.182.99.27@tcp10 - client will retry
[1420657030:001674 sliricthr15:7fff557fa700:def ric.c sli_ric_handle_io 381] bulkserver error on read, rc=-110
[1420657030:001746 sliricthr15:7fff557fa700:rpc service.c pscrpc_target_send_reply_msg 593] req@0xdbb3d0 x8629193/t


Although I did hit the above error, overall, the test went much better, i ran at least twice as long, and did not see the original error.

On 2015-03-16 23:54:23 -0400, Jared Yanovich wrote:

In situations like these, the application level errors are too high to mean anything to us ("image is malformed" isn't very specific).

If this is the only application that is having problems, then these reports are useful.  But if simpler IO tests with more specific failure reports can tickle bugs, these should be reported and analyzed first.

Fortunately we are already rid of svn as a test for self-build.

Is this reproducible?  I have not encountered any such issues in HEAD for some time.

On 2015-04-08 10:34:28 -0400, Zhihui Zhang wrote:

Saw this again today, with some thing interesting:

[New Thread 0x7ffeae7d4700 (LWP 4985)]
[New Thread 0x7ffeadfd3700 (LWP 4986)]
[New Thread 0x7ffead7d2700 (LWP 4987)]
[1428503388:189656 sliricthr10:7fff10ff9700:slvr slvr.c slvr_fsio 500] no backing file: 0x04900000000134be:0 fd=-1
[1428503388:189770 sliricthr10:7fff10ff9700:def ric.c sli_ric_write_sliver 78] write error rc=-9
[1428503495:690413 sliricthr04:7fff13fff700:slvr slvr.c slvr_fsio 500] no backing file: 0x04900000000140dc:0 fd=-1
[1428503495:690631 sliricthr04:7fff13fff700:def ric.c sli_ric_write_sliver 78] write error rc=-9

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions