From d5a9aff764df14aa9ed89e781308eb79ea07ddf8 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 28 Jun 2010 09:16:24 +1000 Subject: [PATCH] README update and bug-ons to help catch newly identified problems. Signed-off-by: NeilBrown --- README | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++--- clean.c | 10 +++++--- cluster.c | 1 + 3 files changed, 78 insertions(+), 7 deletions(-) diff --git a/README b/README index 4ad31ad..120ad15 100644 --- a/README +++ b/README @@ -4790,13 +4790,17 @@ DONE 3/ clean up the various 'scratch' patches discarding any tracing that DONE 4/ check in this README file DONE 5/ Write rest of the TODO list - 5a/ index.c:1982. Data block with Phys and no UnincCredit +DONE 5a/ index.c:1982. Data block with Phys and no UnincCredit It is Dirty but only has *N credits. 16/1 ... - 5b/ phase_flip/pin_all_children/lafs_refile finds refcnt == 0; +DONE 5b/ phase_flip/pin_all_children/lafs_refile finds refcnt == 0; I guess we should getref/putref. + 5c/ dirty_inode might find InoIdx is allocated by datablock not + and doesn't cope well. + +DONE 5d/ At unmount, 16/1 is still pinned. 6/ soft lockup in unlink call. EIP is at lafs_hash_name+0xa5/0x10f [lafs] @@ -4821,7 +4825,44 @@ Why have I no credits? Cleaning is racing with truncate, and that cannot happen!! - 8/ looping in do_checkpoint + 7a/ block.c:507 in lafs_dirty_dblock - no credits for 0/2 + block.c:507: [cfa63c58]0/2(4348)r2F:Valid,Dirty,Writeback,PhysValid cluster(1) iblock(1) + in touch_atime. I think I know this one. + + 7b/ soft lockup in cleaner between 0x5e6, then 0x799-7f6 then 0x990 of 0x1502 + i.e. 1510, 1945-2038, 2448 of 5378 + Appear to be looping in first loop of try_clean, maybe + group_size_words == 0 ?? + + 7c/ NULL pointer deref - 000001b4 + Could be cluster_flush finds inode dblock without inode. + + 7d/ paging request at 6b6b6bfb. + invalidate_inode_buffers called, so inode_has_buffers, + so private_list is not empty. So presumably use-after-free. + But is on s_inodes list. + Probably cleaner is still active (if this is first call to + invalidate_inodes in generic_shutdown_super) so list gets broken. + We need locking or earlier flush. + + 7e/ Remove BUG block.c;273 as cleaner can cause this. + Check for Realloc too. + + 7f/ index.c:2024 no uninc credit + [ce532338]0/306(2996)r1F:Pinned,Phase0,Valid,Dirty,Writeback,SegRef,Claimed,PhysValid cluster(1) + found during checkpoint. Maybe inode credit problem. + + 7g/ inode.c:831 InoIdx 283/0 is Realloc, not dirty, and has + ->uninc blocks. This is during truncate. Need some + interlock with cleaner maybe? + + 7h/ truncate finds children - Realloc on clean-leafs + + 7j/ resolve space allocation issues. + Understand why CleanSpace can be tried and failed 1000 + times before there is any change. + +DONE 8/ looping in do_checkpoint root is still i Phase1 because 0/2 is in Phase 1 [cfa57c58]0/2(2078)r1E:Pinned,Phase1,WPhase0,Valid,Dirty,C,CI,CN,CNI,UninCredit,IOLock,PhysValid writepageflush(1) Seems to be waiting for writeback, but writeback is clear. @@ -4852,6 +4893,7 @@ Why have I no credits? [cfa5fc58]0/2(2852)r0E:Valid,Dirty,SegRef,CN,CNI,UninCredit,PhysValid [cfa53c58]0/2(3570)r0E:Valid,Dirty,CN,CNI,UninCredit,PhysValid [cfa53828]0/2(2969)r0E:Valid,Dirty,CN,CNI,UninCredit,PhysValid + [cfa57c58]0/2(579)r0E:Valid,Dirty,UninCredit,PhysValid maybe dir-orphan handling stuffed up 12/ timeout/showstate in unmount @@ -4900,7 +4942,7 @@ Why have I no credits? 23/ Rebase on 2.6.latest -24/ load/dirty block0 before dirtying any othe rblock in depth=0 file, +24/ load/dirty block0 before dirtying any other block in depth=0 file, else we might lose block0 25/ use kmem_cache for @@ -4966,6 +5008,8 @@ Why have I no credits? 48/ Review all code, improve all comments, remove all bugs. +49/ measure performance + 26June2010 Investigating 5a @@ -5059,3 +5103,25 @@ Why have I no credits? parent is finally incorporated. So we don't write quota blocks until checkpoint is done. So yes, they are like SegmentMap ]] + + + segsums.... + If there are hundreds of snapshots, then a block being cleaned (whether to + cleaner segment or new-data segment) could affect hundreds of segment + usage counters. That would be clumsy to work with. Every block in the + free table would need to hold references to hundreds of blocks. This + is do-able and might not be a big waste of space, but is still clumsy. + I could change the arrangement for accounting per-snapshot usage by having + a limited number of snapshots and having all the counters for one segment + in the one blocks. So 1024byte block could hold 512 counters (youth plus + base plus 510 snapshots). Half that if I go to 4byte counters. + In more common case of 32 snaphots, could fit counters for 8 segments in + a block. This means using space/io for all possible snapshots rather than + all active snapshots. It would also mean having a fairly fixed upper limit. + I wonder what NILFS does.... + Worry about this later. + + Still trying to get pinning of SegmentMap blocks right. + Normally we need a phase-lock when pinning a data block so that we + don't lose the pinning before we dirty. But as we phase_flip + these it doesn't matter... So just add that too the test?? diff --git a/clean.c b/clean.c index 4cbc41a..0609e32 100644 --- a/clean.c +++ b/clean.c @@ -324,10 +324,13 @@ static int try_clean(struct fs *fs, struct toclean *tc) if ((((char *)tc->gh) - (char *)tc->ch) >= le16_to_cpu(tc->ch->Hlength)) { /* Finished with that cluster, try another. */ - tc->haddr = le64_to_cpu(tc->ch->next_addr); - if (in_seg(fs, tc->dev, tc->seg, tc->haddr)) + u64 next; + next = le64_to_cpu(tc->ch->next_addr); + if (in_seg(fs, tc->dev, tc->seg, next)) { + BUG_ON(next <= tc->haddr); + tc->haddr = next; tc->ac.state = 1; - else { + } else { tc->ac.state = 0; tc->ss = 0; } @@ -341,6 +344,7 @@ static int try_clean(struct fs *fs, struct toclean *tc) /* FIXME what if group has padding at end? * this might be fixed, but need to be certain * of all possibilities. */ + BUG_ON(le16_to_cpu(tc->gh->group_size_words) == 0); tc->gh = (struct group_head *)(((char *)tc->gh) + le16_to_cpu(tc->gh->group_size_words)*4); tc->desc = tc->gh->u.desc; diff --git a/cluster.c b/cluster.c index 4be7e47..de9a453 100644 --- a/cluster.c +++ b/cluster.c @@ -1250,6 +1250,7 @@ static void cluster_flush(struct fs *fs, int cnum) if (!test_bit(B_Index, &b->flags) && LAFSI(b->inode)->type == TypeInodeFile) { struct indexblock *ib; + LAFS_BUG(!dblk(b)->my_inode, b); spin_lock(&dblk(b)->my_inode->i_data.private_lock); ib = getiref_locked(LAFSI(dblk(b)->my_inode)->iblock, MKREF(cflush)); -- 2.39.5