From: NeilBrown <neilb@suse.de>
Date: Fri, 4 Mar 2011 01:47:29 +0000 (+1100)
Subject: README update
X-Git-Url: http://git.neil.brown.name/?a=commitdiff_plain;h=617e9b586999c49fdda3287e93b82990b966d82f;p=LaFS.git

README update

After a break of  4.5 months.

Signed-off-by: NeilBrown <neilb@suse.de>
---

diff --git a/README b/README
index 44150df..a5e15f4 100644
--- a/README
+++ b/README
@@ -5292,23 +5292,22 @@ DONE 15cf/ lafs_iget_fs need to sometimes to in-kernel mounts for subset filesys
 
 36/ review roll-forward
 
-36a/  make sure files with nlink == 0 are handled well
+DONE 36a/  make sure files with nlink == 0 are handled well
 DONE 36b/  sanity check before trusting clusters
-36c/ handle miniblocks which create new inodes.
-36d/ Handle DescHole in roll_block
-36e/ When dirtying a block in roll_block, maybe use writeback rather
+DONE 36c/ handle miniblocks which create new inodes.
+DONE 36d/ Handle DescHole in roll_block
+DONE 36e/ When dirtying a block in roll_block, maybe use writeback rather
      than just iolock, for consistency...
-37f/ What to do if table becomes full when add_block_address in
+DONE 36f/ What to do if table becomes full when add_block_address in
      roll_block ??
-37g/ Write roll_mini for directories.
-37h/ In roll_one, use the cluster counting code to find block number and
+36g/ Write roll_mini for directories.
+DONE 36h/ In roll_one, use the cluster counting code to find block number and
      make sure we don't exceed the segment.
-37i/ add more general error checking to lafs_mount - 
+DONE 36i/ add more general error checking to lafs_mount - 
             lafs_iget orphans and segsum.  Check type is correct.
          errors from lafs_count_orphans or lafs_add_orphans.
          alloc_page failure for chead - maybe allocate something bigger??
 
-
 37/ Configure index block hash_table at run time base on mem size??
 
 38/ striped layout
@@ -5392,6 +5391,12 @@ DONE 52/ NFS export
    realloc or dirty rather than lafs_allocated_block doing it.?
    See also 15ad below.
 
+66/ Delay writeout of directory updates until an fsync.  If a checkpoint happens
+   first, discard the updates (and fsync waits for checkpoint to complete).
+   If a cross-directory rename happens care is needed:  either flush updates
+   first or ensure that a flush does happen before the cross-directory
+   update is flushed.
+
 26June2010
  Investigating 5a
 
@@ -6914,3 +6919,57 @@ WritePhase - what is that all about?
 
     So that is all done now, except I don't hold refs on snapshots in the cleaner
     yet.
+
+11oct2010
+ DescHole
+   - When is this used? directory etc don't need it.
+   - a regular file might, but there is no API to punch
+     a hole.... yet I guess.
+   - So we just want to allocate these blocks to 0.
+
+15oct2010 - happy birthday Daniel...
+ Looking at 36:
+  a/ files with nlink==0;
+        If we happen to find them, we hold a reference until all roll-forward
+        is done, incase a name is found - it is important not to start deletion
+        early.
+
+18oct2010
+  36g - write roll_mini for directories.
+   We get a name, an inode number, and one of:
+      LINK UNLINK REN_SOURCE REN_NEW_TARGET REN_OLD_TARGET
+
+   The REN_SOURCE is linked with a REN_*_TARGET which could be in a
+   different directory, so we need to stash the SOURCE until the TARGET
+   arrives.
+   We simply impose the implied change on the directory and update the
+   link count in the target inode.
+   So:
+     load the inode
+     possibly record REN_SOURCE for later
+
+     calls prepare/pin/commit as appropriate.
+     Put the inode on orphan list if appropriate - needs care
+        as we retarget orphan list.
+     update inode link count.
+
+   (28Feb2011)
+   Just a refresh on the purpose of these updates.
+   1/ They allow us to fsync a directory without performing a full checkpoint.
+     As directory blocks are not processed in roll-forward we need the update
+     for data to be safe.  As fsync of directories are rare in some common
+     situations we could avoid actually writing these.  Simply queue them
+     internally and discard them on a checkpoint.  If an fsync comes before the
+     checkpoint, only then do we write them out.  If there are any cross-directory
+     renames then the preceeding updates in both directories need to be flushed
+     before the cross-directory rename.  It might be easier to always flush on
+     a cross-directory rename.
+   2/ They ensure consistency of inode link-count wrt to names in the filesystem,
+     but as link count is only updated by these (or a checkpoint) there is no
+     problem with delaying.
+
+   So: when replaying these we must update the directory content and the inode
+   link count.
+   It is OK to delay the write-out of these until an fsync, and not bother
+   if a checkpoint happens.
+   So add that to th TODO list - item 66.