Linus Torvalds [Fri, 23 Nov 2007 20:27:05 +0000 (15:27 -0500)]
Linux 2.3.15
There's a rather huge patch-set out there now, taking the 2.3.x series to
2.3.15.
This has a lot of the merge code I've been sent over the last two weeks,
but I will invariably have missed some, if for no other reason than simply
that I got absolutely _flooded_ by people sending me patches.
One of the more interesting things was the SMP pipe cleanup sent by
Richard, but try as I might it was never really stable under load on x86 -
not with the plain semaphores in 2.3.14, and not with the patches Andrea
had either. I assume Richard tested it on an alpha with the much more
well-thought-out atomic operation that the alpha provides.
I ended up rewriting the x86 semaphore code (and some of Richards pipe
code too, for that matter, to get rid of some races in waking things up),
and it doesn't show the problems I saw before, but hey, maybe I just
exchanged one set of problems for another set that I can't trigger any
more. Give me feedback, please.
Other features that don't impact everybody, but are rather major:
- ATM support merged in
- firewalling is gone (again), replaced by an even more generic netfilter
facility.
- general networking merges and updates
- Various driver updates (ISDN, ISA PnP, sound, fbcon, usb, intelliport,
you name it)
- make system call return type "long" even if the system call only
returns valid data in the lower order bits - we use the high bits for
error handling, and some 64-bit architectures care (read: the Merced
calling conventions want this because they don't automatically extend
the return type - I bet it will be a new portability issue for other
programs than just the kernel)
Linus Torvalds [Fri, 23 Nov 2007 20:25:38 +0000 (15:25 -0500)]
Linux-2.3.7.. Let's be careful out there..
The new and much improved fully page-cache based filesystem code is now
apparently stable, and works wonderfully well performancewise. We fixed
all known issues with the IO subsystem: it scales well in SMP, and it
avoids unnecessary copies and unnecessary temporary buffers for write-out.
The shared mapping code in particular is much cleaner and also a _lot_
faster.
In short, it's perfect. And we want as many people as possible out there
testing out the new cool code, and bask in the success stories..
HOWEVER. _Just_ in case something goes wrong [ extremely unlikely of
course. Sure. Sue me ], we want to indeminfy ourselves. There just might
be a bug hiding there somewhere, and it might eat your filesystem while
laughing in glee over you being naive and testing new code. So you have
been warned.
In particular, there's some indication that it might have problems on
sparc still (and/or other architectures), possibly due to the ext2fs byte
order cleanups that have also been done in order to reach the
afore-mentioned state of perfection.
I'd be especially interested in people running databases on top of Linux:
Solid server in particular is very fsync-happy, and that's one of the
operations that have been speeded up by orders of magnitude.
Linus Torvalds [Fri, 23 Nov 2007 20:25:32 +0000 (15:25 -0500)]
Linux 2.3.7pre6
Anybody who is interested in FS performance should take a look at the
latest pre-patch of 2.3.7 (only pre-6 and possibly later: do NOT get any
earlier versions. pre-5 still causes file corruption, pre-6 looks good so
far).
Careful, though: I fixed the problem that caused some corruption less than
an hour ago, and while my tests indicate it all works fine, this is a very
fundamental change. The difference to earlier kernels is:
- ext2 (and some other block device filesystems that have been taught
about it) uses write-through from the page cache instead of having a
separate buffer cache and the page cache to maintain dirty state. This
means much less memory pressure in certain situations, and it also
means that we can avoid unnecessary copies.
- the page cache has been threaded, so on SMP you can actually get
noticeable speedups from processes that do concurrent file accesses.
- lower-latency read paths, especially the cached case.
Both of these are big, and fundamental changes. So don't mistake me when I
say it is experimental: Ingo, David and I have been spending the last
weeks (especialy Ingo, who deserves a _lot_ of credit for this all: I
designed much of it, but Ingo made it a reality. Thanks Ingo) on making it
do the right thing and be stable, but if you worry about not having
backups you might not want to play with it even so. It took us this long
just to make it work reliably enough that we can't find any obvious
problems..
The interesting areas are things like
- writes to shared mappings now go blindingly fast. We're talking mondo
cleanups here. We used to do really badly on this, now we do really
well.
- does bdflush still do the right thing? There may be a _lot_ of tweaking
to do to get everything working at full capacity.
- can people confirm that it is stable for everybody?
- if anybody has 8-way machines etc, scalability is interesting. It
should scale to 8-way no problem. We used to scale to 1-way, barely.
Numbers?
- fsync(). It doesn't work right now, but it should be easy to make it
work well on big files etc - something we've never been able to do
before (we used to lack the indexing from file to dirty blocks: now we
have access to that quite automatically thanks to having the
inode->page index in place, and the dirty blocks are right there)
and I'd really appreciate comments from people, as long as people are
aware that it _looks_ stable but we don't guarantee anything at this
point.