Linus Torvalds [Fri, 23 Nov 2007 20:25:38 +0000 (15:25 -0500)]
Linux-2.3.7.. Let's be careful out there..
The new and much improved fully page-cache based filesystem code is now
apparently stable, and works wonderfully well performancewise. We fixed
all known issues with the IO subsystem: it scales well in SMP, and it
avoids unnecessary copies and unnecessary temporary buffers for write-out.
The shared mapping code in particular is much cleaner and also a _lot_
faster.
In short, it's perfect. And we want as many people as possible out there
testing out the new cool code, and bask in the success stories..
HOWEVER. _Just_ in case something goes wrong [ extremely unlikely of
course. Sure. Sue me ], we want to indeminfy ourselves. There just might
be a bug hiding there somewhere, and it might eat your filesystem while
laughing in glee over you being naive and testing new code. So you have
been warned.
In particular, there's some indication that it might have problems on
sparc still (and/or other architectures), possibly due to the ext2fs byte
order cleanups that have also been done in order to reach the
afore-mentioned state of perfection.
I'd be especially interested in people running databases on top of Linux:
Solid server in particular is very fsync-happy, and that's one of the
operations that have been speeded up by orders of magnitude.
Linus Torvalds [Fri, 23 Nov 2007 20:25:32 +0000 (15:25 -0500)]
Linux 2.3.7pre6
Anybody who is interested in FS performance should take a look at the
latest pre-patch of 2.3.7 (only pre-6 and possibly later: do NOT get any
earlier versions. pre-5 still causes file corruption, pre-6 looks good so
far).
Careful, though: I fixed the problem that caused some corruption less than
an hour ago, and while my tests indicate it all works fine, this is a very
fundamental change. The difference to earlier kernels is:
- ext2 (and some other block device filesystems that have been taught
about it) uses write-through from the page cache instead of having a
separate buffer cache and the page cache to maintain dirty state. This
means much less memory pressure in certain situations, and it also
means that we can avoid unnecessary copies.
- the page cache has been threaded, so on SMP you can actually get
noticeable speedups from processes that do concurrent file accesses.
- lower-latency read paths, especially the cached case.
Both of these are big, and fundamental changes. So don't mistake me when I
say it is experimental: Ingo, David and I have been spending the last
weeks (especialy Ingo, who deserves a _lot_ of credit for this all: I
designed much of it, but Ingo made it a reality. Thanks Ingo) on making it
do the right thing and be stable, but if you worry about not having
backups you might not want to play with it even so. It took us this long
just to make it work reliably enough that we can't find any obvious
problems..
The interesting areas are things like
- writes to shared mappings now go blindingly fast. We're talking mondo
cleanups here. We used to do really badly on this, now we do really
well.
- does bdflush still do the right thing? There may be a _lot_ of tweaking
to do to get everything working at full capacity.
- can people confirm that it is stable for everybody?
- if anybody has 8-way machines etc, scalability is interesting. It
should scale to 8-way no problem. We used to scale to 1-way, barely.
Numbers?
- fsync(). It doesn't work right now, but it should be easy to make it
work well on big files etc - something we've never been able to do
before (we used to lack the indexing from file to dirty blocks: now we
have access to that quite automatically thanks to having the
inode->page index in place, and the dirty blocks are right there)
and I'd really appreciate comments from people, as long as people are
aware that it _looks_ stable but we don't guarantee anything at this
point.
Linus Torvalds [Fri, 23 Nov 2007 20:25:25 +0000 (15:25 -0500)]
Linux 2.3.7pre1
I'd like to point out that the current pre-2.3.7 series is fairly
experimental. As amply demonstrated by the filename (the "dangerous" part
in the filename hopefully made some people go "Hmm..").
We're working on re-architecting (or rather, cleaning up so that it works
like it really was supposed to) the page cache writing, and as a result a
number of filesystems are probably going to be broken for a while unless
we get people jumping in to help.
Right now 2.3.7-1 (aka "dangerous") is not stable even with ext2, in that
swapping doesn't work. Ingo just sent me patches to fix that, and I'm
hoping to remove the "dangerous" part from 2.3.7-2, but even then a number
of filesystems will be broken.
We _may_ end up just re-introducing the "update_vm_cache()" code for
filesystems that really don't need the added performance, but it would
actually be preferable if people really wanted to make them perform well
with the new direct write-through cache code.
Linus Torvalds [Fri, 23 Nov 2007 20:25:11 +0000 (15:25 -0500)]
pre-2.3.4..
There's a pre-2.3.4-1 out there in "testing" on ftp.kernel.org, which has
the new scalable network code (well, the first cut of it, anyway). It also
updates ISDN and PPC to newer versions. Please test it out and give
feedback..
Linus Torvalds [Fri, 23 Nov 2007 20:25:09 +0000 (15:25 -0500)]
Linux-2.3.3 and a short hiatus..
There's a Linux-2.3.3 out there on ftp.kernel.org, this one hopefully
fixes pretty much all the waitqueue changes (and I'll disable waitqueue
debugging in 2.3.4 unless something comes up).
And yes, before anybody tells me, I know I forgot to increment the version
number. So "uname" is goign to report 2.3.2 unless you fix that by hand.
I'm also leaving for a very quick trip to Finland in another two hours, so
don't bother emailing me - please discuss isues on the kernel list, and
I'll catch up when I get back on Friday (yes, I'll spen as much time in
airplanes as I do on the ground - fun, fun).
Have fun,
Linus Torvalds [Fri, 23 Nov 2007 20:25:02 +0000 (15:25 -0500)]
Linux 2.3.1pre3
As to 2.3.x, we're beginning with a long overdue waitqueue cleanup, which
means that a lot of small details need to get fixed in a variety of files.
A working pre-patch of this is to be found as pre-patch-2.3.1-3, but not
all drivers have been fixed - and help is appreciated (even drivers that
_have_ been fixed have not necessarily actually been tested due to lack of
hardware).
Linus Torvalds [Fri, 23 Nov 2007 20:18:57 +0000 (15:18 -0500)]
Linux 2.2.8
Most of 2.2.8 by far is just architecture updates: arm, ppc and m68k stand
out as having been pretty much synchronized to their respective devel
trees, but there are some fixes to alpha and x86 too.
The one major fix in 2.2.8 is the SMP fix for disable_irq(), courtesy of
Andrea Arcangeli (I disagreed in details and did it differently in the
end, but all the heavy lifting was done by Andrea). This is the thing that
caused silenth deaths for some people with certain network adapters (3c509
and 8390-based cards in particular: the latter covers ne2000 clones which
are fairly common).
There are lots of smaller things (driver updates, filesystem cleanups and
some networking fixes), but the SMP irq thing is the one to kill for if
you happened to have any of the affected cards.
Linus Torvalds [Fri, 23 Nov 2007 20:18:41 +0000 (15:18 -0500)]
There's a pre-3 patch on ftp.kernel.org in the kernel/testing directory,
and I'd really like people to give it a good testing: especially if you've
seen slow network connections to some clients (ie Windows). David worked
in the compatibility patches to work around some of the Windows TCP stack
"features" (and Apple too, for that matter), and we want to get this well
tested. It's all fairly straightforward, but let's be careful out there..
Linus Torvalds [Fri, 23 Nov 2007 20:18:31 +0000 (15:18 -0500)]
Linux 2.2.5 - and a vacation
I made Linux-2.2.5 yesterday (as some people already have noticed: due to
popular demand I try to delay the announcement for some time in order to
let the thing percolate to mirror sites, in case anybody wondered).
The 2.2.5 release is meant to be a final cleanup release before I leave
for a two-week vacation. So please take these release notes to also mean
that it is probably a good idea to hold off emailing me stuff directly,
unless it is a major bug that you really think I should look at
immediately. I would suggest people discuss problems on the mailing list
and on the newsgroups, where other competent people are, rather than
expecting me to do much about it.
Also, note that there have been various indications that egcs potentially
miscompiles the kernel, or at least makes some problems worse. We don't
know whether that is due to one or more kernel bugs, compiler problems, or
just combinations of "features" in both. I would suggest that if you have
problems you at least verify whether the problems still exist with
gcc-2.7.2.
That said, I bet that both the kernel people and the egcs people would be
really happy the more people look into this - if somebody feels motivated
enough and sees problems with egcs, it would be extremely powerful to try
to pinpoint the particular file that seems to bring on the problems. I'm
afraid it needs a known failure mode and lots of legwork to find out what
triggers it, though.
- compiles with accounting.
- add support for Microgate SyncLink and Synchronous HDLC
- stallion driver update
- alpha EV6 and SMP fix for bootup with newer compilers
- ptrace fix for sparc/i386
- small sparc updates
- floppy driver could oops at bootup under certain setups
- random driver updates (bw-qcam, sound driver error codes, etc oneliners)
- FIOASYNC ioctl fix
- network locking fixes
- SMP "struct user" and signal sending fixes
Linus Torvalds [Fri, 23 Nov 2007 20:18:26 +0000 (15:18 -0500)]
Linux 2.2.4
As of 2.2.4, I should be synchronized with the Sparc[64] and PPC ports,
which is the major reason why the patch is pretty huge. Apart from the
architecture synchronizations, 2.2.4 does:
- dumping core over NFS could do bad things. Core-dumping cleaned up and
fixed.
- various small TCP/IP buglets fixed. Linux got confused by hosts that
didn't report any mss, and had problems with zero-sized fragments, etc.
Linus Torvalds [Fri, 23 Nov 2007 20:18:20 +0000 (15:18 -0500)]
Linux 2.2.3pre3
There's a new pre-patch for 2.2.3, one that I was already going to make
the final 2.2.3, but I decided that I'm chicken after all, and that I
might as well let some people check that it's sane.
This pre-2.2.3 does:
- Fix some silly NFS problems. Some of them can be quite bad: lost error
notification of asynchronous writes, which can result in horrible
problems (including lost email etc). Most people wouldn't ever notice,
so don't panic, but forgetting about the error notification certainly
counts as a brown paper bag.
- Alpha should compile and work again
- Various driver updates. This is actually the bulk of the patch, with
IRDA updates, some scsi, video and sound driver updates etc.
- The "mmap forgets about the file that was mapped" bug that has been
discussed here. Only affected certain drivers.
- shaper atomicity fixes
- various minor TCP fixes
- buffer growth fix and recursive IO memory reclaim fix from Andrea
- network filter compiles ;)
- unix gc fixes
Tell me if you see problems, because I'm going to release it as 2.2.3
unless people tell me otherwise..
Linus Torvalds [Fri, 23 Nov 2007 20:18:12 +0000 (15:18 -0500)]
Linux 2.2.2pre4
In a superhuman effort to not get killed by my wife, I delayed the latest
release for a day. And in fact, it's still just a pre-release, because I
wanted to check with Ingo that I have his latest IO-APIC code with the
proper handling of ExtINT. Ingo?
Anyway, the "not quite valentine days release" (also known as the "horny
greased weasel", aka "presidents day" release ;), is right now a pre-patch
on ftp.kernel.org: /pub/linux/kernel/testing/pre-patch-2.2.2-4.gz.
Happily, I haven't heard of any new real show-stoppers, which is good
(especially considering the fact that I gave it an extra week just to hear
if somebody could come up with some new problems). The things fixed
relative to 2.2.1 are:
- the inode thing. If you don't know, don't worry.
- config scripts updated
- IO-APIC cleanups and fixes, so that people with strange motherboards
should be able to reboot cleanly and not get unexpected interrupts.
- 2kB sector media (ie mostly MO) fixes. See all the warnings on the
lists about fdisk confusion etc if you have one of these things.
- IDE disk cleanups/fixes (geometry and autodetection)
- PS/2 mouse hides ACK's again
- pty crash fix
- some network driver fixes (out-of-memory and shared interrupts)
- some sound and video updates.
- lockd cookie fixes
- nfsd readdir reply cache fix
- filesystem/VM deadlock avoidance (new deamon: kpiod)
- SMP scheduler race condition (which nobody has probably ever seen)
- TCP socket locking fix
Most of the above are really hard to see in the first place, and not
something most people would ever hit (with the possible exception of the
inode thang). But it would be good to have a really rock solid 2.2.2, so
if people could just bother to check that it works for them, and I'll make
this official tomorrow.
Linus Torvalds [Fri, 23 Nov 2007 20:18:11 +0000 (15:18 -0500)]
Linux 2.2.2-pre2
this one contains various small documentation updates and updates to xconfig,
but the important parts (and the smallest part of the actual patch) are:
- shared file lockup fix by Stephen Tweedie
- my fix for the TCP bug that Ingo found
- Ingo's io-apic setup fixes, which should finally get rid of the
spurious apic interrupts with some motherboards and the ExtINT setup.
- inode leak thing
- SMP scheduler potential race condition fix
- sound driver updates
- partition and disk fixes (2kB blocksize media and some IDE disk
geometry and irq detection issues).
None of the fixes are critical to most people, but all of them _can_ be
critical to people who have seen vulnerabilities in the area. As such, if
you're happy with 2.2.1 there is no pressing reason to test this patch
out, but I hope to have the pre-patches so that the final 2.2.2 can be
left around for a while (CD-ROM manufacturers etc would certainly prefer
to not see lots of releases).
Linus Torvalds [Fri, 23 Nov 2007 20:18:08 +0000 (15:18 -0500)]
Linux 2.2.1 - the Brown Paper Bag release
The subject says it all. We did have a few paper-bag-inducing bugs in
2.2.0, so there's a 2.2.1 out there now, just a few days after 2.2.0.
Oh, well. These things happen,
Linus
- the stupid off-by-one bug 'execute a coredump' crash found by Ingo
- __down_interruptible on alpha
- move "esstype" to outside a #ifdef MODULE
- NFSD rename/rmdir fixes
- revert to old array.c
- change comment about __PAGE_OFFSET
- missing "vma = NULL" case for avl find_vma()
Linus Torvalds [Fri, 23 Nov 2007 20:18:06 +0000 (15:18 -0500)]
Linux 2.2.0
> Compile this code
>
> ---- cut here ----
> #include <fcntlbits.h>
> void main( int argc, char *argv[] ) {
> open( argv[ 1 ], O_WRONLY|O_CREAT|O_TRUNC, 0666 );
> }
> ---- and here ----
>
> and run it like this
>
> strace ./a.out >(cat - )
>
> with 2.0.36 & 2.2.0-pre[67] you get:
>
> open("/dev/fd/63", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
>
> with 2.2.0-pre[89] you get:
>
> open("/dev/fd/63", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOENT (No
> such file or directory)
Ok, this seems to be due to pre9 removing some rather bogus code that
happened to hide another problem in open_namei().
I haven't actually tested this, but it looks really obvious, so does this
patch fix it for you? (This should also fix a potential performance
bogosity - there's absolutely no reason why we should get the directory
lock when we don't need to for a normal open of an existing file).
Linus Torvalds [Fri, 23 Nov 2007 20:18:05 +0000 (15:18 -0500)]
2.2.0-final
Hoya,
there's now a 2.2.0-pre9 on ftp.kernel.org, and when you compile it it
will call itself 2.2.0-final. The reason is fairly obvious: enough is
enough, and I can't make pre-kernels forever, it just dilutes the whole
idea. The only reason the tar-file is not called 2.2.0 is that I want to
avoid having any embarrassing typos that cause it to not compile under
reasonable configurations or something like that. Unreasonable
configurations I no longer care about.
Every program has bugs, and I'm sure there are still bugs in this. Get
over it - we've done our best, and nobody ever believed that there
wouldn't be 2.2.x kernels to fix problems as they come up, and delaying
2.2.0 forever is not an option.
I have a wedding anniversary and a company party coming up, so I'm taking
a few days off - when I get back I expect to take this current 2.2.0-final
and just remove the "-final" from the Makefile, and that will be it. I
suspect somebody _will_ find something embarrassing enough that I would
fix it too, but let's basically avoid planning on that.
In short, before you post a bug-report about 2.2.0-final, I'd like you to
have the following simple guidelines:
"Is this something Linus would be embarrassed enough about that he would
wear a brown paper bag over his head for a month?"
and
"Is this something that normal people would ever really care deeply
about?"
If the answer to either question is "probably not", then please consider
just politely discussing it as a curiosity on the kernel mailing lists
rather than even sending email about it to me: I've been too busy the last
few weeks, and I'd really appreciate it if I could just forget the worries
of a release for a few days..
But if you find something hilariously stupid I did, feel free to share it
with me, and we'll laugh about it together (and I'll avoid wearing the
brown paper bag on my head during the month of February). Do we have a
deal?
I've seen people working on a 2.2.0 announcement, and I'm happy - I've
been too busy to think straight, much less worry about details like that.
If everything turns out ok, I'll have a few memorable bloopers in my
mailbox but nothing worse than that, and I can sit down and actually read
the announcement texts that people have been discussing.
ObFeatures:
- m68k sync
- various minor driver fixes (irda, net drivers, scsi, video, isdn)
- SGI Visual Workstation support
- adjtimex update to the latest standards
- vfat silly buglet fix
- semaphores work on alpha again
- drop the inline strstr() that gcc got wrong whatever we did
- kswapd needed to be a bit more aggressive
- minor TCP retransmission and delack fixes
Until Monday,
Linus
Linus Torvalds [Fri, 23 Nov 2007 20:18:01 +0000 (15:18 -0500)]
Linux 2.2.0pre7
Ok, I think I now know why pre-6 looks so unbalanced. It's two issues.
Basically, trying to swap out a large number of pages from one process
context is just doomed. It bascially sucks, because
- it has bad latency. This is further excerberated by the per-process
"thrashing_memory" flag, which means that if we were unlucky enough to
be selected to be the process that frees up memory, we'll probably be
stuck with it for a long time. That can make it extremely unfair under
some circumstances - other processes may allocate the pages we free'd
up, so that we keep on being counted as a memory trasher even if we
really aren't.
Note that this shows most under "moderate" load - the problem doesn't
tend to show itself if you have some process that is _really_
allocating a lot of pages, because then that process will be correctly
found by the trashing logic. But if you have lots of "normal load"
processes, some of those can get really badly hurt by this.
In particular, the worst case you have a number of processes that all
allocate memory, but not very quickly - certainly not more quickly than
we can page things out. What happens is that under these circumstances
one of them gets marked as a "scapegoat", and once that happens all the
others will just live off the pages that the scapegoat frees up, while
the scapegoat itself doesn't make much progress at all because it is
always just freeing memory for others.
The really bad behaviour tends to go away reasonably quickly, but while
it happens it's _really_ unfair.
- try_to_free_pages() just goes overboard, and starts paging stuff out
without getting back to the nice balanced behaviour. This is what
Andrea noticed.
Essentially, once it starts failing the shrink_mmap() tests, it will
just page things out crazily. Normally this is avoided by just always
starting from shrink_mmap(), but if you ask try_to_free_pages() to try
to free up a ton of pages, the balancing that it does is basically
bypassed.
So basically pre-6 works _really_ well for the kind of stress-me stuff
that it was designed for: a few processes that are extremely memory
hungry. It gets close to perfect swap-out behaviour, simply because it is
optimized for getting into a paging rut.
That makes for nice benchmarks, but it also explains why (a) sometimes
it's just not very nice for interactive behaviour and (b) why it under
normal load can easily swap much too eagerly.
Anyway, the first problem is fixed by making "trashing" be a global flag
rather than a per-process flag. Being per-process is really nice when it
finds the right process, but it's really unfair under a lot of other
circumstances. I'd rather be fair than get the best possible page-out
speed.
Note that even a global flag helps: it still clusters the write-outs, and
means that processes that allocate more pages tend to be more likely to be
hit by it, so it still does a large part of what the per-process flag did
- without the unfairness (but admittedly being unfair sometimes gets you
better performance - you just have to be _very_ careful whom you target
with the unfairness, and that's the hard part).
The second problem actually goes away by simply just not asking
try_to_free_pages() to free too many pages - and having the global
trashing flag makes it unnecessary to do so anyway because the flag will
essentially cluster the page-outs even without asking for them to be all
done in one large chunk (and now it's not just one process that gets hit
any more).
There's a "pre-7.gz" on ftp.kernel.org in testing, anybody interested?
It's not the real thing, as I haven't done the write semaphore deadlock
thing yet, but that one will not affect normal users anyway so for
performance testing this should be equivalent.
Linus Torvalds [Fri, 23 Nov 2007 20:17:58 +0000 (15:17 -0500)]
Linux 2.2.0pre5
Oh, well.. Based on what the arca-[678] patches did, there's now a pre-5
out there. Not very similar, but it should incorporate the basic idea:
namely much more aggressively asynchronous swap-outs from a process
context.
Comment away,
Linus
Linus Torvalds [Fri, 23 Nov 2007 20:17:56 +0000 (15:17 -0500)]
Linux 2.2.0pre4
Ok, you know the drill by now. This fixes:
- yes, people told me about the new and improved ksymoops. Much better,
no need for C++, and this one actually seems to compile and work
reliably.
- ntfs fixes
- the vfat thing _really_ works now
- NFS fix for deleting files while writebacks active.
- ppa/imm driver updated
- minor mm balancing patches
- Alan took the gauntlet and cleaned up some CONFIG_PROC_FS stuff.
More on Monday,
Linus Torvalds [Fri, 23 Nov 2007 20:17:48 +0000 (15:17 -0500)]
Linux 2.2.0pre2 (December 31 1998)
Well, some people obviously had problems with the first 2.2.0pre, so
there's a second one there. Most of it is almost purely syntactic sugar:
configuration issues and jiffies wraparound, but there were a few problems
wrt some IDE disk geometry stuff in particular that made 2.2.0pre1 not
boot for some people.
Other real changes:
- nfsd updated, and we have an official maintainer for knfsd (and I was
happy by how many people were ready to stand up for it. Good show,
guys!)
- network driver updates (tulip/eepro)
- some TCP fixes for occasional but nasty performance problems.
- fix for an attack where you could cause a complete and utter lockup of
the kernel as a normal user. Thanks to Michael Chastain for keeping the
faith on this one and reminding me to fix it.
If you haven't had problems with pre1, there should be no major cause to
look at pre2. But if you haven't even looked at pre1 yet, please consider
looking at the pre-2.2.0 kernels before it's too late. I'm going to be
extremely rude to people who knew better but didn't test out the pre-
kernels and then send me bug-reports on the released 2.2.0.