This fixes one more exit-time resource accounting issue - and it's also
a speedup and a thread-tree (to-be thread-aware pstree) visual
improvement.
In the current code we reparent detached threads to the init thread.
This worked but was not very nice in ps output: threads showed up as
being related to init. There was also a resource-accounting issue, upon
exit they update their parent's (ie. init's) rusage fields -
effectively losing these statistics. Eg. 'time' under-reports CPU
usage if the threaded app is Ctrl-C-ed prematurely.
The solution is to reparent threads to the group leader - this is now
very easy since we have p->group_leader cached and it's also valid all
the time. It's also somewhat faster for applications that use
CLONE_THREAD but do not use the CLONE_DETACHED feature.