From: john stultz <johnstul@us.ibm.com>
I've found a race in the mtrr ipi_handler() and set_mtrr() functions.
Basically the server, set_mtrr() does the following:
1.1 initialize a flag
1.2 send ipi
1.3 waits for all cpus to check in
1.4 toggle flag
1.5 do stuff
1.6 wait for all cpus to check out
1.7 toggle flag again
1.8 return
While the clients, running ipi_handler() do the following:
2.1 check in
2.2 wait for flag
2.3 do stuff
2.4 check out
2.5 wait for flag
2.6 return
The problem is the flag is on the servers stack! So if 1.7 and 1.8
executes before 2.5 happens, the client's pointer to the flag becomes
invalid (likely overwritten) which causes the client to never see the
flag change, hanging the box.
The patch fixes that by adding a final synchronisation step in which the
controlling CPU waits for all the IPI'ed CPUs to complete.
cpu_relax();
barrier();
}
+ atomic_dec(&data->count);
local_irq_restore(flags);
}
cpu_relax();
barrier();
}
- local_irq_restore(flags);
+ atomic_set(&data.count, num_booting_cpus() - 1);
atomic_set(&data.gate,0);
+
+ /*
+ * Wait here for everyone to have seen the gate change
+ * So we're the last ones to touch 'data'
+ */
+ while(atomic_read(&data.count)) {
+ cpu_relax();
+ barrier();
+ }
+ local_irq_restore(flags);
}
/**