February 15, 2011
-----------------
In running the OpenMP version all programs were run with up to 24 threads,
the maximum number of cores in an AMD Magny-Cours processor. In all cases
running on 24 cores gave a somewhat lower performance probably due
to thread coordination that cannot be hidden when all cores take part in
the computation.

Problems were encountered when using the PGI Fortran compiler with:
- mod2ci: the use of the "-fastsse" flag caused convergence problems
          Without "-fastsse" (only "-O4") the results were slightly
	  better but still unsatisfactory.
	  Using Intel's ifort compiler with flag "-O5" gave good results
	  in all cases. Also the speed was higher than with the PGI
	  compiler in the cases where convergence was reached.
	  With the PGI compiler the residuals were always (sometimes much)
	  larger. The PGI runs have names like mod2ci.p<XX>.log; the Intel
	  ifort runs have names like mod2ci.pa<XX>.log.
- mod2h:  For more than 2 cores the PGI Fortran compiler gave wrong results.
          Therefore only two runs are logged (for 1 and 2 cores). With the
	  Intel ifort compiler no problems were encountered. The Intel runs
	  have names like mod2h.p<XX>.intel.log.	  
