In the setup of the parallel version of LAPW1 the list of k-points in case.klist (Note: A k-list from case.in1 cannot be used for parallel calculations) is split into subsets according to the weights specified in the .machines file:
where newweighti is the number of k-points to be calculated on processor i. If Ki is always set to a value greater equal one.
A loop over all i processors is repeated until all k-points have been processed.
Speedup in a parallel program is intrinsically dependent on the serial or
parallel parts of the code according to Amdahl's law:
In the case of WIEN97 the serial part of time is spent in the programs lapw0, lcore and mixer. For most cases this time is very small (negligible) in comparison to the times spent in lapw1 and lapw2 as well as the time for waiting until all parallel lapw1 and lapw2 processes have finished. For a good performance it is therefore necessary that you properly estimate the speed and availability of the machines used. We encourage the use of testpara_lapw or ``Run Programs Other goodies Parallel Test'' to check the k-point distribution over the machines before actually running the programs in parallel.
While running lapw1 and lapw in parallel mode, the scripts testpara1_lapw (see 5.2.6) and testpara2_lapw (see 5.2.7) can be used to monitor the succession of parallel execution.