WIPL-D GPU Solver is a module that enables usage of NVIDIA CUDA-enabled GPUs to significantly decrease EM simulation time in WIPL-D. For this purpose, an arbitrary number of GPUs can be used. This module provides GPU acceleration of three phases in EM analysis:

  • matrix fill-in,
  • matrix inversion, and
  • near-field calculations.

Acceleration by an order of magnitude can be achieved on a single personal computer, while maintaining the accuracy of final results. The greatest speed-up was achieved in the most time-demanding part of an EM analysis, matrix inversion.

Besides the fact that calculations are done on GPUs, the WIPL-D GPU Solver gives us several important improvements of the out-of-core (OoC) algorithm, which is used when analyzing electrically large EM structures. These are:

  • more than one hard disk can be used in parallel, which significantly increases the speed of hard disk I/O operations,
  • I/O operations are done in parallel with GPU calculations, and hence hard disk I/O time (almost) will not influence the overall solving time,
  • new GPU accelerated OoC reduced algorithm allows matrix inversion time to be almost halved when solving problems whose system matrix is symmetric.

Acceleration that can be expected, when GPU Solver is used, depends on the overall hardware configuration on which EM simulations are performed, although it mainly depends on the used GPU(s). There are a lot of parameters of a certain GPU model that affect calculations speed. The parameter that has the most influence is memory bandwidth. Other important parameters are the number of CUDA cores, RAM size, and processor clock.

Model with 100,000 unknowns can be solved in less than 10 minutes on a machine with 1 GTX 1080 Ti GPU. On the same machine with 4 such GPUs simulation time for the same problem is less than 5 minutes.

GPU solver utilizes symmetry of the matrix when metallic structures are analyzed, and halves the LU decomposition time in such cases. For example, a symmetrical matrix with half a million unknowns can be LU decomposed in about 6 hours on the aforementioned machine with 4 GPUs.

To take advantage of GPU acceleration, one or more Nvidia’s CUDA-enabled GPU (GeForce, Tesla or Quadro series) cards must be used. GPUs with compute capability 2.0 and higher are supported.

For more info on GPU configurations and ways to use the speed-up on your computer, feel free to contact us.