Compiling
SciClone offers multiple compiler suites and tools for code development and testing. In addition to GNU Compiler Collection, supported compilers include Intel and Portland Group compiler suites. Due to the differences in the sub-cluster architectures and the provisioned operating system, each sub-cluster has a standard default GCC compiler version that supports compilation flags and directives specific to the sub-cluster architecture. An overview of the default and available compiler suite versions for each sub-cluster is tabulated below:
Sub-cluster | GCC Versions | Intel Compiler Versions | PGI Compilers |
Rain | gcc/7.2.0 | ||
Hurricane/Whirlwind | 4.7.0 (default),4.7.3, 4.8.4, 5.2.0 | 2017(default) , 2016 | 11.10 (default), 14.3, 17.7 |
Vortex/Vortex-alpha | 4.7.3 (default), 4.8.4, 5.2.0 | 2017 | 14.3 (default), 16.3, 17.7 |
Bora/Hima | 4.9.4 (default), 7.2.0 | 2017 (default), 2016 | 16.3 (default), 17.3, 17.7 |
Storm | 4.7.3 (default), 5.2.0 | 2017 | 14.3 (default), 17.7 |
Meltemi | 6.3.0 (default) | 2016, 2017 (default) | |
Potomac & Pamunkey | 4.7.3 (default), 5.3.0 | 2016 | 14.3 (default), 15.4 |
James | 4.8.5, 6.3.0 (default) | 2016, 2017, 2018 | 15.4, 18.7(default) |
The versions listed above is not comprehensive and shows only the recommended versions for each sub-cluster. For a comprehensive list refer here.
All the compiler paths, include paths, library paths
Code Optimization
Code optimization using compilers can be done by passing the compiler optimization flags during compilation. In automated build systems, this can be done by setting the CFLAGS and CXXFLAGS environment variables. Most commonly used compiler optimization flags that are common across different platforms are tabulated below. These can be used in addition to the CPU architecture specific optimization flags.
GNU Compiler Chain | Intel Parallel Studio | PGI Compilers | Remarks |
--help=optimizers (-Q) |
-help opt | -help=opt | Display optimization options. |
-help advanced | Display advanced optimization options that allow fine tuning of compilation. | ||
-O0, -O1, -O2, -O3 | -O0, -O1, -O2, -O3 | -O0, -O1, -O2, -O3,-O4 | Levels of optimizations. Default -O2. Aggressive optimization option -O3 may change numerical results. |
-Ofast | -fast | -fast | Choose generally optimal flags for the target platform |
-ipa | -ipo | -Mipa=fast | InterProcedural Analysis / Inter Procedural Optimization |
-malign-data=cacheline | -align | -Mcache_align | Align long objects on cache-line boundaries |
-finline | -inline-level=<0|1|2> | -Minline | Controlling inline expansions |
-fprefetch-loop-arrays | -qopt-prefetch[=<0|1|2|3|4|5>] (default is 2) | -Mprefetch | Generate Prefetch instructions |
Suitable flags for each sub-cluster processor architecture and compiler suites are tabulated below. References are also provided to optimization guides where relevant. These flags are recommended with compiler versions listed above, for each sub-cluster.
Sub-cluster | GCC Flags | PGI Flags | Intel Flags | Reference Optimization Guide |
Rain | -march=k8 | -tp amd64 | -xHost | |
Hurricane |
-march=westmere -march=corei7 |
-tp nehalem | -msse4.2 | |
Vortex | -march=bdver2 | -tp piledriver | -msse4.2 | |
Bora/Hima | -march=haswell -mfma | -tp haswell | -xCORE-AVX2 -fma -std=c11 | Best Practice Guide - Haswell/Broadwell |
Storm (except Ice) | -march=barcelona | -tp shanghai -fastsse | -msse3 | |
Ice | -march=barcelona | -tp istanbul -fastsse | -msse3 | Compiler Options Quick Reference - Magny-Cours |
Meltemi |
-march=knl -mavx512f -mavx512pf -mavx512er -mavx512cd -mfma |
N/A | -xCORE-AVX512 -fma -std=c11 | KNL Best Practices Guide |
Potomac | -march=bdver1 | -tp piledriver | -mavx | |
Pamunkey | -march=bdver2 | -tp piledriver | -xCORE-AVX2 | Compiler Options Quick Reference - Abu-Dhabi |
James | -march=skylake | -tp=skylake | -mtune=skylake |
Users are encouraged to explicitly use the architecture flags (above) for their choice of sub-cluster when compiling, since the front-end may not be of the same architecture as the nodes (as is the case with Meltemi).
Choosing Compilers
Three different compiler suites are available on SciClone, often in multiple versions. These include the open source GNU Compiler Collection (GCC) and the commercial Portland Group (PGI) and Intel Parallel Studio XE (Cluster or Composer) suites. In addition, packages such as MPI and CUDA provide their own compilation commands which are implemented on top of one or more of these base compiler suites.
For any given application, the choice of compiler and compiler options can have a major impact on performance. Generally speaking, the commercial compiler suites (PGI and Intel) will produce better results than GCC, and we therefore recommend their use whenever the code base permits. There are exceptions, however, so some experimentation may be in order, particularly for applications with long runtimes. Compiler optimization guidelines are provided above, tailored to SciClone's various hardware platforms.
SciClone features a large collection of third-party application software, and the compiler requirements vary from one package to another. Some packages are highly portable and have been compiled with multiple compiler suites, while others require a very specific compiler version. Where applicable, compiler information is included in the local documentation pages for individual software packages. As a rule, application software should be linked to libraries which have been compiled with the same compiler suite, and should use compatible compiler options.