This series of 45-minute webinars was presented by Colfax International in collaboration with Intel in 2016.
► Part 1 | ► Part 2 | ► Part 3
1. Strategies for Multi-Threading on Intel Xeon Phi Processors
Practical recipes for optimizing performance in multi-threaded computational applications on Intel Xeon Phi processors. Presentation covers common issues with thread parallelism: excessive synchronization, false sharing, insufficient iteration space size, and methods for overcoming these issues: parallel reduction, data padding, strip-mining and loop collapse, and nested parallelism.
2. Fine-Tuning Vectorization on Intel Xeon Phi Processors
Vectorization of computational applications on Intel Xeon Phi processors. Covers automatic vectorization essentials and the toolkit for advanced tuning of vectorization performance, including compiler directives, data container optimization, and language extensions for expression of data parallelism.
3. Controlling Memory Traffic on Intel Xeon Phi Processors
Overview of two aspects of memory traffic tuning in computational applications on Intel Xeon Phi processors: maximizing cache utilization, and streamlining access to the main memory. Presentation covers programming techniques for data locality improvement in loops: permutation, fusion, tiling; and recipes for optimizing memory bandwidth: unit-stride access, thread affinity settings, and allocation in high-bandwidth memory (HBM) using programmatic and automatic approaches.