The paper studies two performance metrics of systems enabled with Intel Xeon Phi coprocessors: the ratio of performance to consumed electrical power and the ratio of performance to purchasing system cost, both under the assumption of linear parallel scalability of the application.
Performance to power values are measured for three workloads: a compute-bound workload (DGEMM), a memory bandwidth-bound workload (STREAM), and a latency-limited workload (small matrix LU decomposition). Performance to cost ratios are computed, using system configurations and prices available at Colfax International, as functions of the acceleration factor and of the number of coprocessors per system. That study considers hypothetical applications with acceleration factor from 0.35x to 2x.
In all studies, systems with Intel Xeon Phi coprocessors yield better metrics than systems with only Intel Xeon processors. That applies even with acceleration factor of 1x, as long as the application can be distributed between the CPU and the coprocessor.