KNL

Optimizing Torch Performance for Intel Xeon Phi Processors

November 18, 2016

    In this 1-hour webinar, Ryo Asai (Colfax) discusses how machine learning applications can benefit from code modernization. He begins by exploring the parallelism that gives modern computer architecture its performance, and how it can be leveraged. Then he applies code modernization techniques live on-screen to the Torch machine learning framework. Specifically, he optimizes image recognition through a deep convolutional neural network that uses the VGG-net architecture. For each code modernization technique, he explains why it works, and how to apply it in practice. What you will learn: What code modernization is, and its importance for machine learning Practical knowledge of modern computer architectures Code modernization techniques for leveraging parallelism Slides:  Colfax-Torch-VGG-Webinar.pdf (2 MB) — this file is available only to registered users. Register or Log [...]

Performance Optimization for Intel® Xeon Phi™ x200 Product Family: Video

September 29, 2016

Optimization for Intel Xeon Phi Processors x200 Colfax now offers a 2-hour Hands-On Workshop (HOW) video on the best practices for performance optimization for Intel® Xeon Phi™ processor (formerly Knights Landing). Use links below the video to navigate the 10 episodes.     Slides:   HOW-Knights-Landing.pdf (4 MB) Part 1. Meet Intel Xeon Phi processors Purpose of Intel Xeon Phi processors and their organization from the programmer’s point of view. Episode 01. ► Intel architecture: today and tomorrow (14 min) Episode 02. ► Cores in Intel Xeon Phi processors (7 min) Episode 03. ► Vector Instruction Support (14 min) Episode 04. ► High-bandwidth memory (8 min) Episode 05. ► Clustering modes (9 min) Part 2. Hands-on Demonstrations Exercises in performance optimization for Intel Xeon Phi processors. Episode 06. ► Memory bandwidth optimization (19 min) (bonus: ► with memkind) (9 min) Episode 07. ► Vectorization with AVX-512 (13 min) (bonus: ► threading) (9 min) Episode 08. ► Tuning with Intel Math Kernel Library (MKL) (20 min) Episode 09. ► [...]

Machine Learning on 2nd Generation Intel® Xeon Phi™ Processors: Image Captioning with NeuralTalk2, Torch

June 20, 2016

  In this case study, we describe a proof-of-concept implementation of a highly optimized machine learning application for Intel Architecture. Our results demonstrate the capabilities of Intel Architecture, particularly the 2nd generation Intel Xeon Phi processors (formerly codenamed Knights Landing), in the machine learning domain. Download as PDF:  Colfax-NeuralTalk2-Summary.pdf (814 KB) — this file is available only to registered users. Register or Log In. or read online below. Code: see our branch of NeuralTalk2 for instructions on reproducing our results (in Readme.md). It uses our optimized branch of Torch to run efficiently on Intel architecture. See also: colfaxresearch.com/get-ready-for-intel-knights-landing-3-papers/ 1. Case Study It is common in the machine learning (ML) domain to see applications implemented with the use of frameworks and libraries such as Torch, Caffe, TensorFlow, and similar. This approach allows the computer scientist to focus on the learning algorithm, leaving the details of performance optimization to the framework. Similarly, the ML [...]

Intel® Python* on 2nd Generation Intel® Xeon Phi™ Processors: Out-of-the-Box Performance

June 20, 2016

This paper reports on the value and performance for computational applications of the Intel® distribution for Python* 2017 Beta on 2nd generation Intel® Xeon Phi™ processors (formerly codenamed Knights Landing). Benchmarks of LU decomposition, Cholesky decomposition, singular value decomposition and double precision general matrix-matrix multiplication routines in the SciPy and NumPy libraries are presented, and tuning methodology for use with high-bandwidth memory (HBM) is laid out. Download as PDF:  Colfax-Intel-Python.pdf (1 MB) — this file is available only to registered users. Register or Log In. or read online below. Code: coming soon, check back later. See also: colfaxresearch.com/get-ready-for-intel-knights-landing-3-papers/ 1. A Case for Python in Computing Python is a popular scripting language in computational applications. Empowered with the fundamental tools for scientific computing, NumPy and SciPy libraries, Python applications can express in brief and convenient form basic linear algebra subroutines (BLAS) and linear algebra package (LAPACK) [...]

Knights Landing Webinar Slides Translated to Japanese

May 13, 2016

日XLsoft社の協力で、弊社の “Introduction to Next-Generation Intel® Xeon Phi™ Processor: Developer’s Guide to Knights Landing” で使われているスライド集が日本語に翻訳されました。 With the help of our partners at XLsoft, the slide deck for the webinar “Introduction to Next-Generation Intel® Xeon Phi™ Processor: Developer’s Guide to Knights Landing” has been translated to the Japanese language. XLsoft社のウェブサイト/XLsoft website Download here:  JP-Colfax-Programmers-Guide-to-KNL.pdf (5 MB) — this file is available only to registered users. Register or Log In. For more information, and to register for the webinar, please visit: Webinar [...]

MCDRAM as High-Bandwidth Memory (HBM) in Knights Landing Processors: Developer’s Guide

May 11, 2016

This publication is part of a developer guide focusing on the new features in 2nd generation Intel® Xeon Phi™ processors code-named Knights Landing (KNL). In this document we discuss the on-package high-bandwidth memory (HBM) based on the multi-channel dynamic random access memory (MCDRAM) technology: Three configuration modes of HBM: Flat mode, Cache mode and Hybrid mode Utilization of the HBM as addressable memory using two methods: by setting affinity policy with the numactl tool and through the usage of special allocators in the memkind library Guidelines for determining the optimal usage model for applications running on bootable Knights Landing.  Colfax_KNL_MCDRAM_Guide.pdf (255 KB) — this file is available only to registered users. Register or Log In. See also: colfaxresearch.com/get-ready-for-intel-knights-landing-3-papers/ 1. MCDRAM in KNL Memory bandwidth in computing systems is one of the common bottlenecks for performance in computational application. Bandwidth-limited applications are characterized by algorithms that have few floating point [...]

Guide to Automatic Vectorization with Intel AVX-512 Instructions in Knights Landing Processors

May 11, 2016

This publication is part of a developer guide focusing on the new features in 2nd generation Intel® Xeon Phi™processors code-named Knights Landing (KNL). In this document, we focus on the new vector instruction set introduced in Knights Landing processors, Intel® Advanced Vector Extensions 512 (Intel® AVX-512). The discussion includes: Introduction to vector instructions in general, The structure and specifics of AVX-512, and Practical usage tips: checking if a processor has support for various features, compilation process and compiler arguments, and pros and cons of explicit and automatic vectorization using the Intel® C++ Compiler and the GNU Compiler Collection.  Colfax_KNL_AVX512_Guide.pdf (195 KB) — this file is available only to registered users. Register or Log In. See also: colfaxresearch.com/get-ready-for-intel-knights-landing-3-papers/ 1. Vector Instructions Intel® Xeon Phi™products are highly parallel processors with the Intel® Many Integrated Core (MIC) architecture. Parallelism is present in these [...]

Remote Access to Intel® Xeon Phi™ Processors

January 25, 2016

Colfax’s remote access program allows you to evaluate and optimize own code, and to test computational applications optimized for Intel® architecture, on systems based on Intel® Xeon Phi™ product family x200 (formerly Knights Landing). Request Access Now Please fill out and submit the form below to request access to the Colfax Cluster. You will get additional instructions via the email address that you provide. Do you have a free account at Colfax Research? Save time by logging in – we will fill in the fields with data from your profile. The registration form should appear here in a moment. If you don’t see a registration form, it could be because your browser has JavaScript disabled is running an advertisement blocking tool is too old Please try to enable JavaScript, disable the ad block, upgrade your browser, or try a different one.If nothing helps, contact us for help. How It Works This program works by providing for 14 days remote access to a computing cluster hosted at Colfax. The program is free and open to everyone. However, to prevent abuse, the program [...]