knights landing

“HOW Series”: Webinars on Performance Optimization, June 2017

April 28, 2017

  In a Nutshell HOW Series “Deep Dive” is a free 20-hour hands-on in-depth training on parallel programming and performance optimization in computational applications on Intel architecture. The 6th run in 2017 begins June 19, 2017. Broadcasts start at 16:00 GMT (9:00 am in San Francisco, 12:00 noon in New York, 5:00 pm in London, 7:00 pm in Moscow, 9:30 pm in New Delhi, 1:00 am in Tokyo). June 2017 S M T W H F S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30                   — Webinar+remote access c GMT 16:00 San Francisco 9:00 am New York 12:00 noon London 5:00 pm Moscow 7:00 pm New Delhi 9:30 pm Tokyo 1:00 am Live status as of 11 minutes ago: 94 registrants. Register Cannot attend? Register anyway for cluster access, progress updates and recorded video.   Learn More Why Attend the HOW Series Course Roadmap Instructor Bio Prerequisites Remote Access for Hands-On Exercises Slides, Code and Video System Requirements (IMPORTANT!) Supplementary Materials Chat Why Attend the [...]

HOW Series “Deep Dive”: Webinars on Performance Optimization, May 2017

April 13, 2017

  In a Nutshell HOW Series “Deep Dive” is a free 20-hour hands-on in-depth training on parallel programming and performance optimization in computational applications on Intel architecture. The 5th run in 2017 begins May 15, 2017. Broadcasts start at 16:00 GMT (9:00 am in San Francisco, 12:00 noon in New York, 5:00 pm in London, 7:00 pm in Moscow, 9:30 pm in New Delhi, 1:00 am in Tokyo). May 2017 S M T W H F S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31                   — Webinar+remote access GMT 16:00 San Francisco 9:00 am New York 12:00 noon London 5:00 pm Moscow 7:00 pm New Delhi 10:30 pm Tokyo 1:00 am Live status as of 32 days ago: 100 registrants. Register Registration for this training in June 2017 is also open. Cannot attend? Register anyway for cluster access, progress updates and recorded video.   Learn More Why Attend the HOW Series Course Roadmap Instructor Bio Prerequisites Remote Access for Hands-On Exercises Slides, Code and Video System Requirements [...]

Get the Most out of Your Free Trial of Intel Xeon Phi Processors

April 7, 2017

Free Webinar Abstract Intel® Xeon Phi™ processors x200 (formerly Knights Landing) are computational beasts. Their theoretical peak performance is up to 3 TFLOP/s and measured memory bandwidth is up to 490 GB/s. This performance is available without any difference in programming models compared to general-purpose x86-like CPUs. Colfax is offering a free trial program for this technology. This program is available through Intel’s sponsorship. The Colfax Cluster has 64 compute nodes based on Intel Xeon Phi 7250 processors. Intel® Omni-Path fabric interconnects the nodes. This cluster is at your service for two weeks for testing and evaluation. In this 1-hour webinar I will describe how you can get the most out of your two weeks on the cluster: What workloads you can run to see the performance How to prepare your own code to run on the cluster Where to learn the best optimization practices for this and similar architectures Slides:  Colfax-Remote-Access-Webinar-2017.pdf (2 MB) — this file is available only to registered users. Register or Log In. Free trial: here [...]

HOW Series “Deep Dive”: Webinars on Performance Optimization, April 2017

March 16, 2017

  In a Nutshell HOW Series “Deep Dive” is a free 20-hour hands-on in-depth training on parallel programming and performance optimization in computational applications on Intel architecture. The 4th run in 2017 begins April 17, 2017. Broadcasts start at 16:00 UTC (9:00 am in San Francisco, 12:00 noon in New York, 5:00 pm in London, 7:00 pm in Moscow, 9:30 pm in New Delhi, 1:00 am in Tokyo). April 2017 S M T W H F S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30     — Webinar+remote access UTC 16:00 San Francisco 9:00 am New York 12:00 noon London 5:00 pm Moscow 7:00 pm New Delhi 9:30 pm Tokyo 1:00 am Live status as of 60 days ago: 354 registrants. Registration for this workshop is closed, but you can register for the upcoming HOW series in May or you can watch the recordings of all presentations below.   Learn More Why Attend the HOW Series Course Roadmap Instructor Bio Prerequisites Remote Access for Hands-On Exercises Slides, Code and Video System Requirements (IMPORTANT!) Supplementary Materials Chat Why [...]

HOW Series “Deep Dive”: Webinars on Performance Optimization, March 2017

February 15, 2017

  In a Nutshell HOW Series “Deep Dive” is a free 20-hour hands-on in-depth training on parallel programming and performance optimization in computational applications on Intel architecture. The 3rd run in 2017 begins March 13, 2017. Broadcasts start at 16:00 UTC (9:00 am in San Francisco, 12:00 noon in New York, 4:00 pm in London, 7:00re pm in Moscow, 9:30 pm in New Delhi, 1:00 am in Tokyo). March 2017 S M T W H F S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31                   — Webinar+remote access UTC 16:00 San Francisco 9:00 am New York 12:00 noon London 4:00 pm Moscow 7:00 pm New Delhi 9:30 pm Tokyo 1:00 am Live status as of 95 days ago: 138 registrants. Registration for this workshop is closed, but you can register for the April HOW Series.   Learn More Why Attend the HOW Series Course Roadmap Instructor Bio Prerequisites Remote Access for Hands-On Exercises Slides, Code and Video System Requirements (IMPORTANT!) Supplementary Materials Chat Why Attend the HOW [...]

FALCON Library: Fast Image Convolution in Neural Networks on Intel Architecture

November 9, 2016

We describe FALCON, an original open-source implementation of image convolution with a 3×3 filter based on Winograd’s minimal filtering algorithm. Compared to direct convolution, Winograd’s algorithm reduces the number of arithmetic operations at the cost of complicating the memory access pattern. This study is carried out in the context of image analysis in convolutional neural networks. Our implementation combines C language code with BLAS function calls for general matrix-matrix multiplication. The code is optimized for Intel Xeon Phi processors x200 (formerly Knights Landing) with Intel Math Kernel Library (MKL) used for BLAS call to the SGEMM function. To test the performance of FALCON in the context of machine learning, we benchmarked it for a set of image and filter sizes corresponding to the VGG Net architecture. In this test, FALCON achieves 10% greater overall performance than convolution from DNN primitives in Intel MKL. However, for some layers, FALCON is faster than MKL by 1.5x, but for other layers slower by as much as 4x. This indicates a possibility of a [...]

Training Calendar

October 4, 2016

“HOW” Series: Deep Dive   June 2017 S M T W H F S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30                   — Upcoming Information Register   Learn Modern Code Are you realizing the payoff of parallel processing? Are you aware that without code optimization, computational applications may perform orders of magnitude worse than they are supposed to? The Web-based HOW Series training provides extensive knowledge needed to extract more of the parallel compute performance potential found in both Intel® Xeon® and Intel® Xeon Phi™ processors and coprocessors. Practice New Skills The HOW series is an experiential learning program because comprising instructional and hands-on self-study components: The instructional part: 10 lecture sessions with 1 hour of theory and 1 hour of practical demonstrations. In the self-study part: attendees are provided with remote access over SSH to a Linux-based cluster of training server with Intel Xeon Phi processors (KNL) and Intel [...]

Machine Learning on 2nd Generation Intel® Xeon Phi™ Processors: Image Captioning with NeuralTalk2, Torch

June 20, 2016

  In this case study, we describe a proof-of-concept implementation of a highly optimized machine learning application for Intel Architecture. Our results demonstrate the capabilities of Intel Architecture, particularly the 2nd generation Intel Xeon Phi processors (formerly codenamed Knights Landing), in the machine learning domain. Download as PDF:  Colfax-NeuralTalk2-Summary.pdf (814 KB) — this file is available only to registered users. Register or Log In. or read online below. Code: see our branch of NeuralTalk2 for instructions on reproducing our results (in Readme.md). It uses our optimized branch of Torch to run efficiently on Intel architecture. See also: colfaxresearch.com/get-ready-for-intel-knights-landing-3-papers/ 1. Case Study It is common in the machine learning (ML) domain to see applications implemented with the use of frameworks and libraries such as Torch, Caffe, TensorFlow, and similar. This approach allows the computer scientist to focus on the learning algorithm, leaving the details of performance optimization to the framework. Similarly, the ML [...]

Intel® Python* on 2nd Generation Intel® Xeon Phi™ Processors: Out-of-the-Box Performance

June 20, 2016

This paper reports on the value and performance for computational applications of the Intel® distribution for Python* 2017 Beta on 2nd generation Intel® Xeon Phi™ processors (formerly codenamed Knights Landing). Benchmarks of LU decomposition, Cholesky decomposition, singular value decomposition and double precision general matrix-matrix multiplication routines in the SciPy and NumPy libraries are presented, and tuning methodology for use with high-bandwidth memory (HBM) is laid out. Download as PDF:  Colfax-Intel-Python.pdf (1 MB) — this file is available only to registered users. Register or Log In. or read online below. Code: coming soon, check back later. See also: colfaxresearch.com/get-ready-for-intel-knights-landing-3-papers/ 1. A Case for Python in Computing Python is a popular scripting language in computational applications. Empowered with the fundamental tools for scientific computing, NumPy and SciPy libraries, Python applications can express in brief and convenient form basic linear algebra subroutines (BLAS) and linear algebra package (LAPACK) [...]

Knights Landing Webinar Slides Translated to Japanese

May 13, 2016

日XLsoft社の協力で、弊社の “Introduction to Next-Generation Intel® Xeon Phi™ Processor: Developer’s Guide to Knights Landing” で使われているスライド集が日本語に翻訳されました。 With the help of our partners at XLsoft, the slide deck for the webinar “Introduction to Next-Generation Intel® Xeon Phi™ Processor: Developer’s Guide to Knights Landing” has been translated to the Japanese language. XLsoft社のウェブサイト/XLsoft website Download here:  JP-Colfax-Programmers-Guide-to-KNL.pdf (5 MB) — this file is available only to registered users. Register or Log In. For more information, and to register for the webinar, please visit: Webinar [...]
1 2