slideshow 00 slideshow 01 slideshow 02 slideshow 03 slideshow 04 slideshow 05 slideshow 06 slideshow 07 slideshow 08 slideshow 09

Feature Of The Day

(20170314) Seyedeh Saeedeh, Meysam Golmohammadi, and Dawer Jamshed have attended the annual Columbia Workshop to learn more about Brain Circuits, Memory, and Computation. Click here to learn more.

Institute for Signal and Information Processing
Joseph Picone
on February 19, 2017 19:48
exam question on the number of processors and cores

Determining the exact hardware configuration from a Linux command is somewhat difficult. Manufacturers do all they can to make their processors look good. The note below explains some issues with doing this using lscpu and /proc/cpuinfo.

The short answer is:

N = no. physical processors: lscpu | grep "Socket(s)"
M = no. cores per socket: lscpu | grep "Core(s)"
T = no. threads per code: lscpu | grep "Thread(s)"

The number of jobs your processor can handle simultaneously is:

# Jobs = N * M * T

Of course you can run more jobs than this on a machine, but depending on the I/O and memory requirements, you might find those jobs run less efficiently.


Under `$ lscpu` you have "socket (s)", "core(s) per socket", and "threads per core". Typically:

N = Socket (s) = the number of processors
M = Core(s) per socket = number of cores per processor
T = Thread(s) per core = number of threads per core

The processor in nedc_001 does not support hyperthreading so that is why is shows T=1.

I was a bit confused about nedc_002 since it is reporting 2 sockets, 8 cores per socket, 2 threads per core yet AMD doesn't have hyperthreading. From some reading I found that the AMD processor on nedc_002 does indeed have 16 physical cores however in this family of processors (Bulldozer) AMD introduced a concept called modules. Each module consists of two cores but each pair of cores shares a floating point unit. It appears AMD has faced some controversy calling these series of processors "16 core".

"We asked Bernard Seite, technical advisor, AMD, whether we really should regard the two execution units within a Bulldozer Module as cores and were told, ‘If you take the overall group of applications that are running on x86, 90 per cent is integer… We look at how efficient Hyper-Threading [is]. Sometimes you have negative impact, but most of the time, you have something which is in between zero and 40. The Bulldozer Module will never be negative [in its performance gains] – you have two threads, and the two threads are not going to clash.'
The only time two threads within a Bulldozer Module could clash, we were told, was if each required 256-bit floating point precision, for example if both threads used the new 256-bit AVX capabilities of the CPU. This is because the floating point unit – as previously alluded to – is shared and comprises two 128-bit fp units which can be ganged to produce a single 256-bit unit. However, it’s very unlikely that we’ll see many 256-bit fp threads any time soon as the standard is new and will take time to adopt. Seite also pointed out that ideally the OS (or the complier, potentially) should be aware of the capabilities of the Module and assign the second 256-bit thread to another Module, perferrably one not running any hardcore fp work."

This is in contrast to Intel's hyperthreading where a single core processes two threads in parallel.

AMD Modules:

Joseph Picone
on February 17, 2017 22:10
interesting deep learning blog

Some interesting stuff (from Amir Harati):

Joseph Picone
on February 14, 2017 20:49

Joseph Picone
on February 12, 2017 08:18
Always interesting...

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer