When discussing new computers with our partners it can be hard sometimes to justify why we are recommending something that costs three or four times as much as apparently similar models advertised on Amazon or for sale on the high street. Ah yes, we always say, but ours are a much higher specification. But, what does that really mean? There are several variables involved, in this article though I want to focus on just one of the trickier ones: the central processing unit or CPU. This is the one that gets mentioned on the label, e.g. you might see something like Intel i3 3.5 GHz or Intel i9 2.3 GHz on the headline description. I was prompted to tackle this subject by a quote I received from a vendor recently that looked like this:
MB PRO 16 8CI9 2.4GHZ TB UK KB/UK PSU 512GB 16GB IN £1,929.00
I have highlighted the key information that is of interest to us here and we can just about see we are looking at a computer with an Intel i9 (I9) CPU with 8 cores (8C) running at 2.4 GHz. But, what does that mean?
How fast is fast?
The GHz value refers to processor clock speed while, e.g., i3 or i9 is the name of the processor, with, as you would expect, higher numbers representing later variations of the processor design. Processor clock speed is often (incorrectly) thought of as computer speed and computer sellers don’t always correct this. Why would they, when it gives them the opportunity to advertise blazing fast 3.0 GHz computers for £350? Now, I’m going to explain how a 2.0 GHz processor might be orders of magnitude faster than a 3.0 GHz one. The following might not get through a computer science class but, hopefully, while simplified, it is sufficiently accurate to get across the key concepts to a non-technical reader.
Meet the CPU
To understand processor (CPU) speeds, we must go back to basics and consider the CPU’s job. Computers are called computers because that is what they do, they compute or calculate. And, the CPU, as we all know, is the brain of the computer. The CPU is where the calculating or computing goes on. In fact, leaving aside specialist operations that are often housed in separate chips or extensions to the CPU, the calculations that the CPU is capable of are very simple. They don’t amount to much more than a capability to request and return very small chunks of numerical data and perform arithmetical operations on that data such as, add, subtract and multiply. The operations are not much beyond those we learned at primary school, the main difference being that the numerical data is in binary rather than decimal form. Binary numbers are limited to ones and zeroes, or, in CPU terms, a transistor in an on or an off state. Without going into any detail about binary, it is enough to know that, e.g. we can represent a number like 255 using just three digits in decimal, whereas eight are required to represent the same number in binary. By conducting huge volumes of simple operations in a very short time computers not only appear to be very smart machines but have grown to occupy their central role in our working and leisure lives.
From microchip to Netflix
OK, so I can sit in front of a box which does simple arithmetic on binary numbers at super fast speeds and somehow that means I can watch Netflix and look up information about anything while communicating in real time with my friends around the world. Hmmm, I’m not really convinced I’ve got the whole picture. Well no, there is more to it than that, but let’s focus for a moment on those simple operations and consider how they might produce what we see on the screen. Consider a standard computer screen resolution of 1024 X 768, you can think of this as a grid made up of 786,432 individual squares or pixels. We can build an imaginary processor, let’s call it the M1, which, unlike the CPU in a PC, has nothing else to do but control what is displayed on this screen. Now, imagine we have to display a finely detailed black and white image where 1 = white and 0 = black, all the M1 needs to do is set the value of each pixel appropriately and if the M1 can perform a million operations per second (like a 1960s processor), it could produce the image in just over three quarters of a second.
Things get a bit more complicated with colour images. Instead of one binary digit (bit) representing white (1) or black (0) we represent 256 colours using eight bits (one byte). And, getting a bit more sophisticated, we can use another byte to represent 256 shades for each colour and one more byte to represent 256 levels of intensity. We get 256 X 256 X 256 = 16,777,216 different colours. Now, the M1 would start to struggle a little, taking almost two and a half seconds (0.75 X 3) to produce the image. What about video? Netflix transmits at just under 24 frames per second and each frame is just like a still colour image. As we have seen, the M1 needs almost 2.5 seconds to produce a single frame. We can’t watch Netflix on this computer. Time for an upgrade, we’ll try a processor, the M100, that can do 100 million operations per second (like a 1980s processor). Now we can easily keep up with Netflix, the M100 renders a full colour frame in 0.025 seconds and 24 frames in 0.6 seconds. In fact, for each frame only some pixels need to change, let’s say on average only 50% of the pixels change between frames (this is an overestimate). The M100 more than keeps up with Netflix, even if it must use say 10% of its processing power to calculate which pixels to change. So, with 100 million simple operations per second we can watch Netflix using less than half the available processing capacity.
In reality, modern computers hand off the management of screen output to a subsystem known as a graphics processor, but, on examination, a graphics processor turns out to be a computer within a computer so the example above remains a good way of understanding speed issues. The use of sub processors for various specialised tasks is common and one of the variables that makes direct comparison between computers more difficult than implied by the headline processor speed.
Herr Hertz explains
OK, getting back to the CPU. How does it manage to do its simple arithmetic operations? The detail is beyond the scope of this article, but we can get a basic grasp of what is going on by understanding that the CPU is made up of microscopic transistors. We can think of transistors as being on or off, open or closed, providing resistance, or allowing a current to flow. In any case, a transistor is always in one of two states, one of which, for arithmetic purposes, represents one and the other, zero. The CPU clock speed tells us how many times per second the processor can cycle its transistors from one state to the other. To visualise this, imagine a metronome: for each tick we hear, the transistors can cycle between states: we call this the frequency. To talk about frequency, we use a scale named after Heinrich Rudolf Hertz, who proved the existence of electromagnetic waves, in which:
One cycle per second = | 1 Hertz | 1 Hz |
One thousand cycles per second = | 1 Kilohertz | 1 KHz |
One million cycles per second = | 1 Megahertz | 1 MHz |
One billion cycles per second = | 1 Gigahertz | 1 GHz |
As a side note, it can sometimes be difficult to grasp the difference between very big numbers. Here’s one way that I have always found useful: one million seconds is 11.5 days, one billion seconds is 31.5 years.
So, our black and white computer above was running a 1 MHz processor, while our Netflix capable machine needed a 100 MHz CPU. When I first started working with PCs, the state-of-the-art processor was the Intel 386, which was succeeded by the 486 and then the Pentium. Let’s have a look at how processor clock speeds have evolved since then:
Intel CPU | Production | Clock Speed | Cycles per Second |
386 | 1985-1990 | 12 – 40 MHz | 12,000,000 – 40,000,000 |
486 | 1989-1992 | 16 – 100 MHz | 16,000,000 – 100,000,000 |
Pentium | 1993-1999 | 65 – 250 MHz | 65,000,000 – 250,000,000 |
Pentium II | 1997-1999 | 233 – 450 MHz | 233,000,000 – 450,000,000 |
Pentium III | 1999-2003 | 450 MHz – 1.4 GHz | 450,000,000 – 1,400,000,000 |
Pentium 4 | 2000-2008 | 1.3 – 3.8 GHz | 1,300,000,000 – 3,800,000,000 |
Core 2 | 2006-2011 | 1.06 – 3.33 GHz | 1,000,600,000 – 3,330,000,000 |
Core i3 | 2010-Present | 800 MHz – 4.0 GHz | 800,000,000 – 4,000,000,000 |
Core i5 | 2009-Present | 1.06 – 4.2 GHz | 1,000,600,000 – 4,200,000,000 |
Core i7 | 2008-Present | 1.1 – 5.0 GHz | 1,100,000,000 – 5,000,000,000 |
Core i9 | 2018-Present | 2.3 – 5.3 GHz | 2,300,000,000 – 5,300,000,000 |
Between 1985 and today we have gone from 12 million cycles per second to over 5 billion. But we can see some other puzzling points in the above table. Way back in 2008 the Pentium 4 was managing a clock speed of 3.33 GHz, comfortably within the range of speeds possible for the very latest i9 chip. And yet, I can buy a Pentium 4 chip on eBay for under a tenner, while an i9 would set me back close to £500. A substantial speed overlap between successive processor generations also stands out.
A new approach for a new millennium
To understand some of what is going on here, we need to pay attention to the name change, from Pentium to Core. This change signified the introduction of multi core processors. Multi core processors, for the purposes of this article, refer to a 21st Century innovation whereby more than one CPU is housed on a single chip. So why don’t they just say multi-processor or multi-CPU. Well, as mentioned above, a CPU is a little more than just an arithmetic unit, there are a lot of ancillary functions around fetching data, storing results and so on. On multi-core machines these functions are shared between the CPU cores, while on multi-processor units they are not. A dual core processor won’t be quite twice as fast as a single core processor running at the same clock speed but, because it can run two operations simultaneously, it will be about 80% faster. So, if we are going to compare clock speeds, we need to know the number of cores. Almost all versions of the i3 CPU have two cores, about half of i5s have two cores and the remainder four, some i7s have two cores, most have four and a few have either six or eight, the i9 has either six or eight.
So, if we compare a Pentium 4 running at 3 GHz with a six core i7 running at the same speed and allowing, as above, 20% for the overhead of distributing tasks between the cores, we find that the Pentium 4 3 GHz can perform 3 billion operations per second while the i7 3 GHz can perform 14.4 billion operations in the same time. Now the price difference starts to make sense.
But there is more to it
We’re not there yet. Aren’t we comparing apples with oranges by pitting a single against a multi-core processor? If we compare a six core i7 against a six core i9 both running at 4.0 GHz aren’t we going to find they are pretty much equal? Simply in terms of arithmetic operations, yes they will be equal but the i9 will be able to take input and produce output at a higher rate than the i7, making the overall result faster. We’d never get to the end of this article if we were to explore variables around CPU input and output but there is one remaining area to consider to grasp the speed difference between successive processor generations.
As discussed above, CPUs are made up of microscopic transistors which can switch between states once per cycle, or tick on the metronome. But how many transistors are we talking about? Here are some sample values:
CPU/Processor | Transistors | |
Intel 386 | Up to 275 thousand | 275,000 |
Intel 486 | Up to 1.2 million | 1,180,235 |
Intel Pentium | Up to 3.1 million | 3,100,000 |
Intel Pentium II | Up to 7.5 million | 7,500,000 |
Intel Pentium III | Up to 9.5 million | 9,500,000 |
Intel Pentium 4 | Up to 184 million | 184,000,000 |
Intel Core 2 | Up to 411 million | 411,000,000 |
Intel Core i3 | Up to 1.4 billion | 1,400,000,000 |
Intel Core i5 | Up to 1.7 billion | 1,750,000,000 |
Intel Core i7 | Up to 3.2 billion | 3,200,000,000 |
Intel Core i9 | Up to 7 billion | 7,000,000,000 |
There is no direct correlation between the number of transistors in a CPU and the speed of the CPU. However, having more transistors allows for more sophisticated calculations in a single operation and for more jobs to be queued, input and output at the same time with the overall result being faster overall computer performance.
Putting it all together
To try to understand where we have arrived at in our understanding of processor speed, we can go back to our single purpose 1 MHz CPU, the M1, which could render an image using 16 million colours in just under 2.5 seconds by setting three values for each of 786,432 pixels. If it didn’t have anything else to do, a Pentium III running at 1 GHz could do the same thing in 0.0025 seconds (i.e. 400 times a second) while an i7 running at 5 GHz could do it in 0.00055 seconds (i.e. 1,818 times a second). Now let’s consider the transistors: if we say our imaginary 1 MHz machine has the same number of transistors as a 386 CPU (275,000) it could still only update one screen every 2.75 seconds, a Pentium III could display a different image on each of 12 screens 400 times a second and an i7 could display a different image on each of 4,000 screens almost 2,000 times per second.
It should be clear now that the clock speed is only measuring the rate at which the metronome is ticking, i.e. the rate at which the processor cycles its transistors between one or other state. Clock speed does not measure how much the processor can accomplish for each cycle. In reality, each operation takes several cycles but, as we have seen, more powerful processors achieve a lot more per cycle than older ones. It is a question of heft rather than speed. Consider an adult with a shovel and a child with a bucket and spade. Both are asked to dig a hole and both proceed to operate their tools at exactly the same speed. It doesn’t take a computer to work out who will have dug the bigger hole after five minutes.
None of this tells you how to select the right computer for your job but, hopefully, it provides a useful illustration as to the relevance of three of the key variables you will see when looking at computer specifications: processor, number of cores and clock speed. And, of course, if you do have an application that requires 4,000 screens to be updated 2,000 times per second you now know exactly which processor to chose.
If you would like to know more please feel free to get in touch with us.