How computers really work #1 – CPU and memory
Watch the YouTube video! 👇👇👇
(or keep scrolling down if you prefer reading!)
Computers are everywhere
Whether you're reading this on a laptop, desktop or smartphone, one thing that's for sure is that it's some kind of computer.
Computers are all around us. But how many of us actually understand how they work?
Too complicated right? Well maybe not.
I'm going to explain, in simple terms, how all the computers we use are built around just two different types of microchips, and maybe we'll see that they're not all that complicated after all.
The first chip: the CPU
Here's a picture of the 8-bit retro console I designed and built myself:
The chip circled in red is the brains of the operation. It's a Zilog Z80 Central Processing Unit, or CPU.
CPUs are an example of what most of us call microchips, but the technical term is Integrated Circuits, or ICs for short, and the job of a CPU is simply to carry out instructions that we tell it to.
The Z80 CPU was launched in 1976. It's historically very interesting and it still has some applications today but it's not exactly cutting-edge technology. But the way in which CPUs are designed hasn't really changed that much in the last 50 years.
From PCs to Macs, consoles to cash machines, iPhones to Androids - all of them are based around CPUs.
The basic idea is that each CPU has a fixed set of instructions that it understands how to carry out. We call that the "instruction set" of the CPU.
For instance, the Z80 supports around 158 different types of instructionsopen_in_new.
So let's say we've managed to get ourselves a CPU: how do we take that and build a computer out of it?
Now this might come as a surprise, but much like your airfryer, all electronic components come with an instruction manual called a datasheet, and the datasheet for the Z80 lists all the instructions that it supports and how they work.
[TODO: link to datasheet]
You can download the Z80's datasheet directly from the manufacturer's website. In it you can find, amongst many other things, information about the pins and what each one does; a list of all the instructions in the instruction set; and detailed information about how each instruction works. For example, one of the instructions (called
ADD B, r) adds two numbers together.
Now I'd like to see your airfryer do that!
Different CPUs by different manufacturers have different instruction sets. Imagine, if you like, that each CPU has its own native language that only understands. It's one of the reasons why a program written for a PC, for example, won't work if you try and copy it to an iPhone. The CPU inside the iPhone doesn't understand the instruction set that the PC program was written for.
The second chip: Memory
So if CPUs sound really clever, well they are, but they're not quite enough to make a computer. On their own they're kind of useless.
To make a really useful computer we need to pair our CPU up with another chip - memory.
Memory has one simple job - to store information so we can recall it later.
Again let's look at the Retcon board and this time there are three types of memory chips I've circled.
To understand how they all work, we'll have to go back to basics. The simplest type of memory we can imagine is called a flip-flop (not to be confused with the shoes you wear on the beach).
A flip-flop is a device with an input for incoming data, an output, and a second input which is used to control whether the flip-flop should store data or not.
The input takes a single binary value, which means it can either be one or zero, and under normal circumstances it can change between one and zero but the output stays the same.
However, when we activate the control input, whatever value happens to be the input at that time is stored inside the flip-flop and the output changes to reflect it. After that, even if the input changes again, the stored value stays at the output. The new value stored is remembered until the next time the control input changes.
If we could stack up more than one flip-flop on a single device and share all the control inputs together we'd end up with something called a register, which is really just a flip flop that can store more than one digital value at a time.
We call each individual binary value that is stored a "bit" and when we're designing registers we we usually build them to store multiples of 8 bits at a time. Storing 8 bits is so common that we gave that a name too: a "byte".
You can easily buy 8-bit registers on a single chip, like this one, which is called a 74HCT273.
The name might sound scary, but it's an example of something called "standard logic" which means a useful type of chip that lots of different manufacturers make. It's the "273" part of the name which really identifies what the chip does, and there are really only a few dozen types of standard logic chips which are very ubiquitous (and far fewer in recent years since technology has advanced)
Anyway, the 273 is an 8-bit register - an 8-way flip-flop - just like we were talking about in the last section, and you can see from the photo that The Retcon board has two of them.
Now being able to store just one byte might be enough for some purposes, but for most computers that want to do anything useful it's nowhere near enough. Modern computers can store billions of bytes at the same time.
If you can imagine us putting billions of registers all into a single chip you would be imagining one of the other two types of memory: RAM. RAM stands for Random Access Memory.
How is RAM different from a register, apart from being able to hold a lot more data? Well consider that the 273 chip above holds exactly 1 byte of data, but has 20 pins on it. Clearly if we want to store billions of bytes we can't work with devices that have 20 pins per byte, so how can we fit all that storage into a single chip with a manageable number of pins?
Our 8-bit register chip lets us read a byte at the same time as we're writing one, but if we're okay making the compromise that we can only read or write a byte at a time then we can save eight pins for starters.
We can combine the eight input pins with the eight output pins by making them "bi-directional", so that they become "input/output" or "I/O" pins.
In order for this to work we'll also need to add two control pins which we'll use to set the direction of the I/O pins: "read" and "write".
If the read pin is active then the I/O pins will output the byte that's being stored, and if the write pin is active then they'll receive a new byte to be stored.
So we've saved 8 pins at the cost of introducing 2 more: a net saving of 6. Still a long way to go.
We're still only storing one byte. If we wanted to store two bytes and not use any more I/O pins we can use a technique called "multiplexing".
We'll add another control pin called A0, and we'll make it so that when A0 is 0, the I/O pins will be reading from or writing to the first of our two bytes and when A0 is 1, they'll be accessing the second byte.
If we add a second pin, A1, we'll find that we can address not two but four bytes; one for each of the combinations of A1 and A0 that's possible - that's 00, 01, 10 and 11. Adding a third pin lets us address not four, but eight bytes, and so on.
Every time we add a new A pin, we double the number of memory locations we can address.
We call these different memory locations "addresses", and so we call the group of pins which are used to select the address the "address pins". We call the eight I/O pins which read or write the data values the "data pins".
And that's how you make a RAM.
As an example, let's take a look at a real RAM chip, from a company called Alliance - the "ASC64008". It's an example of something called "Static RAM" or "SRAM" which just explains the type of technology it's made out of. There's another type of RAM called "Dynamic RAM" or "DRAM" and that's the kind that most computer is made of these days, because it's quite a lot cheaper to manufacture.
Anyway, if we take a look at the pinout for the ASC64008 from its datasheet, we can see that it has 19 address pins, that's A0 through A18, and so we can figure out how many addresses it has by counting the address pins and doubling for every pin:
|Number of pins||Number of addresses|
|11||2,048 (= 2x 1,024)|
|12||4,096 (= 4x 1,024)|
|13||8,192 (= 8x 1,024)|
|14||16,384 (= 16x 1,024)|
|15||32,768 (= 32x 1,024)|
|16||64,536 (= 64x 1,024)|
|17||131,072 (= 128x 1,024)|
|18||262,144 (= 256x 1,024)|
|19||524,288 (= 512x 1,024)|
Now looking at the data pins, there are eight of those, which means that each memory location can store exactly one byte (8 bits), which means that we can store 524,288 bytes in the RAM.
Note that for numbers bigger than 1,024 I've also given the number in terms of multiples of 1,024. That's a way of making the numbers easier to manage. We call 1,024 a "kilobyte" (KB) so 19 address pins and 8 data pins will store us 512KB, which is equal to 524,288 bytes.
If you're wondering what happens when we get to 1,024KB, well then we call that a "megabyte" (MB). Similarly 1,024MB is a "gigabyte" (GB) and 1,024GB is a "terabyte" (TB).
The third type of memory on the Retcon board is a ROM, which stands for Read Only Memory.
ROMs have data and address pins just like RAM, but as the name suggests you can only read data from a ROM, you can't write or update it. This means that ROMs usually have their data stored on them in the manufacturing process, and later we're going to see two reasons why that might be useful.
Another difference between RAMs and ROMs is that when you take away the power supply to a RAM it usually forgets everything that it knows. ROMs, on the other hand, keep their data intact even when they're not powered up.
Of the three types of memory we've talked about it's RAM that's the most important in a computer and to understand why we're going to have to go back in time to the 1930s and do a little bit of background theory...
Before digital electronics as we know it even existed, a group of mathematicians was working on ways to understand the maths of problem solving. Two of the most prominent were Alan Turing and Alonso Church, and together they came up with a definition called the "Turing-Church Thesis", which proposes that any problem which can be solved by what we call an algorithm - that's a finite series of well-defined steps - must be solvable by a simple hypothetical device called a "Turing Machine".
A Turing Machine can be imagined as a box with a paper tape running through it.
The box has a set of rules baked into it which allow it to read symbols from the tape and make decisions based on the symbols that it reads. Using these rules it can either choose to move the tape forwards, or backwards, overwrite some of the symbols on the tape, or just stop running completely once the job is finished.
We can design a Turing Machine to solve one specific problem, like adding two numbers together, but we can also design a Turing Machine to simulate any other Turing Machine based on symbols that it reads from the tape. A Turing Machine which is capable of simulating any other Turing Machine is called a "Universal Turing Machine", and by definition it's able to solve any problem that any Turing Machine can solve which, according to the Turing-Church Thesis, is any algorithmic problem.
Now I don't know about you, but I just find it mind-blowing to think that an idea as simple as a box with a paper tape running through it is capable of solving so many complex problems, but it really does work, and modern computers are the proof of that.
You might already have guessed that the CPU, with its instruction set, is the electronic equivalent of the box, but what about the tape? Well, that's where the RAM comes in. RAM lets us read, store and overwrite values, just like the tape in the Turing Machine, and that's why RAM is so critical in a computer.
In theory a Universal Turing Machine actually needs a tape that's infinitely long, but infinite memory doesn't exist. How much of a problem is this? Well we might technically need an infinite amount of memory to make a Universal Turing Machine, but it turns out that in practice we can solve most of the problems that we need to with computers with a finite amount of memory. Generally speaking though, bigger is better (at least when it comes to computer memory...)
Connecting CPUs and RAMs together with buses
Remember we said that RAM chips have address pins to select which byte we're interested in, and data pins to send and receive those bytes? Well the CPU also has address and data pins to match. The Z80, for example, has 16 address pins and 8 data pins, and it also has read and write pins just like the RAM does. Together these make up over half of the 40 pins that the Z80 has in total.
We can litearlly wire up the 16 address pins on the CPU to 16 of the address pins on the RAM and we can do the same with the data pins. We can also connect the read and write pins of the CPU and the RAM together.
Now that the address pins of both devices are connected together we call the group of wires connecting them the "address bus". Similarly we call the group of wires connecting the eight data pins of each device the "data bus".
Having connected the CPU to memory like this it's now able to manipulate the address bus, the data bus, and the control wires to write data to or read data from memory if that's something that we tell it to do. It's even possible to connect more than two devices to the buses so the CPU can talk to many different memories or even peripherals like keyboards or displays.
One important thing to note here is that because the Z80 has 16 address pins we can only address up to 64 kilobytes of memory at one time. We say that the Z80 has a 64 kilobyte "address space".
CPU registers and programs
So now the CPU is connected to RAM and it can read from it or write to it whenever we tell it to, but how do we tell the CPU to do stuff and how does it actually do the things that we tell it to?
Remember the registers from earlier? Well pretty much all CPUs are made with registers inside them and ours is no exception. The number and purpose of registers varies a lot from CPU to CPU, but the Z80 has something like 16 general purpose registers, most of which can either store 8 or 16 bits at a time.
These registers are important for temporarily holding data for use in calculations. For example, one of the instructions the Z80 supports adds two numbers together. Where do those numbers come from? Well, one option is that we put each of them into a register inside the CPU. For example, we could put the number 3 into register A and the number 4 into register B; then we could tell the CPU to add together those two registers. The result is stored back into register A, overwriting the number 3 which was there previously.
Addition is an example of an instruction which only uses internal registers like this, but some of the instructions that CPUs understand allow them to load values from external memory into the internal registers, and vice versa, which is how we can get values from the outside world into the CPU so they can be operated upon.
For example, another instruction the Z80 supports loads a byte from a given memory address, say address 5, and stores it into register A.
Here's a set of five instructions which, when run in sequence, will have the net effect of loading two values from memory, adding them together, and storing the result back to memory again.
We call this sequence of instructions a "program", and whatever you are doing with a computer – whether playing a game, writing a document, composing a Tweet, or even writing your own programs; under the hood the CPU is just executing sequences of instructions like this from its instruction set, albeit millions or even billions of times every second.
Booting and the fetch-decode-execute cycle
The next question to answer is: how do we feed those instructions into the CPU in the first place?
Because the CPU already knows how to read things from memory, why don't we store the instructions that we want to run in memory too? We'd just need to make some kind of convention around how the CPU reads programs from memory.
One of the simplest ways is what the Z80 does: when you first turn it on it reads a byte from memory address 0 - that's the first memory address - into a special register called the "instruction register". This is called an "instruction fetch".
The next thing it needs to do is decode the byte that it's just read. Each instruction that the CPU understands is encoded as a unique sequence of ones and zeros that - again - we can look up in the datasheet.
For example, we can see from the datasheet that the
ADD A, B instruction that we saw earlier in the Z80 is encoded as
1000 0000, so whenever the CPU sees that specific sequence of ones and zeros read into the instruction register it knows that it has to perform the
Some instructions need to be encoded into more than one byte, and in that situation the CPU will keep reading bytes from memory until it's decoded an entire instruction.
You may have heard people refer to computer programs as "code"; well these instruction encodings are like the original code. They're actually called "machine code", and they're the only programming language that computers really understand. All other programming languages just give us ways of generating machine code which are easier for software developers to write.
Next the CPU will "execute" the instruction, which is just a fancy way of saying that it runs it. After each instruction has finished executing, the CPU then starts the next fetch, reading a new instruction from the next address down in memory.
This fetch-decode-execute cycle continues indefinitely until either the CPU receives an instruction to stop or until someone turns the power off.
Jumps, branches & conditional branches
Usually instructions are run one after the other like this, but an important class of instruction that all CPUs support are so-called "jump" instructions, which can be used to break that normal convention of instructions running in order.
Jump instructions can be used to tell the CPU to look in a completely different part of memory for the next instruction to run - even in places where it's already executed instructions from.
An instruction which only jumps under certain conditions is called a "conditional jump" or "conditional branch" instruction, and it turns out that if a CPU has a conditional branch instruction then it's automatically powerful enough to be a Universal Turing Machine, which is why almost every CPU has one in its instruction set.
It's very common to see programs where the CPU conditionally branches back to a point it's already been to. For example we might want to repeat a certain block of instructions until some register counts down to zero. This example here does exactly that in order to add all the numbers from 5 to 1 together and get the result.
We call this type of programming technique a "loop".
As we've already seen, some of the instructions that CPUs understand allow them to read values from or write values to memory, so it's important to remember that there are now two reasons CPUs have to read from memory: they do it automatically whenever a new instruction needs to be fetched from memory, and they might do it sometimes whenever the instruction they're currently executing is one that requires a memory read.
We say that computers which use the same memory to store both instructions and data have a Von Neumann or sometimes Princeton Architecture.
This is very similar to the original idea of the Turing Machine, with the CPU as the box and the memory as the tape.
The Z80 is an example of a Von Neumann machine.
Because the bytes storing instructions and data are essentially indistinguishable in memory, programmers usually work with memory models which keep them separate and organised.
Some computers have completely separate address and data buses, and store their programs in completely separate memories from other data. These are called Harvard Architectures.
How retro cartridges worked
Now that we've seen how computers store their programs in memory and how the CPU will run those programs from memory when it's powered up, you might think there's a bit of a chicken and egg problem here: how is it that we get the program into memory in the first place so that the CPU can find it?
If you're old enough to remember games consoles with cartridges, these used a really simple technique: they were designed with a gap in the memory space.
Remember the Z80 has 16 address pins, so it has a 64 kilobyte address space. The computer I built - much like the Sega Master System which inspired it - has some RAM from address 48k onwards, but nothing at all from address 0: a bit of a problem if we turn the console on without any cartridge inserted!
If we take a cartridge apart we'll see that it has a ROM chip inside it, and this ROM has got all the game instructions pre-programmed into it. The data and address pins of the ROM are connected to the metal contacts on the edge of the cartridge. There's a socket inside the console which receives the cartridge and this is connected to the main address and data buses of the CPU.
When we plug the cartridge in, making sure the power's off first, it completes the connection between the ROM and the address and data buses - perfectly filling that gap in the address space.
Finally, when we turn the power on, the CPU boots and starts fetching instructions from memory address 0 on the cartridge ROM, and our game starts running.
So unfortunately for us nostalgia nerds, we don't use cartridges so much these days: we expect our CPUs to be able to do something useful as soon as we turn them on, without having to plug anything in.
A solution that works better for general purpose computers is to embed a ROM into the computer at memory address 0, and to put a program on that ROM whose job it is to look for other programs to run, usually on a hard disk.
On most computers, these ROM programs try to find some kind of Operating System to hand over to.
These built-in ROM programs are called BIOSes, which stands for Basic Input Output Systems, and because they sit somewhere between hardware and software we often say that they're examples of "firmware".
Because BIOS programs are quite basic, it means that the ROM that contains them doesn't need to be all that big, giving us more space in our computer for RAM compared with the cartridge method.
One thing we haven't covered yet is how computers talk to the outside world.
Devices that CPUs can talk to which aren't memory are called "peripherals" and there are just tons of ways that communication can happen, so we're going to have to revisit it another time...
To find out more about the Retcon project, checkout the microsite here!