Initial commit

2024-11-24 19:20:40 +00:00 · 2020-07-12 10:08:51 +01:00 · 2020-07-12 10:08:51 +01:00 · ff4a2907b7
commit ff4a2907b7
parent 51cf0dde5f
20 changed files with 761 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -0,0 +1,61 @@
+Writing a "bare metal" operating system for Raspberry Pi 4
+==========================================================
+
+Introduction
+------------
+
+As a tech CEO @RealVNC, I don't write code any more. And I've recently realised just how much I miss it.
+
+Currently in the throes of a nationwide "lockdown" due to Covid-19 (and having been spared my usual commute), I've found myself with more hours in the day. I have taken this time for myself and used it to fulfil a childhood ambition - to write a **bare metal** operating system that runs on commercial hardware.
+
+What does bare metal mean?
+--------------------------
+
+When we buy a computer or a tablet/smartphone it typically comes with some basic software pre-installed. You'll likely be familiar with watching Microsoft Windows, Mac OS, iOS, Android or maybe even Linux start up as you power the device (or **boot** it) for the first time. These are all operating systems - software designed to make computer chips work out of the box for mere mortals like you and me. They help us interact with the machine by drawing to a screen, processing messages from devices like keyboards & mice, working with networking hardware to connect you to the Internet, allowing us to playback sound and much, much more.
+
+Software developers around the world then build applications (apps) that run on top of these operating systems. These apps talk to the hardware via the operating system (**OS**), so this complex code doesn't have to be written over and over again. As a result, it's possible to be a software developer without knowing much about hardware at all! It's the OS that does the hard work that allows us to use apps like Facebook, Instagram, WhatsApp, TikTok etc.
+
+It's fair to say that _computers can't do anything useful without an OS_. They just sit there waiting to be told what to do. So, why is it that only software giants like Microsoft, Apple and Google get to tell the majority of computers what to do as they're being switched on? Why can't we? Well, we can, and that is what bare metal programming is.
+
+Choice of hardware
+------------------
+
+If you're excited by the prospect of telling a computer what to do then you need an interest in hardware. The computer chip that's going to do your bidding is called a **CPU** (Central Processing Unit) - it's the beating heart of every computer device. Lots of companies have designed such CPUs over the years, but two - Intel and Arm - are most widely adopted. These each have their advantages and disadvantages. If you own a smartphone, it's highly likely that it's running on a chip designed by Arm. If you own a laptop running Microsoft Windows then it's likely to be running on an Intel chip. You'll want to develop an understanding of both **architectures** eventually, but I've chosen an Arm device for this project.
+
+The new [Raspberry Pi 4 Model B](https://www.raspberrypi.org/products/raspberry-pi-4-model-b/) is a low-cost computer that runs on a 1.5 GHz 64-bit quad-core Arm Cortex-A72 processor. It's a device that many millions of people worldwide use, and so it's exciting to write bare metal code for it. Imagine that somebody else might one day use your OS! The **RPi4** also has some useful attached hardware that will help us along the way.
+
+Hardware prerequisites
+----------------------
+
+You'll need some hardware to get started with writing your OS:
+
+ * An RPi4 with a dedicated power supply and HDMI lead
+ * A monitor/TV connected to the RPi4 via HDMI
+ * A micro-SD card to boot the RPi4 from
+ * A computer to write your code on e.g. a Windows laptop (the **dev machine**)
+
+You'll need to make sure that you can write to the micro-SD card using your dev machine. For me, that meant buying an SD card adapter, because the micro-SD card was too small for the slot in my laptop. You may need the same, or even a USB SD card reader too if your laptop doesn't have one built-in.
+
+Other incredibly useful hardware that you simply can't do without:
+
+ * A pair of eyebrow tweezers (I borrowed these from my wife!) - useful to insert/remove the micro-SD card into the tiny slot on the RPi4
+ * A [USB to serial TTL cable](https://www.amazon.co.uk/gp/product/B01N4X3BJB/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1) - useful to see what your OS is doing long before you can write information to the screen
+
+Software prerequisites
+----------------------
+
+If you can't get someone else's OS running, you likely won't be able to write your own. So I started by flashing the SD card with Raspbian - Raspberry Pi's recommended OS. I used the very neat [Imager tool](https://www.raspberrypi.org/downloads/) that they make available on their website to do this.
+
+Hook up your RPi4 and make sure it boots into Raspbian. There are plenty of resources online to help you achieve/troubleshoot this. Getting Raspbian up will test that your hardware setup is working properly. Note: because I connected my RPi4 to my (not brilliant) TV, I needed to make an edit in the _config.txt_ file on the SD card (setting the `hdmi_safe` parameter to 1) to ensure that I could see the screen. Without that, it was just black.
+
+Don't proceed until you get Raspbian running!
+
+---
+
+The RPi4 runs on an Arm Cortex-A72 processor. Your dev machine is likely running on an Intel processor. You'll therefore need some software that helps you build code to run on a different architecture. This is called a **cross-compiler**.
+
+Download and unpack [Arm's gcc compiler](https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-a/downloads). For reasons that I won't go into here, you'll need to use the "AArch64 ELF bare-metal target". Since I'm using WSL on Windows 10 to emulate Ubuntu, I downloaded the x86_64 Linux hosted cross-compiler.
+
+I also advocate installing GNU make - you'll need it soon enough. Because I'm using WSL, for me that was simply a matter of typing `sudo apt install make` and entering my password.
+
+_Now you're ready to start writing your OS!_
--- a/part1/README.md
+++ b/part1/README.md
@ -0,0 +1,106 @@
+Writing a "bare metal" operating system for Raspberry Pi 4 (Part 1)
+===================================================================
+
+How do we code?
+---------------
+
+We tell the RPi4 what to do by writing code. You may know that code ultimately ends up as a series of 0's and 1's (binary). You'll be pleased to know, however, that we don't need to write it like this, otherwise we'd easily lose track of what was going on! In fact, it's one of the jobs of the compiler to convert human-readable language into those 0's and 1's.
+
+To get going we need to understand two languages: **assembly language** and **C**. Whilst C will likely be recognisable to most modern software developers, assembly language is "spoken" by fewer folks. It's a lower-level language that most closely resembles how the CPU "thinks" and it therefore gives us a lot of control, whereas C brings us into a higher-level, human-readable world. We lose a little control to the compiler though, but for our purposes I think we can trust it!
+
+We will need to start out in assembly language, but there isn't much to write before we can then pick up in C.
+
+A note about this tutorial
+--------------------------
+
+This tutorial is not intended to teach you how to code in assembly language or C. There are plenty of good resources on these topics and I am not an expert/authority! I will therefore be assuming some knowledge along the way. Please do read around the topics that I introduce if you need/want to.
+
+Bootstrapping
+-------------
+
+The first code that the RPi4 will run will need to be written in assembly language. It makes some checks, does some setup and launches us into our first C program - the **kernel**.
+
+ * The Arm Cortex-A72 has four cores. We only want our code to run on the master core, so we check the processor ID and either run our code (master) or hang in an infinite loop (slave).
+ * We need to tell our OS how to access the **stack**. I think of the stack as temporary storage space used by currently-executing code, like a scratchpad. We need to set memory aside for it and store a pointer to it.
+ * We also need to initialise the BSS section. This is the area in memory where uninitialised variables will be stored. It's more efficient to initialise everything to zero here, rather than take up space in our kernel image doing it explicitly.
+ * Finally, we can jump to our main() routine in C!
+
+Read and understand the code below and save it as _boot.S_. I suggest using the [Arm Programmer's Guide](https://developer.arm.com/documentation/den0024/a/) as a reference.
+
+```c
+.section ".text.boot"  // Make sure the linker puts this at the start of the kernel image
+
+.global _start  // Execution starts here
+
+_start:
+    // Check processor ID is zero (executing on main core), else hang
+    mrs     x1, mpidr_el1
+    and     x1, x1, #3
+    cbz     x1, 2f
+    // We're not on the main core, so hang in an infinite wait loop
+1:  wfe
+    b       1b
+2:  // We're on the main core!
+
+    // Set stack to start below our code
+    ldr     x1, =_start
+    mov     sp, x1
+
+    // Clean the BSS section
+    ldr     x1, =__bss_start     // Start address
+    ldr     w2, =__bss_size      // Size of the section
+3:  cbz     w2, 4f               // Quit loop if zero
+    str     xzr, [x1], #8
+    sub     w2, w2, #1
+    cbnz    w2, 3b               // Loop if non-zero
+
+    // Jump to our main() routine in C (make sure it doesn't return)
+4:  bl      main
+    // In case it does return, halt the master core too
+    b       1b
+```
+
+And now we're in C
+------------------
+
+You will likely note that the `main()` routine is as yet undefined. We can write this in C (save it as _kernel.c_), keeping it very simple for now:
+
+```c
+void main()
+{
+    while (1);
+}
+```
+
+This simply spins us in an infinite loop!
+
+Linking it all together
+-----------------------
+
+We've written code in two different languages. Somehow we need to glue these together, ensuring that the created image will be executed in the way that we intend. We use a **linker script** for this. The linker script will also define our BSS-related labels (perhaps you were already wondering where they get defined?). I suggest you save the following as _link.ld_:
+
+```c
+SECTIONS
+{
+    . = 0x80000;     /* Kernel load address for AArch64 */
+    .text : { KEEP(*(.text.boot)) *(.text .text.* .gnu.linkonce.t*) }
+    .rodata : { *(.rodata .rodata.* .gnu.linkonce.r*) }
+    PROVIDE(_data = .);
+    .data : { *(.data .data.* .gnu.linkonce.d*) }
+    .bss (NOLOAD) : {
+        . = ALIGN(16);
+        __bss_start = .;
+        *(.bss .bss.*)
+        *(COMMON)
+        __bss_end = .;
+    }
+    _end = .;
+
+   /DISCARD/ : { *(.comment) *(.gnu*) *(.note*) *(.eh_frame*) }
+}
+__bss_size = (__bss_end - __bss_start)>>3;
+```
+
+Writing linker scripts is [worth investigating](http://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_mono/ld.html#SEC6) but, for our purposes, all you need to know is that by referencing `.text.boot` first and using the `KEEP()`, we ensure the `.text` section starts with our assembly code. That means our first instruction starts at 0x80000, which is exactly where the RPi4 will look for it when it boots. Our code will be run.
+
+_Now you're ready to compile and then boot your OS!_
--- a/part1/boot.S
+++ b/part1/boot.S
@ -0,0 +1,30 @@
+.section ".text.boot"  // Make sure the linker puts this at the start of the kernel image
+
+.global _start  // Execution starts here
+
+_start:
+    // Check processor ID is zero (executing on main core), else hang
+    mrs     x1, mpidr_el1
+    and     x1, x1, #3
+    cbz     x1, 2f
+    // We're not on the main core, so hang in an infinite wait loop
+1:  wfe
+    b       1b
+2:  // We're on the main core!
+
+    // Set stack to start below our code
+    ldr     x1, =_start
+    mov     sp, x1
+
+    // Clean the BSS section
+    ldr     x1, =__bss_start     // Start address
+    ldr     w2, =__bss_size      // Size of the section
+3:  cbz     w2, 4f               // Quit loop if zero
+    str     xzr, [x1], #8
+    sub     w2, w2, #1
+    cbnz    w2, 3b               // Loop if non-zero
+
+    // Jump to our main() routine in C (make sure it doesn't return)
+4:  bl      main
+    // In case it does return, halt the master core too
+    b       1b
--- a/part1/kernel.c
+++ b/part1/kernel.c
@ -0,0 +1,4 @@
+void main()
+{
+    while (1);
+}
--- a/part1/link.ld
+++ b/part1/link.ld
@ -0,0 +1,19 @@
+SECTIONS
+{
+    . = 0x80000;     /* Kernel load address for AArch64 */
+    .text : { KEEP(*(.text.boot)) *(.text .text.* .gnu.linkonce.t*) }
+    .rodata : { *(.rodata .rodata.* .gnu.linkonce.r*) }
+    PROVIDE(_data = .);
+    .data : { *(.data .data.* .gnu.linkonce.d*) }
+    .bss (NOLOAD) : {
+        . = ALIGN(16);
+        __bss_start = .;
+        *(.bss .bss.*)
+        *(COMMON)
+        __bss_end = .;
+    }
+    _end = .;
+
+   /DISCARD/ : { *(.comment) *(.gnu*) *(.note*) *(.eh_frame*) }
+}
+__bss_size = (__bss_end - __bss_start)>>3;
--- a/part2/Makefile
+++ b/part2/Makefile
@ -0,0 +1,19 @@
+CFILES = $(wildcard *.c)
+OFILES = $(CFILES:.c=.o)
+GCCFLAGS = -Wall -O2 -ffreestanding -nostdinc -nostdlib -nostartfiles
+GCCPATH = ../../gcc-arm-9.2-2019.12-x86_64-aarch64-none-elf/bin
+
+all: clean kernel8.img
+
+boot.o: boot.S
+	$(GCCPATH)/aarch64-none-elf-gcc $(GCCFLAGS) -c boot.S -o boot.o
+
+%.o: %.c
+	$(GCCPATH)/aarch64-none-elf-gcc $(GCCFLAGS) -c $< -o $@
+
+kernel8.img: boot.o $(OFILES)
+	$(GCCPATH)/aarch64-none-elf-ld -nostdlib -nostartfiles boot.o $(OFILES) -T link.ld -o kernel8.elf
+	$(GCCPATH)/aarch64-none-elf-objcopy -O binary kernel8.elf kernel8.img
+
+clean:
+	/bin/rm kernel8.elf *.o *.img > /dev/null 2> /dev/null || true
--- a/part2/README.md
+++ b/part2/README.md
@ -0,0 +1,70 @@
+Writing a "bare metal" operating system for Raspberry Pi 4 (Part 2)
+===================================================================
+
+Making a makefile
+-----------------
+
+I could now just tell you the commands required to build this very simple kernel one after the other, but let's try to future-proof a little. I anticipate that our kernel will become more complex, with multiple C files needing to be built. It therefore makes sense to craft a **makefile**. A makefile is written in (yet another) language that automates the build process for us.
+
+Save the following as _Makefile_:
+
+```
+CFILES = $(wildcard *.c)
+OFILES = $(CFILES:.c=.o)
+GCCFLAGS = -Wall -O2 -ffreestanding -nostdinc -nostdlib -nostartfiles
+GCCPATH = ../../gcc-arm-9.2-2019.12-x86_64-aarch64-none-elf/bin
+
+all: clean kernel8.img
+
+boot.o: boot.S
+	$(GCCPATH)/aarch64-none-elf-gcc $(GCCFLAGS) -c boot.S -o boot.o
+
+%.o: %.c
+	$(GCCPATH)/aarch64-none-elf-gcc $(GCCFLAGS) -c $< -o $@
+
+kernel8.img: boot.o $(OFILES)
+	$(GCCPATH)/aarch64-none-elf-ld -nostdlib -nostartfiles boot.o $(OFILES) -T link.ld -o kernel8.elf
+	$(GCCPATH)/aarch64-none-elf-objcopy -O binary kernel8.elf kernel8.img
+
+clean:
+	/bin/rm kernel8.elf *.o *.img > /dev/null 2> /dev/null || true
+```
+
+ * CFILES is a list of the _.c_ files already existing in the current directory (our input)
+ * OFILES is that same list but replacing _.c_ with _.o_ in each filename - these will be our **object files** containing the binary code, and they'll be generated by the compiler (our output)
+ * GCCFLAGS is a list of parameters that tell the compiler we're building for bare metal and so it can't rely on any standard libraries that it might normally use to implement simple functions - nothing is for free on bare metal!
+ * GCCPATH is the path to our compiler binaries (the location where you unpacked the Arm tools you downloaded previously)
+
+There then follows a list of targets with their dependencies listed after the colon. The indented commands underneath each target will be executed to build that target. It's hopefully easy to see that to build _boot.o_, we depend on the existence of the source code file _boot.S_. We then run our compiler with the right flags, taking _boot.S_ as our input and generating _boot.o_.
+
+`%` is a matching wildcard character within a makefile. So, when I read the next target, I see that to build any other file that ends in _.o_ we require its similarly-named _.c_ file. The command list underneath is then executed with `$<` being replaced by the _.c_ filename and `$@` being replaced by the _.o_ filename.
+
+Carrying on, to build _kernel8.img_ we must first have built _boot.o_ and also every other _.o_ file in the OFILES list. If we have, we run the `ld` linker to join _boot.o_ with the other object files using our linker script, _link.ld_, to define the layout of the _kernel8.elf_ image we create. Sadly, the ELF format is designed to be run by another operating system so, for a bare metal target, we use `objcopy` to extract the right sections of the ELF file into _kernel8.img_. This is the kernel image that we'll eventually boot our RPi4 from.
+
+I would now hope that the "clean" and "all" targets are self-explanatory.
+
+Building
+--------
+
+Now that we have our _Makefile_ in place, we simply type `make` to build our kernel image. Since "all" is the first target listed in our _Makefile_, `make` will build this unless you tell it otherwise. When building "all", it will first clean up any old builds and then make us a fresh build of _kernel8.img_.
+
+
+Copying our kernel image to the SD card
+---------------------------------------
+
+Hopefully you already have a micro-SD card with the working Raspbian image on it. To boot our kernel instead of Raspbian we need to replace any existing kernel image(s) with our own, whilst taking care to keep the rest of directory structure intact. 
+
+On your dev machine, mount the SD card and delete any files on it that begin with the word _kernel_. A more cautious approach may be to simply move these off the SD card into a backup folder on your local hard drive. You can then restore these easily if needed.
+
+We'll now copy our _kernel8.img_ onto the SD card. This name is meaningful and it signals to the RPi4 that we want it to boot in 64-bit mode. We can also force this by setting `arm_64bit` to a non-zero value in [config.txt](https://www.raspberrypi.org/documentation/configuration/config-txt/boot.md). Booting our OS into 64-bit mode will mean that we can take advantage of the larger memory capacity available to the RPi4.
+
+Booting
+-------
+
+Safely unmount the SD card from your dev machine, put it back into your RPi4 and power it up.
+
+_You've just booted your very own OS!_
+
+As exciting as that sounds, all you're likely to see after the RPi4's own "rainbow splash screen" is an empty, black screen. However, we shouldn't be so surprised: we haven't yet asked it to do anything other than spin in an infinite loop.
+
+The foundations are laid though, and we can start to do exciting things now. **Congratulations for getting this far!**
--- a/part2/boot.S
+++ b/part2/boot.S
@ -0,0 +1,30 @@
+.section ".text.boot"  // Make sure the linker puts this at the start of the kernel image
+
+.global _start  // Execution starts here
+
+_start:
+    // Check processor ID is zero (executing on main core), else hang
+    mrs     x1, mpidr_el1
+    and     x1, x1, #3
+    cbz     x1, 2f
+    // We're not on the main core, so hang in an infinite wait loop
+1:  wfe
+    b       1b
+2:  // We're on the main core!
+
+    // Set stack to start below our code
+    ldr     x1, =_start
+    mov     sp, x1
+
+    // Clean the BSS section
+    ldr     x1, =__bss_start     // Start address
+    ldr     w2, =__bss_size      // Size of the section
+3:  cbz     w2, 4f               // Quit loop if zero
+    str     xzr, [x1], #8
+    sub     w2, w2, #1
+    cbnz    w2, 3b               // Loop if non-zero
+
+    // Jump to our main() routine in C (make sure it doesn't return)
+4:  bl      main
+    // In case it does return, halt the master core too
+    b       1b
--- a/part2/kernel.c
+++ b/part2/kernel.c
@ -0,0 +1,4 @@
+void main()
+{
+    while (1);
+}
--- a/part2/link.ld
+++ b/part2/link.ld
@ -0,0 +1,19 @@
+SECTIONS
+{
+    . = 0x80000;     /* Kernel load address for AArch64 */
+    .text : { KEEP(*(.text.boot)) *(.text .text.* .gnu.linkonce.t*) }
+    .rodata : { *(.rodata .rodata.* .gnu.linkonce.r*) }
+    PROVIDE(_data = .);
+    .data : { *(.data .data.* .gnu.linkonce.d*) }
+    .bss (NOLOAD) : {
+        . = ALIGN(16);
+        __bss_start = .;
+        *(.bss .bss.*)
+        *(COMMON)
+        __bss_end = .;
+    }
+    _end = .;
+
+   /DISCARD/ : { *(.comment) *(.gnu*) *(.note*) *(.eh_frame*) }
+}
+__bss_size = (__bss_end - __bss_start)>>3;
--- a/part3/Makefile
+++ b/part3/Makefile
@ -0,0 +1,19 @@
+CFILES = $(wildcard *.c)
+OFILES = $(CFILES:.c=.o)
+GCCFLAGS = -Wall -O2 -ffreestanding -nostdinc -nostdlib -nostartfiles
+GCCPATH = ../../gcc-arm-9.2-2019.12-x86_64-aarch64-none-elf/bin
+
+all: clean kernel8.img
+
+boot.o: boot.S
+	$(GCCPATH)/aarch64-none-elf-gcc $(GCCFLAGS) -c boot.S -o boot.o
+
+%.o: %.c
+	$(GCCPATH)/aarch64-none-elf-gcc $(GCCFLAGS) -c $< -o $@
+
+kernel8.img: boot.o $(OFILES)
+	$(GCCPATH)/aarch64-none-elf-ld -nostdlib -nostartfiles boot.o $(OFILES) -T link.ld -o kernel8.elf
+	$(GCCPATH)/aarch64-none-elf-objcopy -O binary kernel8.elf kernel8.img
+
+clean:
+	/bin/rm kernel8.elf *.o *.img > /dev/null 2> /dev/null || true
--- a/part3/README.md
+++ b/part3/README.md
@ -0,0 +1,221 @@
+Writing a "bare metal" operating system for Raspberry Pi 4 (Part 3)
+===================================================================
+
+Making something happen
+-----------------------
+
+So far our OS produces only a black screen. How can we be sure that our code is actually running? Let's do something a bit more interesting to really demonstrate that we have control of the hardware.
+
+Usually, the first thing a software developer learns is to print "Hello world!" to the screen. In bare metal development however, printing to the screen can be quite a big challenge, so we're going to do something simpler to start with.
+
+Introducing the UART
+--------------------
+
+Perhaps the easiest way we can "send a message" from our OS is via the **UART** or serial communications circuit. UART stands for Universal Asynchronous Receiver/Transmitter and it's a very old and fairly simple interface that uses just two wires to communicate between two devices. Before USB came along, devices like mice, printers and modems were connected in this way. 
+
+We're going to connect your dev machine directly to your RPi4 and have the RPi4 send the "Hello world!" message to your dev machine! Your dev machine will print it to the screen.
+
+You will need:
+
+ * A [USB to serial TTL cable](https://www.amazon.co.uk/gp/product/B01N4X3BJB/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1)
+ * To [download and install drivers for the cable](https://www.silabs.com/products/development-tools/software/usb-to-uart-bridge-vcp-drivers)
+ * To [download and install PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html) on your dev machine
+
+If you'd like to read up on serial communcation before we start, I recommend looking at the [SparkFun website](https://learn.sparkfun.com/tutorials/serial-communication/all).
+
+Connecting the cable
+--------------------
+
+If you have the drivers installed, go ahead and connect the cable to a spare USB port on your dev machine. Very little will happen, but if you now open Control Panel, click on Device Manager and open the Ports section, you should see a "Prolific" entry. That tells us that your cable is working correctly.
+
+Here's what my machine looks like:
+
+![Windows Control Panel with cable installed](images/3-helloworld-ctlpanel.png)
+
+Make a note of the COMx number in brackets after the Prolific entry - in my case, that's **COM5**.
+
+Now we need to look at the RPi4 to identify how to connect the other end of the cable. You'll be looking for the **GPIO pins**, all 40 of them, which are just above the Raspberry Pi copyright notice. 
+
+The diagram below shows where you need to make connections. The BLACK connecter hooks over Ground (Pin 6), the WHITE over TXD (GPIO 14/Pin 8) and the GREEN over RXD (GPIO 15/Pin 10). As we are powering the RPi4 using a dedicated power supply, make sure you **don't connect the RED connector**.
+
+![GPIO location](images/3-helloworld-pinloc.png)
+
+Here's my RPi4 with the cable connected correctly:
+
+![GPIO photo with cable connected](images/3-helloworld-cable.jpg)
+
+Setting up PuTTY
+----------------
+
+ * Run PuTTY on your dev machine
+ * Click on the "Session" category in the left-hand pane
+ * Set "Connection type" to Serial
+ * Click on "Serial" under the "Connection" category in the left-hand pane
+ * Set the "Serial line to connect to" to the COMx number we found above, mine was COM5
+ * Set the "Speed (baud)" to 115200
+ * Ensure "Data bits" is 8, "Stop bits" is 1, "Parity" is None and "Flow control" is None
+ * Click back to the "Session" category in the left-hand pane and you should see the changed settings
+ * Save these settings by typing a name e.g. "Raspberry Pi 4" in the textbox under "Saved Sessions" and clicking Save
+ * You can now start the connection by double-clicking on "Raspberry Pi 4" - if you do, all you will see for now is an empty black window
+
+A quick config.txt change
+-------------------------
+
+Do you remember that, back in the first tutorial, I had to edit the _config.txt_ file on the SD card to get Raspbian up on my TV screen? Now we need to add a line to ensure that our UART connection will be reliable.
+
+UART communication is a lot to do with timing, and it's important that both ends agree on the exact speed of data being sent/received. When we set up PuTTY, we told it to communicate at 115200 baud, and we'll need the RPi4 to communicate at the same rate. As it is, we can't be sure that it will - it might communicate faster or slower depending on how busy the CPU is.
+
+Add this line to your _config.txt_ to resolve this:
+
+```c
+core_freq_min=500
+```
+
+Getting the UART going in code
+------------------------------
+
+First off, let's update _kernel.c_ to make a few new calls:
+
+```c
+#include "io.h"
+
+void main()
+{
+    uart_init();
+    uart_writeText("Hello world!\n");
+    while (1);
+}
+```
+
+We start by including a new **header file**, _io.h_. This allows us to write some new code outside of the _kernel.c_ file, and call it in when we need it.
+
+You'll note that our `main()` routine has also some new lines. First we call a function to initialise the UART, and then we call another function to write "Hello world!" to it. The weird character at the end of the string - `\n` - is how we add a newline to the end of our text, just like pressing Enter in a word processor!
+
+Let's now create _io.h_ with the following contents:
+
+```c
+void uart_init();
+void uart_writeText(char *buffer);
+```
+
+This is a very short file with two **function definitions**. `uart_init()` is a **void function** with no **parameters**, just like `main()` is. This means that it doesn't need any data from the caller to do its job, and it doesn't send any data back to the caller when it's done. You'll note that `uart_writeText` is also a void function, but it does take a parameter since we need to tell it what text to write!
+
+We'll put the actual code for these functions in another new file, _io.c_:
+
+```c
+// GPIO
+
+enum {
+    PERIPHERAL_BASE = 0xFE000000,
+    GPFSEL0         = PERIPHERAL_BASE + 0x200000,
+    GPSET0          = PERIPHERAL_BASE + 0x20001C,
+    GPCLR0          = PERIPHERAL_BASE + 0x200028,
+    GPPUPPDN0       = PERIPHERAL_BASE + 0x2000E4
+};
+
+enum {
+    GPIO_MAX_PIN       = 53,
+    GPIO_FUNCTION_ALT5 = 2,
+};
+
+enum {
+    Pull_None = 0,
+};
+
+void mmio_write(long reg, unsigned int val) { *(volatile unsigned int *)reg = val; }
+unsigned int mmio_read(long reg) { return *(volatile unsigned int *)reg; }
+
+unsigned int gpio_call(unsigned int pin_number, unsigned int value, unsigned int base, unsigned int field_size, unsigned int field_max) {
+    unsigned int field_mask = (1 << field_size) - 1;
+  
+    if (pin_number > field_max) return 0;
+    if (value > field_mask) return 0; 
+
+    unsigned int num_fields = 32 / field_size;
+    unsigned int reg = base + ((pin_number / num_fields) * 4);
+    unsigned int shift = (pin_number % num_fields) * field_size;
+
+    unsigned int curval = mmio_read(reg);
+    curval &= ~(field_mask << shift);
+    curval |= value << shift;
+    mmio_write(reg, curval);
+
+    return 1;
+}
+
+unsigned int gpio_set     (unsigned int pin_number, unsigned int value) { return gpio_call(pin_number, value, GPSET0, 1, GPIO_MAX_PIN); }
+unsigned int gpio_clear   (unsigned int pin_number, unsigned int value) { return gpio_call(pin_number, value, GPCLR0, 1, GPIO_MAX_PIN); }
+unsigned int gpio_pull    (unsigned int pin_number, unsigned int value) { return gpio_call(pin_number, value, GPPUPPDN0, 2, GPIO_MAX_PIN); }
+unsigned int gpio_function(unsigned int pin_number, unsigned int value) { return gpio_call(pin_number, value, GPFSEL0, 3, GPIO_MAX_PIN); }
+
+void gpio_setPinFunction(unsigned int pin_number, unsigned int function) {
+    gpio_function(pin_number, function);
+}
+
+void gpio_useAsAlt5(unsigned int pin_number) {
+    gpio_pull(pin_number, Pull_None);
+    gpio_setPinFunction(pin_number, GPIO_FUNCTION_ALT5);
+}
+
+// UART
+
+enum {
+    AUX_BASE        = PERIPHERAL_BASE + 0x215000,
+    AUX_ENABLES     = AUX_BASE + 4,
+    AUX_MU_IO_REG   = AUX_BASE + 64,
+    AUX_MU_IER_REG  = AUX_BASE + 68,
+    AUX_MU_IIR_REG  = AUX_BASE + 72,
+    AUX_MU_LCR_REG  = AUX_BASE + 76,
+    AUX_MU_MCR_REG  = AUX_BASE + 80,
+    AUX_MU_LSR_REG  = AUX_BASE + 84,
+    AUX_MU_CNTL_REG = AUX_BASE + 96,
+    AUX_MU_BAUD_REG = AUX_BASE + 104,
+    AUX_UART_CLOCK  = 500000000,
+    UART_MAX_QUEUE  = 16 * 1024
+};
+
+#define AUX_MU_BAUD(baud) ((AUX_UART_CLOCK/(baud*8))-1)
+
+void uart_init() {
+    mmio_write(AUX_ENABLES, 1); //enable UART1
+    mmio_write(AUX_MU_IER_REG, 0);
+    mmio_write(AUX_MU_CNTL_REG, 0);
+    mmio_write(AUX_MU_LCR_REG, 3); //8 bits
+    mmio_write(AUX_MU_MCR_REG, 0);
+    mmio_write(AUX_MU_IER_REG, 0);
+    mmio_write(AUX_MU_IIR_REG, 0xC6); //disable interrupts
+    mmio_write(AUX_MU_BAUD_REG, AUX_MU_BAUD(115200));
+    gpio_useAsAlt5(14);
+    gpio_useAsAlt5(15);
+    mmio_write(AUX_MU_CNTL_REG, 3); //enable RX/TX
+}
+
+unsigned int uart_isWriteByteReady() { return mmio_read(AUX_MU_LSR_REG) & 0x20; }
+
+void uart_writeByteBlockingActual(unsigned char ch) {
+    while (!uart_isWriteByteReady()); 
+    mmio_write(AUX_MU_IO_REG, (unsigned int)ch);
+}
+
+void uart_writeText(char *buffer) {
+    while (*buffer) {
+       if (*buffer == '\n') uart_writeByteBlockingActual('\r');
+       uart_writeByteBlockingActual(*buffer++);
+    }
+}
+```
+
+You'll see that the two functions we defined in our _io.h_ header file now have some actual code, along with some other supporting functions. I'll explain what's going on in this code in the next lesson, but let's skip straight to the action now!
+
+With your new _io.c_ and _io.h_ files in place, as well as the changes to _kernel.c_ made, run `make` to build your new OS. 
+
+Then: 
+
+ * Copy the newly built _kernel8.img_ to the SD card, and then put the SD card into your RPi4
+ * Make sure your USB to serial TTL cable is connected correctly
+ * Run PuTTY and double-click your "Raspberry Pi 4" session - you should see an empty black screen and no errors
+ * Power on your RPi4
+
+If you've followed all these instructions, after a few seconds you'll see "Hello world!" appear in your PuTTY window on your dev machine.
+
+_It's a message from your RPi4 to say that your OS is working. Proof at last!_
--- a/part3/boot.S
+++ b/part3/boot.S
@ -0,0 +1,30 @@
+.section ".text.boot"  // Make sure the linker puts this at the start of the kernel image
+
+.global _start  // Execution starts here
+
+_start:
+    // Check processor ID is zero (executing on main core), else hang
+    mrs     x1, mpidr_el1
+    and     x1, x1, #3
+    cbz     x1, 2f
+    // We're not on the main core, so hang in an infinite wait loop
+1:  wfe
+    b       1b
+2:  // We're on the main core!
+
+    // Set stack to start below our code
+    ldr     x1, =_start
+    mov     sp, x1
+
+    // Clean the BSS section
+    ldr     x1, =__bss_start     // Start address
+    ldr     w2, =__bss_size      // Size of the section
+3:  cbz     w2, 4f               // Quit loop if zero
+    str     xzr, [x1], #8
+    sub     w2, w2, #1
+    cbnz    w2, 3b               // Loop if non-zero
+
+    // Jump to our main() routine in C (make sure it doesn't return)
+4:  bl      main
+    // In case it does return, halt the master core too
+    b       1b
--- a/part3/images/3-helloworld-cable.jpg
+++ b/part3/images/3-helloworld-cable.jpg
--- a/part3/images/3-helloworld-ctlpanel.png
+++ b/part3/images/3-helloworld-ctlpanel.png
--- a/part3/images/3-helloworld-pinloc.png
+++ b/part3/images/3-helloworld-pinloc.png
--- a/part3/io.c
+++ b/part3/io.c
@ -0,0 +1,100 @@
+// GPIO
+
+enum {
+    PERIPHERAL_BASE = 0xFE000000,
+    GPFSEL0         = PERIPHERAL_BASE + 0x200000,
+    GPSET0          = PERIPHERAL_BASE + 0x20001C,
+    GPCLR0          = PERIPHERAL_BASE + 0x200028,
+    GPPUPPDN0       = PERIPHERAL_BASE + 0x2000E4
+};
+
+enum {
+    GPIO_MAX_PIN       = 53,
+    GPIO_FUNCTION_ALT5 = 2,
+};
+
+enum {
+    Pull_None = 0,
+};
+
+void mmio_write(long reg, unsigned int val) { *(volatile unsigned int *)reg = val; }
+unsigned int mmio_read(long reg) { return *(volatile unsigned int *)reg; }
+
+unsigned int gpio_call(unsigned int pin_number, unsigned int value, unsigned int base, unsigned int field_size, unsigned int field_max) {
+    unsigned int field_mask = (1 << field_size) - 1;
+  
+    if (pin_number > field_max) return 0;
+    if (value > field_mask) return 0; 
+
+    unsigned int num_fields = 32 / field_size;
+    unsigned int reg = base + ((pin_number / num_fields) * 4);
+    unsigned int shift = (pin_number % num_fields) * field_size;
+
+    unsigned int curval = mmio_read(reg);
+    curval &= ~(field_mask << shift);
+    curval |= value << shift;
+    mmio_write(reg, curval);
+
+    return 1;
+}
+
+unsigned int gpio_set     (unsigned int pin_number, unsigned int value) { return gpio_call(pin_number, value, GPSET0, 1, GPIO_MAX_PIN); }
+unsigned int gpio_clear   (unsigned int pin_number, unsigned int value) { return gpio_call(pin_number, value, GPCLR0, 1, GPIO_MAX_PIN); }
+unsigned int gpio_pull    (unsigned int pin_number, unsigned int value) { return gpio_call(pin_number, value, GPPUPPDN0, 2, GPIO_MAX_PIN); }
+unsigned int gpio_function(unsigned int pin_number, unsigned int value) { return gpio_call(pin_number, value, GPFSEL0, 3, GPIO_MAX_PIN); }
+
+void gpio_setPinFunction(unsigned int pin_number, unsigned int function) {
+    gpio_function(pin_number, function);
+}
+
+void gpio_useAsAlt5(unsigned int pin_number) {
+    gpio_pull(pin_number, Pull_None);
+    gpio_setPinFunction(pin_number, GPIO_FUNCTION_ALT5);
+}
+
+// UART
+
+enum {
+    AUX_BASE        = PERIPHERAL_BASE + 0x215000,
+    AUX_ENABLES     = AUX_BASE + 4,
+    AUX_MU_IO_REG   = AUX_BASE + 64,
+    AUX_MU_IER_REG  = AUX_BASE + 68,
+    AUX_MU_IIR_REG  = AUX_BASE + 72,
+    AUX_MU_LCR_REG  = AUX_BASE + 76,
+    AUX_MU_MCR_REG  = AUX_BASE + 80,
+    AUX_MU_LSR_REG  = AUX_BASE + 84,
+    AUX_MU_CNTL_REG = AUX_BASE + 96,
+    AUX_MU_BAUD_REG = AUX_BASE + 104,
+    AUX_UART_CLOCK  = 500000000,
+    UART_MAX_QUEUE  = 16 * 1024
+};
+
+#define AUX_MU_BAUD(baud) ((AUX_UART_CLOCK/(baud*8))-1)
+
+void uart_init() {
+    mmio_write(AUX_ENABLES, 1); //enable UART1
+    mmio_write(AUX_MU_IER_REG, 0);
+    mmio_write(AUX_MU_CNTL_REG, 0);
+    mmio_write(AUX_MU_LCR_REG, 3); //8 bits
+    mmio_write(AUX_MU_MCR_REG, 0);
+    mmio_write(AUX_MU_IER_REG, 0);
+    mmio_write(AUX_MU_IIR_REG, 0xC6); //disable interrupts
+    mmio_write(AUX_MU_BAUD_REG, AUX_MU_BAUD(115200));
+    gpio_useAsAlt5(14);
+    gpio_useAsAlt5(15);
+    mmio_write(AUX_MU_CNTL_REG, 3); //enable RX/TX
+}
+
+unsigned int uart_isWriteByteReady() { return mmio_read(AUX_MU_LSR_REG) & 0x20; }
+
+void uart_writeByteBlockingActual(unsigned char ch) {
+    while (!uart_isWriteByteReady()); 
+    mmio_write(AUX_MU_IO_REG, (unsigned int)ch);
+}
+
+void uart_writeText(char *buffer) {
+    while (*buffer) {
+       if (*buffer == '\n') uart_writeByteBlockingActual('\r');
+       uart_writeByteBlockingActual(*buffer++);
+    }
+}
--- a/part3/io.h
+++ b/part3/io.h
@ -0,0 +1,2 @@
+void uart_init();
+void uart_writeText(char *buffer);
--- a/part3/kernel.c
+++ b/part3/kernel.c
@ -0,0 +1,8 @@
+#include "io.h"
+
+void main()
+{
+    uart_init();
+    uart_writeText("Hello world!\n");
+    while (1);
+}
--- a/part3/link.ld
+++ b/part3/link.ld
@ -0,0 +1,19 @@
+SECTIONS
+{
+    . = 0x80000;     /* Kernel load address for AArch64 */
+    .text : { KEEP(*(.text.boot)) *(.text .text.* .gnu.linkonce.t*) }
+    .rodata : { *(.rodata .rodata.* .gnu.linkonce.r*) }
+    PROVIDE(_data = .);
+    .data : { *(.data .data.* .gnu.linkonce.d*) }
+    .bss (NOLOAD) : {
+        . = ALIGN(16);
+        __bss_start = .;
+        *(.bss .bss.*)
+        *(COMMON)
+        __bss_end = .;
+    }
+    _end = .;
+
+   /DISCARD/ : { *(.comment) *(.gnu*) *(.note*) *(.eh_frame*) }
+}
+__bss_size = (__bss_end - __bss_start)>>3;