In the last blog post, about booting the upstream kernel on Inforce 6640, I mentioned that there was an interesting issue with ftrace that led to a form of debug-by-rebooting approach, and that is worth a blog post. Of course it only took me 1 year+ to write it (!) – so apologies for this huge delay, and let’s debug some early boot failures!
First of all, a reminder of what was the issue: in an attempt to boot the upstream kernel on the Inforce 6640 board, I found that for some reason ftrace doesn’t work in such a setup, starting around kernel 5.7 – bisect wasn’t precise, even talking with the ftrace maintainer on IRC didn’t provide many clues (since more debugging was required).
In order to be accurate and provide up-to-date results, I re-tested this with kernel v6.2-rc7 and…same issue is observed, the kernel just can’t boot with ftrace, and not a single hint is provided in serial console!
A parenthesis here: kernel 6.2-rc7 does boot on inforce 6640, but with no graphics (likely some changes in the msm driver), having some SCSI timeouts on boot time (sd 0:0:0:0: [sda] tag#13 timing out command, waited 120s) and…no USB(!), unless the “old” device-tree (from v5.16) is used – I think it’s related with some dwc3/usb changes merged on qcom device-trees. But anyway, this might be a topic for a whole new blog post, so let’s get back to debugging early boot issues.
Once we face a completely empty serial output after the usual bootloader messages, a basic question comes up: has the kernel even booted? In other words, more technically meaningful: did the bootloader properly load the kernel and jump to the code? Did the kernel code effectively start execution?
The first idea is always “let’s print something”, but how, if I don’t know where I’d put the print statement? Not to mention…that super early code has nothing initialized, so even a printk could be “too much” for such code. Hence, the only alternative seemed to me…to reboot the machine from kernel code! With that, at least I’d validate that kernel code was running. And thinking about that…I even extended this logic: if I can reboot the kernel, I could kinda bisect the code to determine where the failure is. One question remains though – how to reboot a kernel so early?
Rebooting/shutdown is definitely a non-trivial task. By checking the kernel code (here, for example), one can see it’s full of callbacks to the architecture code; naturally, it’s a platform/arch low-level process. Checking machine_restart() under arch/arm64, we can see it goes through EFI calls (if EFI is supported). Noticing I was walking an unknown and potentially daunting path, the decision was to first seek help on IRC, and that brought me gold: Marc Zyngier (the ARM64 maintainer) presented me PSCI (Power State Coordination Interface) – if my device’s firmware supports this specification (and Inforce 6640 does!), I could issue a PSCI reset SMC (Secure Monitor Call) to get a board reset. In other words, with some assembly instructions I could perhaps reboot the kernel! Marc even provided me information about the register I should write, and after some tinkering (and more code study), I came up with this function:
This was first tested as an alternative way for the sysrq-b reset handler, and that worked like a charm! Now, where to plug such code in kernel early path? Entry point seems to make sense, so I tried this hack on head.S:
And voilá! With that, the kernel was bootlooping…meaning the board reached kernel code, so the first question was answered and debug could proceed! I decided to take a lucky step and jump directly to start_kernel(), which is C code and way easier to play with no fears of causing another issue while debugging! And the lucky step paid off: kernel was indeed executing that function. So, through a set of attempts using the arm64_smc_reset() above and commenting the results, I got the following “bisect” from the code:
It was a bit ugly to see the diff above in code, hence I’ve added as an image. Basically, it tells us the issue likely happens in setup_machine_fdt(), and by digging more (with more “bisect” reboots), I found the issue happens in fdt_check_header(). Now, as a next step I really wanted to print some values from the guilty function, but using the kernel command-line earlycon didn’t help – it turned out this is not early enough.
But what if I could write a very simple really early console, or even an easier approach: copy/port the earlycon code for this board to a bit “earlier” point in the code? And that’s what I tried next: by inspecting (and instrumenting) the msm_serial driver, I came up with this following alternative (a big code snippet here would be just terrible to read, so uploaded the patch instead): https://people.igalia.com/gpiccoli/arm64-really-early-console-msm-qcom.patch
Notice the code is quite hacky – it ports the __msm_console_write() function with a pre-configured MMIO address for the serial port to early arch code. But despite the code being merely a “mock-up”, using it with sprintf() (for formatting) allowed me to print stuff, which was quite nice! Unfortunately I didn’t have time to debug the ftrace issue more in this board – I guess the best path forward would be first to submit a proper ifc6640 device-tree, and then follow with the debug. I think it’s likely related to the bootloader version and kernel/DT offsets, so that’s not an easy debug and definitely it’d be time-consuming.
But I hope at least the PSCI reset trick, and maybe this really early serial console prototype/idea could be useful to somebody in the world of ARM boards debugging! Thanks for reading and see you in the next blog post (which I expect takes waaay less than 1 year heheh).
My first task in the Core team at Igalia was to boot a more recent kernel in the Inforce 6640 board , which was very interesting given my very limited experience working in the ARM64 world – also, the work aimed to benefit Igalia’s Graphics team in their Freedreno development. The board itself is pretty powerful and contains a lot of resources; it comes with a quad-core Snapdragon™ 820 processor, Adreno™ 530 GPU, 4G of RAM plus a fast 64G UFS storage. A picture of my unit below:
So, I got a packed Debian image with 4.14 kernel and after some looking, noticed that the kernel was a modified / custom version from Linaro, which they call Qualcomm LT (Landing Tree). Hence, the first idea was to boot this image and see how well it worked. And unsurprisingly, it was working flawlessly – given this was provided by the board vendor, the expectation was that good amount of validation was performed, so we had a starting point here – the working version was the Linaro’s Qualcomm-LT 4.14. In a rush of optimism, I tried to test the latest released mainline kernel at the moment (v5.14) and…it didn’t work! Why would it, right? Where would be the fun?
So, next step was to bisect it and I decided to start by using the Linaro’s LT tree – it failed on 5.7+, but succeeded to boot 5.4, although the board seemed pretty slow! First thought was likely a problem in the storage, like a bad device tree config for UFS (the storage device) or a driver change between 4.14 and 5.4. But before looking into that, we needed to understand why kernels 5.4+ weren’t booting…so how to proceed here, if ssh wasn’t working?
Yeah…exactly: serial output to the rescue! So I got some jumper cables and decided to use a Raspberry Pi as the other end for serial connection. By connecting all the proper cables and using “minicom”, I could boot a 5.7 kernel and check the progress..and guess what? Kernel was booting normally!!! But without PCIe – no network / GPU, for example. Also, it was still very slow. And then, an epiphany: it was the device-tree! By dumping the device-tree (from procfs) in both the working 4.14 and the non-working 5.7, a bunch of stuff were missing, like all the PCI structures. But why is that? Somebody just removed stuff from a working device-tree?
Nah, it was just an obvious thing I didn’t realize before: up to kernel 5.6, the Inforce 6640 board was relying on the their “cousin’s” device-tree, the Dragonboard 820c! A pretty similar board, so relying on its DT made the board working like a charm, but in kernel 5.6 a proper device-tree for Inforce 6640 was added, although it’s very simple and minimal to boot the board. In order to have a fully working system after that finding, all we needed was to force the usage of the “old” db820c.dts, with the following hack patch:
OK, so now we have a working upstream kernel (with network!), so let’s investigate the remaining: why the board is so slow? And what about graphics, is it working? I started with the graphics and again, had some issues: board had graphics working on Linaro’s LT tree (v5.13), but not in the upstream 5.13/5.14. Another round of bisects later plus code inspection, and I found the “culprit”: a downstream patch that was also required in order to make the GPU work. The patch added a regulator with “always-on” property and a higher voltage than found in upstream. So, changed that and voilá: we had now a working 5.14 with graphics but…still slow! Moving then to the next journey: why the newer kernels are so much slower?
In order to investigate that, I began by comparing the db-820c device-trees among versions, and despite there were differences, nothing rang a bell. So, jumped to sysfs and started investigating there…when it comes to my attention that the CPUs were running slow, very slow – CPUs frequency were tiny . Taking a look in the commits and checking cpufreq drivers, the troublemaker was found: after kernel 4.18, a patch added a driver for handling OPP voltage scaling for Kryo CPUs, which is the case for Snapdragon™ 820. After loading this new module, CPU clocks were back to normal/fast rates, the board was as fast (or even more) than running the original Linaro’s LT 4.14 downstream kernel.
But… (yeah, another but)…after some simple CPU benchmarks, just some seconds after the beginning of the tests, board rebooted into a firmware error state. It was like as if it couldn’t handle all the CPU speed in the newer kernels. Here, I used the very powerful ftrace tooling to determine the reason of this behavior. And guess what was found?
Cooling! The responsible for that was kernel lack of throttling. The following trace snippet shows how the governor would reduce the CPU clock/voltage in order to preserve CPUs health:
The key here is the function step_wise_throttle which downscale the CPU frequency to prevent from overheating. In order to have it properly working, cooling maps/trip points are required in the device-tree, it’s present in some downstream patches on Linaro’s kernel – but when that work was upstream’ed, it lacked some portions, hence if we have cpufreq driver and high frequencies, we’re prone to overheating / firmware reboots. The patch that fixes that was re-submitted by another person and I tried to get that accepted, by “lobbying” in the mailing-list and showing our test results, but up to 5.16-rc6 it isn’t present in Linus tree.
Finally, worth to mention 2 things: for some reason, building a kernel with ftrace after v5.7 prevents it from booting. This led to a very interesting debug, like debug-by-rebooting (using a firmware triggered reboot in assembly), (re-)implementing a very early console to show output much before “earlycon” is available – stay tuned for a blog post about that! Second, we still lack that “regulator-always-on” necessary to have a functional GPU – after submitting a patch for that, I’ve discussed that with the maintainer, and it seems we should have the user of such regulator to request it properly – due to other tasks I couldn’t continue this work. So, in summary, if you want to boot an upstream kernel (5.14+) on Inforce 6640, the steps are:
Disable the ifc6640.dts, and rely on db820c DT;
Boot without ftrace and with the Kryo cpufreq driver built-in (ARM_QCOM_CPUFREQ_NVMEM=y);
With that, one should be able to boot a recent kernel in this great board, and even with graphics! Take a look in the Inforce 6640 running glxgears in a recent kernel:
Hope the read was enjoyable and useful, feel free to leave your comment or ping on IRC (gpiccoli @OFTC/Libera) =]