Latest posts by Youness Alaoui (see all)
- Intel FSP reverse engineering: finding the real entry point! - April 2, 2018
- February 2018 coreboot update now available - February 22, 2018
- A Primer Guide to Reverse Engineering - November 17, 2017
Hi again everyone and welcome to the “Good news” post!
It’s been 3 weeks since I wrote my last blog post but this is going to be a short update, in big part because I’ve spent the first two weeks sick in bed and thus wasn’t able to do much at all. However, in the last week I did manage to make some big progress, and the result represents such a great milestone that it warrants a blog post of its own. And, well, I doubt many will complain about not having to read through a wall of text for today’s blog post 🙂
So the good news is: coreboot is working on the Librem 13. The laptop boots into Linux and most things are working! The only issue I have found so far is that the M.2 SATA port doesn’t seem to work properly yet (see below for more info).
Getting video output
You may remember that, at the end of my 2nd blog post, I had finally managed to build coreboot with all of the binary blobs included and it should have worked but it didn’t for “some reason”, so I was going to try to enable debugging to see where it froze.
After installing the “Screwdriver” image on the BeagleBone Black and enabling the EHCI debugging in the coreboot config, I was able to get the debug output from coreboot. It was really quite easy to achieve. The screwdriver image for BBB is pretty much a “boot it and it will Just Work™” thing, no configuration or app installs to do on it. As for enabling EHCI debugging, it took me a couple of tries because I had to enable two options in different config menus, not just one (enable EHCI log debugging, then enable the option to send the log via USB), but thankfully the wiki page explained that so once I followed the docs, it was quite simple. And for the curious/future reference, the USB port which outputs the EHCI debugging is the one on the right side of the laptop.
Once I had the boot log from coreboot, I noticed that it hadn’t frozen anywhere, the last line was about coreboot launching the payload SeaBIOS, so coreboot did everything up until the end. I checked the various steps and it had initialized the RAM, the refcode, the VBIOS, etc. I figured, “Maybe it’s a configuration issue”, so I checked my lspci output from before, and saw that the VGA Controller PCI ID was “8086,1616”, then I went into the coreboot config and saw that it was set to “8086,0406”. So I changed that, and flashed coreboot and when I booted the machine, the video controller worked and I saw the SeaBIOS prompt. Hurray!
The Curious Case of the M.2 SSD
Unfortunately, once I tried booting Linux, it failed with a “Read Error”:
After spending some time trying to figure it out and not being able to (there is no “Read Error” string or anything that could print such a string in the SeaBIOS code, so I couldn’t track down where the error came from, and there is no EHCI debug since coreboot is already done booting and the issue is from SeaBIOS), I tried booting PureOS from the USB installation drive instead, and I was able to boot into the live environment without any problems. Wow, first success! PureOS is booting with coreboot! There was much rejoicing.
To begin investigating the SSD issue, I used the same set of commands from the Motherboard porting guide and started comparing the results, and there were a few differences, but I’m still not sure what they mean. Here’s an example of some of the differences between the two lspci outputs (the problematic SATA controller) :
Even after booting into Linux, the internal SSD was not accessible, and ‘dmesg’ was showing errors initializing the SATA controller.
SeaBIOS was sometimes seeing the M.2 SSD (but was never able to boot from it):
Sometimes, it wouldn’t see the M.2 SSD at all:
…and sometimes, it would just show garbage:
However, it had no issues detecting and booting from the USB stick, so I had an idea; I installed a 2.5″ HDD into my Librem 13 and tried that. It was immediatly detected by the PureOS liveUSB. So I installed PureOS on the HDD, and rebooted. While SeaBIOS still didn’t detect the SSD, it detected the 2.5″ HDD and was able to boot flawlessly with it. Still no SSD detection even with PureOS fully booted from the HDD however, and dmesg still complained about various SATA initialization issues.
I took the opportunity to test the wifi, video card, speakers, and everything seemed to work. I then booted into MemTest86+ and tested the RAM overnight. There were no errors after more than 17 hours of RAM testing.
As I booted Linux again I noticed the ME PCI device wasn’t in the lspci output, so I wondered if I somehow messed up the ME partition, therefore I left the computer running for a couple of hours to make sure it wouldn’t shut down (due to ME watchdog), then I noticed something weird: I suddenly had a /dev/sdb* set of devices. The output of ‘dmesg’ showed that it magically was able to detect it somehow and I was now able to access the M.2 SSD.
So I did a few more tests, and it seems that after a few minutes (30 minutes to an hour), the M.2 SSD connector will suddenly start responding and Linux will be able to initialize it and detect/access the SSD. It also seems that suspending/resuming the laptop helps trigger the M.2 initialization much faster. I still have no idea why this happens. And once, it managed to initialize the SSD after only 3 seconds instead of the usual 30 minutes, as you can see in this ‘dmesg’ output here :
I have now started reading up on the PCI Configuration space in order to understand the differences in the lspci output and hopefully fix the M.2 issues. My current theory is that since the PCI subsystem ID is different when using the vendor BIOS than from using the coreboot BIOS, it’s possible that the subsystem ID somehow tells SeaBIOS/Linux that this specific SATA controller has a quirk that changes the initialization timings. This is only a wild guess for the time being, hopefully in the next few weeks I’ll understand enough about the way PCI initialization works to be able to figure out what goes wrong.
My current status is that PureOS boots and is perfectly usable, however the M.2 controller doesn’t work reliably. Also, the MEI PCI device as well as the USB EHCI device have disappeared from the ‘lspci’ output (both USB ports are working though). The lspci output is also different for most of the other devices when compared to the original BIOS.
One other thing worth mentioning is that I have stopped using the IC clip already. Since I am able to boot into Linux with coreboot, I can now use flashrom to flash the BIOS directly from Linux and I’ve used it to do my BIOS updates while testing in the last few days. This is great, because not only does it speed up development, but it also confirms/tests the process that existing Librem 13 owners will go through to update their laptops to coreboot.
Here is the Acceptance Test Matrix that I mentioned in my previous article, which I’ve found in an old post on the coreboot blog, where I’ve stricken whatever I have had time to test and confirm as working, and made bold anything known not to work :
Cold boot: memory controller works. Cold boot: all installed DRAM is online. Cold boot: graphics controller works.
- Cold boot: SATA controller succeeds.
- Cold boot: EC controller responds ok to init code.
Cold boot: LCD backlight turns on. Cold boot: linux boots ok in text mode. Cold boot: linux boots ok in framebuffer (boot splash) mode. Cold boot: X initializes the LCD at full native resolution. Cold boot: X enables hardware acceleration.
- Boot time: Cold boot to grub succeeds in less than a set timeout.
- Boot time: Reboot from linux back to linux succeeds in less than a set timeout.
- Boot time: Power down succeeds in less than a set timeout.
SeaBIOS test: keyboard works. Grub test: keyboard works. Grub test: text mode and framebuffer graphics work. Cold boot to USB linux succeeds. (We plan to use SeaBIOS for boot device selection, barring major bugs.) Reboot to USB linux succeeds. EC test: fan spins. EC test: holding power for >5 seconds forces a power down. ACPI test: lid switch works. ACPI test: power button event received ok. ACPI test: AC power on/off event received ok. ACPI+EC+battery test: battery percentage works.
- Media keys on keyboard work in linux.
- Device tests: internal mic,
internal speakers,webcam, webcam mic, wifi, bluetooth, hard drive, SSD, SD card, each USB port, headphone jack.
- prime95 (one instance bound to each hyperthread) for a fixed time to test CPU thermal management.
- glxgears for a fixed time to test GPU thermal management.
- During prime95 test, CPU digital thermal sensor should give reasonable results.
Linux suspend ok. LCD backlight adjustable in linux. Linux kernel boot messages should not contain too many errors.(Only the SATA errors are appearing)
As you can see we have at least 22 out of 32 items that are considered tested and done, which means we’re at least two-thirds there—most of the other items are probably working as well, I just hadn’t had time to test them yet.
I hope to have the M.2 issues fixed within the next couple of weeks, then, after making sure it is perfectly safe to flash coreboot to any Librem 13, we’ll probably release a beta image for people to test (it will come with plenty of disclaimers though!) After that, I’ll work on disabling the Intel ME (first by using the me_cleaner tool, then testing if it works as expected).
We’ll keep you posted on the progress.