Purism

Purism

Beautiful, Secure, Privacy-Respecting Laptops, Tablets, PCs, and Phones
Purism

Latest posts by Purism (see all)

Let me tell you how I made NVMe SSD support work on the first generation Librem laptops. This story is pretty old, from before the Librem 13 version 2 was even released, so it has been simplified and brought back to the current state of things as much as possible. The solutions presented here have been implemented a long time ago in our coreboot ports, but the technical insights you may derive from this post today should prove interesting nonetheless.

During internal beta testing of the install script a while ago, we realized that coreboot didn’t work with our NVMe SSDs, as all my testing had been done with a SATA M.2 SSD. I spent some time fixing coreboot so that it would initialize the NVMe SSD, and SeaBIOS so it can boot from the NVMe drive, and then I’ve figured out how to fix the NVMe issues I’ve been having after linux boots.

The story began with my blog post about the interference of the AMI BIOS with coreboot. What I didn’t mention back then is that after I figured out the issue and managed to unbrick Francois’ Librem, he wasn’t able to boot into his SSD from coreboot because it wasn’t getting detected. I then realized that he had an NVMe SSD and not a SATA SSD.

  • A SATA drive is controlled using the SATA controller on the motherboard which talks to the SATA drive using 4 data lines.
  • A NVMe drive is actually a PCIe device all on its own, which could use up to 16 data lines (4 per lane and the M.2 specification defines up to 4 lanes per device).

If a SATA device is detected, then the integrated SATA controller will talk to the drive using the SATA protocol, if a PCIe device is detected, then the device will be initialized as any other PCIe device (like the Wifi module for example) and in the case of NVMe drives, the NVMe protocol will be used directly (without passing through an onboard controller) to communicate with the device.

With that knowledge, I figured out my simple mistake then: the PCIe port used for the M.2 connector is Port #6, which is a “Flexible I/O” which can be used for either SATA or PCIe according to the Intel Broadwell datasheet. Unfortunately, in the Librem 13 coreboot configuration, the PCIe Port #6 was disabled (since it was never used, but that was only because I only ever tried a SATA drive). So the fix was simple: enable the PCIe port #6, and once coreboot initializes that PCIe port, the NVMe drive is initialized and working.

Francois tested this for me and confirmed he could boot on his drive. I needed to do my own tests however, so I ordered an NVMe drive (The Intel 600p Series SSD) and before I received it, someone (a regular Librem user) found and decided to test my script. With a lot of courage and determination, he was the first non-purism-employee beta tester of the coreboot install script and unfortunately, it didn’t work for him. While the coreboot install was fine, SeaBIOS wasn’t detecting his NVMe drive (he could boot into a live USB and flash back his Factory BIOS, so nothing to be alarmed about). I didn’t know why it worked for Francois but not for jsparber (our volunteer beta tester). I then realized that SeaBIOS itself didn’t have NVMe support, or more precisely, the NVMe support that was added to SeaBIOS was never tested outside of the qemu emulator, and was actually disabled for real hardware, so I enabled NVMe support for non-qemu hardware and sent the updated image to our beta tester who confirmed it to be working.

Why was it working for Francois if SeaBIOS didn’t have NVMe support, though? That’s a bit mysterious, but I think that his specific NVMe drive had some sort of SATA-compatibility mode in order to allow booting from older BIOS that don’t support NVMe devices.

Once I received my own NVMe SSD, I thought that it would just be a formality to get it to work, and indeed, it was detected by SeaBIOS but I couldn’t boot on it because it was still blank, so I tried to install PureOS on it. Unfortunately, that failed. I was getting an error halfway through the installation and the NVMe device was disappearing completely. My dmesg output had (among a flood of I/O errors) :

nvme 0000:04:00.0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff
nvme 0000:04:00.0: Refused to change power state, currently in D3
nvme nvme0: Removing after probe failure status: -19
nvme0n1: detected capacity change from 256060514304 to 0

After searching for a long time, I’ve found a few mentions of this error. At first, I thought that this lauchpad bug was the one affecting me, but it was saying that it was fixed in kernel 4.4.0, and 4.8. The PureOS live USB was sporting kernel 4.7 back then, so I thought that maybe I needed a higher kernel version still. I tried to install Ubuntu 17.04, but it wouldn’t even boot, and then the Fedora 25 live USB would have the same issue, even though it had kernel 4.9.6. I decided to try the Archlinux installer, which had the 4.10.6 kernel, and I had the same problems, then I found this other bug which says it might happen since kernel 4.10.0 had support for APST and some drives had quirks which made them fail when APST was enabled, but the fix here was simple, a kernel option at boot and the bug should disappear, so I tried it but no luck.

At that point, in order to remove coreboot from the equation, I had flashed back an AMI BIOS on my Librem, but I was still getting all these issues.

I gave up after a while, and I figured out that if it’s not a kernel issue, maybe it’s not a Linux issue at all, so I tried installing Windows on the NVMe, and the same issue happened again! “Well, if the problem happens with Windows and the factory BIOS, then there’s nothing I can do, the problem is with the drive itself, it’s defective! Right?”

While I was trying to find the original packaging to ship it back for a replacement, I had an idea: I put the NVMe drive in the Librem 13 v2 prototype that I had, and it worked! So I figured that the problem was with my own Librem 13 v1 which might have had defective hardware (maybe a scratch on the motherboard or something?)

However, for a week, while working on other things, I kept thinking that there must be something else I can do, that the issue can’t be as simple as “it’s a hardware issue”, but I didn’t know what more I could do if Windows+AMI Bios were failing, and the SSD itself was fine.

Then Francois told me that he was having issues with his Librem, where the NVMe device would “disappear”. This looked a lot like the problem I’ve been having, but for him, it wouldn’t happen within the first 5 minutes of use like me, it would happen after 48 hours instead, sometimes after putting the laptop to sleep, sometimes not—very unpredictable. Unfortunately, he had also reinstalled his system at the same time when he flashed coreboot, so this new problem could be coming from coreboot or from the new OS he installed—but lo and behold, after he flashed his original BIOS back, he was still having the same issue of his NVMe disappearing.

Since I don’t believe in coincidences, I decided to start my research again from scratch—forget all the various links and explanations and datasheets I found—and just looked at the problem from a blank slate. After I searched for the error I was getting again, I found a post on the Lenovo forums where someone was complaining about the same issue on their ThinkPad X270 and the thread was marked as “SOLVED“, so that was very promising. After reading through it, I found that the solution was a new BIOS update for the Lenovo X270 that fixed the problem. And when I looked at the changelog of that update, this is what it said about NVMe support:

- (New) Disable NVMe L1.2
- (New) Disable NVMe CLKREQ

Now that was interesting… what was this “L1.2” and “CLKREQ”? I did some more research and I found an article that explained that L1.2 is simply a lower-power mode of operation for a PCIe device. Going back to the original dmesg output, I then realized that it said something very interesting about the drive and its power state:

Refused to change power state, currently in D3

According to this MSDN article, the “D3” there refers to a device power state, and more precisely, D3 is the lowest power state of the device. That seems to coincide with the L1.2 PCIe state which is also the lowest power state. I’ve decided to do what Lenovo did and disable CLKREQ and L1.2 in the PCIe device. The CLKREQ seems to be used by the CPU or by the device to request activation of the clock and to allow exit of the L1.2. According to the PCI specification, I’ve found a document that states :

“The CLKREQ# signal is also used by the L1 PM Substates mechanism. In this case, CLKREQ# can be asserted by either the system or the device to initiate an L1 exit”

The “L1 PM substates” is referring to that L1.2 (L1.1 and L1.2 are referred to as L1 substates, and “PM” here means “Power Management”), so my theory was that the drive goes into low power mode, and when it needs to get out of it, the CLKREQ should be used, but wasn’t working, causing the drive to never know that it needs to wake up. Disabling CLKREQ would fix it because some other mechanism would be used to wake the drive, or disabling L1.2 would also fix it because the drive would never go into that D3 low power mode.

I looked extensively at the Broadwell LP datasheet and I saw that the CLKREQ for PCIe port #6 is multiplexed with GPIO 23, and looking at the gpio.h in coreboot for the Librem 13, I see GPIO 23 set as “INPUT”, while GPIO 18 and 19 (which are for PCIe ports #1 and #2) are set as “NATIVE”. So I’ve set GPIO 23 to NATIVE and tried it, but this made NVMe undetectable; coreboot was simply unable to detect anything on port #6, and I have no idea why. Not only do I not know what a “native” gpio means, but I also don’t know why changing it from input to native would cause the PCI bus scan to fail.

Either way, I’ve set it back to “INPUT” and tried to see how to disable CLKREQ from some PCI configuration. Unfortunately, the code in soc/intel/broadwell/pcie.c—which mentions “CLKREQ”—does things that I can’t understand, it modifies/sets PCI configuration values on offsets that don’t match anything in the datasheet and I have no idea if I’ve been reading the datasheet incorrectly or if the code is wrong somehow.

One simple example is this code snippet:

/* Per-Port CLKREQ# handling. */
if (gpio_is_native(18 + rp - 1))
/*
* In addition to D28Fx PCICFG 420h[30:29] = 11b,
* set 420h[17] = 0b and 420[0] = 1b for L1 SubState.
*/
pci_update_config32(dev, 0x420, ~0x20000, (3 << 29) | 1);

First of all, it’s checking for the gpio to be native (which is what I did before without success), but it’s setting the PCI configuration at offset 0x420, but the only offset 0x420 I see in the datasheet (page 736) is the “PCI Express* Configuration Register 3”:

Bit    Description

31:1   Reserved 
0      PEC3 Field 1—R/W. BIOS may set this bit to 1b

Possibly these 31 “Reserved” bits are only described in a confidential Intel document, but in any case I didn’t know what that code was doing and I wouldn’t know what to change to make it behave the way I want it to.

I eventually found that this low power mechanism is called “ASPM” and the cbmem output from coreboot had a line that said “ASPM: enabled L1” which didn’t match any string in that soc/intel/broadwell/pcie.c file, so after I searched for the “ASPM:” string, I found that there is code in device/pciexp_device.c which is what actually configures the ASPM on the device!

The code in pciexp_device is rather straightforward since it does this:

/* Check for and enable Common Clock */
if (IS_ENABLED(CONFIG_PCIEXP_COMMON_CLOCK))
    pciexp_enable_common_clock(root, root_cap, dev, cap);

/* Check if per port CLK req is supported by endpoint*/
if (IS_ENABLED(CONFIG_PCIEXP_CLK_PM))
    pciexp_enable_clock_power_pm(dev, cap);

/* Enable L1 Sub-State when both root port and endpoint support */
if (IS_ENABLED(CONFIG_PCIEXP_L1_SUB_STATE))
    pciexp_config_L1_sub_state(root, dev);

/* Check for and enable ASPM */
if (IS_ENABLED(CONFIG_PCIEXP_ASPM))
    pciexp_enable_aspm(root, root_cap, dev, cap);

Unfortunately, those configs for L1_SUB_STATE and CLK_PM are forced-enabled in the menuconfig of coreboot, so I couldn’t disable it (I had already noticed them before but couldn’t disable them), so I just changed the code to remove the line that calls the
pciexp_enable_clock_power_pm function, and tested it. I could then see in the cbmem log that coreboot didn’t enable CLKREQ anymore, but the install was still failing, so I also removed the code that calls  pciexp_config_L1_sub_state, and tried again, and my installation was successful!

I had previously done around 50 installation attempts on that NVMe and the drive would always crash between 50% and 80% through the installation. With my new changes, I had now done 3 successive installs that went all the way to 100% without crashing a single time. This demonstrated that my changes worked, that disabling the CLKREQ+L1.2 substate on the NVMe drive fixed the issue. My “fix” was obviously not the most elegant way of solving the issue, but I was now happy to report that we would be able to use the NVMe drives.

Some users might be wondering whether not being able to put their NVMe drives into low power mode would affect battery life, and the answer is, “In theory yes”, but in practice the difference would be a very small percentage. Back then, I doubted anyone would actually notice it, and so far it seems nobody did, so it looks like the issue is fairly minor in the grand scheme of things.

Purism customers now had working NVMe support for their Librem laptops running coreboot, and this solved a big headache for our operations & support team (who had temporarily put on hold all NVMe-based orders because of the bug, favouring the SATA-based laptop configurations as they were more reliable at that time).

Interestingly, this also meant that we had a superior user experience to similar laptops with a proprietary BIOS: users now had NVMe drives working with coreboot, NVMe drives that had never worked on the AMI BIOS we compared to!

During my testing of the install script, I had also tweaked some of the coreboot options and we had coreboot booting in about 350 miliseconds, which is a lot faster than the few seconds it took for the AMI BIOS to boot.

My fix was merged into coreboot in July 2017, via these two patches and this patch to SeaBIOS.

Some additional notes…

One might wonder if a possible reason behind the problem could have been an error in the design of the motherboard on the first-generation librems where the CLKREQ signal wouldn’t be properly routed, though it doesn’t look that way according to the schematics, so I’m not entirely sure why it was happening after all. At least, the fix was “simple” enough, and it worked on the Librem 13 v1 I had available to test.

Interestingly enough, François’ NVMe drive kept failing on his Librem 13 v1 after 2 to 3 days of use, even with my final fix. I was unable to figure out why that was still happening back then; why would his NVMe drive go into D3 power state if coreboot wasn’t enabling the L1.2 substate anymore? We eventually tabled the matter for a while as François switched to the newly released Librem 13 “version 2” a few weeks later. The answer came to me completely by chance, a year or so later, as I was looking through some PCIe code and saw that the PCIe device itself could have “L1.2 support” set even if it’s not enabled, so maybe his Linux kernel was enabling L1.2 if it saw that the device “supported” it. Unfortunately, by then, Francois wasn’t able to reproduce the issue on his NVMe drive anymore even with his old laptop, so it was impossible to test our hypothesis. The question of why he had started having those issues “all of a sudden” back then (when he didn’t encounter such issues before) shall remain a mystery!

Recent Posts

Related Content

Tags