Solving the QSPI NOR Flash issue
The following article gives a real life example of pitfalls that developers face during their work almost every day and shows how problems can be solved by close cooperation of software and hardware developers.
It all started with a custom specific design which incorporated a CPU that was not used before. Therefore there was almost no know how available at emtrion about that CPU from former projects. Nevertheless the project was almost finished within the given time and hardware and software were working as demanded by the customer.
The only remaining issue was the startup time of the system. As often the customer complained that it lasts too long until the system was running after power up. To reduce component costs a single QSPI NOR Flash chip was used to store all software, including the Linux operating system.
The used CPU incorporates a so called “First level bootloader” which starts fetching code from the QSPI NOR Flash. This is done in 4 bit wide QSPI mode at 41.5 MHz clock. It was assumed that the startup time could be reduced by increasing the clock of the QSPI NOR Flash after the First Bootloader has finished.
The CPUs data sheet does not directly specify the maximum clock rate of the QSPI interface. But one can find that programming of the QSPI clock is constrained by a modulus divider of the internal peripheral clock. Therefore only 83 MHz and 166 MHz are valid and higher values than the default value 41.5 MHz. The used QSPI NOR Flash also supports a clock rate of up to 166 MHz. That looks good.
But besides the clock rate setting the maximum frequency is also limited by the asynchronous timing behavior of the chips. One can find that the minimum data valid delay of the Flash is 5 ns at minimum capacitive load and the minimum data setup demand of the CPU is 1.4 ns. This results in a maximum clock rate of 156.2 MHz which is below 166 MHz. Therefore the maximum clock rate that can be used with the given components is 83 MHz.
After the correct maximum clock rate was calculated it was easily changed by the software.
But after the frequency change the NOR Flash failed to work. It was found that the ROM loader still started with 41.5 MHz but the software failed after switching the clock to 83 MHz. Keeping the clock rate at 41.5 MHz still worked.
Many people were convinced that something must be wrong with the hardware because the 83 MHz clock was carefully calculated before. A couple of ideas were raised by the customer from QSPI NOR Flash application notes and other sources of knowledge. Solutions like equal trace lengths of all signal lines and impedance control to 60 Ω were demanded, similar to the DDR RAM interface. Also problems of power supply decoupling were suspected.
To distinguish between fact and fiction the software was reduced to a simple loop that read the NOR Flash ID and signaled wrong results. Measuring the resulting signals showed that all the excitement was unnecessary and the rumors about hardware problems were wrong.
The scope showed following behavior of the QSPI clock (yellow) and a data signal from the NOR Flash (green) :
One can see that the clock output of the CPU has rise and fall times of almost 5 ns with reflections on their way up and down while the data output of the NOR Flash rises and falls within 2 ns. These rise and fall times were not considered during the theoretical thoughts before and cause massive problems with data setup times at 83.5 MHz clock rate.
At the first glance there was no solution of that problem in sight. But someone remembered that the PIO pins of the CPU can be programmed with three different drive strengths: low, medium and high. The default setting is low. Would it also help to increase the drive strength of the QSPI interface since these pins can also be multiplexed to become PIO pins?
It did! Programming the corresponding PIO pins to medium drive strength halved the rise and fall times as visible in the next picture. Further increasing the drive strength to high drive made no sense in our case. It only increased signal overshoots with only minimum reducing the switching times.
After changing the drive strength reading the NOR Flash ID worked well as long as the scope was connected. Without the scope reading the ID still failed. One detected that some other timings within the access cycle had also to be adjusted at 83 MHz clock . The default values that fit at 41.5 MHz have to be revised at 83 MHz. The last picture shows that the chip select (blue) becomes active about 2.6 ns before the rising clock edge (yellow). This is a bit too short for the NOR flash which demands >3 ns. Since the scope is connected the access is working anyway and the green signal is the correct answer.
The load characteristics of the scope’s probe somehow seem to influence the timing slightly so that it works while the scope is connected.
Increasing the chip select setup time by one clock finally solved all problems. The customer is happy by the achieved improvement without any hardware changes.
The way how the QSPI NOR Flash performance issue was solved by theoretical thoughts first and practical measuring and verification afterwards shows that it is often mandatory that hardware and software developers work close together in cases of trouble. Only be combined efforts it is possible to detect the real cause of problems and solve them.