Google describes Lighthouse as an open-source, automated tool that enhances web page quality. While it isn’t solely a performance tool, it provides valuable feedback on a webpage’s performance. Achieving a top performance score for mobile in Lighthouse is quite challenging and can make you question yourself and the tool. So, let’s dive into Lighthouse to understand the reasons behind this
Issue 1 – Non-linear scoring scale
Contrary to what you might expect, the performance score in Lighthouse doesn’t follow a linear scale. It actually follows a curved distribution. For example, improving the score from 99 to 100 requires similar effort as improving it from 90 to 94. This non-linear calculation means that improving your score will vary depending on where you are on the curve. To put it simply, it’s like a runner’s effort varying while running downhill, on flat ground, or uphill. This unexpected scoring system could have been avoided if a different range or distribution was chosen.
Issue 2 – Score variability
Running Lighthouse multiple times on the same website using the same setup can yield variable results. This may seem odd at first, but the variability is often due to external factors such as changes in ads, internet traffic routing, device differences, browser extensions, or antivirus software. This variability can be significant, as shown in a comparison between different hardware setups. To reduce variability, it is recommended to run Lighthouse multiple times and be cautious when drawing conclusions based on a single test.
Issue 3 – Majority of websites ranked as not good
When Google introduced the Core Web Vitals in 2020, it aimed to simplify the web performance rating. However, the percentage of websites classified as “good” based on the Core Web Vitals was lower compared to the overall performance score. This raised questions about whether web performance was truly poor or if the bar was set too high. The thresholds for achieving a “good” performance score were based on human perception thresholds and relevant HCI research, making them challenging to achieve.
Issue 4 – Lab data vs. field data
Lighthouse is a lab-based tool, meaning it collects data within a controlled environment with predefined settings. While this provides reproducible testing, it doesn’t capture real-world bottlenecks accurately. Field data, on the other hand, is collected from real user experiences and provides a more realistic perspective. PageSpeed Insights combines lab data from Lighthouse with field data from the Chrome User Experience Report (CrUX) dataset. However, not all websites have enough field data available for accurate analysis.
Issue 5 – Mobile or Desktop
Lighthouse doesn’t explicitly indicate whether the results are for mobile or desktop. The performance thresholds for mobile are higher, leading to potential confusion and misrepresentation when comparing scores. Adding a visual indicator to distinguish between mobile and desktop results has been discussed but hasn’t been implemented in Chrome devtools yet.
Issue 6 – Pursuit of perfect scores
Many people strive to achieve near-perfect performance scores, but the high thresholds set by Lighthouse can make it difficult for certain types of websites and web applications. There is no differentiation between demanding e-commerce platforms, web applications like Google Docs, or personal websites. This has led to discussions about the achievability of a perfect score and whether Lighthouse is suitable for comparing sites in this manner.
In conclusion, measuring web performance is complex due to the diverse nature of web-based products and the challenge of defining what constitutes good performance. Lighthouse, while a valuable tool, has its limitations and can be misleading. Google’s efforts to define good performance through metrics and tools like Lighthouse and the Core Web Vitals have had an impact, but there are still areas where clarity and focus can be improved. It is recommended to use Lighthouse primarily to understand how Google classifies your website and to consider other tools for performance debugging.