Saturday, October 18, 2008

Windows 7 Feature #3: Quick Boot

From MSDN Blog http://blogs.msdn.com/e7/archive/2008/08/29/boot-performance.aspx

For Windows 7, we have a dedicated team focused on startup performance, but in reality the effort extends across the entire Windows division and beyond. Our many hardware and software partners are working closely with us and can rightly be considered an extension to the team.

Startup can be one of three experiences; boot, resume from sleep, or resume from hibernate. Although resume from sleep is the default, and often 2 to 5 seconds based on common hardware and standard software loads, this post is primarily about boot as that experience has been commented on frequently. For Windows 7, a top goal is to significantly increase the number of systems that experience very good boot times. In the lab, a very good system is one that boots in under 15 seconds.

For a PC to boot fast a number of tasks need to be performed efficiently and with a high degree of parallelism.

  • Files must be read into memory.
  • System services need to be initialized.
  • Devices need to be identified and started.
  • The user’s credentials need to be authenticated for login.
  • The desktop needs to be constructed and displayed.
  • Startup applications need to be launched.

Because systems and configurations differ, boot times can vary significantly. This is verified by many lab results, but can also be seen in independent analysis, such as that conducted by Ed Bott. Sample data from Ed’s population of systems found that only 35% of boots took less than 30 seconds to give control to the user. Though Ed’s data is from a small population, his data is nicely in line with what we’re observing. Windows Vista SP1 data (below) also indicates that roughly 35% of systems boot in 30 seconds or less, 75% of systems boot in 50 seconds or less. The Vista SP1 data is real world telemetry data. It comes to us from the very large number of systems (millions) where users have chosen to send anonymous data to Microsoft via the Customer Experience Improvement Program.

Histogram distribution of boot times for Vista SP1 as reported through the Microsoft Customer Experience Improvement Program data.  Paragraph above provides summary of the data presented.

From our perspective, too few systems consistently boot fast enough and we have to do much better. Obviously the systems that are greater than 60 seconds have something we need to dramatically improve—whether these are devices, networking, or software issues. As you can see there are some systems experiencing very long boot times. One of the things we see in the PC space is this variability of performance—variability arises from the variety of choices, and also the variety of quality of components of any given PC experience. There are also some system maintenance tasks that can contribute to long boot times. If a user opts to install a large software update, the actual updating of the system may occur during the next boot. Our metrics will capture these and unfortunately they can take minutes to complete. Regardless of the cause, a big part of the work we need to do as members of the PC ecosystem is address long boot times.

In both Ed’s sample and our telemetry data, boot time is meant to reflect when a machine is ready and responsive for the user. It includes logging in to the system and getting to a usable desktop. It is not a perfect metric, but one that does capture the vast majority of issues. On Windows 7 and Vista systems, the metric is captured automatically and stored in the system event log. Ed’s article covers this in depth.

We realize there are other perceptions that users deem as reflecting boot time, such as when the disk stops, when their apps are fully responsive, or when the start menu and desktop can be used. Also, “Post Boot” time (when applications in the Startup group run and some delayed services execute), the period before Windows boot is initiated, and BIOS time can be significant. In our efforts, we’ve not lost sight of what users consider being representative of boot.

Before discussing some of our Windows 7 efforts, we’d like to point out there is considerable engagement with our partners underway. In scanning dozens of systems, we’ve found plenty of opportunity for improvement and have made changes. Illustrating that, please consider the following data taken from a real system. As the system arrived to us, the off-the-shelf configuration had a ~45 second boot time. Performing a clean install of Vista SP1 on the same system produced a consistent ~23 second boot time. Of course, being a clean install, there were many fewer processes, services and a slightly different set of drivers (mostly the versions were different). However, we were able to take the off-the-shelf configuration and optimize it to produce a consistent boot time of ~21 seconds, ~2 seconds faster than the clean install because some driver/BIOS changes could be made in the optimized configuration.

For this same system, it is worth noting the resume from sleep time is approximately 2 seconds, achieving a nearly instant on experience. We do encourage users to choose sleep as an alternative to boot.

As an example Windows 7 effort, we are working very hard on system services. We aim to dramatically reduce them in number, as well as reduce their CPU, disk and memory demands. Our perspective on this is simple; if a service is not absolutely required, it shouldn’t be starting and a trigger should exist to handle rare conditions so that the service operates only then.

Of course, services exist to complete user experiences, even rare ones. Consider the case where a new keyboard, mouse or tablet HW is added to the system while it was off. If this new HW isn’t detected and drivers installed to make the HW work during startup, then the user may not be able to enter their credentials and log into the machine. For a given user, this may be a very rare or never encountered event. For a population of 100s of millions of users, this can happen frequently enough to warrant having mechanisms to support it. In Windows 7, we will support this scenario and many others with fewer auto start services because more comprehensive service trigger mechanisms have been created.

As noted above, device and driver initialization can be a significant contributor as well. In Windows 7, we’ve focused very hard on increasing parallelism of driver initialization. This increased parallelism decreases the likelihood that a few slower devices/drivers will impact the overall boot time.

In terms of reading files from the disk, Windows 7 has improvements in the “prefetching” logic and mechanisms. Prefetching was introduced way back in Windows XP. Since today’s disks have differing performance characteristics, the scheduling logic has undergone some changes to keep pace and stay efficient. As an example, we are evaluating the prefetcher on today’s solid state storage devices, going so far as to question if is required at all. Ultimately, analysis and performance metrics captured on an individual system will dynamically determine the extent to which we utilize the prefetcher.

There are improved diagnostic experiences in Windows 7 as well. We aim to quickly identify specific issues on individual systems, and provide help to assist in resolving the issues. We believe this is an appropriate way to inform users about some problems, such as having too many startup applications or the presence of lengthy domain-oriented logon scripts. As many users know, having too many startup applications is often the cause of long boot times. Few users, however, are familiar with implications of having problematic boot or logon scripts. In Windows XP, Vista and in Windows 7, the default behavior for Windows is to log the user into the desktop without waiting for potentially lengthy networking initialization or scripts to run. In corporate environments, however, it is possible for IT organizations to change the default and configure client systems to contact servers across the network. Unfortunately, when configuring clients to run scripts, domain administrators typically do so in a synchronous and blocking fashion. As a result, boot and logon can take minutes if networking timeouts or server authentication issues exist. Additionally, those scripts can run very expensive programs that consume CPU, disk and memory resources.

In addition to working on Windows 7 specific features and services, we are sharing tools, tests and data with our partners. The tools are available to enthusiasts as well. The tools we use internally to detect and correct boot issues are freely available today here as a part of the Windows Performance Toolkit. While not appropriate for most users, the tools are proving to be very helpful for some.

One of the topics we want to talk about in the future which we know has been written about a great deal and is the subject of many comments, is the role that additional software beyond the initial Windows installation plays in overall system performance. The sheer breadth and depth of software for Windows means that some software will not have the high quality one would hope, while the vast majority is quite good. Microsoft must continue to provide the tools for developers to write high performance software and the tools for end-users to identify the software on their system that might contribute to performance that isn’t meeting expectations. Windows itself must also continue to improve the defensive tactics it uses to isolate and inform the end-user about software that mightcontribute to poor performance.

Another potential future topic pertains to configuration changes a user can make on their own system. Many recommended changes aren’t helpful at all. For instance, we’ve found the vast majority of “registry tweak” recommendations to be bogus. Here’s one of my favorites. If you perform a Live search for “Enable Superfetch on XP”, you’ll get a large set of results. I can assure you, on Windows XP there is no Superfetch functionality and no value in setting the registry key noted on these sites. As with that myth, there are many recommendations pertaining to CPU scheduling, memory management and other configuration changes that aren’t helpful to system performance.

Startup is one topic on performance. As described in the previous post we want to continue the discussion around this topic. What are some of the elements you’d like to discuss more?

Michael Fortin

0 comments: