Executive Summary:
To keep your Microsoft Office SharePoint Server (MOSS) 2007 deployment up and running, remember these points: Use a two-tiered or three-tiered architecture for your MOSS 2007 server farm. The two-tier approach features a clustered Microsoft SQL Server back end and a front-end server for Web content and application services.In the three-tier design, Web servers serve only Web content, and the application services are delegated their own servers. Use Microsoft Network Load Balancing (NLB) to load balance front-end servers. Use Microsoft Cluster Service (MSCS) to cluster the database back-end servers.

Microsoft Office SharePoint Server (MOSS) 2007 answers many business needs, from document storage and information sharing to centralized project tracking to exposing business intelligence (BI) data. As such, MOSS is considered a business-critical application in most organizations, and therefore you need to ensure that the services provided are available when needed. In this overview of MOSS 2007 high availability, I discuss four key areas that will help you design and deploy a highly available SharePoint environment: selecting the appropriate architecture, understanding core services and their availability options, implementing your high-availability strategy, and planning for failures.

Architecture Selection
There are many ways to design a MOSS farm, but it's important to choose a farm layout that is conducive to high availability. Factors such as budget, availability of hardware, desired performance, and service level agreements (SLAs) will affect the number of servers in your farm and their placement. There are two basic SharePoint architectures that provide high availability: the two-tier architecture and the three-tier architecture. Figure 1, illustrates both architectures.

The Web content tier consists of servers that host the Microsoft IIS Web sites that deliver content to the end user. The application tier hosts all the background services (e.g., Excel Web Access, Search) that are used by Web parts to display information to the end user.

The two-tier approach features a clustered Microsoft SQL Server back end and a Web server front end. In this scenario, the front-end servers host the Web content and the application-tier functionality. The benefit of the two-tier approach is that it's simpler to design and implement than the three-tier setup. The major drawback comes in potential performance loss if there's a heavy reliance on Excel Calculation Services, which performs calculations on Excel workbooks stored in the database, and other application-layer services.

In the three-tier design, Web servers serve only Web content, and the application services are delegated their own servers. You need to keep in mind a few caveats, which I discuss later on a per-service basis. The main benefit of the three-tier approach is that it's highly scalable, allowing for easy expansion. On the downside, it's more complex and harder to monitor and maintain.

You also need a load-balancing technology. Network Load Balancing (NLB) and Microsoft Cluster Service (MSCS) are Microsoft's two load-balancing technologies. In an NLB architecture, machines host the same data and share an IP address that clients use to access the load-balanced site or service. Requests are divided up between the load-balanced hosts according to rules set by an administrator. In an MSCS environment, hosted services reside in virtual servers. Virtual servers are a group of services required to run a clustered application; they are coupled with an IP address, network name, and usually a shared physical disk that all nodes in the cluster have access to. When one node fails, the next node configured as a possible owner of the service takes the shared resources (IP, network name, physical disk) and starts the necessary services, thereby starting the virtual server. In our example, we'll use NLB on front-end servers and MSCS to cluster the SQL Server back end.

Note that when load-balancing the front-end servers, keep in mind that NLB operating in unicast mode with a single NIC will prevent inter-host communications, possibly interfering with the functionality of the farm. In this situation, it's usually best to implement the NLB cluster in Internet Group Management Protocol (IGMP) multicast mode (provided your switch vendor supports this). Alternatively, you can use a third-party hardware load-balancing solution.

Because failover clusters depend on their shared storage, your storage design is important. There are many shared-storage devices available today, taking advantage of different technologies from Fibre Channel to iSCSI. The one consideration that you need to take into account regardless of the technology leveraged is storage redundancy. It does no good to have redundant servers if your storage device represents a possible single point of failure. If the situation warrants redundancy, it probably warrants redundant storage devices. For both two-tier and three-tier scenarios, SQL Server must be set in an active/passive failover cluster. This provides for redundancy and ensures that the failure of one node doesn't affect the availability of the database.

Services and Availability Options
Knowing the core SharePoint services, their functions, and methods for providing redundancy, when possible, will help you keep the server farm highly available. MOSS 2007 has five key services:

  • The Web Server serves Web content to end users.

  • The Query service provides query functionality for MOSS 2007 search.

  • Excel Calculation Services performs calculations on Excel workbooks stored in the database.

  • The Index service collects and propagates the results of SharePoint Search crawls. This information is then used by the Query service to return search results.

  • Windows SharePoint Services (WSS) 3.0 Search provides search functionality in the absence of Query and Index services, and provides full text search of SharePoint Help.

Only the first three services in the list can be made redundant in your server farm, and Table 1 shows you how to do so. The remaining two services, WSS Search and the Index service, can't be made redundant.

The WSS Search service isn't required if you're running the Query service and the Index service, unless you want full-text search in SharePoint Help. If you do, you can run WSS Search on the same server farm as the Query and Index services with no change in functionality.

Note that although you can't make these services redundant via load balancing or by installing them on multiple servers, it's possible to make them redundant by installing them on a Microsoft Virtual Server virtual machine (VM) and using MSCS to cluster them. Bear in mind that this redundancy protects only from hardware issues, and might not provide the desired level of performance. For more information on clustering VMs, visit http://www.windowsitpro.com/articles/index.cfm?articleid=45901&feed =articleLink.

You can attain database redundancy by using a clustered SQL Server configuration; you would then configure SharePoint to use the SQL cluster virtual server during installation. For more information about clustering SQL Server, see the SQL Server 2005 Books Online (BOL—http://technet.microsoft.com/en-us/sqlserver/bb428874.aspx) materials and search for "clustering."

Note: During the install of SQL Server 2005 to multiple cluster nodes, keep in mind that the installation must be performed from one of the nodes; however, if you're logged on to one of the other target nodes during the installation, the install on that node will fail.

Implementation
To maintain a highly available SharePoint environment, you need to ensure that the availability options at each tier of your architecture meet your needs. The following procedures relate to the three-tier architectural model: Web servers in one tier, application services in another tier, and the database back end in the third tier. To accomplish the following implementation tasks in a two-tier environment, just add the application server services to the Web servers.

Web servers. To make Web servers highly available, you need two or more servers. You also need to run NLB or use an external load balancer.

The first step is to install MOSS on the servers you'll be using for the Web front end. When you begin the installation, you'll be prompted whether you want to perform a Basic or Advanced installation. Because this won't be a standalone installation, select Advanced, and on the next page, select Web Front End-Only install components required to render content to users, as Figure 2 shows. Then click Install Now. When the installation completes, click Close, which opens the SharePoint Products and Technologies Configuration Wizard. Proceed through the wizard by performing these steps:

  1. Click Next at the Welcome screen and click Yes in the dialog box that advises that you might have to start or reset related services during configuration.

  2. Next, select whether you want to connect to an existing farm or start a new one.

  3. Specify the configuration database server and the name of the database, as Figure 3 shows. Then enter the credentials for the account that the machine will use to connect to the configuration database.

  4. If you want to install the Central Administration Web application on your Web server, select that check box and note the port number (in case you want to load balance it across your Web servers). You'll see a summary of your choices. Confirm that they're correct and click Next. Click Finish.

After you install MOSS on your Web servers, you'll need to configure load balancing. For this example, I show you how to set up NLB with IGMP Multicast on Windows Server 2003. I prefer to use the Network Load Balancing Manager, which you'll find under the Windows 2003 Administrative Tools menu. To set up NLB, perform these steps:

  1. Start the Network Load Balancing Manager on any machine in the domain and click Cluster, New.

  2. On the Cluster Parameters screen, enter the cluster's IP address and Subnet mask. Under Cluster operation mode, select the Multicast option and the IGMP Multicast check box, as Figure 4, shows. Click Next.

  3. You'll be prompted to enter additional cluster IP addresses, which is handy if you plan to host multiple Secure Sockets Layer (SSL) sites and want them to be load balanced. Click Next.

  4. Next, you need to configure port rules. Using the options here, you can specify which ports are load balanced on a per IP address basis. This means that if you're only hosting one protocol in your NLB cluster (e.g., HTTP), you need to open only the related ports. Click Next.

  5. On this screen, you specify the first host to be added to the cluster. Enter the name of one of your Web servers and click Next. This screen shows the configuration of the host you've selected. It contains the host priority (which is the host ID within the cluster), the dedicated IP information of the host, and the initial host state (the defaults is Started). Click Finish.

  6. The left panel of the Network Load Balancing Manager shows your first host along with its description and state, as Figure 5 shows.

  7. Click Cluster, Add Host, and enter the name of the next host you want to join to the cluster. Click Next, then click Finish. Repeat this step for each host you want to add.

Don't forget to add DNS records that point to the NLB cluster IP address for the sites you're load-balancing.

Application servers. You can run the Query service on any number of application servers. However, the Query and Index services can't reside on the same server. If they do, the Index service recognizes that the Query service is installed and it won't propagate the index. If the content you're hosting is relatively static (50 percent or more of the requests for your Web servers are for static content), you can see a potential performance boost by moving the Query service to your Web servers. The resulting performance boost is due to the content caching done by the Query service.

Excel Calculation Services provides support for server-side calculation of workbooks hosted through Excel Web Access in MOSS 2007. A request to process a workbook is sent to a server running Excel Calculation Services. The service stores session-state information so that the same server processes the request until the user session ends or the workbook is closed.

Excel Calculation Services is a resource-intensive service, so in large environments with heavy utilization of complex workbooks, you might want to dedicate a couple of high power servers solely to this service. I've worked at companies that relied on workbooks so complex that it took a high-end, dual-core machine longer than an hour to do the calculations on them. Cases like that let you see SharePoint's true value. If you upload the workbook and make it accessible through Excel Web Access, the calculations are performed from a central location, and you need to buy only the application servers instead of buying expensive workstations for all employees that need to view the worksheets. Keep in mind, though, that because these operations are so resource-intensive, they might affect other services running on the servers.

Failure Management
Despite all precautions, failures will occur. If a failure involving any of the redundant services occurs, the server will be unavailable, but the service will continue to function. For this reason, it's important that you have a monitoring solution in place, such as Microsoft System Center Operations Manager, that will notify administrators in the event of a failure. Here's how to handle a failure, depending on which server fails:

  • Web servers—If a Web server fails, the server will no longer be running on the virtual IP address and NLB won't direct requests to it. Repair the server, and bring it back up in the NLB cluster.

  • Application servers—If a server hosting Excel Calculation Services or the Query service fails, that server will no longer respond to requests, and those requests will go to another server hosting the service. If a server hosting the Index service fails, the Query servers will continue to respond using cached information. After the server is recovered, index propagation will resume.

  • SQL Server (database) server—In a clustered environment, SQL Server will fail over to the inactive node in the event of a failure. It's important to repair the failed node and test failover/failback to ensure uptime in the event of future failures.

It's All About Reliability
SharePoint is a crucial application in most environments, necessitating a high-availability infrastructure. The two-tier and three-tier architectures satisfy the need for high availability by placing services that can be made redundant on multiple hosts, and NLB and MSCS technologies provide continuous access to content in the event of a single cluster node failure. Using the available tools, administrators can enable the necessary reliability to ensure that data and productivity are maintained.