Титульная страница
ISO 9000 ISO 14000 Forum

Creating Large-Scale, Highly Available OLAP Sites: A Step-by-Step Guide

Dave Wickert

Microsoft Corporation

Abstract: This paper introduces various techniques needed to create large-scale, highly available online analytical processing (OLAP) sites using Microsoft® SQL Server™ 2000 Analysis Services and Network Load Balancing in a Microsoft Windows® 2000 Advanced Server infrastructure. High availability refers to the ability of a multiple-server site to withstand hardware or software outages that may affect individual servers within a site. A large-scale site is a site that has hundreds of gigabytes (GB) of data, making normal file copy operations ineffective.

This paper provides step-by-step instructions that demonstrate how to build the servers and the supporting infrastructure for a reference site based on both thin-client and thick-client architectures. The architectures described are designed to:

·         Protect the data of a multiple-server OLAP site and keep the site up and running during unplanned outages.

·         Coordinate the allocation of processing resources during planned outages, such as data refreshes and complex extraction, transformation, and loading (ETL) cycles.

·         Solve many of the staging-to-production operational problems that can occur in large OLAP sites.


The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, AS TO THE INFORMATION IN THIS DOCUMENT.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, email address, logo, person, place or event is intended or should be inferred.

У 2001 Microsoft Corporation. All rights reserved.

Microsoft, PivotTable, Windows, and Windows NT are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.



Introduction

The pace of the business world is steadily increasing, sometimes exponentially. Because of this, knowledge workers are presented with many challenges. Customers expect excellent service, businesses outsource noncore functions, and the market demands great financial performance. In short, knowledge workers are expected to do more with fewer resources. The tools they use for analysis are critical and must be flexible and adaptable in a changing business climate.

In addition to the pace of change, the amount of data continues to increase exponentially. It is not uncommon for decision-support systems to encompass hundreds of gigabytes (GB) of data.

Lastly, the Internet is rapidly changing how information is delivered to end users. While many approaches are possible, end-user access is commonly classified as either thin-client or thick-client. Thin-client means that the only component of an application installed on the end-user's computer is a Web browser. Thick-client means that additional components of the application are installed on the end user's computer. Because both approaches are reasonable and appropriate, the application and network architecture must be flexible enough to implement either approach.

The architecture presented in this white paper is based on a business intelligence application that provides e-commerce data to its clients. The architecture is based on Microsoft® SQL Server™ 2000 Analysis Services and Microsoft Windows® 2000 Advanced Server. Business intelligence solutions differ from and add value to standard operational systems and Internet applications in three ways. These solutions:

·         Provide the ability to extract, cleanse, and aggregate data from multiple operational systems into a separate data mart or data warehouse.

·         Store data, often in a star schema or multidimensional cube (OLAP) format to enable rapid delivery of summarized information and the capability to drill down to details.

·         Deliver personalized, relevant informational views, and querying, reporting, and analysis capabilities that go beyond the standard reporting capabilities of transaction-based systems — a requirement for gaining better business understanding and making better decisions faster.


While not all business intelligence applications use online analytical processing (OLAP) techniques, OLAP solves problems in many systems, and its uses are increasing. As OLAP applications continue to expand and evolve to address the needs of the business intelligence community, it has become increasingly necessary to host applications on a flexible platform that provides scalability, reliability, and availability.

Today, most OLAP applications are not required to meet the scalability and availability standards that mission-critical applications must meet. This is changing rapidly, however. For some companies, the end product is the data itself; in other companies, OLAP data is used to direct the day-to-day operations of the company. This white paper addresses the needs of such companies by presenting a reference site architecture that combines Web site techniques with OLAP applications.

Goals

The reference site architecture described in this paper meets the following goals:

·         It uses flexible data access techniques so information can be gathered from various sources, such as RDBMS, flat files, and Microsoft Excel spreadsheets.

·         It uses an automated process for updating the production cubes that requires little or no operator intervention. The process is customizable. For example, you can add reconciliation steps, allowing administrators to validate the data before it is moved into production.

·         It delivers data in near real-time and in an online format that allows drilldown and analysis, including basic support for pivoting and slicing.

·         It includes Microsoft products such as Microsoft Office 2000 to provide end-user access to the data, but the architecture also allows advanced end users to run tools based on sophisticated, emerging technologies such as data mining and visualization.

·         It supports remote administration in a lights-out environment.

·         It delivers services in a robust, highly available fashion. Users expect services to be available 24 hours a day, seven days a week, 365 days a year.

·         It delivers data over the Internet, but the architecture is flexible enough to allow the same infrastructure to be used for in-house intranet applications.

·         It demonstrates an architecture constructed using common, off-the-shelf hardware from various hardware vendors and current Microsoft software products.


The architecture uses a wide range of products and services to provide:

·         High availability. Network Load Balancing, a feature of Windows 2000 Advanced Server, is used to create clusters of servers without a single point of failure. The database servers are also clustered together for high availability during the staging of end-of-period data processing.

·         Scalability. If additional front-end Analysis servers are required, they can be added quickly and easily without disrupting the existing site.

·         Throughput for a large volume of data. SQL Server is frequently used in line-of-business applications that contain hundreds of GB of data. It includes Data Transformation Services (DTS) and Analysis Services, which transform data, load data, and process OLAP information. All of the components used in the architecture are included in SQL Server.

·         Common database tools. DTS uses open OLE DB interfaces, which allow companies to use a wide range of data sources.

·         End-user access. OLE DB for OLAP is a widely-supported interface that provides access to data stored in SQL Server in any OLAP format. PivotTable® Service, a scaled-down version of SQL Server Analysis Services, acts as an intelligent cache between the client and server. In concert with Excel 2000, PivotTable Service speeds query times and supports offline analysis.

This white paper assumes that SQL Server 2000 Analysis Services is being used. For historical purposes, where appropriate, this paper provides additional information about SQL Server 7.0 OLAP Services. However, the techniques and approaches described in this paper have only been verified using SQL Server 2000 Analysis Services.

Benefits of This Approach

A well-designed solution addresses the current needs of users, providing immediate benefits. It also takes future situations into account, providing future benefits. The future benefits that a solution can deliver is called its flexibility. The more flexible the solution, the greater its potential benefits.

Immediate Benefits

The solution proposed in this white paper provides the following immediate benefits to administrators and users:

·         By providing appropriate information to end users, you can empower them to use and reuse information in different scenarios. This is the real power of OLAP applications.

·         By providing remote administration, you have a true lights-out environment. You can administer your system from anywhere over the Internet.


·         Because there is no single point of failure, the site is robust and reliable. Even if a server fails or is taken offline, the site continues to run.

·         By delivering information using the Internet, you can:

·         Reduce delivery cost.

·         Increase user reach.

·         Reuse the same infrastructure for administrators.

·         Implement new applications more quickly (particularly if you use a thin-client approach).

·         By including a provision for a direct link to the corporate network, you can easily access existing systems in which the source transactional data is located. This reduces cost and improves the speed of developing new interfaces.

Future Benefits

The solution proposed in this white paper provides the following future benefits to administrators and users:

·         Because the solution uses Network Load Balancing, the cluster of Analysis servers have up to 32 computers. This provides additional capacity for growth. You can also add more CPUs to each server.

·         Having multiple database servers in the architecture provides additional capacity in case your data staging needs increase. However, before you start using extra servers, weigh the benefits of extra capacity against the impact of a failover. For example, suppose you have two database servers. If your end-of-month processing increases, you can use both of them for data processing, but you do not have failover protection. You may find that you have to balance the failover protection against the overall system processing requirements.

·         High availability increases end users' acceptance of the data. This is important with decision-support applications because the user always has the capability to add subjective input to the business problem at hand. If the data is unavailable, end users may find it easier to interpret the data on their own rather than relying on the application. If the system is always available, users make decisions using the analysis tools and the available data — with all of the intended benefits.


Document Organization

This document includes the following sections:

·         Design Issues — This section is an overview of the architectural issues that commonly arise when building large-scale, highly available OLAP sites.

·         Building Large-Scale, Highly Available OLAP Sites — This section outlines the architecture necessary for building and maintaining a highly available OLAP site. Windows 2000 Advanced Server provides system services for Windows Clustering, which is the technology that supports the architecture for creating a highly available site.

·         Alternative Configurations — This section provides suggestions for tailoring the reference architecture to specific situations.

·         Finding More Information — This section provides links to additional resources that discuss the issues and technology used in this white paper.

·         Appendix A: Step-by-Step Guide — This appendix provides directions for configuring Network Load Balancing using Windows 2000 Advanced Server, Analysis Services, and its supporting infrastructure for the sample site.

·         Appendix B: Command-Line Tool — This appendix contains documentation for wlbs, a command-line tool for managing Network Load Balancing clusters.

·         Appendix C: Sample Scripts — This appendix provides sample scripts for working with Network Load Balancing and Analysis Services (for example, stopping and starting Network Load Balancing and Analysis Services, changing the Analysis Services Data folder, and moving the repository from the Bin folder to the Data folder).

·         Appendix D: Cluster Service Configuration — This appendix describes how to configure Analysis Services in a server cluster. While this is not a supported feature, it may be an appropriate alternative for some systems.


Design Issues

This document walks you through the steps of building a sample architecture for a highly available and scalable OLAP Internet site using Windows 2000 Advanced Server. The sites described in this document can be used to deliver highly available OLAP services, for either dedicated or shared sites. A similar site design can also be used to host a highly available intranet site.

A highly available architecture enables a multiple-server site to withstand planned and unplanned hardware or software outages that affect individual servers within the site. An example of a planned outage is taking a server offline for maintenance to perform a software update. While the server is offline for the software maintenance operation, the rest of the site stays online, providing service to users. An example of an unplanned outage is a hardware failure. In this case, the rest of the site stays online, providing service to users, because the processes that provide data services for the site fail over to the remaining server clusters. The architecture described in this paper is designed both to protect the data of such a Web site and to keep the site up and running.

This section discusses some common architectural requirements of large-scale, highly-available OLAP sites:

·         Clustering for high availability

·         Building a large-scale staging facility that keeps OLAP data available during cube and partition processing

·         Moving data into production

The solution proposed in this paper addresses these needs.

Using Windows Clustering

Windows Clustering enables a group of independent servers to be managed as a single system for higher availability, easier manageability, and greater scalability. The Windows 2000 Advanced Server operating system includes two clustering technologies designed for this purpose: server clusters, known as Microsoft Cluster Server (MSCS), which primarily provides failover support for critical line-of-business applications such as databases, messaging systems, and file and print services; and Network Load Balancing, which balances incoming IP traffic among multiple-node clusters.

Both technologies are included as part of Windows 2000 Advanced Server. While either technology can be used alone to achieve a high level of service for a site, the scalability, reliability, and availability of the site is maximized when the technologies are used together to build the site's infrastructure. The reference architecture uses Network Load Balancing for the Analysis servers and Windows Cluster service for the file services and back-end database services.


Network Load Balancing was chosen as the primary clustering technology for the reference architecture because it scales well. Network Load Balancing can be configured in two modes: priority mode and multiple host mode. In priority mode, which is usually used when failover is the primary issue, a single server handles all incoming requests and traffic is routed to other servers when that server is unavailable. In multiple host mode, all computers are available to handle incoming requests. For the type of requests that Analysis Services sends, multiple host mode is preferable because it uses all available resources, and balancing is more effective at distributing the load.

Unlike server clusters, Network Load Balancing does not require servers to be identical. You can have any combination of computers with different capacities. In addition, Network Load Balancing does not have a specialized Hardware-Compatibility List (HCL). For more information, see the HCL at (http://www.microsoft.com/hcl/). Any system that is capable of running Windows 2000 Advanced Server can be used in a Network Load Balancing cluster.

Network Load Balancing distributes IP traffic to multiple copies (or instances) of a TCP/IP service, such as a Web server. For Analysis Services, each Analysis server runs on a host within the cluster. Network Load Balancing transparently divides the incoming client requests among the hosts. From the perspective of the client, the cluster appears to be a single server that answers client requests. As enterprise traffic increases, network administrators can add another server to the cluster.

Server clusters are designed specifically to address failover issues. They are not used for scalability. In a server cluster, all systems must be identical (down to the firmware revision level) and the hardware must be chosen from a separate section of the HCL. While Analysis Services can run on server clusters, it is not supported. For more information, see "Appendix D."

This document does not describe the features and functions of Windows Clustering and Network Load Balancing. This paper assumes that you have basic knowledge of the Microsoft technologies used in the high-availability scenarios. For more information about these technologies, see "Finding More Information" later in this paper.

Building a Staging Facility

A staging facility provides a place to work with data before you move it into production, where users have access to it. By using a separate staging facility, you can work on data without taking your production servers offline. The reference architecture discussed in this paper provides an extensive staging facility to support the additional resources that staging activities require. The facility must be gracefully expandable in case processing or storage requirements exceed the initial capacity.

OLAP applications integrate data from many data sources. Obviously, the staging facility must have the disk and CPU resources to handle the data and processing. In addition, the staging facility must often provide resources beyond the sum of the individual data sources.


Many online transactional systems keep only a snapshot of the data that they generate, saving only the information necessary for immediate processing. For example, a distribution control system may store shipping records for the current month only, deleting them after an item is received. But because OLAP systems are often used to investigate trends, they usually have longer retention periods. An OLAP system may keep records for 36 or 48 monthsѕmuch longer than its transactional counterparts. Thus, the staging facility may require considerably more storage space and have extra processing requirements to be able to hold and process historical data.

Because data may come from many disparate sources, the OLAP system must perform many of the processing steps that large-scale data warehousing systems do. For example, an OLAP system may use data coming from a distribution control system to correlate an invoice number used in the billing system with a shipping number issued by an external shipping company. The OLAP system may filter, cleanse, normalize, and merge the data while processing the cube, or OLAP, data. All of these extra processing steps usually require a separate staging facility.

Posting Data into Production

After the data has been scrubbed, transformed, and validated, the data is ready to be posted to the production environment, where users can see and query the data.

Traditionally, OLAP applications have used a simple development methodology:

1.      Developers and end users work to design a series of cubes.

2.      Database administrators locate the appropriate data sources and determine how to filter, cleanse, normalize, and merge the data to build the cubes.

3.      The cubes are built and validated. The validation process is done either internally by database administrators or externally by other end users. For example, the accounting department may validate a general ledger report. Typically, this is a one-time activity, performed before the system is deployed.

4.      After the cubes are built and validated, end users query them.

However, in some applications, more extensive validation must take place before data can be moved from staging to production. There may be reconciliation reports or manual comparisons with different reports, or senior management may need to review financial data and approve its interpretation before the data is posted to the company's books. In this scenario, data must be validated each time it is moved into production (for example, every week or every month).

This additional validation means that there must be at least two copies, sometimes more, of the OLAP data: a copy that is currently being seen or used by end users (what is currently in production), and a second copy that is being reviewed or validated (what is currently in staging). When the time is right, the second copy must be copied to production.


Various techniques are available for copying data from staging to production, such as archiving and restoring, copying data files, or even physically swapping drives. Consider this issue seriously, because Analysis Services does not provide a built-in facility for copying data between systems, and the reference architecture uses multiple systems for failover and scalability. Whatever process you choose should have a minimal impact on end users. In some environments, it is unacceptable to take the system offline while 100-GB files are copied to computers on the network.

Building Large-Scale, Highly Available
OLAP Sites

This section introduces an architecture for building large-scale, highly available OLAP sites. The reference site scenario discussed in this paper runs an application called Commerce that performs business analyses on the products sold by the company. The company makes this analysis information available to its vendors through an Internet site that has been constructed to provide the OLAP facility.

Hardware

The reference site uses seven servers, all of which run Windows 2000 Advanced Server.

The hardware used in the reference site can be considered a baseline for a highly available system. Check with your hardware vendor for more information about hardware solutions for increased availability, such as dual-interface Ethernet adapters and uninterruptible power supplies.

All of the hardware used is readily-available, off-the-shelf hardware that is available from many vendors. This helps reduce the overall cost of the site and ensures that the maintenance and supports costs are well established and can be included in a normal information technology department budget.

When building a highly available site, you should use hardware listed in the Windows HCL at http://www.microsoft.com/hcl/.

Data Storage

Using Network Load Balancing ensures that production data is available when OLAP site users issue queries to the system. However, timely posting of new or updated data is also important. A critical operational issue in a production OLAP site is that the underlying relational data sources must be available when the data is processed. This paper suggests that you use Windows Clustering to ensure that new or updated data is processed within the required time limits.

Relational data storage for the OLAP site is managed by two servers running Windows Clustering with a shared Redundant Array of Independent Disks (RAID) disk array. The server cluster provides availability in the event of a server failure, and the RAID array provides availability in the event of a disk failure.


The disk technology provided in modern servers and arrays can detect potential disk failures before they occur. If a disk failure is predicted by the system, the failing disk can be hot swapped out of the RAID 5 array and replaced without interrupting service. Even if a disk fails unexpectedly, a RAID 5 array brings a standby disk into the array so that availability continues. No manual intervention is required.

Depending on the I/O requirements of the database servers in a site, you may want to use other RAID techniques. An alternative to RAID 5 is RAID 1+0 (disk striped and mirrored). This option is more expensive than RAID 5. Site designers should use the type of RAID architecture that meets their requirements.

RAID arrays can be implemented in software using built-in Windows 2000 Advanced Server services. However, the reference architecture uses a hardware implementation for increased data-access performance.

RAID Configuration and Analysis Services

Consider the following guidelines when you configure RAID for Analysis Services:

·         No single solution works for all data.

·         For the best performance with full recoverability, use RAID 0+1. However, this approach may be expensive.

·         RAID 5 is a good, less expensive alternative for the Analysis Services Data folder (because the data is mostly read-only).

·         Use RAID 0 (just disk striping) for the Analysis Services Temp folder. The Temp folder does not require long-term recoverability. It is used only for cube or partition processing.

·         Use a controller that supports hot standby disks that are automatically added to the RAID set if a disk failure occurs. The RAID set is then automatically rebuilt, remirrored, and so on.

·         Use fibre-channel RAID controllers whenever possible. Small Computer System Interface (SCSI) is acceptable for midrange systems, but not for large RAID or System Area Network (SAN) configurations.

·         Consider using SAN technology to obtain snapshots of data (to move data from staging to production). This greatly reduces time required to copy OLAP database files.

Note   The type of file system used in this configuration is critical. All disks used in this architecture must be formatted to use the NTFS file system format because it provides a higher level of security and data integrity than file allocation table (FAT) file format.


Networking

Each server has two or three 100-Megabits per second (Mbps) Ethernet Network Interface Connectors (NICs). The TCP/IP protocol is used throughout the reference site. In all cases, the private network (10.0.0.x, 11.0.0.x) NIC is used as the dedicated adapter for  Windows Clustering.

In the front-end servers that provide Analysis Services data, one NIC is connected to a 100-Mbps switch that is connected to a network that connects to the Internet. This NIC is bound with a public IP address of 192.168.18.155. The Network Load Balancing cluster heartbeat is sent on this public NIC. The other NIC is connected to the private network (10.0.0.x) through the 100-Mbps switch that connects the servers in the site.

In the back-end servers, one NIC is connected to a private network (10.0.0.x) providing access to the Analysis server cluster through a 100-Mbps switch. The reference site uses a second private network address of the 11.0.0.x range of IP addresses for the Cluster service heartbeat for the staging server.

The virtual private network (VPN) server cluster provides a VPN to the public network and domain and domain name system (DNS) services to the private network. One NIC is connected to a 100-Mbps switch that is connected to a network that connects to the Internet. This NIC is bound with a public IP address of 192.168.18.160. The other NIC is connected to the private network (10.0.0.x) through the 100-Mbps switch that connects the servers in the site.

The firewall is configured so that only HTTP port 80 (or port 433 if Secure Sockets Layer (SSL) is used) can access IP address 192.168.18.155 and only VPN port number can access IP address 192.168.18.160.

Highly Available Reference Site Architecture

Service providers have different preexisting infrastructure and business models. The architecture of the reference site is intended to be sufficiently generic that the core concepts can be used in a variety of scenarios.


Figure 1 shows the architecture of the reference site (thick-client) for the Windows 2000 Advanced Server-based network. The IP addresses and connections for different parts of the network are shown in different colors, as follows:

·         The external network is shown as a solid blue line. This is also the heartbeat for the Analysis server cluster.

·         The internal database network is shown as a dashed green line.

·         The internal database cluster heartbeat network is shown as a dotted black line.

Figure 1: Reference site architecture (thick-client) or thin-client using the Analysis server cluster for both Web services and Analysis Services (Internet only access)


 


Computer

Configuration

VPN1

Windows 2000 Advanced Server with VPN, primary domain controller (PDC), DNS,
Cluster Sentinel,
192.168.18.180,
10.0.0.41

VPN2

Windows 2000 Advanced Server with VPN, backup domain controller (BDC), secondary DNS,
Cluster Sentinel,
192.168.18.181,
10.0.0.42

AS1

Windows 2000 Advanced Server with Network Load Balancing,
IIS,
Analysis Services,
192.168.18.161,
10.0.0.1

AS2

Windows 2000 Advanced Server with Network Load Balancing,
IIS,
Analysis Services,
192.168.18.162,
10.0.0.2

AS3

Windows 2000 Advanced Server with Network Load Balancing,
IIS,
Analysis Services,
192.168.18.163,
10.0.0.3

Db1

Windows 2000 Advanced Server with Windows Clustering,
SQL Server 2000 (relational database) with Analysis Services (staging server),
10.0.0.52,
11.0.0.1

Db2

Windows 2000 Advanced Server with Windows Clustering,
SQL Server 2000 (relational database) with Analysis Services (staging server),
10.0.0.52,
11.0.0.2

 


Note   All of the system diagrams in the white paper have been simplified for readability. For a production site, you should architect the hardware with full redundancy, that is, dual path power supplies, teamed NIC cards, dual switches and firewalls, and so on, so there is no single point of failure in the configuration.

The architecture illustrated in Figure 1 has data access through the Internet only. Data moves into the site through the VPN. Although this is commonly used in medium-to-small companies, large companies usually have dedicated access through a leased-line back door. For security reasons, you should incorporate a firewall so all of the Internet-facing servers are in a perimeter network (also known as DMZ, demilitarized zone, and screened subnet), which is a network off the router or firewall that acts as a buffer between the external network and your secure internal network.

Figure 2: Reference site architecture (thick-client) or thin-client using the Analysis server cluster for both Web services and Analysis Services (dedicated corporate access)


 

Computer

Configuration

AS1

Windows 2000 Advanced Server with Network Load Balancing,
IIS,
Analysis Services,
192.168.18.161,
10.0.0.1

AS2

Windows 2000 Advanced Server with Network Load Balancing,
IIS,
Analysis Services,
192.168.18.162,
10.0.0.2

AS3

Windows 2000 Advanced Server with Network Load Balancing,
IIS,
Analysis Services,
192.168.18.163,
10.0.0.3

Db1

Windows 2000 Advanced Server with Windows Clustering,
SQL Server 2000 (relational database) with Analysis Services (staging server),
10.0.0.52,
11.0.0.1

Db2

Windows 2000 Advanced Server with Windows Clustering,
SQL Server 2000 (relational database) with Analysis Services (staging server),
10.0.0.52,
11.0.0.1

 


Figure 3 shows a third option, which is possible if one of the Analysis servers can be taken offline for cube processing. In this case, you can eliminate the staging servers if the batch windows are longer, or if the data processing steps are not as large. Each server uses its own data storage for the OLAP data.

Figure 3: Reference site architecture (thick-client) or thin-client using the Analysis server cluster for both Web services and Analysis Services (limited processing requirements)


 

Computer

Configuration

AS1

Windows 2000 Advanced Server with Network Load Balancing,
IIS,
Analysis Services,
192.168.18.161,
10.0.0.1

AS2

Windows 2000 Advanced Server with Network Load Balancing,
IIS,
Analysis Services,
192.168.18.162,
10.0.0.2

AS3

Windows 2000 Advanced Server with Network Load Balancing,
IIS,
Analysis Services,
192.168.18.163,
10.0.0.3

 

The VPN and the Domain Controller/DNS Server

DNS host entries for the individual servers on the private 10.0.0.x network must be added to the DNS database on the servers VPN1 and VPN2 so that server names are correctly resolved.

For the sake of simplicity, the reference site scenario uses a PDC (VPN1) and a BDC (VPN2). A BDC is provided so that the domain controller is not a potential single point of failure. Note that the server VPN1 is only part of the private 10.0.0.x network. It is not visible to the Internet.

Vendors must provide user credentials when they access the OLAP cubes. The usernames and passwords they supply should match user account information that is kept in the PDC domain accounts.


The VPN server cluster allows administrators to control these accounts and to administer the computers on the private 10.0.0.x network. Administrators can tunnel into the private network and connect to any of the servers, including the PDC, to administer domain accounts. Windows 2000 Terminal Services is running in administrative mode on all servers in the site.

Important   Because this site is set up for clustering with Analysis Services, each installation is configured identically. For example, each server has the same memory allocation and buffer sizes. This allows you to balance connections because Network Load Balancing assumes that each server has the same capacity. If they differ in capacity (for example, more or less memory and CPU speed) or properties, manually adjust the weight in Network Load Balancing to compensate. Server properties are stored in the registry, not the OLAP repository.

Front-End and Back-End Tiers

The reference site scenario discussed in this paper uses a multitiered architecture that provides redundancy and fault tolerance. The architecture is physically divided into two main tiers: the front end and the back end. The front end provides the core Analysis Services facility using HTTP access through Microsoft Internet Information Services (IIS). The back end provides the relational data storage and database services. Database services are provided by SQL Server 2000 Standard Edition. If you choose the thick-client approach, SQL Server Enterprise Edition is required for the front-end Analysis servers because it uses the HTTP connection feature (through IIS) for clients to access cubes on the Analysis server.

The Front End

The reference architecture uses three servers (named AS1, AS2, and AS3) to provide Internet access for the site and respond to requests from users. You can use more servers, but this paper recommends keeping the number as small as possible, based on the anticipated user load and availability requirements. This makes managing the cluster simpler.

The front-end servers provide the OLAP facility for client computers running SQL Server 2000 Enterprise Edition. Client computers access the front-end servers by using OLAP applications from either Microsoft (for example, Microsoft Excel and Microsoft Office Web Components) or vendors. OLAP applications use PivotTable Service on their client computers.

In most cases, PivotTable Service is installed automatically when an OLAP application itself is installed. However, if you want to install or upgrade directly, you can find PivotTable Service on your SQL Server 2000 CD-ROM in the Msolap\Install\Pts subfolder. Two versions are available: PTSFull.exe and PTSLite.exe. PTSLite.exe installs the PivotTable Service files only. PTSFull.exe installs the PivotTable Service files and Microsoft Data Access Components (MDAC). For more information about which to use and the files that will be installed, see "Distributing SQL Server with Applications " in SQL Server Books Online. To install PivotTable Service directly, run one of the executables. It will unpack the software and perform the installation.


PivotTable Service uses three access methods to connect to Analysis servers: XML for Analysis, HTTP, and client-server TCP/IP.

XML for Analysis is a new technology that uses low-overhead Simple Object Access Protocol (SOAP) messages to pass XML containing OLAP data between the server and the client. XML for Analysis is cross-platform, and several vendors are adopting it. Because of these characteristics, XML for Analysis is quickly becoming the access method of choice for many client applications. For more information about XML for Analysis, see "Finding More Information" later in this paper.

The HTTP method uses port 80 and transfers data back and forth as Multipurpose Internet Mail Extensions (MIME)-encoded binary messages. The same messages are passed between the client and the server as in the TCP/IP technique (which is described in the following paragraph), but they use a different transport medium. PivotTable Service determines whether to use the HTTP method or the TCP/IP method based on the connection string. If the server name is a URL, the HTTP method is used.

The TCP/IP method connects to port 2725. This port is not commonly allowed through most corporate firewalls. The network access picture is also complicated, because you don't know the network architecture of the client computers. For example, they could have their own firewalls that limits access to the Internet.

The XML for Analysis and HTTP access methods are new in SQL Server 2000. In SQL Server 7.0, only the TCP/IP method is available. Both the HTTP and TCP/IP access methods require PivotTable Service to be used on the client. XML for Analysis is more lightweight. To use XML for Analysis, the client software is required only to generate XML inside a SOAP message to get data from the Analysis Services cluster.

In the reference architecture, because of a firewall, PivotTable Service is configured to use either the HTTP connection method to the IIS default Web site (port 80) or XML for Analysis. To access the facility, users specify the server name using a URL instead of a regular server name. For example: http://commerce.yourdomain.com/commerce/. This causes PivotTable Service to use the HTTP connection method. Alternatively, the client software, using XML for Analysis, can use this URL to pass SOAP messages and get back OLAP data as an XML stream. Both techniques use HTTP as the underlying protocol and can easily pass through firewalls and other Internet facilities.

Note   In SQL Server 7.0 OLAP Services, PivotTable Service connects to the server using ports 2393 and 2394; thus, the Analysis server still listens on these ports to accept connections from SQL Server 7.0 clients. However, PivotTable Service clients in SQL Server 2000 Analysis Services always connect on a single TCP port, port 2725. To use Network Load Balancing with SQL Server 7.0 OLAP Services, you must use the Single affinity setting so that connections to multiple ports from a single client are load balanced to the same OLAP server. This is not an issue with PivotTable Service clients in SQL Server 2000 Analysis Services, because they use a single TCP port.


Role of Network Load Balancing for the Front End

With Network Load Balancing, up to 32 servers work together in a cluster to handle the load of providing data to a Web site. Network Load Balancing is configured on each server in the cluster to respond to the same virtual IP address and fully qualified domain name.

Network Load Balancing provides scalability and load balancing by directing resource requests among the front-end servers to balance the load for the site. The Network Load Balancing load-balancing algorithm determines which server responds to a user request.

When the traffic on the site increases beyond the capacity of the site, a new front-end server can be configured with the Network Load Balancing settings for the site and loaded with a copy of the latest OLAP data. When the new front-end server is brought online in the network, it dynamically joins the existing Network Load Balancing cluster and immediately begins sharing the load with the other front-end servers.

Availability at the Analysis server level is maintained because Network Load Balancing detects when a server that is not responding to network requests and dynamically removes it from the cluster. The remaining nodes pick up the load of the unresponsive server to keep the site running. When a node in a server cluster joins or leaves a Network Load Balancing cluster, an entry in Event Viewer notes the change.

Figure 4: Network Load Balancing NIC roles


To provide name resolution so that external users can access the site, make a host entry in the Internet DNS for the virtual IP address (of the Network Load Balancing cluster) and cluster name (commerce.yourdomain.com).

The Back End

The servers Db1 and Db2 run Windows Clustering and provide highly available data services (file shares and databases) for the site. The file shares are required for the Data folder of the Analysis servers. The databases are optional and are exposed as a cluster-wide service only if the Analysis servers use Relational OLAP (ROLAP) or Hybrid OLAP (HOLAP) storage modes. In most cases, the Multidimensional OLAP (MOLAP) storage mode is recommended, because it makes all of the data available in the Data folder. However, SQL Server can be used internally for other reasons, such as for part of the processing of OLAP cubes, but there is no need to expose it as a cluster-wide service. For more information about storage modes, see SQL Server Books Online.

This cluster of two servers is referred to as the back-end cluster. The back-end cluster is configured in active-to-passive mode, in which one server provides all of the services and the other waits as a hot backup. (In the other possible mode, active-to-active mode, both servers provide services.)

The computers in the Analysis server cluster are configured to always refer to the files (and databases, if used) on the database cluster by the virtual cluster name, and not the names of the individual servers.

For files located on the common cluster-wide file share, an automated procedure that runs in the background keeps all of the data files up-to-date. It uses the file creation and last modified date to determine which files should be copied from the primary server to the warm backup. You can use Robocopy, a file transfer program that is part of the Windows 2000 Server Resource Kit, for this purpose. You can also schedule simple scripts to run periodically that ensure synchronization between the Data folders on the primary and warm standby servers.

Availability

The back-end cluster provides failover capability for services running on the cluster. If one of the servers becomes unavailable because of a hardware failure, planned maintenance, or any other reason, the other server cluster immediately takes over the services of the server that is unavailable. The failure of a server does not cause failure of the data services or interruption in service. When the server is brought back online, it resumes delivery of data services.


Staging Data

The main purpose of the database cluster is to provide the data services for the Analysis Services database located on each of the front-end Analysis servers. Data processing takes place in two environments: traditional and OLAP. For more information, see "Creating a Data Warehouse" in SQL Server Books Online.

Staging Data (Traditional)

Traditional data updates and deletes are processed on the warm standby server. Data is copied from the application's transactional database on the Internet, or available through the Internet. It is copied to the warm standby server. Depending on the amount of new, updated, or deleted information, this data can be quite large. Typically, this data is transactional and is in a highly normalized schema, Third Normal Form (3NF). It is also typically in various storage formats, such as other RDBMS databases, flat files, and Excel spreadsheets.

SQL Server 2000 includes DTS, which automates the data staging process. After the data update process is complete, a reconciliation report should be run and its status sent to the operations group for validation.

When the end-of-day, -week, -month, -quarter or -year processing runs, SQL Server uses the data from the various data sources outlined earlier and updates the transaction data stored on the primary database server. The data on the primary database server should be in a star or snowflake schema RDBMS.

The traditional staging process includes the following steps:

·         Extracting data from the transactional data sources.

·         Transforming the data (that is, scrubbing and normalizing it). This can involve converting or generating surrogate keys, converting data from various formats such as date/time, and consolidating data (for example, when cost centers merge).

·         Loading the data into the star or snowflake schema that is used as the data source for the OLAP cubes and dimensions.

Staging Data (OLAP)

After the traditional staging process is complete, you need to perform additional OLAP processing to rebuild or incrementally update the OLAP cubes and dimensions in the production cubes. The data source for the OLAP cubes and dimensions should be the star or snowflake schema database. The ultimate destination for this OLAP data is the computers that make up the Analysis server cluster, but the data is first processed on the database cluster.

After the OLAP processing is complete, a reconciliation report should be run and its status sent to the operations group for validation. Depending on the complexity of the validation process, you may want a team of users to connect to the newly processed cubes and verify that the data is correct.


The Analysis server service (MSSQLServerOLAPService) runs only on the primary database server while the OLAP processing is taking place (or if Analysis Manager administration is required). The service is started just before the OLAP processing and stopped immediately afterward. Administrators must start and stop the service manually before and after running Analysis Manager. This ensures that the Msmdrep.mdb repository and the Data folder files are consistent, available, and closed during normal operations.

Location of OLAP Data

In the reference architecture, a master copy of the OLAP data is stored in the data storage space of the database cluster. The local disks on the front-end servers store the operational versions of the data. There are several advantages to this combined approach:

·         It uses a high-speed, high-capacity RAID disk array in the database cluster. This means that data can be processed more quickly on the staging systems.

·         There is a master copy of the data. This means that the process of moving data from staging to production is quicker and data can be copied online without affecting the performance of the front-end servers. After the data is placed on the front-end servers, an update job changes the location of the Data folder. The OLAP data can be copied using several methods, such as:

·         DOS COPY or XCOPY commands.

·         Robocopy.

·         Microsoft Application Center 2000 data replication.

·         Third-party data movement products.

·         For high speed data transfers, SAN technology. Create a disk mirror set (with two or more members), break them, and then change the underlying device ownership. The net effect of this is an that the disk copy process runs extremely quickly. This is particularly effective for large databases of 50 GB or more.   

·         Because a master copy of the data is maintained on the database server, it is easy to add a new Analysis server to the front-end cluster. The Analysis server is required only to copy the data and add the host to the cluster — then the server is up and running.

·         Because every Analysis server contains its own data, they are not dependent on any outside resource. Performance is consistent, and there is no single point of failure anywhere in the cluster.

Note   Consider using Windows 2000 volume sets on the Analysis servers. Volume sets can expand server capacity without requiring you to take the system offline.

·         It is easier to manage site content and keep it synchronized when it is located in one place rather than distributed among the local disks on each front-end server.


For the sake of brevity, this paper describes only one method of organizing data. However there are many other architectural choices for hosting data, such as using SAN technology or network attached devices.

Storage Mode

This paper recommends using the MOLAP storage mode for data in the proposed architecture. MOLAP is recommended because the data is self-contained in the Analysis Services Data folder and can easily be moved from the processing server to the Analysis servers. To ensure 100 percent availability of the OLAP facility, you need a mechanism that allows multiple versions of the data to be spread out among the members of the cluster. The data must be independently configured and controlled along with the server being converged into the Network Load Balancing cluster. If you use ROLAP or HOLAP, a single relational database contains (at minimum) the fact data and possibly the aggregates. This complicates the movement of data, because you must copy both relational databases and data files throughout the cluster.

Regardless of the storage mode used, Analysis Services does not support writeback for either data or dimension data. For more information, see "Analysis Services Restrictions" later in this paper.

OLAP Data Storage

OLAP data is stored in the following locations.

Location

Description

The repository (Msmdrep.mdb)

This is all of the meta data that makes up the OLAP structures on a server (cubes, dimensions, data sources, and so on). For large systems, you should migrate the repository to a SQL Server database.