Executive Summary:
Business Intelligence (BI) enhancements to SQL Server 2008 Integration Services (SSIS), Analysis Services (SSAS) and Reporting Services (SSRS) improve performance and make reporting easier. |
SQL Server 2008, while not a revolutionary
release, provides
rich insight into your data
for your organization’s business intelligence (BI) needs.
I’ll take you on a brief tour of SQL Server Integration
Services (SSIS), SQL Server Analysis Services (SSAS),
and SQL Server Reporting Services (SSRS) enhancements,
so you’re armed with information to help you
make decisions about implementing SQL Server 2008
BI solutions. First, I want to point out two relational
engine (query optimizer) enhancements that set the
stage for better BI—partitioned table parallelism and
star-join query optimizations (Optimized Bitmap Filters).
Even though this tour is just a taste of the many
enhanced BI features found in SQL Server 2008, it
should be enough to give you food for thought when
you consider whether upgrading to SQL Server 2008
BI is a smart move, based on your organization’s BI
architecture and requirements.
Partitioned Table Parallelism
SQL Server 2008 improves performance on partitioned
tables that reside on multi-CPU-based systems.
The query optimizer can elect a parallel query
execution plan on these hard systems to provide improved
performing query and index operations. Fact
tables are often candidates for partitioning in a data
warehouse because they typically contain a few columns
with a very large number of records.
Does your data warehouse have large fact tables
residing on multiple CPUs? You can benefit by upgrading
to SQL Server 2008 because there’s a new
parallel query execution strategy on partitioned
tables. SQL Server 2005 uses a single thread per partition
parallel query execution strategy. In SQL Server
2008, multiple threads can be allocated to a single
partition, thus improving the query’s response time.
As of this writing, you can enable this functionality
by setting the trace flag 2440, although this is expected
to change when the product ships. Note that table
and index partitioning requires SQL Server 2008 Enterprise
Edition.
Star-Join Query Optimizations
(Optimized Bitmap Filters)
The query optimizer uses bitmap filtering to eliminate
rows from a second table based on values taken from
the first table. Bitmap filtering is a common query
filtering technique found in star-schema-based queries.
SQL Server 2008 introduces optimized bitmap
filtering. The query optimizer can now introduce
bitmap filters dynamically in the query plan during
generation, as opposed to just after query plan optimization,
as in SQL Server 2005. Optimized bitmap
filtering results in filtering from multiple dimension
tables and bitmap filters are now applicable to more
query operator types. Optimized bitmap filtering enables
better performing data-aware house queries that
reference the common star-based schemas.
SSIS Enhancements
Any BI solution includes extraction, transformation,
and loading (ETL) of an organization’s data. ETL is
implemented in SQL Server using SSIS. In SQL Server
2005, the SSIS pipeline execution engine doesn’t
scale up to utilize more than one processor in a single
execution tree. The SQL Server 2008 SSIS data flow
engine can execute multiple components (threads) in
a single execution tree. Overall, the 2008 SSIS engine
is more stable and scalable. It eliminates the potential
for deadlocks that occasionally occur in SQL Server
2005 SSIS when you execute packages with complex
user data in large organizations.
Lookup transformation. SSIS can be used in a variety
of scenarios, however, it’s most commonly used
in ETL. One of the most common SSIS components
used in ETL solutions is the Lookup transformation.
The SQL Server 2005 SSIS lookup component used against tables with row counts of over a million rows
occasionally causes a performance slowdown. SQL
Server 2008 no longer has this limitation. You can
perform a lookup against any data source by using
the standard providers, which include ADO.NET,
XML, OLE DB, and other data sources. You can
even perform lookups against other SSIS packages.
The enhanced TxLookup transformation component
of the SSIS package in SQL Server 2008 supports
internal redundancy on the lookup chain. TxLookup
also includes several other improvements over SQL
Server 2005: There’s now a pre-charge query in addition
to the cache-miss query. And for each cache-miss
query, multiple rows can now be returned. The cachemiss
query now has a separate connection manager.
If you use a full or a partial cache query, SQL Server
2008 loads the hash table and uses the pre-charge
query. However, if you use a no cache query, SQL
Server 2008 behaves like SQL Server 2005 and uses
only the cache-miss query. SSIS in SQL Server 2008
improves the performance of lookups to support the
largest tables.
Data profiler. Good news for ETL gurus—SQL
Server 2008 SSIS has a data profiler. Now you’ll have
visibility into the source system data before you build
your ETL solutions, and the ability to code, configure,
and build based upon data patterns. With the
data profiler you can generate source system metadata
statistics, which you can then view using the standalone
Data Profile Viewer. This viewer also displays
candidate keys and data distributions. Data profiling
has long been a requested capability of DTS/SSIS
and the larger SQL Server product. It’s good to see a
formal solution.
SSAS Enhancements
Following the typical progression in a BI solution,
I’ve discussed the first stage—ETL and SSIS—and
now we’re ready to look at creating cubes and mining
models. One of SQL Server 2008’s many improvements
to the SSAS architecture is Cube Designer
enhancements.
Cube Designer enhancements. A critical component
to SSAS is the practice of good cube design. The
ultimate success or failure of your BI rollout depends
on it. I’ll briefly survey what’s new with Personalized
Extensions, Best Practice Alerts, the Dimensional Designer,
the Aggregation Designer, and Named Sets.
You can use Personalized Extensions to create
new SSAS objects and functionality, and then provide
these objects and functionality dynamically in
the context of the user session. You don’t have to create
detailed specifications about where or how to find
the extended functionality. You can share these new
objects and functionality immediately with both end
users and your fellow developers.
The Cube Designer now has a Best-Practice Alert
functionality that spans all objects and is generated
through Analysis Management Objects warnings.
The warnings alert you when you violate design best
practices or make logical errors in database design.
You can detect potential problems with the design in
a non-intrusive way because these warnings are integrated
into real-time designer checks.
New and improved features for the Dimensional
Designer include the Attribute Relationship Designer,
a simplified and enhanced Dimension Wizard, and
the Key Columns dialog box. You can use the new
Attribute Relationship Designer in the Dimension
Editor to easily browse and modify attribute relationships.
The Dimension Wizard, which has been modified
to align output with best practices, auto-detects
parent-child hierarchies, provides safer default error
configuration, and supports specification of member
properties. In the new Key Columns dialog box, the
enhanced Dimension Structure tab works with the
Attribute Relationship Designer, making modifying
attributes and hierarchies easier.
Continued on page 2
A new algorithm in the Aggregation Designer
helps you create initial aggregations. This designer is
optimized to work with usage-driven aggregations.
You can view the created aggregations and add to or
remove them.
Dynamic named sets are a new capability of SSAS
2008. A named set in SQL Server 2005 makes it possible
to define a set of dimension members such as a
set of the top 10 stores by sales. You define this set
statically. You can then refer to this named set wherever
you need to see the top 10 stores by sales. In SQL
Server 2005, set evaluation occurs only at set creation.
In SQL Server 2008, you can create dynamic named
sets and define them to be evaluated every time the
sets are used.
Performance enhancements. A major portion of
the SSAS performance enhancements are in areas
such as subspace computations, Multidimensional
OLAP (MOLAP)-enabled write-back, and backup
and storage.
Cube space is generally sparse, with values existing
only for a small number of dimension intersections.
Although SSAS in SQL Server 2005 evaluates
expressions on complete space, and subspace computation
is included with SP2, in SQL Server 2008 SSAS
subspace computations are significantly improved.
Multidimensional Expressions query performance
has improved; SSAS deals better with cube space by
dividing the space to separate calculated members,
regular members, and empty space to improve evaluation
of cells that need to be included in calculations.
The new MOLAP-enabled writeback capabilities
in SQL Server 2008 SSAS remove the need to
store writeback data in ROLAP storage mode. The
new writeback MOLAP storage mode results in significant
performance gains in cubes that leverage the
writeback capabilities.
Finally with SQL Server 2008 SSAS backup
compression, less storage is required to keep backups
online. The backups also run significantly faster
because less disk I/O is required. There are fewer
restrictions on the size of the database, and the
time required for backup and restore operations is
significantly reduced.
Data mining. The next important SSAS element
for any BI solution is data mining. SQL Server 2008
SSAS enhances data mining models by appending
a new algorithm to the Microsoft Time Series algorithm.
This improves the accuracy and stability of
predictions in the data mining models. The new algorithm
is based on the Auto-Regressive Integrated
Moving Average (ARIMA) algorithm, and provides
better long-term predictions than the Auto Regression
Trees with Cross Predict (ARTxp) algorithm
used in SQL Server 2005 SSAS.
By default, the new implementation of the Microsoft
Time Series algorithm uses the ARTxp algorithm
to train one version of the data mining model and the
ARIMA algorithm to train another version of the
data mining model. The algorithm then weighs the results
of these two data mining models to provide the
prediction characteristics you want. If you don’t want
to use the default implementation, you can specify the
algorithms that the Microsoft Time Series algorithm
must use.
In SQL Server 2008 Enterprise Edition, you can
specify a custom weighting of the algorithms to provide
the best prediction over a variable time span.
The improved Microsoft Time Series algorithm accepts
data during prediction to allow for new business
scenarios. For example, you can create a revenue
prediction model based on averages across products,
regional aggregates, or some other broad data set.
You can then apply that model to the time series that
shows the sales of an individual product. By applying
the general model, you can take advantage of the stability
and availability of aggregate data and customize
prediction to the individual product. You can also
train models by using multiple series, and then apply
the models to new data in forecasting scenarios.
SSRS Enhancements
Now that we’ve covered what’s new with laying the BI
groundwork with SSIS and building cubes and mining
models in SSAS, we’re ready to review the new
features and enhancements found in SSRS in SQL
Server 2008.
Report Server engine. A report server is now
implemented as a Windows-based service that hosts
the Report Manager, the Report Server Web service,
and background processing feature areas. The report
engine improves supportability and the ability to control
server behavior with memory management and
infrastructure consolidation. Consolidating server
applications into a single service reduces configuration
and maintenance tasks. However, the Report
Manager and the Report Server Web service applications
continue to run independently within the single
service. Both the Report Manager and the Report
Server Web service can be accessed through URLs
that provide HTTP access to these applications.
The report server includes an HTTP listener that
handles all authentication requests directed to a URL and a port you define during server configuration. To
provide the ASP.NET and Report Server Web service,
the report server uses the new HTTP.SYS capabilities
of the OS instead of IIS. The report server also has
new management features to set a memory threshold
for background operations and performance counters
for monitoring service activity. I’ll briefly explore
SSRS enhancements for report server deployment
modes, report authoring, and report designing.
SSRS continues to expand its delivery options
with the expansion and enhancement of Rich Text
Format (RTF), Microsoft Office Word, and Microsoft
Office Excel rendering. The improvement of
the RTF component provides a method for users
to define mixed formatting in textboxes and import
marked-up strings of the text into a report generated
from a database or other data sources. The Microsoft
Office Word 2007 rendering extension can be used to
export a report to a Word document without using a
third-party tool. Finally, the Microsoft Office Excel
rendering extension has been enhanced to support
features such as nested data regions and sub-reports.
Continued on page 3
Report authoring. While improved report rendering
is all well and good, better report authoring capabilities
bring SQL Server 2008 SSRS to a new level of usability
for developers, power users, and end users looking
for easier report creation. Microsoft has been touting
the Tablix data region type, which features fixed and
dynamic columns and rows, arbitrary nesting on rows
and columns, optional omission of row or column
headers, and the ability to apply multiple parallel rows
and column members within the same report.
Report authoring data visualizations now have
better visual fidelity between formats and support for
rich report formats, such as tables and matrices. Enhanced
features include:
- Expression placeholder text. Expressions use
placeholder display text in text boxes on the report
design surface or in data regions.
- Expression-based parameter prompts. The
Prompt property for a report parameter can be an
expression.
- Processing-time variables. Variables that are global
throughout the report or local to a particular group
can be declared and referred to in expressions.
The Report Designer has also been upgraded with
features such as new query constructs to return all instances
in a recursive hierarchy. New query constructs
support functions such as Rank and Top N. The tool
has a new UI for obtaining grand totals, and it supports
cross-joins, which are required for common
analytic queries. SSRS business users have wanted
a more user-friendly version of the BI Development
Studio report designer tool. The SQL Server team
responded by creating a separate standalone report
designer outside of this tool.
Other enhanced features to help you design reports
include:
- Entity hierarchies that provide a flattened analyticstyle
metadata browser that presents all entities as
a flattened list.
- Live data in design view that allows display of
live data by using simple iteration of design-time
elements.
- Instances in metadata browser that extend the
metadata browser to include instance data.
- Filtering on the design surface that adds UI
elements for defining basic filter conditions directly
on the design surface.
- An interface that mirrors the Office 2007
products.
- Conditional formatting in response to
customer recommendations.
- Standalone deployment that helps address issues
that occur during Click-Once deployment.
- Built-in forms authentication that enables users to
easily switch between Windows and Forms.
- Report Server application embedding that enables
the URLs in reports and subscriptions to point
back to front-end applications.
SQL Server 2008 BI—Is It for
Your Organization?
As I’ve mentioned earlier, I see scalability and performance
as the most significant areas of improvement
in SQL Server 2008. Reports run faster, various queries
can execute faster, and writebacks in SSAS are
faster. A handful of brand-new capabilities, such as
the Data Profiler in SSIS, may also make you think
seriously about migration. Overall, SQL Server 2008
is an evolutionary upgrade which provides a better
performing BI platform.