Speeding Large Data Loads with Bulk Copy Program

The Bulk Copy Program (BCP) isn't sexy or exciting, and it isn't the subject of many articles. Still, it's the most practical way to handle high-speed data loads in SQL Server. This month I'll explain what BCP is and give some tips to help squeeze every drop of performance from it.

BCP is Microsoft's version of an export/import utility. (For a custom file export utility that combines Visual Basic 4.0's TreeView control with SQL Server 6.5's DMO BulkCopy object, see Mike Otey, "Exporting Data from SQL Server," January 1997.) This command-line program moves data between an existing database table or view and an operating-system file stored in ASCII or machine format. Machine format is useful only for transferring data between SQL Servers running on the CPU architectures (i.e., Intel, Alpha, PowerPC); people can't read machine format. BCP has many uses, but it's especially helpful for moving large data sets between SQL Server and other database systems or dealing with extracts from external data feeds or legacy systems.

Sometimes people think BCP is part of the SQL Server engine and try to run commands from ISQL/w or a similar interface. BCP isn't a Transact-SQL command, and it's not part of the server. BCP is a C-language client that communicates with the database using special BCP extensions in the DB-Library (DB-Lib) API (a SQL Server-specific programming interface). Importing or exporting data with an Open Database Connectivity (ODBC)-enabled application such as Microsoft Access is often easier because Microsoft doesn't provide a GUI interface for BCP and the command-line syntax can be arcane and difficult to master. ODBC-enabled applications are OK for small data sets, but you need to avoid ODBC applications when you're bulk loading a lot of data. ODBC applications load data using standard Insert statements, which are many times slower than BCP.

Increasing BCP's Performance
BCP performance can vary significantly with different uses. The following tips can help make BCP fly.

Tip 1: Always use fast BCP.
You load data with BCP in either fast or slow BCP mode. As you can guess, fast BCP is quicker, but you pay a price. Fast BCP forces you to set a database option, Select into/bulk copy with sp_dboption. This option lets nonlogged operations occur; don't leave it on in production databases because it can prevent you from backing up your transaction log. (See SQL Server Books Online--BOL--Effect of Select into/bulk copy on Transaction Logs for more information.) Fast mode also requires you to drop any indexes on the target table. BCP will revert to slow mode if you forget to remove even one index. You can ignore the BOL and third-party references telling you that triggers on a target table also cause BCP to use slow mode: SQL Server 6.0 fixed this problem.

Transaction log activity accounts for the big difference in speed between slow and fast BCP. Slow mode causes almost as much logging as adding data with an Insert statement, whereas fast BCP does not log individual rows but logs new space allocations only when extents are linked into the table's page chain. (Extents are 16KB buffers, blocks of eight pages, used to manage data.) The difference between fast and slow

BCP is so great that data loads are usually faster if you drop the indexes, run fast BCP, then re-create the indexes, instead of using slow mode and fully logging all the data insertions. If your table has a clustered index, sort the data before running BCP. Then create the index using the sorted_data option, which tells SQL Server it doesn't need to re-sort the data when it creates the index.

Tip 2: Run BCP from the server.
Running BCP on the same machine as SQL Server is a great way to boost performance because it eliminates tons of network overhead. Consider the following scenario: Server A runs SQL Server and stores the data files to be loaded. You invoke BCP from Workstation X and load data into Server A. BCP does not run as part of SQL Server, so the data is copied first to Workstation X, which immediately sends it to back to Server A to be loaded. You move a large data file around the network twice, once to the BCP client machine and once to the SQL Server, but you can eliminate the traffic by running BCP directly from the server. In most cases, copying the remote data file to the server with NT's ordinary file copy commands and then running BCP locally is faster than BCP'ing across the network because NT's caching and file transfer mechanisms are more efficient. Even if you don't run BCP from the server, avoid running BCP against a remote data file because this process moves the file around the network twice.

Tip 3: Use local named pipes.BR> When BCP runs on the same machine as SQL Server, using local named pipes greatly speeds the process. Local pipes are an interprocess communication (IPC) mechanism and completely bypass the network to optimize processes running on the same machine. Local pipes act like a TCP/IP loopback so data never goes to the NT Redirector; by comparison, network named pipes send data through the Redirector, even if both processes run locally on the same machine. (You can easily see the difference in Redirector activity in Performance Monitor--Perfmon--by looking at the Bytes Total/sec counter in the Redirector object.) BCP automatically uses a local named pipe if you do not provide a server name when you run the command.

Compaq has a great white paper, Configuration and Tuning of Microsoft SQL Server 6.5 for Windows NT on Compaq Servers (http://www.compaq.com/support/techpubs/whitepapers/415a0696.html), full of in-depth tuning information. The paper compares BCP load times using a variety of packet size and IPC configurations; the comparisons show that local pipes are 300 percent faster than network pipes.

Tip 4: Place BCP and SQL Server data on separate disks.
Running BCP through a local pipe reduces network overhead but can introduce disk I/O-related bottlenecks. Open the throttle by writing your BCP data files to a fast RAID array on a different physical drive from your SQL Server devices. If this approach is impractical in your environment, run benchmarks with your data and your servers to determine which is worse: the network overhead involved with a remote BCP or the contention on the disk while BCP and SQL Server fight for the same I/O bandwidth.

Tip 5: Install Service Pack 1 for SQL Server 6.5.
Service Pack 1 (SP1) for SQL Server 6.5 includes two nifty enhancements that improve BCP load times up to 700 percent and provide throughput rates of approximately 3.5MB per second (MBps), according to Microsoft benchmarks. First, an engine optimization lets BCP write an extent at a time using n private buffers rather than a page at a time using a single public buffer area. You can set n between 1 and 10 using

sp_configure 'backup buffer size'

This action has a big positive impact on speed, if your disk subsystem can handle the load. Experiment to find the best setting.

Enabling the new Table lock on bulk load option in sp_tableoption activates the second improvement, which tells SQL Server to hold one exclusive table lock rather than grab individual locks as each new extent is linked into the page chain. Previously, BCP locked at the extent level, which could deplete the number of configured locks when a program is loading large data sets. Locking at the table level lets SQL Server acquire a single lock for the duration of the BCP load, so you can load data using much larger batch sizes without exhausting the fixed supply of locks. So, set the batch size to the number of rows in the table you're loading. These two enhancements speed data loading with BCP in the fast, nonlogged mode; unfortunately, you won't see much change when you use the slow, logged version of BCP.

The changes open BCP's floodgates to the I/O subsystem, so the boost is negligible unless your controller and disks can keep up with the increased I/O load. After playing with SP1 and talking with others who have used it, I believe the 700 percent gain Microsoft has achieved is optimistic for most sites. A few Microsoft engineers I know have reported 200 percent to 300 percent improvements using fast RAID controllers and lots of disks at corporate sites. In comparison, load times improved by 30 percent on my lowly P120 40MB laptop.

Tip 6: Experiment with packet size.
Before SP1, packet size had little effect as long as the packet size was at least 4KB. A Microsoft engineer familiar with the SP1 BCP enhancements suggests using a packet size of 16KB to take full advantage of BCP's new ability to write data in 16KB extents rather than 2KB pages.

My tests show that 16KB packets improve times slightly, but I haven't tested packet sizes exhaustively. Try playing with this option in your environment.

Tip 7: Use native mode.
BCP can work with data in native SQL Server format or in an ASCII representation, which is more fun to look at. Native format uses slightly less disk space and runs a little faster because data isn't converted to ASCII format. Don't expect the same boost as using a local named pipe, but every little bit can help.

Put 'Em All Together
BCP tuning is one part science, one part art, and three parts experimentation, so I can't guarantee success. But I'd be very surprised if these tips don't improve performance at your site.

BCP isn't the most user-friendly interface in the world, and we all hope that one day Microsoft will provide a more attractive, easier-to-use utility. But for now, BCP is the best game in town, so you might as well make the most of it.