Bulk copying SQL Server data from Linux and UNIX
The SQL Server ODBC driver distribution includes a bulk copy program (bcp), which lets you import and export large amounts of data (from a table, view, or result set) in and out of SQL Server databases. The Easysoft bulk copy program is based on the one provided by Microsoft. If you have already used Microsoft's bcp utility, you will recognise the Easysoft bcp's command line arguments and be able to adapt existing Windows bcp scripts for Linux and UNIX platforms.
Contents
- Overview
- Exporting data from a table or view
- Exporting data returned by a query
- Importing data into a table
- bcp format files
- bcp examples
Overview
bcp moves data from tables in the database to files on the file system and vice versa. bcp is used mostly with flat files (where there is no embedded structure information that governs relationships between records), such as tab-delimited text files. bcp also works with a special SQL Server file type that allows the transfer of non-character data.
If you need to copy SQL Server data regularly, you can rerun your bcp commands by running them as a cron job.
bcp enables you to import data exported from client applications directly from Linux and UNIX machines. You do not need to copy the exported data file to a Windows machine or make the file accessible to Windows by copying it to a Samba share.
bcp does not create database objects, so a table must exist before you can use bcp to transfer data into it.
When you use bcp to bulk copy data into a view, the usual rules about adding rows to a view apply: only views in which all columns refer to the same table can be used as a destination.
The bulk copy program is located in installation_dir/easysoft/sqlserver/bcp
where installation_dir
is the Easysoft installation directory, by default /usr/local
.
To display bcp's command line syntax, use the -h
argument. For example:
$ cd /usr/local/easysoft/sqlserver/bcp $ ./bcp -h usage: bcp {[[database_name.][schema].]{table_name | view_name} | "query"} {in | out | queryout | format} data_file options
The source or target table for the bulk copy can be a one-, two-, or three- part table name (or view name). For example, AdventureWorks.HumanResources.myTeam
is a three-part table name, which contains the identifiers database name, schema name, and object name. Use the in
and out
arguments to specify the direction of the bulk copy. To copy data from a file to a database table, use the in
option. To copy data from a database table to a file, use the out
option. data_file
specifies the source or destination file for the bulk copy data. data_file
can include an absolute or a path that is relative to the directory where bcp is being run from. If you do not include a path, bcp looks for or creates data_file
in the same directory as itself. If you are bulk copying data out of SQL Server, bcp will overwrite an existing data_file
if it has the same name and location as one previously specified.
The bcp options
let you specify the SQL Server instance to connect to and the login to use to authenticate the connection. For example, this bcp command connects to a SQL Server Express instance using a Windows user name and password:
$ ./bcp AdventureWorks.Sales.Currency out /tmp/Currency.dat \ -U mydomain\\myuser -P mypassword -S mymachine\\sqlexpress
You do not have to specify a password on the command line. If you omit the -P
argument from the command line, bcp will prompt you for one when you run the command.
For information about each bcp argument, refer to the SQL Server ODBC driver documentation.
Exporting data from a table or view
bcp has two modes of operation: interactive and noninteractive. In interactive mode, a series of prompts asks you to describe each column in the source table or view you are exporting data from (or importing data to). The following bcp command exports data from the HumanResources.myTeam
table in the AdventureWorks sample database to the myTeam.dat
file:
./bcp AdventureWorks.HumanResources.myTeam out /tmp/myTeam.dat \ -U mydomain\\myuser -S myserver\\sqlexpress
The default mode is interactive and so bcp prompts you to describe each column. For example:
Enter the file storage type of field EmployeeID [smallint]: enter the prefix-length of field EmployeeID [0]: enter the field terminator [none]:
To accept the recommended responses (shown in square brackets), press Return.
At the end of the interactive session, bcp prompts you to save this information in a text-based format file for future use. If you do this, re-entering previous format responses interactively is not necessary. For more information about format files, refer to bcp format files.
In noninteractive mode, you are not prompted for format information. To run bcp in noninteractive mode, you include one of the following arguments, which specify in advance how data should be formatted:
Data format | Argument | Description |
---|---|---|
Native | -n |
Use Native format when copying data between SQL Server tables and the data does not contain extended or double-byte character set (DBCS) characters. |
Unicode Native | -N |
Use Native format when copying data between SQL Server tables and the data contains extended or DBCS characters. |
Character | -c |
Use Character format when copying data between SQL Server and other applications. |
Unicode Character | -w |
Use Unicode Character format when copying data between SQL Server tables and the data is Unicode but does not contain extended or double-byte character set (DBCS) characters. |
These commands copy data between SQL Server instances. The character data in the table is ASCII, and so Native format (-n
) is specified. Because bcp is being run in noninteractive mode, it does not prompt for format information.
./bcp AdventureWorks.HumanResources.myTeam out myTeam \ -U mydomain\\myuser -S mymachine\\development -n
./bcp AdventureWorks.HumanResources.myTeam in myTeam \ -U mydomain\\myuser -S mymachine\\production -n
Exporting data returned by a query
If you do not want to export an entire table then you can use bcp's queryout
argument to retrieve data through a query instead. bcp executes the query, which you include in the command line, and writes the results to a file. An example bcp command that executes a query is shown later in this article.
This example bcp command calls a procedure rather than executes a query. The procedure returns a result set, which is written to a file:
./bcp "EXEC AdventureWorks.dbo.uspGetEmployeeManagers 50" queryout \ /tmp/Employees.txt -U mymachine\\myuser -S mymachine\\sqlexpress -c
Importing data into a table
To import data into a table or updatable view, you specify the table or view and the in
argument. The following example command imports comma separated data that has been exported from an OpenOffice Calc spreadsheet:
./bcp AdventureWorks.HumanResources.myTeam in ~/myTeam.csv \ -f ~/myTeam.Fmt -U mydomain\\myuser -S mymachine\\sqlexpress
A format file is used in this command to:
- Indicate that the values in the data file are comma separated.
The default delimiter is a TAB character.
- Strip the double quotation marks ("") that OpenOffice used to delimit text fields when exporting the data:
$ more ~/myTeam.csv 77,"Mia Doppleganger","Administative Assistant","Microsoft Office" 49,"Hirum Mollicat","I.T. Specialist","Report Writing"
The format file includes the quotation marks when specifying the column and row delimiters. For example, the column delimiter specified for the second column is
\",\"
. bcp excludes the column delimiter when importing data, and soMia Doppleganger","
becomesMia Doppleganger
when imported into the target table.11.0 4 1 SQLCHAR 0 7 ",\"" 1 EmployeeID "" 2 SQLCHAR 0 100 "\",\"" 2 Name Latin1_General_CI_AS 3 SQLCHAR 0 100 "\",\"" 3 Title Latin1_General_CI_AS 4 SQLCHAR 0 100 "\"\n" 4 Background Latin1_General_CI_AS
When imported, the data fields are no longer delimited with quotation marks:
$ /usr/local/easysoft/unixODBC/bin/isql.sh -v MY_DSN +---------------------------------------+ | Connected! | | | | sql-statement | | help [tablename] | | quit | | | +---------------------------------------+ SQL> select * from AdventureWorks.HumanResources.myTeam +-----------+------------------+-------------------------+------------------+ | EmployeeID| Name | Title | Background | +-----------+------------------+-------------------------+------------------+ | 77 | Mia Doppleganger | Administative Assistant | Microsoft Office | | 49 | Hirum Mollicat | I.T. Specialist | Report Writing | +-----------+------------------+-------------------------+------------------+
For more about format files, refer to bcp format files.
Notes
- If you get the error "Client unable to establish connection: eof reading input file," the row terminator in the data file you are attempting to import is different to the one bcp expects. You can work around this by specifying the actual row terminator with the
-r
argument or in a format file. For example, the row terminator in this data file is a\r
character (carriage return):$ od -c /tmp/myTeam.txt 0000000 7 7 \t M i a D o p p l e g a n 0000020 g e r \t A d m i n i s t r a t i 0000040 v e A s s i s t a n t \t M i c 0000060 r o s o f t O f f i c e \r 4 9 0000100 \t H i r u m M o l l i c a t \t 0000120 I . T . S p e c i a l i s t \t 0000140 R e p o r t W r i t i n g a 0000160 n d D a t a M i n i n g \r 0000177
To import the data, it's necessary to specify the row terminator on the command line:
./bcp AdventureWorks.HumanResources.myTeam in /tmp/myTeam.txt \ -U mydomain\\myuser -S mymachine\\sqlexpress -c -r \\r
- If you need to import large amounts of data, try breaking the data into smaller chunks. This will help if you have an error during a bulk import. (For example, a primary key error can rollback the entire import operation by default.) To do this, include
-b batch_size
in the bulk import command, wherebatch_size
is the number of rows per batch.
bcp format files
bcp lets you create format files that contain meta information (such as data type and column size), which define how bcp imports and exports data. The format file describes the layout of the data file you want to import or export and defines the correspondence between data file fields and SQL Server table columns. The format file generated by bcp is a template that you modify to allow you to work with data that is in a different format to what the target table or application expects. For example, you have some data in a CSV file that you want to import into SQL Server, but the data does not match up with the number of columns or order of columns in the database.
To create a format file, you specify format
for the mode and nul
for the data file. In addition, you must also include the -f
argument, followed by the full path name of the format file. For example:
$ ./bcp AdventureWorks.HumanResources.myTeam format nul -f /tmp/myTeam.Fmt -n \ -U mydomain\\myuser -S mymachine\\sqlexpress $ more /tmp/myTeam.Fmt 11.0 4 1 SQLSMALLINT 0 7 "\t" 1 EmployeeID "" 2 SQLNCHAR 0 50 "\t" 2 Name Latin1_General_CI_AS 3 SQLNCHAR 0 50 "\t" 3 Title Latin1_General_CI_AS 4 SQLNCHAR 0 50 "\r" 4 Background Latin1_General_CI_AS
The fields in the format file are as follows:
Row | Description |
---|---|
1 | The SQL Server version. For example: 7.0 = SQL Server 7.0 8.0 = SQL Server 2000 9.0 = SQL Server 2005 10.0 = SQL Server 2008 11.0 = SQL Server 2012 12.0 = SQL Server 2014 and later |
2 | The number of fields in the data file. |
Column | Description |
---|---|
1 | The position of the field in the data file. |
2 | The data type of the field in the data file. |
3 | The prefix length of the field in the data file. Valid prefix lengths are 0, 1, 2, 4, and 8. |
4 | The maximum length, in bytes, of the data type shown in column 2. |
5 | The terminator used to delimit the field in the data file. |
6 | The column number of the table in the SQL Server database. For example, if the fourth field in the data file maps to the sixth column in the SQL Server table, the column number for the fourth field is 6. |
7 | The name of the column taken from the SQL table. It is not necessary to use the actual name of the column, but some value must be supplied. |
8 | The collation used to store character and Unicode data in the data file. |
After you create the format file, you can then reference it in your bcp commands by specifying the -f
argument, along with the name of the format file.
$ ./bcp AdventureWorks.HumanResources.myTeam in myTeam -f myTeam.Fmt \ -U mydomain\\myuser -S mymachine\\development
The command no longer needs a format argument (such as -c
) because all the formatting information is included in the format file.
Format files enable you to bulk copy data when:
- There are more columns in the data file than in the table. For example, the target table contains a subset of the columns of the table the data was exported from.
To work around this:
- Skip a data field in the data file by changing the value of the sixth column in the format file to
0
. - Change the value of the sixth column for the remaining data fields to map them to the correct SQL Server column, where necessary.
For example, the following format file:
- Skips the second field in the data file.
- Maps the third field in the data file to the second column in the target table.
11.0 3 1 SQLINT 1 4 "\t" 1 EmployeeID "" 2 SQLDATETIME 1 8 "\t" 0 ExtraField "" 3 SQLDATETIME 1 8 "\n" 2 BirthDate ""
- Skip a data field in the data file by changing the value of the sixth column in the format file to
- There are more columns in the table than in the data file, because, for example, the table has columns with default or computed values.
To work around this:
- Delete any format file row that describes a field that is missing from the data file.
- Change the first column in the format file so that the numbers are sequential.
- Reduce the number shown in the second row of the data file accordingly.
For example, the target table for this format file contains a retirement date column whose value is computed from a date of birth column:
11.0 3 1 SQLINT 1 4 "\t" 1 EmployeeID "" 2 SQLDATETIME 1 8 "\t" 2 RetirementDate "" 3 SQLDATETIME 1 8 "\n" 3 BirthDate ""
The format file is modified to allow imports from data files that contain two fields: the employee ID and data of birth:
11.0 2 1 SQLINT 1 4 "\t" 1 EmployeeID "" 2 SQLDATETIME 1 8 "\n" 3 BirthDate ""
- The column order is different for the data file and table. (Refer to Example: columns in the data file are in a different order to that of the table.)
bcp examples
Example: exporting data from a query to a data file
bcp lets you export a result set returned by a SQL query to a data file. This is useful if you do not want to export an entire table.
Bulk exporting data from a table or view does not guarantee the order in which the data is written to the data file. You can use a query to ensure that a bulk export preserves the order of the table data in the data file.
The following example exports the names from the AdventureWorks Person.Contact
table into a comma separated file. The names are ordered by last name. A second bcp command generates a comma separated file containing the column headings. The two data files are then concatenated. (bcp output does not include column headings, as they are not part of the result set.)
$ ./bcp "SELECT FirstName, LastName FROM AdventureWorks.Person.Contact \ ORDER BY LastName" queryout body.csv -U mydomain\\myuser \ -S mymachine\\sqlexpress -c -t "," $ ./bcp "SELECT 'FirstName','LastName'" queryout heading.csv \ -U mydomain\\myuser -S mymachine\\sqlexpress -c -t "," $ cat heading.csv body.csv > contacts.csv
Example: moving data from one table to another
Transferring data from the source table to the destination table is a two-step process. The first step is to use bcp to output the data in the source table into a text file. The second step then uses the bcp command to insert the records into the destination table from the text file.
In this example, a table named HumanResources.myTeam
exists in the AdventureWorks database on both a development and a production SQL Server instance:
CREATE TABLE HumanResources.myTeam (EmployeeID smallint NOT NULL, Name nvarchar(50) NOT NULL, Title nvarchar(50) NULL, Background nvarchar(50) NOT NULL DEFAULT '' );
The table on the development instance has the following records in it:
INSERT INTO HumanResources.myTeam(EmployeeID,Name,Title,Background) VALUES(77,'Mia Doppleganger','Administrative Assistant','Microsoft Office'); INSERT INTO HumanResources.myTeam(EmployeeID,Name,Title,Background) VALUES(49,'Hirum Mollicat','I.T. Specialist','Data Mining');
This bcp command exports all the records from HumanResources.myTeam
to a flat file named myTeam in the same directory as the bcp binary:
./bcp AdventureWorks.HumanResources.myTeam out myTeam \ -U mydomain\\myuser -S mymachine\\development -n
This bcp command imports the records from the flat file into the empty table on the production server. The HumanResources.myTeam
table on the production server is now a duplicate of the table on the development server.
./bcp AdventureWorks.HumanResources.myTeam in myTeam \ -U mydomain\\myuser -S mymachine\\production -n
Example: columns in the data file are in a different order to that of the table
In some cases, a data file may have fields in an order different to that of the corresponding columns in the destination table. For example, the following data file has the same number of fields as the HumanResources.myTeam
table, however the name and title fields are in reverse order:
1 Administrative Assistant Guy Gilbert Microsoft Office 2 I.T. Specialist Kevin Brown Data Mining
bcp format files contain mapping instructions between data in a source file and columns in a target table. Altering these mapping instructions enables you to import data when there is a mismatch between data fields in a source file and columns in a target table. For example, you can alter the order in which fields in the source file are mapped to columns in the target table. This enables you to import data from a file whose data field order is different to that of the target table.
This bcp command generates a format file that shows the order in which columns are defined in the HumanResources.myTeam
table.
$ ./bcp AdventureWorks.HumanResources.myTeam format nul -f myTeam.Fmt \ -U mydomain\\myuser -S mymachine\\development -c
The sixth field in the format file indicates the column numbers for the table in the SQL Server database:
$ more myTeam.Fmt 11.0 4 1 SQLCHAR 0 7 "\t" 1 EmployeeID "" 2 SQLCHAR 0 50 "\t" 2 Name Latin1_General_CI_AS 3 SQLCHAR 0 50 "\t" 3 Title Latin1_General_CI_AS 4 SQLCHAR 0 50 "\n" 4 Background Latin1_General_CI_AS
The format file shows that in the HumanResources.myTeam
table, Name
is the second column and Title
is the third column. As these fields are in a different order in the data file, we need to alter the format file. We need to map the second field in the data file to the third column in the target table. Similarly, we need to map the third field in the data file to the second column in the target table. To do this, reverse the column order in the format file:
$ more myTeam.Fmt 11.0 4 1 SQLCHAR 0 7 "\t" 1 EmployeeID "" 2 SQLCHAR 0 50 "\t" 3 Name Latin1_General_CI_AS 3 SQLCHAR 0 50 "\t" 2 Title Latin1_General_CI_AS 4 SQLCHAR 0 50 "\n" 4 Background Latin1_General_CI_AS
For these changes to have an effect, specify the format file with the -f
argument when importing the data:
$ ./bcp AdventureWorks.HumanResources.myTeam in myTeam -f myTeam.Fmt \ -U mydomain\\myuser -S mymachine\\development
Example: automatically generating bcp commands with SQL
If you need to copy data from numerous tables in your database, you could save time by using a SQL statement to generate the necessary bcp commands. If, for example, you needed to back up all User tables in a database, you could build the necessary commands by embedding bcp fragments in a SQL statement that queries the sys.tables
view. This example SQL statement generates bcp commands that back up all User tables in the AdventureWorks database:
SELECT '/usr/local/easysoft/sqlserver/bcp/bcp AdventureWorks.' + SCHEMA_NAME(schema_id) + '.' + name + ' out /tmp/' + name + '.dat' + ' -n -U mydomain\\\\myuser -P mypassword -S myserver\\\\sqlexpress' FROM AdventureWorks.Sys.Tables ORDER BY name
This produces a result set, which contains a bcp command that exports the data from each AdventureWorks User table. For example:
/usr/local/easysoft/sqlserver/bcp/bcp AdventureWorks.Person.Address out /tmp/Address.dat -n -U mydomain\\\\myuser -P mypassword -S myserver\\\\sqlexpress /usr/local/easysoft/sqlserver/bcp/bcp AdventureWorks.Person.AddressType out /tmp/AddressType.dat -n -U mydomain\\\\myuser -P mypassword -S myserver\\\\sqlexpress . . .
The following command executes each bcp command in the result set, producing a backup of the AdventureWorks database:
$ /usr/local/easysoft/unixODBC/bin/isql.sh -b -dC -v MY_DSN < /tmp/bcp.sql | \ sed -e '/---/d' -e '/fetched/d' -e '/SQLRowCount/d' | xargs -L 1 | sh Starting copy... 1000 rows successfully bulk-copied to host file. Total received: 1000 . . .
The command passes the SQL shown earlier, which has been saved to a file, to isql running in batch mode and then passes each record in the result set to sh
.