Retrieving data with Perl DBI DBD::ODBC
Contents
- Introduction
- Prerequisites
- Assumptions
- Perl DBI and DBD::ODBC architecture
- Simple methods of retrieving data
- Hash and reference methods of retrieving data
- Special cases
Prerequisites
The prerequisite software for this tutorial is:
-
Perl
We used Perl 5.8, but you only need the minimum version required by the DBI and DBD::ODBC modules, which is currently 5.6. Use
perl --version
to find out what version of Perl you have installed. -
DBI module
We used DBI 1.45 and DBI 1.607, but the samples in this tutorial should work with anything after 1.40. To find out if you have a recent enough version of DBI installed, run:
perl -e 'use DBI 1.40;'
If you get an error like "DBI version 1.40 required--this is only version 1.30 at -e line 1," you need to upgrade DBI.
If you get an error saying "DBI cannot be found in @INC," you have probably not got DBI installed.
Go to CPAN to get an up-to-date version of the DBI module.
-
DBD::ODBC
We used DBD::ODBC 1.11 and DBD::ODBC 1.17. You can use similar methods to the ones shown above to check whether DBD::ODBC is installed and to see what version you have.
To check that you have the DBD::ODBC module installed:
perl -e 'use DBD::ODBC;'
If you have not got DBD::ODBC installed, refer to Enabling ODBC support in Perl with Perl DBI and DBD::ODBC for installation instructions.
To show the DBD::ODBC version:
perl -MDBD::ODBC -e 'print $DBD::ODBC::VERSION;'
To show all drivers DBI knows about and their versions:
perl -MDBI -e 'DBI->installed_versions;'
We also used the Easysoft ODBC-ODBC Bridge as the ODBC driver to access a remote Microsoft SQL Server database from UNIX.
Assumptions
-
Operating system
This tutorial was designed on UNIX and Linux, and we have assumed you are using UNIX or Linux too. However, all the Perl examples should work equally well on Windows as long as some minor alterations to the command line are made.
-
ODBC Driver Manager
We assume that you're using the unixODBC Driver Manager.
Perl DBI and DBD::ODBC architecture
Perl script using DBI methods | v API | v DBI | v DBD::ODBC | v ODBC Driver Manager (for example, unixODBC) | v ODBC driver
Simple methods of retrieving data
Specifying what you want
You specify the data you want to retrieve from the database by using SQL SELECT
statements. The basic form of a SQL SELECT
statement is:
SELECT column_list from table
where:
column_list
is a comma-separated list of columns, or*
(for all columns). In fact, it can be many more things; consult a SQL reference manual.table
is a table or view in the database.
You can qualify the rows you want back by adding a WHERE
clause like:
WHERE column = 'text'
Your SQL SELECT
statement is passed to the DBI prepare method, which, in ODBC terms, ends up in a call to the SQLPrepare
or SQLExecDirect
ODBC APIs.
Be careful when creating SQL from Perl variables, as you can include characters that invalidate the SQL. For example:
my $name = "can't"; my $sql = qq/select * from table where column = '$name'/;
$sql
is invalid because it's now:
select * from table where column = 'can't'
Most databases would require the second '
to be doubled up. You can use the DBI quote
method to do this for you with:
my $name = $dbh->quote("can't"); my $sql = qq/select * from table where column = $name/;
The other way (recommended) is to use bound parameters.
Simple prepare and execute
This code obtains and displays all the rows in a table:
my $dbh = DBI->connect(); # connect my $sql = qq/select * from table/; # the query to execute my $sth = $dbh->prepare($sql); # prepare the query $sth->execute(); # execute the query my @row; while (@row = $sth->fetchrow_array) { # retrieve one row print join(", ", @row), "\n"; }
Here, we prepare the SQL, execute it, and then use the fetchrow_array
method to return an array of column values for each row. Repeatedly calling the fetchrow_array
method until it returns undef
, retrieves all rows in the result set.
Note NULL
fields are returned as undef
in the array returned by fetchrow_array
(more about NULL
values later).
Note Don't use fetchrow_array
in a scalar context unless you really know what you are doing, as it can return the first or last column (dependent on the ODBC driver) which could be a NULL
value and therefore undef
. Also fetchrow_array
can return undef
if there are no more rows or if an error occurs.
There are other methods of obtaining the result set — refer to later sections in this tutorial.
Obtaining metadata from the result set
ODBC can describe the result set for you by listing the number of columns and the names of the columns. You can obtain this information with the statement attributes NUM_OF_FIELDS
, NAME, NAME_uc
, NAME_lc
, NAME_hash
, NAME_lc_hash
, and NAME_uc_HASH
.
Assuming you have the table fred
with columns a
(integer
), b
(char(10)
) and
c
(float
):
my $sth = $dbh->prepare(select * from fred); $sth->{NUM_OF_FIELDS} returns 3. $sth->{NAME}->[0] returns 'a' although it could return 'A' $sth->{NAME_uc}->[0] returns 'A'. $sth->{NAME_lc}->[0] returns 'a'.
Note The NAME
attribute can return lower case or upper case column names depending on the database. Some database always uppercase column names in CREATE
statements if they are unquoted. Some databases retain the case of column names if they are enclosed in the identifier quote character. For this reason, it's best to use NAME_uc
or NAME_lc
.
Note. Some ODBC drivers may be unable to return a column name at all if the column is an expression like 1
or a function. For example, SELECT 1 FROM table
in SQL Server returns an empty string for $sth-{NAME}->[0]
. You can get around this by using column aliases as in SELECT 1 as col1 FROM table
.
NAME_hash
, NAME_lc_hash
, and NAME_uc_hash
are like NAME
, NAME_lc
and NAME_uc
except that the result is a hash instead of an array, with the keys being the column names, and the values being the column index starting at 0.
Similarly, the TYPE
attribute returns an array reference of column types. For the fred
table above:
print join(", ", @{$sth->{TYPE}}), "\n";
returns 4
, 1
, and 6
. The column types are defined by international standards (refer to the DBI manual).
Using parameters
The main reasons for using parameters are:
- You can prepare the SQL once then execute many times with different parameters thus skipping the prepare parsing.
- With bound parameters, you don't need to bother about quoting issues.
Expanding the simple example in Simple prepare and execute, to include a WHERE
clause, we have:
my $dbh = DBI->connect(); # connect my $var = $dbh->quote("value to search for"); my $sql = qq/select * from table where column = $var/; # the query to execute my $sth = $dbh->prepare($sql); # prepare the query $sth->execute(); # execute the query my @row; while (@row = $sth->fetchrow_array) { # retrieve one row print join(", ", @row), "\n"; }
This is fine but what if we want to execute the query multiple times with different values for $var
? We can repeat the $sql
assignment with different $var
values and re-run $dbh->prepare
, but this is inefficient because it causes the ODBC driver and database to re-parse the SQL each time, which is unnecessary.
A better solution is to use the following:
my @row; # row data my $dbh = DBI->connect(); # connect my $var = "value to search for"; my $sql = qq/select * from table where column = ?/; # the query to execute with parameter my $sth = $dbh->prepare($sql); # prepare the query $sth->execute(($var)); # execute the query with parameter while (@row = $sth->fetchrow_array) { # retrieve one row print join(", ", @row), "\n"; } $var = "another value to search for"; $sth->execute(($var)); # execute the query (no need to re-prepare) while (@row = $sth->fetchrow_array) { # retrieve one row print join(", ", @row), "\n"; }
Here, the SQL contains a parameter marker (?
) indicating to the ODBC driver that we will provide the values later. The SQL is passed to the prepare
method and the database will parse the SQL and note a parameter is required. When the execute
method is called, we pass the parameters required. You can use multiple parameter markers, for example:
$sql = q/select * from table where col1 = ? and col2 = ?/; . . $sth->execute(($param1, $param2));
Note You must pass the array of parameters in the order that you want them to match the parameter markers in the SQL. In the above example, $param1
substitutes the first parameter marker in col1 = ?
and $param2
substitutes the parameter marker in col2 = ?
.
Note You can't use parameter markers in place of column names. For example, select ? from table
or select * from table where ? = 'A'
.
When passing the parameters to the execute
method, a default binding type is used. DBD::ODBC attempts to find out what the parameter markers represent and bind the parameters as the same type. If your ODBC driver does not have the ODBC API SQLDescribeParam
, a string binding is used (SQL_VARCHAR
). You can also bind the parameters yourself before the execute
call, and in this way, you can specify how the parameters should be bound:
# import SQL types for use in bind_param method: use DBI qw(:sql_types); $sql = q/select * from table where col1 = ?/; $sth = $dbh->prepare($sql); # uses default bind type: $sth->bind_param(1, $param1); # Specify the parameter should be bound as an SQL_INTEGER $sth->bind_param(1, $param1, {TYPE => SQL_INTEGER});
Note Once the parameters are bound, you can't change the type they are bound as, although you can obviously change the bound values.
Using bound columns
In previous examples in this tutorial, we used the fetchrow_array
method to retrieve the result set. DBI also provides a method to bind the returned column data to Perl variables using the bind_col
and bind_columns
methods:
Going back to the first example, we had:
my $dbh = DBI->connect(); # connect my $sql = qq/select * from table/; # the query to execute my $sth = $dbh->prepare($sql); # prepare the query $sth->execute(); # execute the query my @row; while (@row = $sth->fetchrow_array) { # retrieve one row print join(", ", @row), "\n"; }
which can be rewritten to use bound columns as follows:
my $dbh = DBI->connect(); # connect my $sql = qq/select * from table/; # the query to execute my $sth = $dbh->prepare($sql); # prepare the query $sth->execute(); # execute the query my ($col1, $col2, $col3); # Note columns start at 1 (not 0). $sth->bind_col(1, \$col1); $sth->bind_col(2, \$col2); $sth->bind_col(3, \$col3); while ($sth->fetch) { # retrieve one row print "$col1, $col2, $col3\n"; }
Binding columns and using the fetch
method is usually faster than using methods like fetchrow_array
. As with bind_param
, you can specify the type the column is bound as.
$sth->bind_col(1, \$col1, {TYPE => SQL_INTEGER});
$sth->bind_col(2, \$col2, {TYPE => SQL_CHAR
});$sth->bind_col(3, \$col3, {TYPE => SQL_FLOAT});
In ODBC, there's usually no need to specify the type. Instead of separate calls to bind_col
, you can bind all the columns in the above example in one go with:
$sth->bind_columns(\$col1, \$col2, \$col3);
–Or–
$sth->bind_columns(\($col1, $col2, $col3));
Hash and reference methods of retrieving data
DBI supports a number of methods that return a result set (or part of it) in hash or array references. The following examples assume you have already prepared and executed a SELECT
on a table containing two rows with two columns named a
and b
:
column a | Column b ---------+--------- 1 | one 2 | two
fetchrow_arrayref
my $ref; while($ref = $sth->fetchrow_arrayref) { print join (", ", @{$ref}), "\n"; }
1, one 2, two
fetchrow_hashref
my $ref; while($ref = $sth->fetchrow_hashref) { print join (", ", keys %$ref), "\n"; print join (", ", values %$ref), "\n"; }
a, b 1, one a, b 2, two
fetchall_arrayref
# # You must set RaiseError or check $sth->err because # fetchall_arrayref returns the data fetched so far. # $sth->{RaiseError} = 1; my $ref; $ref = $sth->fetchall_arrayref; print "Number of rows returned is ", 0 + @{$ref}, "\n"; foreach $r (@{$ref}) { print join(", ", @{$r}), "\n"; }
1, one 2, two
You can add a slice to fetchall_arrayref
to specify which columns to return:
- To return all columns as above:
$ref = $sth->fetchall_arrayref([]);
1, one 2, two
- To return only the first column:
$ref = $sth->fetchall_arrayref([0]);
1 2
- To return only the last column:
$ref = $sth->fetchall_arrayref([-1]);
one two
- To return each row as a
hashref
:$ref = $sth->fetchall_arrayref({}); print "Number of rows returned is ", 0 + @{$ref}, "\n"; print join (", ", keys %{$ref->[0]}), "\n"; foreach $r (@{$ref}) { print join(", ", (values %$r)), "\n"; } print "Number of rows returned is ", 0 + @{$ref}, "\n"; print join (", ", keys %{$ref->[0]}), "\n"; foreach $r (@{$ref}) { print join(", ", (values %$r)), "\n"; }
Number of rows returned is 2 a 1 2
Note When specifying the names of the columns to return in this way, the case of keys for the returned hashes always matches the case used in the parameter to
fetchall_hashref
, regardless of whatFetchKeyHashName
is set to.
You can add a max_rows
argument to fetchall_arrayref
to restrict the number of rows:
$ref = $sth->fetchall_arrayref([], 1);
1, one
This method is particularly useful if:
- Your database engine does not support the
top n
SQL syntax. - You want the increased performance of
fetchall_arrayref
. - You want to keep memory down but also need to process all the rows:
fetchall_arrayref
can be called repeatedly on the same result set to get rows in chunks.
A rather elegant way of doing this, shown in the DBI documentation, is:
$sth->{RaiseError} = 1; my $rows = []; # cache for batches of rows while(my $row = (shift(@$rows) || # get row from cache, or reload cache: shift(@{$rows=$sth->fetchall_arrayref(undef,1)||[]}))) { print join(", ", @{$row}), "\n"; }
1, one 2, two
Special cases
SQL_CHAR
types and trailing spaces
Databases store char(n)
columns as exactly n
characters. For example, you insert FRED
into a char(10)
column. When you retrieve the column value, you'll get 'FRED '. This often leads to confusion, especially in tests like $var eq 'FRED'
. With the following table definition and INSERT
statements:
create table tut2_8 (a int, b char(10)) insert into tut2_8 values (1, 'one') insert into tut2_8 values (2, 'two')
the following code never prints out Found 'two'
.
my ($col1, $col2); $sth->bind_columns(\($col1, $col2)); while ($sth->fetch) { # retrieve one row print "$col1, $col2\n"; print "Found 'two'" if ($col2 eq 'two'); }
The ChopBlanks
attribute is provided to help you here. ChopBlanks
may be set on the connection or statement handle and is inherited from connections. The default for ChopBlanks
is false
. If you insert $sth->{ChopBlanks} = 1;
before the call to execute
, the above test now works.
Note ChopBlanks
only works on fixed-length CHAR
columns.
Long columns
Suppose you create a table with the following code:
my $dbh = DBI->connect(); # connect $dbh->do(q/create table tut2_9 (a int, b text)/); my $sth = $dbh->prepare(q/insert into tut2_9 values (1, ?)/); my $p = 'x' x 500; $sth->execute($p);
The text
column type here is a SQL Server long
data type. Other databases have memo
, blob
, and so on.
The following data retrieval code:
$sth = $dbh->prepare(q/select * from tut2_9/); $sth->execute; my ($col1, $col2); $sth->bind_columns(\($col1, $col2)); while ($sth->fetch) { # retrieve one row print "$col1, $col2\n"; }
fails with:
DBD::ODBC::st fetch failed: [unixODBC][][Microsoft] [ODBC SQL Server Driver] String data, right truncation (SQL-01004) (DBD: st_fetch/SQLFetch (long truncated DBI attribute LongTruncOk not set and/or LongReadLen too small) err=-1) at ./tut2_9.pl line 19.
The attributes LongReadLen
and LongTruncOk
control how long
columns are retrieved. By default, DBD::ODBC has LongReadLen
set to 80
and LongTruncOk
is false
. You can find the current settings with:
print "LongReadLen is '", $h->{LongReadLen}, "'\n"; print "LongTruncOk is ", $h->{LongTruncOk}, "\n";
When LongTruncOk
is false
(as above), attempts to retrieve more than LongReadLen
bytes from a column fail. If you're not bothered about the column data being truncated, set LongTrunkOk
to 1
. If you need more than 80
bytes from the column, you need to set LongReadLen
. If you want long
columns ignored and no data fetched from them (undef
returned instead) then you need to set LongReadLen
to 0
and LongTruncOk
to true
.
Handling NULL
data
NULL
in SQL
Columns that have the NULL
value are special. NULL
is not the same as an empty string. You need to be careful using comparisons on columns that are nullable. For instance:
Given the table CREATE TABLE fred (a int, b char(1) null)
:
Col1 | Col2 -----+----- 1 | Y 2 | N 3 | NULL
SELECT * FROM fred WHERE col2 = 'Y'
returns:
col1 | col2 -----+----- 1 | Y
and SELECT * FROM fred WHERE col2 > 'Y'
returns:
col1 | col2 -----+----- 1 | N
Some people expect the result of the second query to include row 3 because they think col2 > 'Y'
includes NULL; it doesn't. In relational operations, unlike binary operations, there are three states that an operation can result in: TRUE
, FALSE
, and UNKNOWN
. The last state is the one that needs a closer look at.
The UNKNOWN
state occurs for any logical operation where the result cannot be evaluated as either TRUE
or FALSE
, and is connected with the NULL
concept.
A NULL
in a SQL table, can be regarded as a placeholder for missing information, it's not a value as such, just a marker to indicate the lack of information. So in the query for all rows where col2
is not equal to Y
, the rows containing a NULL
are not returned. This is because the answer to the question "Does the col2 column contain values not equal to Y" returns FALSE
, as the column contains the unknown NULL
.
The normal way to select NULL
columns in SQL syntax is using column is null
(the opposite being column is not null
). So to find all the rows in table Fred
where col2
has a NULL
value, use:
select * from fred where col2 is null
NULL
in Perl
In Perl DBI, NULL
column values are represented by undef
(the undefined value). As a result, the following Perl run against the table above:
$sth = $dbh->prepare(q/select * from tut2_19/); $sth->execute; my @row; while(@row = $sth->fetchrow_array) { print join(", ", @row), "\n"; }
1, Y 2, N 3,
results in "Use of uninitialised value in join or string at XXX.pl line NNN," if you have warnings turned on (and you should). Use either of the following examples to avoid this:
# print out row print DBI::neat_list(\@row), "\n"; or # change all NULLs to empty strings foreach (@row) {$_ = '' unless defined};
If you're using a parameterised query, you may expect to do:
$sql = q/select * from table where a = ?/;
$stmt = $dbh->prepare($sql);
$stmt->execute(undef); # pass NULL
as parameter
but with some databases this does not work. In ODBC terms, the above code results in a call to the ODBC API SQLBindParameter
with a data pointer of NULL
and an indicator of SQL_NULL_DATA
. Some ODBC drivers and databases do not select all the rows where the column is NULL
when this method is used. As shown above, a standard way of selecting a NULL
column is where column is null
, but you can't parameterise this. This has been the subject of much discussion and one method put forward is:
my $test = undef; $sql = q/select * from table where (col = ? or (col is null and ? = 1))/; $stmt->execute($test,defined($test)?0:1);