The Easysoft ODBC-Apache Spark Driver is installed on the computer where your applications are running. ODBC applications access ODBC drivers through the ODBC Driver Manager and a data source. The data source tells the Driver Manager which ODBC driver to load, which Spark Thrift server to connect to and how to connect to it. This chapter describes how to create data sources, use DSN-less connections and configure the Easysoft ODBC-Apache Spark Driver.
Before setting up a data source, you must have successfully installed the Easysoft ODBC-Apache Spark Driver.
For Easysoft ODBC-Apache Spark Driver installation instructions, see Installation.
This section describes how to configure the Easysoft ODBC-Apache Spark Driver to connect to Apache Spark via a Spark Thrift server by using a data source or a DSN-less connection string. The section assumes you are, or are able to consult with, a database administrator.
There are two ways to set up a data source to your Apache Spark data:
By default, the Easysoft ODBC-Apache Spark Driver installation creates a SYSTEM data source named [SPK_SAMPLE]. If you are using the unixODBC included in the Easysoft ODBC-Apache Spark Driver distribution, the SYSTEM odbc.ini file is in /etc.
If you built unixODBC yourself, or installed it from some other source, SYSTEM data sources are stored in the path specified with the configure option --sysconfdir=directory. If sysconfdir was not specified when unixODBC was configured and built, it defaults to /usr/local/etc.
If you accepted the default choices when installing the Easysoft ODBC-Apache Spark Driver, USER data sources must be created and edited in $HOME/.odbc.ini.
To display the directory where unixODBC stores SYSTEM and USER data sources, type odbcinst -j. By default, you must be logged in as root to edit a SYSTEM data source defined in /etc/odbc.ini. |
You can either edit the sample data source or create new data sources.
Each section of the odbc.ini file starts with a data source name in square brackets [ ] followed by a number of attribute=value pairs.
The Driver attribute identifies the ODBC driver in the odbcinst.ini file to use for a data source.
When the Easysoft ODBC-Apache Spark Driver is installed into unixODBC, it places an Easysoft ODBC-Spark entry in odbcinst.ini. For Easysoft ODBC-Apache Spark Driver data sources therefore, you need to include a Driver = Easysoft ODBC-Spark entry.
To configure a SugarCRM data source, in your odbc.ini file, you need to specify the Spark Thrift server and authentication details.
Driver=Easysoft Apache Spark ODBC
Description=Easysoft Apache Spark ODBC driver
The Easysoft ODBC-Apache Spark Driver must be able to find the following shared objects, which are installed during the Easysoft ODBC-Apache Spark Driver installation:
By default, this is located in /usr/local/easysoft/unixODBC/lib.
By default, this is located in /usr/local/easysoft/lib.
By default, this is located in /usr/local/easysoft/lib.
You may need to set and export LD_LIBRARY_PATH, SHLIB_PATH or LIBPATH (depending on your operating system and run-time linker) to include the directories where libodbcinst.so, libeslicshr.so and libessupp.so are located.
The shared object file extension (.so) may vary depending on the operating system (.so, .a or .sl). |
The isql query tool lets you test your Easysoft ODBC-Apache Spark Driver data sources.
To test the Easysoft ODBC-Apache Spark Driver connection
1. Change directory into /usr/local/easysoft/unixODBC/bin.
2. Type ./isql.sh -v data_source, where data_source is the name of the target data source.
3. At the prompt, type an SQL query. For example:
Type help to return a list of tables:
To connect an ODBC application on a Windows machine to a Apache Spark server:
1. Open ODBC Data Source Administrator:
The ODBC Data Source Administrator dialog box is displayed:
2. Select the User DSN tab to set up a data source that only you can access.
Select the System DSN tab to create a data source which is available to anyone who logs on to this Windows machine.
3. Click Add... to add a new data source.
The Create New Data Source dialog box displays a list of drivers:
4. Select Easysoft ODBC-Apache Spark Driver and click Finish.
The DSN Setup dialog box is displayed:For details of the attributes that can be set on this dialog box, see Attribute Fields.
This section lists the attributes which can be set for the Easysoft ODBC-Apache Spark Driver in a table showing:
Attributes which are text fields are displayed as value.
Attributes which are logical fields can contain either 0 (to set to off) or 1 (to set to on) and are displayed as "0|1".
If an attribute can contain one of several specific values then each possible entry is displayed and separated by a pipe symbol.
For example, in the statement:
the value entered may be "1", "2" or "3".
The name of the User or System data source to be created, as used by the application when calling the SQLConnect or SQLDriverConnect functions.
Descriptive text that may be retrieved by certain applications to describe the data source
The host name or IP address of the machine on which your Spark Thrift server is running.
If applicable to the chosen Authentication method, the user name (or LDAP DN) required to gain access to the Spark Thrift server.
The port on which the Spark Thrift Server is listening. For non-HTTP Thrift server transports, the default port is 10000. For HTTP Thrift server transports, the default port is 10001.
The length that the Easysoft ODBC-Apache Spark Driver reports for varchar columns. If you are using the driver under Oracle, and get the error "illegal use of long data type", try setting this attribute to 8000. If you are using the driver under SQL Server and get the error "requested conversion is not supported" try setting this attribute to 2048.
Whether to encrypt date passed over the communications channel between the Easysoft ODBC-Apache Spark Driver and the Spark Thrift server.
Whether to bypass validation of the certificate used by the Spark Thrift server. This setting is only applicable if Encrypt is set to Yes. If TrustServerCert is set to No, you need to specify the path to the Thrift server certificate with the Server Cert attribute.
The certificate used by the Spark Thrift server to encrypt connections to it.
If your Spark Thrift server's hive.server2.authentication attribute is set to NONE, set Authentication to NONE.
If your Spark Thrift server's hive.server2.authentication attribute is set to NOSASL, set Authentication to NOSASL.
If your Spark Thrift server's hive.server2.authentication attribute is set to LDAP, set Authentication to LDAP.
If your Spark Thrift server's hive.server2.authentication attribute is set to Kerberos and the Thrift server uses a non-Windows KDC, set Authentication to Kerberos.
If your Spark Thrift server's hive.server2.authentication attribute is set to Kerberos and the Thrift server uses a Windows AD KDC, set Authentication to AD.
If your Spark Thrift server's hive.server2.transport.mode attribute is set to http, set Authentication to HTTP_BASIC.
If your Spark Thrift server's hive.server2.transport.mode attribute is set to http and uses an access token based authentication scheme (for example Databricks), set Authentication to HTTP_OAUTH.
The Kerberos principle for the Spark Thrift server. This setting is only relevant if Authentication is set to Kerberos or AD.
If you are using an HTTP-basedThrift server transport, set this attribute to the HTTP endpoint for the Spark Thrift server. For example, cliservice.
If your Spark Thrift server's uses an access token based authentication scheme (for example Databricks) specify the token with this attribute.
In addition to using a data source, you can also connect to a database by using a DSN-less connection string of the form:
SQLDriverConnect(..."DRIVER={Easysoft Apache Spark ODBC Driver};
SERVER=mythriftserver;PORT=10000;AUTHENTICATION=NONE"...)
You need to use the Easysoft Apache Spark ODBC Driver (Windows) or Easysoft Apache Spark ODBC (Linux) DRIVER keyword to identify the Easysoft ODBC-Apache Spark Driver.