Cassandra Data Model

Connection String Options

  1. Auto Cache
  2. Cache Connection
  3. Cache Driver
  4. Cache Location
  5. Cache Metadata
  6. Connect On Open
  7. Database
  8. Firewall Password
  9. Firewall Port
  10. Firewall Server
  11. Firewall Type
  12. Firewall User
  13. Flatten Arrays
  14. Flatten Objects
  15. Location
  16. Logfile
  17. Max Log File Size
  18. Offline
  19. Other
  20. Pagesize
  21. Password
  22. Port
  23. Proxy Auth Scheme
  24. Proxy Auto Detect
  25. Proxy Password
  26. Proxy Port
  27. Proxy Server
  28. Proxy SSL Type
  29. Proxy User
  30. Pseudo Columns
  31. Query Cache
  32. Query Passthrough
  33. Readonly
  34. Row Scan Depth
  35. RTK
  36. Server
  37. SSL Server Cert
  38. Support Enhanced SQL
  39. Tables
  40. Timeout
  41. User
  42. Verbosity

Auto Cache

Data Type

bool

Default Value

false

Remarks

When AutoCache is set, the driver automatically caches the results of SELECT queries to a cache specified by the CacheLocation option. CacheLocation defines the path to a simple, file-based cache.

AutoCache is the simplest caching configuration available, however, like any caching scheme, using a cache is not without pitfalls such as reporting on stale data. The driver is designed to be fully functional without relying on caching.

The following sections outline how and when to use AutoCache. Understanding how AutoCache works and its limitations will help you choose an effective caching strategy. For more information on deploying other caching strategies, see Caching Data.

How AutoCaching Works

 

When you execute a SELECT statement with AutoCache set, the driver executes the query to the remote data and persists the results; rows and columns that already exist are overwritten. That is, SELECT statements are used to create and refresh the cache, not to query it.

Non-queries (such as UPDATE/INSERT/DELETE statements) are executed to the remote data as well; these statements will not modify the data in the cache at all, regardless of the value set for AutoCache.

To work with the local data, append #CACHE to the table name. For example:

SELECT * FROM [Account#CACHE]
Limitations of AutoCache

 

In the following scenarios, consider the alternatives listed below:

  • When you need to work with the cache transparently: Because AutoCache requires a special syntax to utilize the cache, it is not suitable for use with BI, analytics, and reporting. Many of these tools generate SQL statements for you; these generated statements will still be executed to Cassandra instead of the cache.

    In these situations, one solution is to use the Offline property. When this is set, all queries are executed locally. See Caching: Best Practices for examples.

    One downside of this approach is that it requires a separate connection. As an alternative, consider using the CData Sync tool to maintain a local database that is kept fresh with scheduled updates.

     

  • When you need more control over cached data: The AutoCache feature does not have the ability to remove rows from the cache that were deleted from the remote data. It also does not support dropping a table from the cache or more advanced cache maintenance such as changing the cached table schemas.

    In this scenario, consider CACHE Statements. CACHE statements can remove cached rows that no longer exist in Cassandra.

    See Caching Explicitly for more information on how to use CACHE statements.

  • When you need to work with an RDBMS: AutoCache can only be used with the default database, JavaDB or SQLite. Many enterprises will need to use an RDBMS to support more concurrent writes or integrate with existing infrastructure. You can specify a database driver with CacheConnection and CacheDriver.

 

 

Cache Connection

Data Type

string

Default Value

""

Remarks

The cache database is determined based on the CacheDriver and CacheConnection properties. The CacheConnection defines the connection properties necessary to connect to the cache database.

Cache Driver=com.microsoft.sqlserver.jdbc.SQLServerDriver;Cache Connection='jdbc:sqlserver://localhost:7437;user=sa;password=123456;databaseName=Cache'

 

 

 

Cache Driver

Data Type

string

Default Value

""

Remarks

You can cache to any database that you have a JDBC driver for. The driver has been tested with SQL Server, Derby and Java DB, MySQL, Oracle, and SQLite.

The cache database is determined based on CacheDriver and the CacheConnection properties. The CacheDriver is the name of the JDBC driver class that you would like to use to cache data. The example below caches to SQL Server:

Cache Driver=com.microsoft.sqlserver.jdbc.SQLServerDriver;Cache Connection='jdbc:sqlserver://localhost:7437;user=sa;password=123456;databaseName=Cache'
Note that the driver JAR must be specified on the classpath.

 

 

Cache Location

Data Type

string

Default Value

""

Remarks

If AutoCache is set but the cache location is not specified, CacheLocation defaults to the cache folder on the directory specified by the Location setting.

The CacheLocation is a simple, file-based cache. See the CacheConnection and CacheDriver properties to cache to other databases.

 

Cache Metadata

Data Type

bool

Default Value

false

Remarks

The cache.db file will be created in the location specified by the CacheConnection or if that is not set, the CacheLocation.

 

Connect On Open

Data Type

bool

Default Value

true

Remarks

When set to 'true', a connection will be made to Cassandra when the connection is opened. This property enables the 'Test Connection' feature available in various database tools.

This feature acts as a NOOP command as it is used to verify a connection can be made to Cassandra and nothing from this initial connection is maintained.

Setting this property to 'false' may provide performance improvements (depending upon the number of times a connection is opened).

 

Database

Data Type

string

Default Value

""

Remarks

The name of the Cassandra database.

 

 

Firewall Password

Data Type

string

Default Value

""

Remarks

If FirewallServer is specified, the FirewallUser and FirewallPassword properties are used to connect and authenticate to the given firewall.

 

Firewall Port

Data Type

string

Default Value

""

Remarks

Note that the driver sets the FirewallPort to the default port associated with the specified FirewallType. See the description of the FirewallType option for details.

 

Firewall Server

Data Type

string

Default Value

""

Remarks

If this property is set to a domain name, a DNS request is initiated and the name is translated to the corresponding IP address.

 

Firewall Type

Data Type

string

Default Value

"NONE"

Remarks

The applicable values are:

 

   
Firewall Type Default FirewallPort
TUNNEL 80
SOCKS4 1080
SOCKS5 1080

 

 

Firewall User

Data Type

string

Default Value

""

Remarks

If the FirewallServer is specified, the FirewallUser and FirewallPassword properties are used to connect and authenticate against the firewall.

 

Flatten Arrays

Data Type

string

Default Value

""

Remarks

By default, nested arrays are returned as strings of JSON. The FlattenArrays property can be used to flatten the elements of nested arrays into columns of their own. This is only recommended for arrays that are expected to be short.

Set FlattenArrays to the number of elements you want to return from nested arrays. The specified elements are returned as columns. The zero-based index is concatenated to the column name. Other elements are ignored.

For example, you can return an arbitrary number of elements from an array of strings:

["FLOW-MATIC","LISP","COBOL"]
When FlattenArrays is set to 1, the preceding array is flattened into the following table:

 

   
Column Name Column Value
languages_0 FLOW-MATIC

 

 

Flatten Objects

Data Type

bool

Default Value

false

Remarks

By default, objects nested in arrays are returned as strings of JSON. When FlattenObjects is set to true, object properties are flattened into columns of their own. The property name is concatenated onto the object name with an underscore to generate the column name.

For example, you can flatten the nested objects below at connection time:

[
     { "grade": "A", "score": 2 },
     { "grade": "A", "score": 6 },
     { "grade": "A", "score": 10 },
     { "grade": "A", "score": 9 },
     { "grade": "B", "score": 14 }
]
When FlattenObjects is set to true and FlattenArrays is set to 1, the preceding array is flattened into the following table:

 

   
Column Name Column Value
grades_0_grade A
grades_0_score 2

 

 

Location

Data Type

string

Default Value

""

Remarks

The path to a directory which contains the schema files for the driver (.rsd files for tables and views, .rsb files for stored procedures). The Location property is only needed if you would like to customize definitions (e.g., change a column name, ignore a column, etc.) or extend the data model with new tables, views, or stored procedures.

The schema files used in your application must be deployed with other assemblies. You must also ensure that Location points to the folder that contains the schema files. The folder location can be a relative path from the location of the executable.

 

Logfile

Data Type

string

Default Value

""

Remarks

For more control over what is written to the log file, take a look at Verbosity.

 

Max Log File Size

Data Type

string

Default Value

""

Remarks

A string specifying the maximum size in bytes for a log file (ex: 10MB). When the limit is hit, a new log is created in the same folder with the date and time appended to the end. There is no limit by default. Values lower than 100kB will use 100kB as the value instead.

 

Offline

Data Type

bool

Default Value

false

Remarks

When Offline is set to TRUE, all queries execute against the cache as opposed to the live data source. In this mode, certain queries like INSERT, UPDATE, DELETE, and CACHE are not allowed.

 

Other

Data Type

string

Default Value

""

Remarks

The Other property is a semicolon-separated list of name-value pairs used in connection parameters specific to a data source.

Caching Configuration

 

   
CachePartial=True Caches only a subset of columns, which you can specify in your query.
QueryPassthrough=True Passes the specified query to the cache database instead of using the SQL parser of the driver.

 

Integration and Formatting

 

   
SupportAccessLinkedMode In Access' linked mode, it is generally a good idea to always use a cache as most data sources do not support multiple Id queries. However if you want to use the driver in Access but not in linked mode, this property must be set to False to avoid using a cache of a SELECT * query for the given table.
ConvertDateTimesToGMT Whether to convert date-time values to GMT, instead of the local time of the machine.
RecordToFile=filename Records the underlying socket data transfer to the specified file.
ClientCulture This property can be used to specify the format of data (e.g., currency values) that is accepted by the client application. This property can be used when the client application does not support the machine's culture settings. For example, Microsoft Access requires 'en-US'.
Culture This setting can be used to specify culture settings that determine how the driver interprets certain data types that are passed into the driver. For example, setting Culture='de-DE' will output German formats even on an American machine.

 

 

Pagesize

Data Type

string

Default Value

"5000"

Remarks

The Pagesize property affects the maximum number of results to return per page from Cassandra when executing a query. A higher value will return more results per page, but may also cause more time consuming.

 

Password

Data Type

string

Default Value

""

Remarks

The password used to authenticate with Cassandra.

 

 

Port

Data Type

string

Default Value

"9042"

Remarks

The port for the Cassandra database.

 

 

Proxy Auth Scheme

Data Type

string

Default Value

"BASIC"

Remarks

This value may be BASIC, DIGEST, NONE, NTLM, NEGOTIATE or PROPRIETARY.

 

Proxy Auto Detect

Data Type

bool

Default Value

true

Remarks

This indicates whether to use the default system proxy settings or not. Set ProxyAutoDetect to FALSE to use custom proxy settings. This takes precedence over other proxy settings.

 

Proxy Password

Data Type

string

Default Value

""

Remarks

If the ProxyServer is specified, the ProxyUser and ProxyPassword properties are used to connect and authenticate against the firewall.

 

Proxy Port

Data Type

string

Default Value

"80"

Remarks

See the description of the ProxyServer field for details.

 

Proxy Server

Data Type

string

Default Value

""

Remarks

If this property is set to a domain name, a DNS request is initiated and the name is translated to the corresponding address.

 

Proxy SSL Type

Data Type

string

Default Value

"AUTO"

Remarks

This value may be AUTO, ALWAYS, NEVER, or TUNNEL.

 

Proxy User

Data Type

string

Default Value

""

Remarks

If a ProxyServer is specified, the ProxyUser and ProxyPassword options are used to connect and authenticate against the firewall.

 

Pseudo Columns

Data Type

string

Default Value

""

Remarks

This setting is particularly helpful in Entity Framework, which does not allow you to set a value for a pseudo column unless it is a table column. The value of this connection setting is of the format "Table1=Column1, Table1=Column2, Table2=Column3". You can use the "*" character to include all tables and all columns; i.e., "*=*".

 

Query Cache

Data Type

string

Default Value

""

Remarks

The QueryCache allows you to cache the results of a query in-memory and use them until the cache expires. Setting the QueryCache can improve performance if the same or similar (see below) queries are executed often. The in-memory query cache is shared across connections, so it can help with performance even if more than one connection is being used.

The cache manager for QueryCache will not only use the results in the cache for exactly the same query, but also for queries that represent a subset of data in the cached query. For example, in the following queries, the cache created while executing Query A will be used to obtain the results for both Query B and Query C.

SELECT * from Account; // Query A
SELECT * from Account WHERE Name LIKE '%John'; // Query B
SELECT Id, Name from from Account LIMIT 10; // Query C

 

Setting the QueryCache to zero disables in-memory caching.

 

Query Passthrough

Data Type

bool

Default Value

false

Remarks

This option passes the query to Cassandra as-is.

 

Readonly

Data Type

bool

Default Value

false

Remarks

If this property is set to true, the driver will allow only SELECT queries. INSERT, UPDATE, DELETE, and stored procedure queries will cause an error to be thrown.

 

Row Scan Depth

Data Type

string

Default Value

"100"

Remarks

The columns in a table must be determined by scanning table rows. This value determines the maximum number of rows that will be scanned. The default value is 100.

Setting a high value may decrease performance. Setting a low value may prevent the data type from being determined properly, especially when there is null data.

 

RTK

Data Type

string

Default Value

""

Remarks

The RTK property may be used to license a build. Please see the included licensing file to see how to set this property. The runtime key is only available if you purchased an OEM license.

 

Server

Data Type

string

Default Value

""

Remarks

The host name or IP address of the server hosting the Cassandra database.

 

 

SSL Server Cert

Data Type

string

Default Value

""

Remarks

If using a TLS/SSL connection, this property can be used to specify the TLS/SSL certificate to be accepted from the server. Any other certificate that is not trusted by the machine will be rejected. This can take the form of a full PEM certificate, the path to a file containing the certificate, the public key, the MD5 thumbprint, or the SHA1 thumbprint. If not specified, any trusted certificate will be accepted. Use '*' to signify to accept all certificates (not recommended for security concerns).

 

Support Enhanced SQL

Data Type

bool

Default Value

true

Remarks

Use QueryCache to cache in memory for a given time in seconds or to disable in-memory caching.

 

Tables

Data Type

string

Default Value

""

Remarks

Listing the tables from some databases can be expensive. Providing a list of tables in the connection string improves the performance of the driver.

 

Timeout

Data Type

string

Default Value

"60"

Remarks

If the Timeout property is set to 0, operations do not time out: They run until they complete successfully or encounter an error condition.

If Timeout expires and the operation is not yet complete, the driver throws an exception.

 

User

Data Type

string

Default Value

""

Remarks

The username used to authenticate with Cassandra.

 

 

Verbosity

Data Type

string

Default Value

"1"

Remarks

The verbosity level determines the amount of detail that the driver reports to the Logfile. Verbosity levels from 1 to 5 are supported. These are described below:

 

   
1 Setting Verbosity to 1 will log the query, the number of rows returned by it, the start of execution and the time taken, and any errors.
2 Setting Verbosity to 2 will log everything included in Verbosity 1, cache queries, and HTTP headers.
3 Setting Verbosity to 3 will additionally log the body of the HTTP requests.
4 Setting Verbosity to 4 will additionally log transport-level communication with the data source. This includes SSL negotiation.
5 Setting Verbosity to 5 will additionally log communication with the data source and additional details that may be helpful in troubleshooting problems. This includes interface commands.

The Verbosity should not be set to greater than 1 for normal operation. Substantial amounts of data can be logged at higher verbosities, which can delay execution times.