dtl

DTL Performance Tips

Here is a short list of tips to get the best performance out of DTL.
1. In your classes use a fixed length string such as char[x] or tcstring<x> to read or write character data for large data sets. This allows dtl to bind the data directly to your class. If you use std::string in your class then DTL assumes that the string might be arbitrarily long and calls SQLGetData / SQLPutData to read or write the information which is slower than directly binding a fixed length string.
2. If you are reading a large number of records or writing a large number of records at once then use the bulk_copy or bulk_copy_helper methods as these can use ODBC bulk operations to read/write the data which is much faster than working with one record at a time. This tip will give you the biggest improvement for insert operations. Bulk fetch is about 10% faster than a fetch done one row at a time, but bulk insert is often more than twice as fast as inserts that are done one row at a time.
3. If you are going to be executing a SQL statement multiple times, then try to re-use that iterator (and associated DBView). Re-using an existing iterator to re-execute an existing statement is much faster than re-creating the statement every single time.

DTL Performance Benchmarks

In order to get an understanding of how much overhead is incurred in working with a DBView we ran some tests which compare a DBView::select_iterator to a select statement using the raw ODBC API :



// Read the contents of the DB_EXAMPLE table and return a vector of the

// resulting rows

// Use DTL

vector<Example> ReadData()

{

 vector<Example> results;

 DBView<Example> view("DB_EXAMPLE");



 DBView<Example>::select_iterator read_it = view.begin();

 DBView<Example>::select_iterator end_it = view.end();



 for ( ; read_it != end_it; ++read_it)

 {

  results.push_back(*read_it);

 }



 return results;

}



// Read the contents of the DB_EXAMPLE table and return a vector of the

// resulting rows

// Use raw ODBC API

vector<Example> RawReadData()

{

	vector<Example> results;

	Example row;



	SQLHENV	     	V_OD_Env;			// Handle ODBC environment

	RETCODE		V_OD_erg;			// result of functions

	SQLHDBC		V_OD_hdbc;		// Handle connection

	SQLHSTMT	V_OD_hstmt;



	SQLINTEGER		V_OD_status[10];	// Status SQL, NULL or not

	char                	string_buffer[200];

	TIMESTAMP_STRUCT    	timestamp_buffer;

	 

	V_OD_Env = DBConnection::GetDefaultConnection().GetHENV();

	V_OD_hdbc = DBConnection::GetDefaultConnection().GetHDBC();



	V_OD_erg=SQLAllocHandle(SQL_HANDLE_STMT, V_OD_hdbc, &V_OD_hstmt);

	if ((V_OD_erg != SQL_SUCCESS) && (V_OD_erg != SQL_SUCCESS_WITH_INFO))

	{

		cout << "Error. Unable to alloc handle" << endl;

		exit(0);

	}

	SQLBindCol(V_OD_hstmt,1,SQL_C_SSHORT, &row.exampleInt, 0, &V_OD_status[0]);

	SQLBindCol(V_OD_hstmt,2,SQL_C_CHAR, string_buffer, sizeof(string_buffer), &V_OD_status[1]);

	SQLBindCol(V_OD_hstmt,3,SQL_C_DOUBLE, &row.exampleDouble, 0, &V_OD_status[2]);

	SQLBindCol(V_OD_hstmt,4,SQL_C_SLONG, &row.exampleLong, 0, &V_OD_status[3]);

	SQLBindCol(V_OD_hstmt,5,SQL_C_TIMESTAMP, &timestamp_buffer, 0, &V_OD_status[4]);



	V_OD_erg=SQLExecDirect(V_OD_hstmt,(unsigned char *)"SELECT INT_VALUE, STRING_VALUE, DOUBLE_VALUE, EXAMPLE_LONG, "

	 "EXAMPLE_DATE FROM DB_EXAMPLE",SQL_NTS);   

	if ((V_OD_erg != SQL_SUCCESS) && (V_OD_erg != SQL_SUCCESS_WITH_INFO))

	{

	   cout << "Error. Unable to execute statement" << endl;

	   exit(0);

	}



	while(SQLFetch(V_OD_hstmt) != SQL_NO_DATA)

        {

	   row.exampleStr = string_buffer;

	   row.exampleDate = timestamp_buffer;

           results.push_back(row);

    

        }  

    	SQLFreeHandle(SQL_HANDLE_STMT,V_OD_hstmt);



   	return results;

}

void test_read() {

   // read initial data from table

   _ftime( &start );

   examples = RawReadDataDummy();  // RawReadDataDummy() is the same as RawReadData() -- see below for an explanation

  _ftime( &finish );

   cout << "Dummy: " << finish.time*1000+finish.millitm 

	   - start.time*1000 - start.millitm << endl;







   // read data from table using ODBC API

   cout << "Examples read from database using raw ODBC API:" << endl;

   _ftime( &start );

   for (i = 0; i < 10; i++)

     examples = RawReadData();

   _ftime( &finish );

   cout << "Raw: " << finish.time*1000+finish.millitm 

	   - start.time*1000 - start.millitm << endl;



 

   // read data from table using DBView

   cout << "Examples read from database using DBView:" << endl;

   _ftime( &start );

   for (i = 0; i < 10; i++)

     examples = ReadData();

   _ftime( &finish );

   cout << "Read Data: " << finish.time*1000+finish.millitm 

	   - start.time*1000 - start.millitm << endl;

}

In running these tests you will notice that there is a call to a function called DummyRawReadData. The reason for this is that when we first ran our tests versus MS SQL Server 7.0, we noticed that whichever routine was called first would always run more slowly. It turns out that the way the SQL Server 7 ODBC driver works, when the first statement is executed the SQL is parsed and all 32 records from the DB_EXAMPLE table are stored into a buffer. On our test machine, the population of this buffer takes about 150ms, wheras transferring the contents of the buffer to a local vector requires about 2ms. Furthermore any subsequent ODBC call which executes the same SQL select statement is able to simply reuse the driver's data buffer. So, to in order to not distort perfomance metrics we measure the cost of this first fetch separately by a routine called RawReadDataDummy.

Test results from running the program four times using a 333MHZ Pentium computer with SQLServer 7.0 and the above code. All times shown in milliseconds as output:

RawReadDataDummy (called once)	160	150	160	190
RawReadData (called x10)	18.1	19.1	18.0	19.0
ReadData (called x10)	68.1	67.1	68.1	68.1

This gave us a feel for the overall performance difference between the two methods. The next question we asked was how much of this performance difference was a constant factor (initialization of the DBView + iterator) and how much of this difference was a variable overhead (i.e. a cost per row of data read). To answer this question we ran the code through Numega TrueTime (TM) which was able to measure the amount of time spent on a line by line basis in the functions above. Doing the math with this profiling tool led to the result that 72.4% of the function time for RawReadData was spent in the fetch loop whereas 56.3% of the function time for ReadData was in the fetch loop. This gives us:

Function	Fetch loop time in ms	Time outside fetch loop / fixed overhead
RawReadData	18.5 * .724 = 13.40	18.5 * .276 = 5.11
ReadData	68.1 * .563 = 38.34	68.1 * .437 = 29.76

Therefore if we look at the fetch loop the overhead incurred by the ReadData routine is a total 24.94ms. Since ReadData was called 10 times, and the total number of rows fetched was 32, this gives a per row overhead of 0.08ms / row when using DBView versus a the raw ODBC API calls shown above. Similarly, we can compute the fixed / construction time overhead of using a DBView as 2.46ms.

Conclusions:

There is a performance price to be paid for using the DBView code. However, the per row overhead is relatively small, on the order of 0.08ms / row on our 333MHZ test machine. Most of the overhead in using a DBView comes from the constructor calls associated with creating a DBView and a select iterator. The DBView construction overhead is on the order of 2.46ms. This construction overhead is relatively small compared to the time taken to parse a query and read rows from the database which is on the order of 150ms in RawReadDataDummy above. So, the most significant overhead is the construction cost for a DBView, but for the majority of applications this will not be noticeable as it is dwarfed by the time taken to process the underlying query against the database.

Permission to use, copy, modify, distribute and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appears in all copies and that both that copyright notice and this permission notice appear in supporting documentation. Corwin Joy and Michael Gradman make no representations about the suitability of this software for any purpose. It is provided "as is" without express or implied warranty.

This site written using the ORB.