Friday, May 23, 2008

Data load speed test

I've run some data load tests with various databases using DBMonster, so connecting to databases through JDBC on a WindowsXP personal computer.
Here are the results, in both cases I loaded 100 rows in the parent table and 1000 in the child table, with foreign keys enabled.


Firebird 2.1 with Jaybird 2.1.3 and DBMonster 1.0.3 (And Java .6)

Table structure is:

CREATE TABLE GUYS(
GUY_ID Integer NOT NULL,
GUY_NAME Varchar(45) NOT NULL,
CONSTRAINT PK_GUYS PRIMARY KEY (GUY_ID)
);
GRANT DELETE, INSERT, REFERENCES, SELECT, UPDATE
ON GUYS TO SYSDBA WITH GRANT OPTION;

CREATE TABLE BADS_ATTRIBUTES(
ATTRIBUTE_ID Integer NOT NULL,
GUY_ID Integer NOT NULL,
ATTRIBUTE_NAME Varchar(45) NOT NULL,
CONSTRAINT PK_BADS_ATTRIBUTES PRIMARY KEY (ATTRIBUTE_ID,GUY_ID)
);
ALTER TABLE BADS_ATTRIBUTES ADD CONSTRAINT FK_BADS_ATTRIBUTES_1
FOREIGN KEY (GUY_ID) REFERENCES GUYS (GUY_ID) ON UPDATE CASCADE ON DELETE CASCADE;
GRANT DELETE, INSERT, REFERENCES, SELECT, UPDATE
ON BADS_ATTRIBUTES TO SYSDBA WITH GRANT OPTION;


D:\dbmonster-core-1.0.3\bin>dbmonster --grab -t guys bads_attributes -o d:/fireb
ird_2_1_schema.xml

D:\dbmonster-core-1.0.3\bin>rem Batch file to run dbmonster under Windows

D:\dbmonster-core-1.0.3\bin>rem Contributed by Peter De Bruycker
2008-05-23 22:43:03,203 INFO SchemaGrabber - Grabbing schema from database. 2 tables to grab.
2008-05-23 22:43:03,265 INFO SchemaGrabber - Grabbing table GUYS. 50% done.
2008-05-23 22:43:03,359 INFO SchemaGrabber - Grabbing table BADS_ATTRIBUTES. 100% done.
2008-05-23 22:43:03,359 INFO SchemaGrabber - Grabbing schema from database comp
lete.

D:\dbmonster-core-1.0.3\bin>dbmonster -s d:/firebird_2_1_schema.xml

D:\dbmonster-core-1.0.3\bin>rem Batch file to run dbmonster under Windows

D:\dbmonster-core-1.0.3\bin>rem Contributed by Peter De Bruycker
2008-05-23 22:49:32,828 INFO DBMonster - Let's feed this hungry database.
2008-05-23 22:49:32,984 INFO DBCPConnectionProvider - Today we are feeding: Fir
ebird 2.1 Beta 2=WI-T2.1.0.16780 Firebird 2.1 Beta 2/tcp (xxx)/P10 W
I-T2.1.0.16780 Firebird 2.1 Beta 2=WI-T2.1.0.16780 Firebird 2.1 Beta 2/tcp (pm-7
071b5d42629)/P10
2008-05-23 22:49:33,187 INFO Schema - Generating schema .
2008-05-23 22:49:33,187 INFO Table - Generating table .
2008-05-23 22:49:33,187 INFO Table - Generating table .
2008-05-23 22:49:33,359 INFO Table - Generation of table finished.
2008-05-23 22:49:39,375 INFO Table - Generation of table finished.
2008-05-23 22:49:39,375 INFO Schema - Generation of schema finished.
2008-05-23 22:49:39,375 INFO DBMonster - Finished in 6 sec. 547 ms.

D:\dbmonster-core-1.0.3\bin>

The same with MySQL 5.1.23 Connector/J 5.1.6 (InnoDB tables of course as I wanted to have FK)

Table structure is:

DROP TABLE IF EXISTS `test`.`guys`;
CREATE TABLE `test`.`guys` (
`guy_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`guy_name` varchar(45) NOT NULL,
PRIMARY KEY (`guy_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1001 DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`bads_attributes`;
CREATE TABLE `test`.`bads_attributes` (
`attribute_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`guy_id` int(10) unsigned NOT NULL,
`attribute_name` varchar(45) NOT NULL,
PRIMARY KEY (`attribute_id`,`guy_id`),
KEY `FK_bads_attributes_1` (`guy_id`),
CONSTRAINT `FK_bads_attributes_1` FOREIGN KEY (`guy_id`) REFERENCES `guys` (`guy_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=139 DEFAULT CHARSET=utf8;

D:\dbmonster-core-1.0.3\bin>dbmonster --grab -t guys bads_attributes -o d:/mysql
_5_1_23_schema.xml

D:\dbmonster-core-1.0.3\bin>rem Batch file to run dbmonster under Windows

D:\dbmonster-core-1.0.3\bin>rem Contributed by Peter De Bruycker
2008-05-23 22:59:40,515 INFO SchemaGrabber - Grabbing schema from database. 2 tables to grab.
2008-05-23 22:59:40,671 INFO SchemaGrabber - Grabbing table guys. 50% done.
2008-05-23 22:59:40,703 INFO SchemaGrabber - Grabbing table bads_attributes. 100% done.
2008-05-23 22:59:40,703 INFO SchemaGrabber - Grabbing schema from database complete.

D:\dbmonster-core-1.0.3\bin>dbmonster -s d:/mysql_5_1_23_schema.xml

D:\dbmonster-core-1.0.3\bin>rem Batch file to run dbmonster under Windows

D:\dbmonster-core-1.0.3\bin>rem Contributed by Peter De Bruycker
2008-05-23 23:00:02,531 INFO DBMonster - Let's feed this hungry database.
2008-05-23 23:00:02,953 INFO DBCPConnectionProvider - Today we are feeding: MyS
QL 5.1.23-rc-community
2008-05-23 23:00:03,093 INFO Schema - Generating schema .
2008-05-23 23:00:03,093 INFO Table - Generating table .
2008-05-23 23:00:03,125 INFO Table - Generating table .
2008-05-23 23:00:12,812 INFO Table - Generation of table finished.
2008-05-23 23:00:49,000 INFO Table - Generation of table finished.
2008-05-23 23:00:49,000 INFO Schema - Generation of schema finished.
2008-05-23 23:00:49,000 INFO DBMonster - Finished in 14 sec. 187 ms.

D:\dbmonster-core-1.0.3\bin>

The difference is quite large!
You can compare my results to those obtained by the SQLite team, hope that these numbers make sense to you.

I'll try with PostgreSQL too, just don't know when

4 comments:

sdirector said...

InnoDB insert speed is going to be largely tied to the innodb_buffer_pool_size and innodb_log_file_size settings. The default values of these settings are not suitable for an input speed test - did you raise them past their default values?

Mark Robson said...

Insert speed on such tiny tables is highly irrelevant in the real world. Try 10M rows for a proper test; additionally see the notes above about innodb tuning.

What batch size did it use between commits? All in one transaction? This is likely to make a big difference on any database.

Of course in MySQL if you have autocommit turned on, innodb will be committing every single insert statement to the disc - you can improve this by using batch inserts (which are done as a single transaction no matter how many rows are added).

Real-world databases have significant numbers of rows.

What kind of hardware did you try it on (was this production-spec hardware? Did you have a battery-backed raid controller?) , also was this over a network or locally?

pabloj said...

Thanks to both of you for these comments, I'd just like to point out that I wasn't trying to benchmark a datawarehouse like massive load process, in that case I wouln't have used straight inserts with keys enabled and so on and speed would have been much higher.

pabloj said...

Another take on this http://dammit.lt/2008/05/26/insert-speed-paramy-auto-inc/