H2 Logo
English Japanese
Search:

Highlight keyword(s)
Home
Quickstart
Installation
Tutorial
Features
Performance
Advanced Topics
JaQu
Download

Reference
SQL Grammar
Functions
Data Types
Javadoc
Docs as PDF (1 MB)
Error Analyzer

Appendix
Build
History & Roadmap
Links
FAQ
License

 

Performance

Performance Comparison
PolePosition Benchmark
Application Profiling
Database Profiling
Performance Tuning

Performance Comparison

In many cases H2 is faster than other (open source and not open source) database engines. Please note this is mostly a single connection benchmark run on one computer.

Embedded

Test Case Unit H2 HSQLDB Derby
Simple: Init ms 547 532 2594
Simple: Query (random) ms 250 391 1515
Simple: Query (sequential) ms 188 313 1406
Simple: Update (random) ms 812 1750 17704
Simple: Delete (sequential) ms 203 250 8843
Simple: Memory Usage MB 7 11 11
BenchA: Init ms 578 719 3328
BenchA: Transactions ms 3047 2406 12907
BenchA: Memory Usage MB 10 15 10
BenchB: Init ms 2141 2406 11562
BenchB: Transactions ms 1125 1375 3625
BenchB: Memory Usage MB 9 11 8
BenchC: Init ms 688 594 4500
BenchC: Transactions ms 1906 64062 6047
BenchC: Memory Usage MB 11 17 11
Executed statements # 322929 322929 322929
Total time ms 11485 74798 74031
Statements per second # 28117 4317 4362

Client-Server

Test Case Unit H2 HSQLDB Derby PostgreSQL MySQL
Simple: Init ms 2782 2656 5625 4563 3484
Simple: Query (random) ms 3093 2703 6688 4812 3860
Simple: Query (sequential) ms 2969 2594 6437 4719 3625
Simple: Update (random) ms 2969 3531 18250 5953 5125
Simple: Delete (sequential) ms 1047 1250 6875 2485 2390
Simple: Memory Usage MB 7 11 14 0 0
BenchA: Init ms 2250 2453 6031 4328 3625
BenchA: Transactions ms 10250 9016 21484 15609 11172
BenchA: Memory Usage MB 10 15 10 0 1
BenchB: Init ms 9500 10672 22609 19609 13406
BenchB: Transactions ms 2734 2656 3875 4688 2531
BenchB: Memory Usage MB 10 11 11 1 1
BenchC: Init ms 1860 1484 6890 2219 3438
BenchC: Transactions ms 9046 63266 18641 11703 7421
BenchC: Memory Usage MB 12 17 13 0 1
Executed statements # 322929 322929 322929 322929 322929
Total time ms 48500 102281 123405 80688 60077
Statements per second # 6658 3157 2616 4002 5375

Benchmark Results and Comments

H2

Version 1.1.114 (2009-06-01) was used for the test. For simpler operations, the performance of H2 is about the same as for HSQLDB. For more complex queries, the query optimizer is very important. However H2 is not very fast in every case, certain kind of queries may still be slow. One situation where is H2 is slow is large result sets, because they are buffered to disk if more than a certain number of records are returned. The advantage of buffering is, there is no limit on the result set size. The open/close time is almost fixed, because of the file locking protocol: The engine waits some time after opening a database to ensure the database files are not opened by another process.

HSQLDB

Version 1.8.0.10 was used for the test. Cached tables are used in this test (hsqldb.default_table_type=cached), and the write delay is 1 second (SET WRITE_DELAY 1). HSQLDB is fast when using simple operations. HSQLDB is very slow in the last test (BenchC: Transactions), probably because is has a bad query optimizer. One query where HSQLDB is slow is a two-table join:

SELECT COUNT(DISTINCT S_I_ID) FROM ORDER_LINE, STOCK
WHERE OL_W_ID=? AND OL_D_ID=? AND OL_O_ID<? AND OL_O_ID>=?
AND S_W_ID=? AND S_I_ID=OL_I_ID AND S_QUANTITY<?

The PolePosition benchmark also shows that the query optimizer does not do a very good job for some queries. Another disadvantage of HSQLDB is the slow startup / shutdown time (currently not listed) when using bigger databases. The reason is, a backup of the whole data is made whenever the database is opened or closed.

Derby

Version 10.4.2.0 was used for the test. Derby is clearly the slowest embedded database in this test. This seems to be a structural problem, because all operations are really slow. It will be hard for the developers of Derby to improve the performance to a reasonable level. A few problems have been identified: Leaving autocommit on is a problem for Derby. If it is switched off during the whole test, the results are about 20% better for Derby. Derby supports a testing mode (system property derby.system.durability=test) where durability is disabled. According to the documentation, this setting should be used for testing only, as the database may not recover after a crash. Enabling this setting improves performance by a factor of 2.6 (embedded mode) or 1.4 (server mode). Even if enabled, Derby is still less than half as fast as H2 in default mode.

PostgreSQL

Version 8.3.7 was used for the test. The following options where changed in postgresql.conf: fsync = off, commit_delay = 1000. PostgreSQL is run in server mode. It looks like the base performance is slower than MySQL, the reason could be the network layer. The memory usage number is incorrect, because only the memory usage of the JDBC driver is measured.

MySQL

Version 5.1.34-community was used for the test. MySQL was run with the InnoDB backend. The setting innodb_flush_log_at_trx_commit (found in the my.ini file) was set to 0. Otherwise (and by default), MySQL is really slow (around 140 statements per second in this test) because it tries to flush the data to disk for each commit. For small transactions (when autocommit is on) this is really slow. But many use cases use small or relatively small transactions. Too bad this setting is not listed in the configuration wizard, and it always overwritten when using the wizard. You need to change this setting manually in the file my.ini, and then restart the service. The memory usage number is incorrect, because only the memory usage of the JDBC driver is measured.

Firebird

Firebird 1.5 (default installation) was tested, but the results are not published currently. It is possible to run the performance test with the Firebird database, and any information on how to configure Firebird for higher performance are welcome.

Why Oracle / MS SQL Server / DB2 are Not Listed

The license of these databases does not allow to publish benchmark results. This doesn't mean that they are fast. They are in fact quite slow, and need a lot of memory. But you will need to test this yourself. SQLite was not tested because the JDBC driver doesn't support transactions.

About this Benchmark

How to Run

This test was executed as follows:

build benchmark

Separate Process per Database

For each database, a new process is started, to ensure the previous test does not impact the current test.

Number of Connections

This is mostly a single-connection benchmark. BenchB uses multiple connections; the other tests use one connection.

Real-World Tests

Good benchmarks emulate real-world use cases. This benchmark includes 3 test cases: A simple test case with one table and many small updates / deletes. BenchA is similar to the TPC-A test, but single connection / single threaded (see also: www.tpc.org). BenchB is similar to the TPC-B test, using multiple connections (one thread per connection). BenchC is similar to the TPC-C test, but single connection / single threaded.

Comparing Embedded with Server Databases

This is mainly a benchmark for embedded databases (where the application runs in the same virtual machine as the database engine). However MySQL and PostgreSQL are not Java databases and cannot be embedded into a Java application. For the Java databases, both embedded and server modes are tested.

Test Platform

This test is run on Windows XP with the virus scanner switched off. The VM used is Sun JDK 1.5.

Multiple Runs

When a Java benchmark is run first, the code is not fully compiled and therefore runs slower than when running multiple times. A benchmark should always run the same test multiple times and ignore the first run(s). This benchmark runs three times, but only the last run is measured.

Memory Usage

It is not enough to measure the time taken, the memory usage is important as well. Performance can be improved by using a bigger cache, but the amount of memory is limited. HSQLDB tables are kept fully in memory by default; this benchmark uses 'disk based' tables for all databases. Unfortunately, it is not so easy to calculate the memory usage of PostgreSQL and MySQL, because they run in a different process than the test. This benchmark currently does not print memory usage of those databases.

Delayed Operations

Some databases delay some operations (for example flushing the buffers) until after the benchmark is run. This benchmark waits between each database tested, and each database runs in a different process (sequentially).

Transaction Commit / Durability

Durability means transaction committed to the database will not be lost. Some databases (for example MySQL) try to enforce this by default by calling fsync() to flush the buffers, but most hard drives don't actually flush all data. Calling fsync() slows down transaction commit a lot, but doesn't always make data durable. When comparing the results, it is important to think about the effect. Many database suggest to 'batch' operations when possible. This benchmark switches off autocommit when loading the data, and calls commit after each 1000 inserts. However many applications need 'short' transactions at runtime (a commit after each update). This benchmark commits after each update / delete in the simple benchmark, and after each business transaction in the other benchmarks. For databases that support delayed commits, a delay of one second is used.

Using Prepared Statements

Wherever possible, the test cases use prepared statements.

Currently Not Tested: Startup Time

The startup time of a database engine is important as well for embedded use. This time is not measured currently. Also, not tested is the time used to create a database and open an existing database. Here, one (wrapper) connection is opened at the start, and for each step a new connection is opened and then closed.


PolePosition Benchmark

The PolePosition is an open source benchmark. The algorithms are all quite simple. It was developed / sponsored by db4o.

Test Case Unit H2 HSQLDB MySQL
Melbourne write ms 369 249 2022
Melbourne read ms 47 49 93
Melbourne read_hot ms 24 43 95
Melbourne delete ms 147 133 176
Sepang write ms 965 1201 3213
Sepang read ms 765 948 3455
Sepang read_hot ms 789 859 3563
Sepang delete ms 1384 1596 6214
Bahrain write ms 1186 1387 6904
Bahrain query_indexed_string ms 336 170 693
Bahrain query_string ms 18064 39703 41243
Bahrain query_indexed_int ms 104 134 678
Bahrain update ms 191 87 159
Bahrain delete ms 1215 729 6812
Imola retrieve ms 198 194 4036
Barcelona write ms 413 832 3191
Barcelona read ms 119 160 1177
Barcelona query ms 20 5169 101
Barcelona delete ms 388 319 3287
Total ms 26724 53962 87112

There are a few problems with the PolePosition test:

  • HSQLDB uses in-memory tables by default while H2 uses persistent tables. The HSQLDB version included in PolePosition does not support changing this, so you need to replace poleposition-0.20/lib/hsqldb.jar with a newer version (for example hsqldb-1.8.0.7.jar), and then use the setting hsqldb.connecturl=jdbc:hsqldb:file:data/hsqldb/dbbench2;hsqldb.default_table_type=cached;sql.enforce_size=true in Jdbc.properties.
  • HSQLDB keeps the database open between tests, while H2 closes the database (losing all the cache). To change that, use the database URL jdbc:h2:file:data/h2/dbbench;DB_CLOSE_DELAY=-1
  • The amount of cache memory is quite important, specially for the PolePosition test. Unfortunately, the PolePosition test does not take this into account.

Application Profiling

Analyze First

Before trying to optimize performance, it is important to understand where the problem is (what part of the application is slow). Blind optimization or optimization based on guesses should be avoided, because usually it is not an efficient strategy. There are various ways to analyze an application. Sometimes two implementations can be compared using System.currentTimeMillis(). But this does not work for complex applications with many modules, and for memory problems.

A good tool to measure both memory usage and performance is the YourKit Java Profiler .

A simple way to profile an application is to use the built-in profiling tool of java. Example:

java -Xrunhprof:cpu=samples,depth=16 com.acme.Test

Unfortunately, it is only possible to profile the application from start to end. Another solution is to create a number of full thread dumps. To do that, first run jps -l to get the process id, and then run jstack <pid> or kill -QUIT <pid> (Linux) or press Ctrl+C (Windows).


Database Profiling

The ConvertTraceFile tool generates SQL statement statistics at the end of the SQL script file. The format used is similar to the profiling data generated when using java -Xrunhprof. As an example, execute the the following script using the H2 Console:

SET TRACE_LEVEL_FILE 3;
DROP TABLE IF EXISTS TEST;
CREATE TABLE TEST(ID INT PRIMARY KEY, NAME VARCHAR(255));
@LOOP 1000 INSERT INTO TEST VALUES(?, ?);
SET TRACE_LEVEL_FILE 0;

Now convert the .trace.db file using the ConvertTraceFile tool:

java -cp h2*.jar org.h2.tools.ConvertTraceFile
    -traceFile "~/test.trace.db" -script "~/test.sql"

The generated file test.sql will contain the SQL statements as well as the following profiling data (results vary):

-----------------------------------------
-- SQL Statement Statistics
-- time: total time in milliseconds (accumulated)
-- count: how many times the statement ran
-- result: total update count or row count
-----------------------------------------
-- self accu    time   count  result sql
--  62%  62%     158    1000    1000 INSERT INTO TEST VALUES(?, ?);
--  37% 100%      93       1       0 CREATE TABLE TEST(ID INT PRIMARY KEY, NAME VARCHAR(255));
--   0% 100%       0       1       0 DROP TABLE IF EXISTS TEST;
--   0% 100%       0       1       0 SET TRACE_LEVEL_FILE 3;

Database Performance Tuning

Virus Scanners

Some virus scanners scan files every time they are accessed. It is very important for performance that database files are not scanned for viruses. The database engine does never interprets the data stored in the files as programs, that means even if somebody would store a virus in a database file, this would be harmless (when the virus does not run, it cannot spread). Some virus scanners allow to exclude files by suffix. Make sure files ending with .db are not scanned.

Using the Trace Options

If the main performance hot spots are in the database engine, in many cases the performance can be optimized by creating additional indexes, or changing the schema. Sometimes the application does not directly generate the SQL statements, for example if an O/R mapping tool is used. To view the SQL statements and JDBC API calls, you can use the trace options. For more information, see Using the Trace Options .

Index Usage

This database uses indexes to improve the performance of SELECT, UPDATE and DELETE statements. If a column is used in the WHERE clause of a query, and if an index exists on this column, then the index can be used. Multi-column indexes are used if all or the first columns of the index are used. Both equality lookup and range scans are supported. Indexes are used to order result sets, but only if the condition uses the same index or no index at all. The results are sorted in memory if required. Indexes are created automatically for primary key and unique constraints. Indexes are also created for foreign key constraints, if required. For other columns, indexes need to be created manually using the CREATE INDEX statement.

Optimizer

This database uses a cost based optimizer. For simple and queries and queries with medium complexity (less than 7 tables in the join), the expected cost (running time) of all possible plans is calculated, and the plan with the lowest cost is used. For more complex queries, the algorithm first tries all possible combinations for the first few tables, and the remaining tables added using a greedy algorithm (this works well for most joins). Afterwards a genetic algorithm is used to test at most 2000 distinct plans. Only left-deep plans are evaluated.

Expression Optimization

After the statement is parsed, all expressions are simplified automatically if possible. Operations are evaluated only once if all parameters are constant. Functions are also optimized, but only if the function is constant (always returns the same result for the same parameter values). If the WHERE clause is always false, then the table is not accessed at all.

COUNT(*) Optimization

If the query only counts all rows of a table, then the data is not accessed. However, this is only possible if no WHERE clause is used, that means it only works for queries of the form SELECT COUNT(*) FROM table.

Updating Optimizer Statistics / Column Selectivity

When executing a query, at most one index per joined table can be used. If the same table is joined multiple times, for each join only one index is used. Example: for the query SELECT * FROM TEST T1, TEST T2 WHERE T1.NAME='A' AND T2.ID=T1.ID, two index can be used, in this case the index on NAME for T1 and the index on ID for T2.

If a table has multiple indexes, sometimes more than one index could be used. Example: if there is a table TEST(ID, NAME, FIRSTNAME) and an index on each column, then two indexes could be used for the query SELECT * FROM TEST WHERE NAME='A' AND FIRSTNAME='B', the index on NAME or the index on FIRSTNAME. It is not possible to use both indexes at the same time. Which index is used depends on the selectivity of the column. The selectivity describes the 'uniqueness' of values in a column. A selectivity of 100 means each value appears only once, and a selectivity of 1 means the same value appears in many or most rows. For the query above, the index on NAME should be used if the table contains more distinct names than first names.

The SQL statement ANALYZE can be used to automatically estimate the selectivity of the columns in the tables. This command should be run from time to time to improve the query plans generated by the optimizer.

Optimization Examples

See src/test/org/h2/samples/optimizations.sql for a few examples of queries that benefit from special optimizations built into the database.

Cache Size and Type

By default the cache size of H2 is quite small. Consider using a larger cache size, or enable the second level soft reference cache. See also Cache Settings .