Feb. 2006 The chief software engineer of Parity Software, Schwieberdingen near Stuttgart
named in an interview Qt, Perl und db++ as his preferred
development environment with wich one can develop application software with the highest
level of flexibility and efficieny. He ist still thankfull today to Concept asa who
eight years ago suggested Qt when he was facing a decision between Java
2006 The Exact Software in Munich distrubuted,
as usual at the beginning of the year, the update for their Windows payroll
system LohnXL/XXL, which
is based on the database kernel of db++, to more than 10,000
extensive tests with their application simulation programme dbstresser the Parity
Software, Schwieberdingen near Stuttgart delivered Version 8.11.29 to
their customer base.
Sept. 2005 Distribution of an update to the specific
requirements of dsoftware, Neuss
maintaining compatibility with the ASS development
environment and the db++ Version which they deploy. The most important change
was the need to access tables with a size of more than 2 GB on Linux/Windows platforms with a 32bit
2004/2005 David Concept asa brings Goliath DB AG (German Railways) to
its knees in a long drawn out trademark
dispute. Since 1997, Concept asa has attempted to protect the use of db++
not only as an image-mark
— this has been the case in both Germany and abroad since 1990 — but also as a
After initial resistance, this status was achieved in Febuary 2001. But the
enjoyment of this success by Concept asa did
not last very long. In October 2002, the DB AG was successful in revoking the
grant of the word-mark
to Concept asa. But in a final hearing before the 27th Senate of the
Federal Patent court in August 2004, Concept asa won hands down and the word-mark db++ is now
the exclusive property of Concept asa. But we are not vindictive and continue
to travel by train when we are not using our modest business car, a
3Litre-Lupo, in keeping with our motto “db++ is the 3litre consumption car under the data base systems — slim, fast and efficient”.
Development of db++ Versionen 8.11.19 to 8.11.31. Below we detail some
of the extensions and optimisations:
The following reports on performance improvements
should not awaken doubts that db++ was previously slow: on the contrary, db++
always enjoyed the reputation of being extremely fast.
Our ambition is to extend our reputation as one of the fastest data base
systems available also in the future and the
responsible programmer has a
tireless passion in tweaking out
the last drop of speed.
- Begin of the total integration
of the Makefile systems for both Windows and Linux/UNIX with the goal has
that a simple „make“ is all that is required on all platforms.
- Transfer and first
implementation of largefile code developed for EADS Airbus on Sun Sparc for Linux and Windows to permit tables
- Replacing of the last non posix calls with standard posix insofar as efficiency is not compromised. So, for
instance, for the new 2005 finalised >2GB version of db++, a set of
portable I/O/locking routines have been written for Windows. Externally,
these look very much like the standard posix calls. Internally, native
Windows calls are used on that platform. This has be done because the
given standard posix calls under Windows are either unreliable or not
- The mapping mechanism from virtual
relation table names to real directories on different host machines and
db-servers specified in the RELFTAB-file has been partially redesigned to
increase transparency between Linux/UNIX and Windows.
- Extension of the cdbFind()and
cdbCount functions in particular:
(a) Both cdbFind() and cdbCount() can be made to return a “buffer” of
tuples. A maximum of 8 Kilobytes of tuples can be buffered in one call but
the maximum number of tuples to be placed in the buffer can be specified
in the arguments to the extended functions cdbFind() and cdbCount().
Subsequent calls to cdbCount() can be made to return tuple buffers until
end-of-scan is reached.
(b) By setting a corresponding flag, a new search operation can be made
which retains the status of the last search (scan). Thus a buffer of the
first 10 tuples in a new scan can be returned, but a subsequent
cdb_next(), cdb_previous(), etc. applies to the old scan.
- On the first handshake between
a client program and a db-server, it is determined whether the client and
server share the same Endian. If so, data is transferred
as an opaque data block rather than a multiple of different data types. The number of XDR calls is thereby significantly reduced.
- The old ODBC-driver „dbq“ has
been replaced by „dbquery“. The new driver supports not only ODBC but also
native SQL and AQL queries. This obsoletes the need for the undocumented,
but often used function aqlexec().
- One optimisation of the
dbserver „browse“ (cdb_next(), et al) routines made in an earlier customer specific version of db++ had been
inadvertently mislaid in the latest version. The optimisation was as follows: a search with an
incompletely defined constraint may have to search through the entire
relation. But it was not pleasant that an additional operation (say
cdb_last()) also searched to the physical end of the table. Now the key
positions (first, last) are marked by the first search/next sequence. These routines
have been restored.
- A simulation model of a typical
commercial application – called dbstresser – has been developed in
cooperation with Parity Software GmbH. This has been the basis for
profiling with the wonderful „kcachegrind“ tool. Many insights have been
gained to improving performance, in particular the avoidance of expensive
system and library calls (alone the avoidance of sprintf() brought a 10%
improvement of dbserver).
- Additional small optimisations
(a) Use of assembly for string comparison.
(b) Previously, on inserting a tuple, a string length recalculation of
each char field was performed to calculate the maximum print width. Now, a
modified Fibonacci-Function is used so
that initially most insertions the maximum print widths are calculated,
but as the relation grows in size successively
fewer calculations are made. This can occasionally
lead to inaccuracy in formatting output
of records, but the performance gain is certainly preferable.
(c) An quick page cache (simple code) of very frequently accessed pages.
has been inserted before the actual page cache (quite complex code).
(d) On searching for a free-page on which
to place new or modified data, a the search began at the physical start of
the relation file. The new algorithm tries hard to find a free page near
to the logical adjacency of the data. The algorithm also tries to keep the
data in primary and secondary indices as physically
separate as possible.
- Optimisation of the routines
which match a constraint, cbd_next() et al., to avoid, if possible, when
scanning a secondary index, unnecessary
calls to obtain the primary tuple. In certain situations, this can lead to
a an order of magnitude (10 times) or more in performance.
- On flushing modified pages from
the cache to disc, the pages are now presorted to the physical ordering
before writing to disc, rather than the logical order of the cache. For large files, this reduces the disc head movement
- The db-server attempts to use inactive process time
(milliseconds) to perform file flushing operations. Flushing is also triggered
if a weighted variable of operations per
table exceeds a tunable value. Two
parameters are available for fine tuning.
- Each of the optimisations of 1
to 4 was tested by a new module „dbstresser“, developed by the Firm Parity
Software, which simulates a wholesale warehouse application.
- dbodbc has been
almost re-written, casual
insufficient query operations and unpleasant (clumsy) interface problems with
MS tools such as MSAccess, have been resolved as also problems with
various versions of Microsoft OSs, in particular XP have been fixed.
- Final work on the makefile
system for both Windows and Linux/UNIX. There is now only one Makefile in
each directory so that simple „make“ is sufficient to build on all
platforms. Also a revision control/history has been implemented in the
makefile system so that programmes and
libraries have embedded information of the build date and major, minor,
mini and build numbers. These can be later displayed so that, for
instance, a program can output this information for itself and for any
shared libraries with which it is linked.
- Large file support for all
platforms is now standard. In doing this, a set of portable file IO
routines was written which simulate posix calls on Windows with Windows
native IO operations. This also made it possible to clean up the
incompatibilities with locking over heterogeneous
platforms with native, nfs and samba
extensions of cdb_find()
(a) A new function “cdbFindInScan()”, searches for a specific tuple within
an existing scan (potentially thousands of tuples). This function can be
usefully combined with the cdbSavePos() and cdbRestorePos(), to switch
between various positions within a scan.
(b) cdbFind() was sometimes too clever by half.
The function always tried to select the most optimal index (primary or
secondary) to make the search. This has
consequence that resulting scan is sorted according to the optimal
index chosen by the program. A new argument permits the user to specify a
specific index on which the search will be made to retrieve tuples in his
desired sorting sequence and not the sequence chosen by the optimiser.
Current development work
- Completion of earlier, initial
work on blob and memo fields, so that these are not just available through
the C-programming interface but fully integrated in all of the db++ tools
such as „db“, „dbedit“, „dbfsck“ and „dbodbc“.
considerable re-writing of the dbodbc client -side driver and many changes
to the dbodbc server „dbquery“, these are now in extensive beta testing. „dbodbc“ now
contains a new console utility, „drvsetup“ to install and manage dbodbc
DLLs and Data Sources (DSNs). This utility can do almost everything which
the Microsoft ODBC-Manager can do, but from a console. This makes it the preferred way to write
installation scripts for both new and existing customers of our partners. On conclusion of beta testing,
the new “dbodbc” will be distributed to a select group of our customers.
- A completely new
algorithm (not found in the literature) for sorting the entire B-tree has
been implemented which is almost linear, even for tables > 2G, while
requiring very little main memory (< 64K). But if you are lucky
enough to have several gigabytes of
main memory, then setting DB_MAXSORTBUF to a very high value is still
faster. This experimental code will be integrated in the release version