Database Writing
Whereas the performance of reading from a database is largely dependent on the database setup itself, when writing to a database there are very many FME parameters that can help to fine tune the workspace performance.
Remember that writing to a database incurs network overheads. There has to be a balance between the amount of data being sent (the network traffic), database performance, and the risk of losing uncommitted data.
Each database Writer has a set of parameters for handling the number of features to write for any particular transaction. Not every format supports these, but the two most common parameters are Features per Transaction and Features per Bulk Write.
Features per Bulk Write
The Features per Bulk Write parameter tells FME how many features to send at a time to the database. Features sent to a database Writer will get cached in memory by FME until this number of features is reached, and only then will they be sent to the database. This is also known as chunk size.
This parameter is a way to balance server performance and network traffic with FME performance. The higher the number the more features FME caches, but the fewer requests it needs to make to the database (and therefore less network traffic).
A lower number means FME caches less data, but there are more requests made to the database.
Features per Bulk Write also needs to be considered against the value of Features per Transaction.
Features Per Transaction
Features per Transaction (also sometimes called Transaction Interval) controls how many features are entered into a database before a commit command is issued.
Each commit adds delay to the writing process, so setting this parameter must balance the speed of the translation (a higher number) against the risk that a translation may fail and features need to be rolled-back (a lower number).
For example, if this parameter is set to a value of 1, then each and every feature is committed individually. If the process fails then only the last feature will be lost from the database, but the costs is much reduced performance.
Alternatively, if the parameter is set to a very high value (more than the number of features being written) then only one commit takes place and performance improves. However, if the translation fails, then all features previously written will be rolled-back and lost to the database.
If Features per Transaction is less or equal to Features per Bulk Write, then FME basically caches a number of features and sends them to the database where they are immediately committed.
If Features per Transaction is greater than Features per Bulk Write, then FME is sending features to the database where they will be cached until the transaction commit total is reached.
Jake Speedie says… |
The Transaction and Chunk parameters can differ from format to format. For example, SQL Server has a single Bulk Option flag rather than a numeric setting. Therefore it’s very important that you check out the FME Readers and Writers Manual to confirm what parameters are available for your database, and how exactly they operate. |
Writing and Indexing
Whereas indexes can improve performance for reading data, for writing they can cause a great reduction in the speed of translation.
That’s because the index will often get re-built with every feature (or every transaction) that is committed to the database.
To remedy this it’s suggested that indexes are dropped (deleted) before carrying out bulk inserts of data into a database table.
A Writer feature type also has options to truncate or drop tables when writing to them:
For the above reason, dropping a table is more efficient than truncating it because the drop action also removes the indexes.
For similar reasons, you may want to turn off networking connectivity when writing data to a Geodatabase geometric network.
Jake Speedie says… |
If you think these sort of things don't make much of a difference, consider these two (real-life) log files:
2016-04-05 06:19:04|1069.3| 0.2|INFORM|Transaction #1488 was successfully committed 2016-04-05 06:19:04|1069.7| 0.5|INFORM|Transaction #1489 was successfully committed 2016-04-07 15:19:01|1536.1| 0.5|INFORM|Transaction #1488 was successfully committed 2016-04-07 15:19:09|1536.7| 0.5|INFORM|Transaction #1489 was successfully committedA user had a workspace to write to a database. He ran it on two different machines and complained that, whereas Machine A was fast, Machine B was way slower. As you can see from the first log, Machine A committed transaction 1489 in 0.5 seconds. Machine B also committed transaction 1489 in 0.5 seconds. However, the elapsed time (15:19:01 to 15:19:09) was 8 seconds. In short, a network or database issue with Machine B was causing a non-FME slowdown in each database commit. 8 seconds might not sound a lot, but with 1,500 transactions that's an accumulated delay of over 3 hours! |