Idea
Think about an application that offers to change password and upon checking the new password for strength we consult a dictionary in memory to provide quick response or to read it from disk every time a user changes a password.
At first glance, I don't know if it's worth to preload a dictionary in memory. Changing passwords happens seldom and a user might feel ok if he has to wait a few seconds for this to complete if he get's a nice message saying that his password is checked for strength. We can release this 80 megs of memory in the JVM :-)
Test Setup
An
eclipse project is attached to this page if you want to try it by yourself (see "lasttest.zip" link in the grey box).
- Source: text-file with alphabetically ordered dictionary words (15MB, ca. 1.5 Mio entries)
- Fill lists sequentially from file, therefore ArrayList and LinkedList will have alphabetical order as well
- Search for a string 'zzzz' as this is assumed to be near the end of the dictionarys entries
- Search time measured in JUnit test-setup with System.currentTimeMillis before and after call to Collection.contains("zzzz")
- Measured on: workstation (3GHz/1GB) under WinNT, JDK 1.4.2_04
Results
This is the result of a 2nd run of the test on my machine. The test runs only once (hence there's no warm up phase).
Init with 1450252 entries using a java.util.Vector
Time: 3125 ms
Used Heap Start: 300424 bytes
Used Heap End: 86314992 bytes
Consumed Heap: 86014568 bytes
Time: 0 ms
Time: 62 ms
Init with 1450125 entries using a java.util.HashSet
Time: 7892 ms
Used Heap Start: 442920 bytes
Used Heap End: 115511600 bytes
Consumed Heap: 115068680 bytes
Time: 0 ms
Time: 0 ms
Init with 1450252 entries using a java.util.ArrayList
Time: 1735 ms
Used Heap Start: 4875040 bytes
Used Heap End: 88911368 bytes
Consumed Heap: 84036328 bytes
Time: 0 ms
Time: 62 ms
Init with 1450252 entries using a java.util.LinkedList
Time: 1906 ms
Used Heap Start: 4875568 bytes
Used Heap End: 107856872 bytes
Consumed Heap: 102981304 bytes
Time: 0 ms
Time: 204 ms
Comments
- HashSet takes longest to fill. That's about clear as we have to deal with many hash calculations and - I guess - some collisions because of the nature of the dictionary words.
- Finding the entry "zzzz" in all types of lists is really fast. Consider LinkedList where we have to traverse the whole list up to the end (at least 1.4 Mio entries) before we have a match. This results in 0.15 microseconds per "next-compare" step (204ms / 1'400'000 step). This in turn means that my PC uses roughly 450 CPU cycles for one next-compare step (3*10^9 cycles/sec * 0.15*10^-6 sec/step).
- Is this nice work of the JIT engineers? I say yes!