Published: May 28, 2001
By Richard G. Baldwin
Java Programming, Lecture Notes # 1356
The purpose of this miniseries is to help you learn the essential features of Object-Oriented data structures in Java using the Collections Framework.
Supplementary material
I recommend that you also study the other lessons in my extensive collection of online Java tutorials. You will find those lessons published at Gamelan.com. However, as of the date of this writing, Gamelan doesn't maintain a consolidated index of my Java tutorial lessons, and sometimes they are difficult to locate there. You will find a consolidated index at Baldwin's Java Programming Tutorials.
The index on my site provides links to the lessons at Gamelan.com.
We also learned that the framework provides three incomplete implementations. These classes are available for you to use as a starting point in defining your own implementations. Default implementations of many of the interface methods are provided in the incomplete implementations.
The implementations in the Java Collections Framework are the concrete definitions of the classes that implement the core collection interfaces. For example, as of JDK 1.3, the concrete implementations in the Java Collections Framework are provided by the following nine classes.
These classes are available for immediate use to instantiate collection objects.
As you can see, there are two classes that obviously fall into the set category, two that obviously fall into the list category, and three that obviously fall into the map category. I will have more to say about the details of these classes in subsequent lessons.
This leaves two additional classes whose names don't readily divulge the category in which they belong.
Vector and Hashtable classes
The classes Vector and Hashtable were part of Java even before the Java Collections Framework became available. The Vector class can be used to instantiate objects that fall in the general list category.
The Hashtable class can be used to instantiate objects that fall in the map category.
These two classes have been upgraded to make them compatible with the Collections Framework.
Abstract implementations
In addition to the concrete implementations listed above, the following three classes partially implement the interfaces, but are not intended for instantiation. Rather, they are intended to be extended into new concrete classes that you define.
Purpose of algorithms
Algorithms are methods (not necessarily exposed) that provide useful capabilities, such as searching and sorting. For example, the Collection interface declares an exposed method named contains.
The contains method
You can safely invoke the contains method on any object instantiated from a class that properly implements the Collection interface, even if you don't know the actual type of the collection object.
The manner in which the search will be performed will probably differ from one concrete implementation of the interface to the next. For example, a TreeSet object will perform the search very rapidly with a time cost of only log(n) where n is the number of elements. On the other hand, for the same number of elements, because of a different underlying data structure, a search on an ArrayList object will probably require more time than a search on a TreeSet object. As the number of elements increases, the difference in time cost between the two will also increase.
Consider the sample program shown in Listing 1. This program compares
the search speed of the ArrayList class and the TreeSet class.
A detailed discussion of the program follows Listing 1.
/*File SpeedTest01 Copyright 2001 R.G.Baldwin **************************************/ import java.util.*; public class SpeedTest01{ public static void main( String args[]){ new Worker().doIt(); }//end main() }//end class SpeedTest01 class Worker{ public void doIt(){ int size = 500000; //Create a TreeSet object Collection aTree = new TreeSet(); //Populate the TreeSet object with // random values. The add() method // for a set rejects duplicates. Random rnGen = new Random(); for(int ct = 0; ct < size; ct++){ aTree.add(new Double( rnGen.nextDouble())); }//end for loop //Create and populate an ArrayList // object with the same random // values Collection aList = new ArrayList(aTree); //Extract a value near the center // of the ArrayList object to use // as a test case. Object testVal = ((List)aList).get(size/2); //Search for the test value in each // of the collection objects. // Measure and display the time // required to perform the search // in each case. long start = new Date().getTime(); boolean found = aList.contains(testVal); long stop = new Date().getTime(); System.out.println( found + " " + (stop - start)); start = new Date().getTime(); for(int x = 0; x < 10000; x++){ found = aTree.contains(testVal); }//end for loop stop = new Date().getTime(); System.out.println( found + " " + (stop - start)); }//end doIt() }// end class Worker Listing 1 |
The program begins by instantiating a TreeSet object and populating it with approximately 500,000 elements as shown in Listing 2 below. The values encapsulated in the objects referred to by the elements in the collection are produced by a random number generator.
Recall that the add method of a Set object rejects duplicate
elements, so there may be fewer than 500,000 elements in the object after
it is populated.
public void doIt(){ int size = 500000; Collection aTree = new TreeSet(); Random rnGen = new Random(); for(int ct = 0; ct < size; ct++){ aTree.add(new Double( rnGen.nextDouble())); }//end for loop Listing 2 |
One of the capabilities of the Collection Framework is to create a new Collection object and populate it with the contents of an existing Collection object of a different (or the same) actual type.
The code in Listing 3 instantiates an ArrayList object and populates
it with the contents of the existing TreeSet object. As a
result, we then have two different Collection objects of different
actual types containing the same elements.
Collection aList = new ArrayList(aTree); Listing 3 |
The objective of this program is to compare the times required to search for and find an element in each of the collections. Thus, we need a target element to search for.
The code in Listing 4 below extracts a value near the center of the ArrayList object using an index to find and extract the value. This is a very fast operation on a List object. This value is saved in testVal to be used later for test purposes.
Note that the reference to the ArrayList object was saved as type Collection in Listing 3 above.
Note also that it was necessary to cast that reference to type List
in Listing 4 below in order to invoke the get method on the reference.
This is because the Collection interface does not declare a method
named get. Rather, the get method is added to the List
interface to define a more specialized form of collection.
Object testVal = ((List)aList).get(size/2); Listing 4 |
The code in Listing 5 below invokes the contains method to search
for the test value in each of the collections. It uses the system
clock to measure the time required to find the element in each case.
I will assume that you understand how to use the Date class for
this purpose, and won't provide a detailed explanation.
long start = new Date().getTime(); boolean found = aList.contains(testVal); long stop = new Date().getTime(); System.out.println( found + " " + (stop - start)); start = new Date().getTime(); for(int x = 0; x < 10000; x++){ found = aTree.contains(testVal); }//end for loop stop = new Date().getTime(); System.out.println( found + " " + (stop - start)); }//end doIt() Listing 5 |
The output from the program was:
true 171
true 30
The first line of output applies to the ArrayList object, and the second line applies to the TreeSet object.
As we would expect, the test value was successfully found in both cases; hence the display of true in both cases.
Time required to search ArrayList
The output indicates that approximately 171 milliseconds were required to find the test value in the ArrayList object.
Time required to search TreeSet object
On the other hand, the time required to find the test value in the TreeSet object was so small that it wasn't even measurable within the granularity of the system clock (other experiments have caused me to believe that the granularity of the system clock on this machine is at least ten milliseconds). Hence, the original reported time required to find the test value in the TreeSet object was zero.
In order to get a measurable time value to search the TreeSet object, I had to wrap the invocation of the contains method in a for-loop and search for the same value 10,000 times in succession. Thus, the time required to find the test value in the TreeSet object was approximately 0.0030 milliseconds as compared to 171 milliseconds for the ArrayList object.
(I'll let you do the arithmetic to see if this makes sense in terms of the expected time cost to search the two different types of objects. Don't forget the extra overhead of the for-loop.)
Different implementations
This is a graphic demonstration that even though both objects can be treated as type Collection, and the contains method can be invoked on either object in a polymorphic manner, the actual implementations of the two objects and the implementations of the contains methods in those two objects are different.
Each type of collection has advantages and disadvantages, depending on your needs.
Polymorphic behavior applies
The important point is that if you receive a reference to the collection object as type Collection, you can invoke the contains method on that reference without regard to the underlying structure of the collection object. This is because polymorphic behavior applies.
Very briefly, polymorphic behavior means that the actual method that is executed is the appropriate method for that type of object regardless of the type of the reference to the object. This is one of the great advantages of using the Java Collections Framework and passing collection objects among methods as interface types.
Sorting algorithms
Some of the implementations of the Java Collection Framework maintain their elements in a random order, and other implementations maintain their elements in a sorted order. Thus, the framework also provides sorting algorithms. However, the sorting algorithms used to maintain the order of the collections are not exposed in the way that the search algorithm is exposed (via the contains() method). Rather, the sorting algorithms are implicit in those implementations that need them, and are absent from those implementations that don't need them.
Now for a little quiz
Let's see if you are still awake. Select the words in one pair of parentheses in the following statement that causes the statement to be true.
The interfaces in the Collections Framework make it possible to manipulate the contents of collections in a manner that is (dependent on) (independent of) the underlying implementation of each collection.
And the answer is ...
The interfaces in the Collections Framework make it possible to manipulate the contents of collections in a manner that is independent of the underlying implementation of each collection. That is the beauty of basing the framework on interfaces that declare polymorphic methods.
I placed this question here to drive home the point that the methods declared in the Collection interface can be invoked on collection objects in a polymorphic manner.
That is to say, as a user of an object instantiated from a class that properly implements the Collection interface, you can invoke the methods declared in that interface on a reference to the object and be confident that the actual method that is invoked will be the version that is appropriate for the class from which the object was instantiated. This is polymorphic behavior.
In the event that you need to invoke a method that is not declared in the Collection interface (such as the get() method in Listing 4 above), you can pass the reference as one of the more specialized sub-interfaces of Collection, such as Set.
Benefits of using the Collections Framework
The Java Tutorial from Sun lists and explains the following benefits of using the Java Collections Framework.
The core collection interfaces in the Collections Framework are shown
in Listing 6.
|
The basic purpose of the core collection interfaces in the Java Collections Framework is to allow collections to be manipulated without regard for how the collections are implemented, provided of course that the implementations comply with the contracts.
The framework provides the following nine concrete implementations (classes) of the interfaces shown in Listing 1:
A collection object instantiated from the class TreeSet and a collection object instantiated from the class ArrayList can each be viewed as being of the interface type Collection.
Methods having the same signatures can be used to manipulate either collection with confidence that the behavior of the method will be appropriate for the actual type of collection involved.
The framework also provides the following incomplete implementations of the core interfaces:
Copyright 2000, Richard G. Baldwin. Reproduction in whole or in part in any form or medium without express written permission from Richard Baldwin is prohibited.
Richard has participated in numerous consulting projects involving Java, XML, or a combination of the two. He frequently provides onsite Java and/or XML training at the high-tech companies located in and around Austin, Texas. He is the author of Baldwin's Java Programming Tutorials, which has gained a worldwide following among experienced and aspiring Java programmers. He has also published articles on Java Programming in Java Pro magazine.
Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.
-end-