[132066 views]

[]

[toggle ads]

Java Anti-Patterns

This page collects some bad code that may not look so obviously bad to beginners. Beginners often struggle with the language syntax. They also have little knowledge about the standard JDK class library and how to make the best use of it. In fact I have collected all examples from everyday junior code. I have modified the original code to give it example character and such that it highlights the problems. Many of these problems can easily be detected by FindBugs, which is available as a simple Eclipse Plug-in. I strongly recommend this tool to any beginner programmer. Also pros should run it from time to time on their codebase, and review its output carefully. It an easy to use tool and I always find some bugs when I use it.

Some of these may seem like micro-optimization, premature optimization without profiling or constant factor optimizations. But performance and memory wasted in thousands of these small places adds up quickly and will grind an application to a crawl. And when I say application, I mean a server-side application running on an application server. That's what I do for a living. On desktop GUI applications the situation may not be as bad. But then, what's the only relevant platform that runs client-side Java applications? Android. An embedded platform with very limited resources (memory!). Here even constant factor optimizations pay off quickly. Like iterating over arrays instead of lists.

In the end a lot of your application's performance depends on the overall quality of your code. By the way you should never underestimate the importance of memory footprint. I can't stress that enough. I have seen too many applications with crazy garbage collection overhead and out of memory errors. Even though garbage collection is quite fast, most server-side code's scalability is dominated and limited primarily by its memory use per request/transaction and the request/transaction duration. Improving either of these by a constant factor will directly give you a higher throughput by that factor. If the factor is 10, it can mean supporting 100 or 1000 users, which can make all the difference to your customer.

Compare these scenarios (assume 100MB young generation):

Scenario thread pool tx duration => max. tx / s mem / tx => garbage / min GC / min
base 30 100 ms 300 50 KB 900 MB 9
slower 30 1000 ms 30 50 KB 90 MB 0.9
more mem 30 100 ms 300 500 KB 9 GB 90
excess mem 30 100 ms 300 5 MB 90 GB 900

In the slower scenario the transaction duration is 10 times longer. This immediately cuts the maximum number of transactions per second by the factor of 10 as well (limited thread-pool, limited CPU resources). In the more mem scenario each transaction uses 10 times as much memory. This directly bumps up the number of garbage collections to over one per second, which causes non-negligible overhead. Using much more memory like in scenario excess mem this would lead to 15 collections per second, leaving 66ms per collection which is clearly not enough. The system will thrash. Also 66ms is below the transaction duration of 100ms, so many running transactions will still hold onto memory, preventing it from collection, and causing a propagation of that memory to older generations. This means the older generations will start growing and will need a large (slow) collection sooner. The application in that scenario no longer performs. I think this clearly shows how bad excess memory consumption is, compared to just slow code. All your superfast code can't help you when you allocate too much memory.

String concatenation

String s = "";
for (Person p : persons) {
    s += ", " + p.getName();
}
s = s.substring(2); //remove first comma
This is a real performance killer: O(persons.length˛). The repeated concatenation of strings in a loop causes excess garbage and array copying. Moreover it is ugly that the resulting string has to be fixed for an extra comma.
StringBuilder sb = new StringBuilder(persons.size() * 16); // well estimated buffer
for (Person p : persons) {
    if (sb.length() > 0) sb.append(", ");
    sb.append(p.getName);
}

Lost StringBuffer performance

StringBuffer sb = new StringBuffer();
sb.append("Name: ");
sb.append(name + '\n');
sb.append("!");
...
String s = sb.toString();
Despite good intentions the above code is not perfect. The most obvious mistake is the string concatenation in line 3. In line 4 appending a char would be faster than appending a String. An also major omission is the missing length initialization of the buffer which may incur unnecessary resizing (array copying). In JDK 1.5 and above a StringBuilder instead of StringBuffer should have been used: because it is only a local variable the implicit synchronization is overkill. Actually, using simple String concatenation compiles to almost perfect byte code: it's only missing the length initialization.
StringBuilder sb = new StringBuilder(100);
sb.append("Name: ");
sb.append(name);
sb.append("\n!");
String s = sb.toString();
String s = "Name: " + name + "\n!";

Testing for string equality

if (name.compareTo("John") == 0) ...
if (name == "John") ...
if (name.equals("John")) ...
if ("".equals(name)) ...
None of the above comparisons is wrong - but neither are they really good. The compareTo method is overkill and too verbose. The == operator tests for object identity which is probably not what you want. The equals method is the way to go, but reversing the constant and variable would give you extra safety if name is null.
if ("John".equals(name)) ...
if (name.length() == 0) ...
if (name.isEmpty()) ...

Converting numbers to Strings

"" + set.size()
new Integer(set.size()).toString() 
The return type of the Set.size() method is int. A conversion to String is wanted. These two examples in fact do the conversion. But the first incurs the penalty of a concatenation operation (translates to (new StringBuilder()).append(i).toString())). And the second creates an intermediate Integer wrapper. The correct way of doing it is one of these
Integer.toString(set.size())

Not taking advantage of immutable objects

zero = new Integer(0);
return Boolean.valueOf("true");
Integer as well as Boolean are immutable. Thus it doesn't make sense to create several objects that represent the same value. Those classes have built-in caches for frequently used instances. In the case of Boolean there are even only two possible instances. The programmer can take advantage of this:
zero = Integer.valueOf(0);
return Boolean.TRUE;

XML parsers are for sissies

int start = xml.indexOf("<name>") + "<name>".length();
int end = xml.indexOf("</name>");
String name = xml.substring(start, end);
This naive XML parsing only works with the most simple XML documents. It will however fail if a) the name element is not unique in the document, b) the content of name is not only character data c) the text data of name contains escaped characters d) the text data is specified as a CDATA section e) the document uses XML namespaces. XML is way too complex for string operations. There is a reason why XML parsers like Xerces are a over one megabyte jar files! The equivalent with JDOM is:
SAXBuilder builder = new SAXBuilder(false);
Document doc = doc = builder.build(new StringReader(xml));
String name = doc.getRootElement().getChild("name").getText();

Assembling XML with String operations

String name = ...
String attribute = ...
String xml = "<root>"
            +"<name att=\""+ attribute +"\">"+ name +"</name>"
            +"</root>";
Many beginners are tempted to produce XML output like shown above, by using String operations (which they know so well and which are so easy). Indeed it is very simple and almost beautiful code. However it has one severe shortcoming: It fails to escape reserved characters. So if the variables name or attribute contain any of the reserved characters <, >, &, " or ' this code would produce invalid XML. Also as soon as the XML uses namespaces, String operations may quickly become nasty and hard to maintain. Now XML should be assembled in a DOM. The JDom library is quite nice for that.
Element root = new Element("root");
root.setAttribute("att", attribute);
root.setText(name);
Document doc = new Documet();
doc.setRootElement(root);
XmlOutputter out = new XmlOutputter(Format.getPrettyFormat());
String xml = out.outputString(root);

The XML encoding trap

String xml = FileUtils.readTextFile("my.xml");
It is a very bad idea to read an XML file and store it in a String. An XML specifies its encoding in the XML header. But when reading a file you have to know the encoding beforehand! Also storing an XML file in a String wastes memory. All XML parsers accept an InputStream as a parsing source and they figure out the encoding themselves correctly. So you can feed them an InputStream instead of storing the whole file in memory temporarily. The byte order (big-endian, little-endian) is another trap when a multi-byte encoding (such as UTF-8) is used. XML files may carry a byte order mark at the beginning that specifies the byte order. XML parsers handle them correctly.

char is not int

int i = in.read();
char c = (char) i;
The above code assumes that you can create a character from a number. It's true technically: a character's number is the 16 bit Unicode codepoint number. But it is semantic nonsense. In Java the character is a semantic entity of its own. The character's byte representation is completely decoupled from that. If we encounter a char we don't need to worry whether the character is stored in UTF-8, UTF-16, USC-4 or ISO-8859-1 internally. It simply doesn't matter. We can compare it to other characters and it will always behave as expected. This concept is not known in C for example. In C the char type is just a numeric type. It can contain anything, even invalid data that does not represent characters. In C you have to know exactly which character encoding a char array uses or you may do wrong things when sorting, printing, searching etc. Also C programs may wrongly assume that a char is one byte long and contains values 0-127 or 0-256, which is true for ASCII, but not for many other character encodings (known as "multi-byte" character encodings). Anyway, in Java use Reader/Writer or CharsetEncoder/CharsetDecoder instead to convert between characters and their byte representation (see following paragraph).

Platform dependent filenames

File tmp = new File("C:\\Temp\\1.tmp");
File exp = new File("export-2013-02-01T12:30.txt");
File f = new File(path +'/'+ filename);
Never hard code paths in a filesystem. Different platforms have different conventions, and you can never be sure that a hard coded path is actually available on a random system. Use API calls to create temporary files. Mind that different file systems have different restrictions on what makes a valid file name. Here the exp file contains a colon character, which is illegal on Windows file systems. When you construct absolute or relative paths in the filesystem, be careful of the platform dependent separator character.
File tmp = File.createTempFile("myapp","tmp");
File exp = new File("export-2013-02-01_1230.txt");

File f = new File(path + File.separatorChar + filename);
// or even better
File dir = new File(path);
File f = new File(dir, filename);

Undefined encoding

Reader r = new FileReader(file);
Writer w = new FileWriter(file);
Reader r = new InputStreamReader(inputStream);
Writer w = new OutputStreamWriter(outputStream);
String s = new String(byteArray); // byteArray is a byte[]
byte[] a = string.getBytes();
Each line of the above converts between byte and char using the default platform encoding. The code behaves differently depending on the platform it runs on. This is harmful if the data flows from one platform to another. It is considered bad practice to rely on the default platform encoding at all. Conversions should always be performed with a defined encoding.
Reader r = new InputStreamReader(new FileInputStream(file), "ISO-8859-1");
Writer w = new OutputStreamWriter(new FileOutputStream(file), "ISO-8859-1");
Reader r = new InputStreamReader(inputStream, "UTF-8");
Writer w = new OutputStreamWriter(outputStream, "UTF-8");
String s = new String(byteArray, "ASCII");
byte[] a = string.getBytes("ASCII");

Unbuffered streams

InputStream in = new FileInputStream(file);
int b;
while ((b = in.read()) != -1) {
   ...
}
The above code reads a file byte by byte. Every read() call on the stream will cause a native (JNI) call to the native implementation of the filesystem. Depending on the implementation this may cause a syscall to the operating system. JNI calls are expensive and so are syscalls. The number of native calls can be reduced dramatically by wrapping the stream into a BufferedInputStream. Reading 1 MB of data from /dev/zero with the above code took about 1 second on my laptop. With the fixed code below it was down to 60 milliseconds! That's a 94% saving. This also applies for output streams of course. And it is true not only for the file system but also for sockets.
InputStream in = new BufferedInputStream(new FileInputStream(file));

Unbuffered writes to an OutputStreamWriter

Writer w = new OutputStreamWriter(os, "UTF-8");
while (...) {
  w.write("something");
}
As demonstrated OutputStreamWriter uses memory for each call to its write() methods. This is very unfortunate and not the behaviour that one would expect! If you do many writes, you should wrap it in a BufferedWriter, which (also unexpectedly) seems to use no memory at all:
Writer w = new BufferedWriter(new OutputStreamWriter(os, "UTF-8"));

Infinite heap

byte[] pdf = toPdf(file);
Here a method creates a PDF file from some input and returns the binary PDF data as a byte array. This code assumes that the generated file is small enough to fit into the available heap memory. If this code can not make this 100% sure then it is vulnerable to an out of memory condition. Especially if this code is run server-side which usually means many parallel threads. Bulk data must never be handled with byte arrays. Streams should be used and the data should be spooled to disk or a database.
File pdf = toPdf(file);
A similar anti-pattern is to buffer streaming input from an "untrusted" (security term) source. Such as buffering data that arrives on a network socket. If the application doesn't know how much data will be arriving it must make sure that it keeps an eye on the size of the data. If the amount of buffered data exceeds sane limits an error condition (exception) should be signalled to the caller, rather than driving the application against the wall by letting it run into an out of memory condition.

Infinite time

Socket socket = ...
socket.connect(remote);
InputStream in = socket.getInputStream();
int i = in.read();
The above code has two blocking calls that use unspecified timeouts. Imagine if the timeout is infinite. That may cause the application to hang forever. Generally it is an extremely stupid idea to have infinite timeouts in the first place. Infinity is extremely long. Even by the time the Sun turns into a red giant (it explodes), it's still a looong way to Infinity. The average programmer dies at 72. There is simply no real-world situation, where we want to wait that long. Infinite timeout is just an absurd thing. Use an hour, day, week, month, 1 year, 10 years. But not Infinity. To connect to a remote machine I personally find 20 seconds plenty of timeout. A human is not even as patient and would cancel the operation before. While there is a nice override for the connect() method that takes a timeout parameter, there is no such thing for the read(). But you can modify a Socket's socket timeout before every blocking call. (Not just once! You can set different timeouts for different situations.) The socket will throw an exception on blocking calls after that timeout. Also frameworks that communicate over the network should provide an API to control these timeouts and use sensible default values. Infinity is not sensible - it's insane and drives you mad. Who came up with this absolutely useless infinity timeout anyway?
Socket socket = ...
socket.connect(remote, 20000); // fail after 20s
InputStream in = socket.getInputStream();
socket.setSoTimeout(15000);
int i = in.read();
Unfortunately the file system API (FileInputStream, FileChannel, FileDescriptor, File) provides no way to set timeouts on file operations. That's very unfortunate. Because these are the most common blocking calls in a Java application: writing to stdout/stderr and reading from stdin are file operations, and writing to log files is common. Operations on the standard input/output streams depend directly on other processes outside of our Java VM. If they decide to block forever, so will reads/writes to these streams in our application. Disk I/O is a limited resource for which all processes on a system compete. There is no guarantee that a simple read/write on a file is quick. It may incur unspecified wait time. Also today remote file systems are ubiquitous. Disks may be on a SAN/NAS, or file systems may be mounted over the network (NFS, AFS, CIFS/Samba). So a filesystem call may actually be a network call: too bad that we don't have the power of the network API here! So if the OS decides that the timeout for the write is 60 seconds you're stuck with it. It is a failure to assume that any disk/file operation is fast, or even remotely instantaneous. An application can do the user a favour by assuming that a file operation can takes seconds. So it's best avoided or done asynchronously (in background). Solutions to this problem are: adequate buffering and queueing/asynchronous processing.

Assuming a cheap timer call

for (...) {
  long t = System.currentTimeMillis();
  long t = System.nanoTime();
  Date d = new Date();
  Calendar c = new GregorianCalendar();
}

Creating a new Date or Calendar performs a syscall to obtain the current time. On Unix/Linux this is the syscall gettimeofday which is considered "extremely cheap". Well, extremely cheap only compared to other syscalls! In that it usually doesn't require a switch from userspace to kernelspace but is rather implemented as a read from a memory mapped page. Still calls to gettimeofday are expensive compared to normal code execution. The exact penalty of the call strongly depends on the architecture and even configuration (modern x86 systems have numerous timers that can be used by the OS: HPET, TSC, RTC, ACPI, clock chips etc.). On my Linux-2.6.37-rc7 system the timer calls also seem to be synchronised over the system. That means the total available bandwidth of ~800 calls per ms is shared by all threads/processes. Consequently my dual core running with 2 threads was able to make ~400 calls per ms per thread. (Thanks to J. Davies for that hint) And last but not least the resolution of this timer is not infinite. At best it is milliseconds, but it may well be rather something like 25 to 50 milliseconds with a large jitter. Modern Linux system can easily achieve the full ms resolution in System.currentTimeMillis. But that has not always been the case. System.nanoTime will certainly not have its full theoretical resolution: 1ns = 10-9s which corresponds to 1GHz. So on a CPU with 3GHz this would allow ~3 instructions to execute the call, which is obviously not enough. I measured a large jitter between 800ns and 1000000ns(1ms). Clearly calling gettimeofday every 100 nano seconds is wasteful.

Most of the time you don't need the current time as precicely. Caching it outside of the loop is trivial. This way you only access the timer once. You can still decide to clone the Date instance, if you really need different objects. Cloning is extremely cheap compared to a timer access (factor 50 on my system).

Date d = new Date();
for (E entity : entities) {
  entity.doSomething();
  entity.setUpdated((Date) d.clone());
}

Caching the time may not be an option if the loop runs for more than a couple of milliseconds. In that case you may setup a timer that periodically updates a timestamp variable with the current time (using interrupts). Set it to the exact granularity that you need. The coarser that granularity is, the better. On my system this loop is 200 times faster than creating a new Date each time.

private volatile long time;

Timer timer = new Timer(true);
try {
  time = System.currentTimeMillis();
  timer.scheduleAtFixedRate(new TimerTask() {
    public void run() {
      time = System.currentTimeMillis();
    }
  }, 0L, 10L); // granularity 10ms
  for (E entity : entities) {
     entity.doSomething();
     entity.setUpdated(new Date(time));
  }
} finally {
  timer.cancel();
}

Catch all: I don't know the right runtime exception

Query q = ...
Person p;
try {
    p = (Person) q.getSingleResult();
} catch(Exception e) {
    p = null;
}
This is an example of a J2EE EJB3 query. The getSingleResult throws runtime exceptions when a) the result is not unique, b) there is no result c) when the query could not be executed due to database failure or so. The code above just catches any exception. A typical catch-all block. Using null as a result may be the right thing for case b) but not for case a) or c). In general one should not catch more exceptions than necessary. The correct exception handling is
Query q = ...
Person p;
try {
    p = (Person) q.getSingleResult();
} catch(NoResultException e) {
    p = null;
}

Exceptions are annoying

try {
    doStuff();
} catch(Exception e) {
    log.fatal("Could not do stuff");
}
doMoreStuff();
There are two problems with this tiny piece of code. First, if this is really a fatal condition then the method should abort and notify the caller of the fatal condition with an appropriate exception (so why is it caught in the first place?) Hardly ever can you just continue after a fatal condition. Second, this code is very hard to debug because the reason of the failure is lost. Exception objects carry detailed information about where the error occurred and what caused it. Individual subclasses may actually carry a lot of extra information that the caller can use to deal with the situation properly. It's a lot more than a simple error code (which is so popular in the C world. Just look at the Linux kernel. return -EINVAL everywhere...). If you catch highlevel exceptions then at least log the message and stack trace. You should not see exceptions as a necessary evil. They are a great tool for error handling.
try {
    doStuff();
} catch(Exception e) {
    throw new MyRuntimeException("Could not do stuff because: "+ e.getMessage(), e);
}

Re-wrapping RuntimeException

try {
  doStuff();
} catch(Exception e) {
  throw new RuntimeException(e);
}
Sometimes you really want to re-throw any checked exception as RuntimeException. The above piece of code doesn't take into account however, that RuntimeException extends Exception. The RuntimeException doesn't need to be catched here. Also the exception's message is not propagated properly. A bit better is to catch the RuntimeException separately and not wrap it. Even better is to catch all the checked exceptions individually (even if they are a lot).
try {
  doStuff();
} catch(RuntimeException e) {
  throw e;
} catch(Exception e) {
  throw new RuntimeException(e.getMessage(), e);
}
try {
  doStuff();
} catch(IOException e) {
  throw new RuntimeException(e.getMessage(), e);
} catch(NamingException e) {
  throw new RuntimeException(e.getMessage(), e);
}

Not properly propagating the exception

try {
} catch(ParseException e) {
  throw new RuntimeException();
  throw new RuntimeException(e.toString());
  throw new RuntimeException(e.getMessage());
  throw new RuntimeException(e);
}
This codes just wraps a parsing error into a runtime exception in different ways. None of them provides really good information to the caller. The first just loses all information. The second may do anything depending on what information toString() produces. The default toString() implementation lists the fully qualified exception name followed by the message. Nesting many exceptions will produce an unwieldy long and ugly string, unsuitable for a user. The third just preserves the message, which is better than nothing. The last preserves the cause, but sets the message of the runtime exception to toString() of its cause (see above). The most useful and readable version is to propagate only the cause message in the runtime exception and pass the original exception as the cause:
try {
} catch(ParseException e) {
  throw new RuntimeException(e.getMessage(), e);
}

Catching to log

try {
    ...
} catch(ExceptionA e) {
    log.error(e.getMessage(), e);
    throw e;
} catch(ExceptionB e) {
    log.error(e.getMessage(), e);
    throw e;
}
This code only catches exception to write out a log statement and then rethrows the same exception. This is stupid. Let the caller decide if the message is important to log and remove the whole try/catch clause. Its only useful when you know that the caller doesn't log it. That's the case if the method is called by a framework which is not under your control.

Incomplete exception handling

try {
    is = new FileInputStream(inFile);
    os = new FileOutputStream(outFile);
} finally {
    try {
        is.close();
        os.close();
    } catch(IOException e) {
        /* we can't do anything */
    }
}
If streams are not closed, the underlying operating system can't free native resources. This programmer wanted to be careful about closing both streams. So he put the close in a finally clause. But if is.close() throws an IOException then os.close is not even executed. Both close statements must be wrapped in their own try/catch clause. Moreover, if creating the input stream throws an exception (because the file was not found) then os is null and os.close() will throw a NullPointerException. To make this less verbose I have stripped some newlines.
try {
    is = new FileInputStream(inFile);
    os = new FileOutputStream(outFile);
} finally {
    try { if (is != null) is.close(); } catch(IOException e) {/* we can't do anything */}
    try { if (os != null) os.close(); } catch(IOException e) {/* we can't do anything */}
}

The exception that never happens

try {
  ... do risky stuff ...
} catch(SomeException e) {
  // never happens
}
... do some more ...
Here the developer executes some code in a try/catch block. He doesn't want to rethrow the exception that one of the called methods declares to his annoyance. As the developer is clever he knows that in his particular situation the exception will never be thrown, so he just inserts an empty catch block. He even puts a nice comment in the empty catch block - but they are famous last words... The problem with this is: how can he be sure? What if the implementation of the called method changes? What if the exception is still thrown in some special case but he just didn't think of it? The code after the try/catch may do the wrong thing in that situation. The exception will go completely unnoticed. The code can be made much more reliable by throwing a runtime exception in the case. This works like an assertion and adheres to the "crash early" principle. The developer will notice if his assumption was wrong. The code after the try/catch will not be executed if the exception occurred against all honest hope and expectation. If the exception really never occurs - fine, nothing changed.
try {
  ... do risky stuff ...
} catch(SomeException e) {
  // never happens hopefully
  throw new IllegalStateException(e.getMessage(), e); // crash early, passing all information
}
... do some more ...

The transient trap

public class A implements Serializable {
    private String someState;
    private transient Log log = LogFactory.getLog(getClass());
    
    public void f() {
        log.debug("enter f");
        ...
    }
}
Log objects are not serializable. The programmer knew this and correctly declared the log field as transient so it is not serialised. However the initialisation of this variables happens in the class' initialiser. Upon deserialization initializers and constructors are not executed! This leaves the deserialized object with a null log variable which subsequently causes a NullPointerException in f(). Rule of thumb: never use class initialization with transient variables. You can either solve this case here by using a static variable or by using a local variable:
public class A implements Serializable {
    private String someState;
    private static final Log log = LogFactory.getLog(A.class);
    
    public void f() {
        log.debug("enter f");
        ...
    }
}

public class A implements Serializable {
    private String someState;
    
    public void f() {
        Log log = LogFactory.getLog(getClass());
        log.debug("enter f");
        ...
    }
}

Overkill initialization

public class B {
    private int count = 0;
    private String name = null;
    private boolean important = false;
}
This programmer used to code in C. So naturally he wants to make sure every variable is properly initialized. Here however it is not necessary. The Java language specification guarantees that member variables are initialized with certain values automatically: 0, null, false. By declaring them explicitly the programmer causes a class initializer to be executed before the constructor. This is unnecessary overkill and should be avoided.
public class B {
    private int count;
    private String name;
    private boolean important;
}

Log instances: static or not?

This section was edited and before actually suggested not to store log instances in static variables. Turns out I was wrong. Mea culpa. I apologize.
Store the darn log instance in a static final variable and be happy.
private static final Log log = LogFactory.getLog(MyClass.class);
Here is why:

Chosing the wrong class loader

Class clazz = Class.forName(name);
Class clazz = getClass().getClassLoader().loadClass(name);
This code uses the class loader that loaded the current class. getClass() might return something unexpected, like a subclass, or a dynamic proxy. Something out of your control. This is hardly ever what you want when you dynamically load an additional class. Especially in managed environments like Application servers, Servlet engines or Java Webstart this is most certainly wrong. This code will behave very differently depending on the environment it is run in. Environments use the context class loader to provide applications with a class loader they should use to retrieve "their own" classes.
ClassLoader cl = Thread.currentThread().getContextClassLoader();
if (cl == null) cl = MyClass.class.getClassLoader(); // fallback
Class clazz = cl.loadClass(name);

Poor use of reflection

Class beanClass = ...
if (beanClass.newInstance() instanceof TestBean) ...
This programmer is struggling with the reflection API. He needs a way to check for inheritance but didn't find a way to do it. So he just creates a new instance and uses the instanceof operator he is used to. Creating an instance of a class you don't know is dangerous. You never know what this class does. It may be very expensive. Or the default constructor may not even exist. Then this if statement would throw an exception. The correct way of doing this check is to use the Class.isAssignableFrom(Class) method. Its semantics is upsidedown of instanceof.
Class beanClass = ...
if (TestBean.class.isAssignableFrom(beanClass)) ...

Synchronization overkill

Collection l = new Vector();
for (...) {
   l.add(object);
}
Vector is a synchronized ArrayList. And Hashtable is a synchronized HashMap. Both classes should only be used if synchronization is explicitly required. If however those collections are used as local temporary variables the synchronization is complete overkill and degrades performance considerably. I measured a 25% penalty.
Collection l = new ArrayList();
for (...) {
   l.add(object);
}

Wrong list type

Without sample code. Junior developers often have difficulties to chose the right list type. They usually choose quite randomly from Vector, ArrayList and LinkedList. But there are performance considerations to make! The implementations behave quite differently when adding, iterating or accessing object by index. I'll ignore Vector in this list because it behaves like an ArrayList, just slower. NB: n is the size of the list, not the number of operations!
ArrayList LinkedList
add (append) O(1) or ~O(log(n)) if growing O(1)
insert (middle) O(n) or ~O(n*log(n)) if growing O(n)
remove (middle) O(n) (always performs complete copy) O(n)
iterate O(n) O(n)
get by index O(1) O(n)
The insert performance of the ArrayList depends on whether it has to grow during the insert or if the initial size is reasonably set. The growing occurs exponentially (by factor 2) so growing costs are O(log(n)). The exponential growing however may use much more memory than you actually need. The sudden need to resize the list also makes the response time sluggisch and will probably cause a major garbage collection if the list is large. Iterating over the lists is equally inexpensive. Indexed list element access however is very slow in linked lists of course.
Memory considerations: LinkedList wraps every element into a wrapper object. ArrayList allocates a completely new array each time it needs to grow and performs an array copy on every remove(). All standard Collections can not reuse their Iterator objects, which may cause iterator churn especially when recursively iterating large tree structures.
Personally I almost never use LinkedList. It would really only make sense when you wanted to insert objects in the middle of a list. But without access to the wrapper object this doesn't scale with O(1) but O(n) because you must first traverse the list until you find the insert position. So what exactly is the point of the LinkedList class? I recommend using ArrayLists only.

The HashMap size trap

Map map = new HashMap(collection.size());
for (Object o : collection) {
  map.put(o.key, o.value);
}
This developer had good intentions and wanted to make sure that the HashMap doesn't need to be resized. He thus set its initial size to the number of elements he was going to put into it. Unfortunately the HashMap implementation doesn't quite behave like this. It sets its internal threshold to threshold = (int)(capacity * loadFactor). So it will resize after 75% of the collection have been inserted into the map. The above code will thus always cause extra garbage.
Map map = new HashMap(1 + (int) (collection.size() / 0.75));

Hashtable, HashMap and HashSet are overrated

These classes are extremely popular. Because they have great usability for the developer. Unfortunately they are also horribly inefficient. Hashtable and HashMap wrap every key/value pair into an Entry wrapper object. An Entry object is surprisingly large. Not only does it hold a reference to key and value, but also stores the hash code and a forward reference to the next Entry of the hash bucket. When you look at heap dumps with a memory analyzer you will be shocked by how much space is wasted by them in large applications like an application server. When you look at the source code of HashSet you will see that the developers were extremely lazy and just used a HashMap in the backend!
Before using any of these classes, think again. IdentityHashMap can be a viable alternative. But be careful, it intentionally breaks the Map interface. It is much more memory efficient by implementing an open hashtable (no buckets), doesn't need an Entry wrapper and uses a simple Object[] as its backend. Instead of a HashSet a simple ArrayList may do similarly well (you can use contains(Object)) as long as it's small and lookups are rare.
For Sets that contain only a handful of entries the whole hashing is overkill and the memory wasted for the HashMap backend plus the wrapper objects is just nuts. Just use an ArrayList or even an array.
Actually it's a shame that there is no efficient Map and Set implementations in the standard JDK!

Lists are overrated

Also List implementations are very popular. But even lists are often not necessary. Simple arrays may do as well. I am not saying that you should not use Lists at all. They are great to work with. But know when to use arrays. The following are indicators that you should be using an array instead of a list: Let me illustrate that in code:
List<Integer> codes = new ArrayList<Integer>();
codes.add(Integer.valueOf(10));
codes.add(Integer.valueOf(20));
codes.add(Integer.valueOf(30));
codes.add(Integer.valueOf(40));

versus

int[] codes = { 10, 20, 30, 40 };
// horribly slow and a memory waster if l has a few thousand elements (try it yourself!)
List<Mergeable> l = ...;
for (int i=0; i < l.size()-1; i++) {
    Mergeable one = l.get(i);
    Iterator<Mergeable> j = l.iterator(i+1); // memory allocation!
    while (j.hasNext()) {
        Mergeable other = l.next();
        if (one.canMergeWith(other)) {
            one.merge(other);
            other.remove();
        }
    }
}

versus

// quite fast and no memory allocation
Mergeable[] l = ...;
for (int i=0; i < l.length-1; i++) {
    Mergeable one = l[i];
    for (int j=i+1; j < l.length; j++) {
        Mergeable other = l[j];
        if (one.canMergeWith(other)) {
            one.merge(other);
            l[j] = null;
        }
    }
}
You save an extra list object (wrapping an array), wrapper objects and possibly lots of iterator instances. Even Sun realized this. That's why Collections.sort() actually copies the list into an array and performs the sort on the array.

Object arrays are soooo flexible

/**
 * @returns [1]: Location, [2]: Customer, [3]: Incident
 */
Object[] getDetails(int id) {...
Even though documented, this kind of passing back values from a method is ugly and error prone. You should really declare a small class that holds the objects together. This is analoguos to a struct in C.
Details getDetails(int id) {...}

private class Details {
    public Location location;
    public Customer customer;
    public Incident incident;
}

Premature object decomposition

public void notify(Person p) {
    ...
    sendMail(p.getName(), p.getFirstName(), p.getEmail());
    ...
}
class PhoneBook {
    String lookup(String employeeId) {
        Employee emp = ...
        return emp.getPhone();
    }
}
In the first example it's painful to decompose an object just to pass its state on to a method. In the second example the use of this method is very limited. If overall design allows it pass the object itself.
public void notify(Person p) {
    ...
    sendMail(p);
    ...
}
class EmployeeDirectory {
    Employee lookup(String employeeId) {
        Employee emp = ...
        return emp;
    }
}

Modifying setters

private String name;

public void setName(String name) {
    this.name = name.trim();
}

public void String getName() {
    return this.name;
}
This poor developer suffered from spaces at the beginning or end of a name entered by the user. He thought to be clever and just removed the spaces inside the setter method of a bean. But how odd is a bean that modifies its data instead of just holding it? Now the getter returns different data than was set by the setter! If this was done inside an EJB3 entity bean a simple read from the DB would actually modify the data: For every INSERT there would be an UPDATE statement. Let alone how hard it is to debug these side-effects! In general, a bean should not modify its data. It is a data container, not business logic. Do the trimming where it makes sense: in the controller where the input occurs or in the logic where the spaces are not wanted.
person.setName(textInput.getText().trim());

Unnecessary Calendar

Calendar cal = new GregorianCalender(TimeZone.getTimeZone("Europe/Zurich"));
cal.setTime(date);
cal.add(Calendar.HOUR_OF_DAY, 8);
date = cal.getTime();
A typical mistake by a developer who is confused about date, time, calendars and time zones. To add 8 hours to a Date there is no need for a Calendar. Neither is the time zone of any relevance. (Think about is if you don't understand this!) However if we wanted to add days (not hours) we would need a Calendar, because we don't know the length of a day for sure (on DST change days may have 23 or 25 hours).
date = new Date(date.getTime() + 8L * 3600L * 1000L); // add 8 hrs
Calendar cal = new GregorianCalender(TimeZone.getTimeZone("Europe/Zurich"));
SimpleDateFormat df = new SimpleDateFormat("dd.MM.yyyy HH:mm");
df.setCalendar(cal);
Here the Calendar object is completely unnecessary. The DateFormat object already contains a Calendar instance. Reuse that.
SimpleDateFormat df = new SimpleDateFormat("dd.MM.yyyy HH:mm");
df.setTimeZone(TimeZone.getTimeZone("Europe/Zurich"));

Relying on the default TimeZone

Calendar cal = new GregorianCalendar();
cal.setTime(date);
cal.set(Calendar.HOUR_OF_DAY, 0);
cal.set(Calendar.MINUTE, 0);
cal.set(Calendar.SECOND, 0);
Date startOfDay = cal.getTime();
The developer wanted to calculate the start of the day (0h00). First he obviously missed out the millisecond field of the Calendar. But the real big mistake is not setting the TimeZone of the Calendar object. The Calendar will thus use the default time zone. This may be fine in a Desktop application, but in server-side code this is hardly ever what you want: 0h00 in Shanghai is in a very different moment than in London. The developer needs to check which is the time zone that is relevant for this computation.
Calendar cal = new GregorianCalendar(user.getTimeZone());
cal.setTime(date);
cal.set(Calendar.HOUR_OF_DAY, 0);
cal.set(Calendar.MINUTE, 0);
cal.set(Calendar.SECOND, 0);
cal.set(Calendar.MILLISECOND, 0);
Date startOfDay = cal.getTime();

Time zone "conversion"

public static Date convertTz(Date date, TimeZone tz) {
  Calendar cal = Calendar.getInstance();
  cal.setTimeZone(TimeZone.getTimeZone("UTC"));
  cal.setTime(date);
  cal.setTimeZone(tz);
  return cal.getTime();
}

If you think this method does something useful, please go and read the article about time. This developer had not read the article and was desperately trying to "fix" the time zone of his date. Actually the method does nothing. The returned Date will not have any different value than the input. Because a Date does not carry time zone information. It is always UTC. And the getTime / setTime methods of Calendar always convert between UTC and the actual time zone of the Calendar.

Using Calendar.getInstance()

Calendar c = Calendar.getInstance();
c.set(2009, Calendar.JANUARY, 15);

This code assumes a Gregorian calendar. But what if the returned Calendar subclass is a Buddhistic, Julian, Hebrew, Islamic, Iranian or Discordian calendar? In these the year 2009 has a very different meaning. And a month called January doesn't exist. Calendar.getInstance() uses the current default locale to select an appropriate implementation. It depends on the Java implementaton which implementations are available. The utility of Calendar.getInstance() is thus very limited, and its use should be avoided as it's result is not well defined.

Calendar c = new GregorianCalendar(timeZone);
c.set(2009, Calendar.JANUARY, 15);

Dangerous Calendar manipulation

GregorianCalender cal = new GregorianCalender(TimeZone.getTimeZone("Europe/Zurich"));
cal.set(Calendar.SECOND, 0);
cal.set(Calendar.MILLISECOND, 0);
if (cal.before(other)) doSomething();

cal.setTimeZone(TimeZone.getTimeZone("GMT"));
cal.set(Calendar.HOUR_OF_DAY, 23);
Date d = cal.getTime();

This code manipulates a Calendar object in ways that are bound to yield undefined results. Calendar objects have complex inner state: individual fields for day, hour, year etc., a millisecond since epoch value (like Date) and a time zone. Depending on what you change, some of these fields are invalidated and are only recomputed from other values when you call certain methods:

Whenever you change fields with set(), then dependend fields do not get updated until you call get(), getTime(), getTimeInMillis(), add(), or roll(). The first paragraph of above code calls set() followed by before(). There is no guarantee (according to the API Doc) that before() will see the modified time value.

The second paragraph invalidates all fields and the millisecond since epoch value by calling setTimeZone() and set(), losing the calendar's data completely. See also bug 4827490

Calendar objects should always be manipulated according to these simple rules:

GregorianCalender cal = new GregorianCalender(TimeZone.getTimeZone("Europe/Zurich"));
cal.set(Calendar.SECOND, 0);
cal.set(Calendar.MILLISECOND, 0);
cal.getTimeInMillis();
if (cal.before(other)) doSomething();

cal.setTimeZone(TimeZone.getTimeZone("GMT"));
cal.get(Calendar.DATE);
cal.set(Calendar.HOUR_OF_DAY, 23);
Date d = cal.getTime();

Calling Date.setTime()

account.changePassword(oldPass, newPass);
Date lastmod = account.getLastModified();
lastmod.setTime(System.currentTimeMillis());

The above code updates the last modified date of the account entity. The programmer wants to be conservative and avoids creating a new Date object. Instead she uses the the setTime method to modify the existing Date instance.

There is actually nothing wrong with that. But I just do not recommend this practice. Date objects are usually passed around carelessly. The same Date instance could be passed to numerous objects, which don't make a copy in their setters. Dates are often used like primitives. Thus if you modify a Date instance, other objects that use this instance might behave unexpectedly. Of course it is unclean design if an object exposes its intrinsic Date instance to the outside world, if you write code that strictly adheres to classical OO-principles (which I think is too inconvenient). General everyday Java practice however is to just copy Date references and not clone the object in setters. Thus every programmer should treat Date as immutable and should not modify existing instances. This should only be done for performance reasons in special situations. Even then the use of a simple long is probably equally good.

account.changePassword(oldPass, newPass);
account.setLastModified(new Date());

Assuming SimpleDateFormat was thread-safe

public class Constants {
    public static final SimpleDateFormat date = new SimpleDateFormat("dd.MM.yyyy");
}

The above code is flawed in several ways. It's broken, because it shares a static instance of a SimpleDateFormat with possibly any number of threads. SimpleDateFormat is not thread-safe. If multiple threads concurrently use this object the results are undefined. You may observe strange output from format and parse or even exceptions. Unfortunately this mistake is very common!

Yes, sharing a SimpleDateFormat requires proper synchronization. Yes that comes at a price (cache flushes, lock contention, etc.). And yes, creating a SimpleDateFormat is not free either (pattern parsing, object allocation). But simply ignoring thread-safety is not a solution, but a sure way to break your code.

Of course this code also doesn't take the time zone into account. And then defining a class called Constants screams of yet another anti-pattern (see next section).

Having a global Configuration/Parameters/Constants class

public interface Constants {
    String version = "1.0";
    String dateFormat = "dd.MM.yyyy";
    String configFile = ".apprc";
    int maxNameLength = 32;
    String someQuery = "SELECT * FROM ...";
}

Often seen in large projects: one class or interface that contains all sorts of constants that are used throughout the application. Why is this bad? Because these constants are unrelated to each other. This class is the only thing that they have in common. And the reference to this class will pollute many again unrelated components of the application. You want to later extract a component and use it in a different application? Or share some classes between a server and a remote client? You may need to ship the constants class as well! This class has introduced a dependency between otherwise unrelated components. This inhibits reuse and loose coupling and gives way to chaos.

Instead put constants where they belong. In no case should constants be used across component boundaries. This is only allowed if the component is a library, on which an explicit dependency is wanted.

Not noticing overflows

public int getFileSize(File f) {
  long l = f.length();
  return (int) l;
}

This developer, for whatever reason, wrapped a call to determine the size of a file into a method that returns an int instead of a long. This code does not support files larger than 2 GB and just returns a wrong length in that case. Code that casts a value to a smaller size type must first check for a possible overflow and throw an exception.

public int getFileSize(File f) {
  long l = f.length();
  if (l > Integer.MAX_VALUE) throw new IllegalStateException("int overflow");
  return (int) l;
}

Another version of an overflow bug is the following. Note the missing parantheses in the first println statement.

long a = System.currentTimeMillis();
long b = a + 100;
System.out.println((int) b-a);
System.out.println((int) (b-a)); 

And last, a true gem that I uprooted during code review. Note how the programmer tried to be careful, but then failed so badly by assuming an int could ever become larger than its maximum value.

int a = l.size();
a = a + 100;
if (a > Integer.MAX_VALUE)
    throw new ArithmeticException("int overflow");

Using == with float or double

for (float f = 10f; f!=0; f-=0.1) {
  System.out.println(f);
}
The above code doesn't behave as expected. It causes an endless loop. Because 0.1 is an infinite binary decimal, f will never be exactly 0. Generally you should never compare float or double values with the equality operator ==. Always use less than or greater than. Java compilers should be changed to issue a warning in that case. Or even make == an illegal operation for floating point types in the Java Language Spec. It makes really no sense to have this feature.
for (float f = 10f; f>0; f-=0.1) {
  System.out.println(f);
}

Storing money in floating point variables

float total = 0.0f;
for (OrderLine line : lines) {
  total += line.price * line.count;
}
double a = 1.14 * 75; // 85.5 represented as 85.4999...
System.out.println(Math.round(a)); // surprising output: 85
System.out.println(10.0/3); // surprising output: 3.3333333333333335 (precision lost twice during division and on conversion to decimal)
BigDecimal d = new BigDecimal(1.14); // precision has already been lost

I have seen many developers coding such a loop. Including myself in my early days. When this code sums 100 order lines with every line having one 0.30$ item, the resulting total is calculated to exactly 29.999971. The developer notices the strange behaviour and changes the float to the more precise double, only to get the result 30.000001192092896. The somewhat surprising result is of course due to the difference in representation of numbers by humans (in decimal format) and computers (in binary format). It always occurs in its most annyoing form when you add fractional amounts of money or calculate the VAT.

Binary representation of floating point numbers was invented for inherently inexact values like measurements. Perfect for engineering! But unusable when you want exact math. Like banks. Or when counting.

There are business cases where you can not afford to lose precision. You lose precision when converting between decimal and binary and when rounding happens in not a well-defined mannor or at indeterminate points. To avoid losing precision you must use fixed point or integer arithmetics. This does not only apply to monetary values, but it is a frequent source of annoyance in business applications and therefore makes a good example. In the second example an unsuspecting user of the program would simply say the computer's calculator is broken. That's of course very embarassing for the programmer.

Consequently an amount of money should never ever be stored in a floating point data type (float, double). Please note that it is not just any calculation that is inexact. Even a simple multiplication with an integer can already yield an inexact result. It is the mere fact of storing a value in a binary representation (float, double) that may already cause rounding! You simply can not store 0.3 as an exact value in float or double. Because float and double are binary IEEE754 types. If you see a float or double in your financial code base, the code will most likely yield inexact results. Instead either a string or fixed point representation should be chosen. A text representation must be in a well-defined format and is not to be confused with user input/output in a locale specific format. Both representations must define the precision (number of digits before and after the decimal point) that is stored.

For calculations the class BigDecimal provides an excellent facility. The class can be used such that it throws runtime exceptions if precision is unexpectedly lost in an operation. This is very helpful to uproot subtle numerical bugs and enables the developer to correct the calculation.

BigDecimal total = BigDecimal.ZERO;
for (OrderLine line : lines) {
  BigDecimal price = new BigDecimal(line.price);
  BigDecimal count = new BigDecimal(line.count);
  total = total.add(price.multiply(count)); // BigDecimal is immutable!
}
total = total.setScale(2, RoundingMode.HALF_UP);
BigDecimal a = (new BigDecimal("1.14")).multiply(new BigDecimal(75)); // 85.5 exact
a = a.setScale(0, RoundingMode.HALF_UP); // 86
System.out.println(a); // correct output: 86
BigDecimal a = new BigDecimal("1.14");

Not freeing resources in a finally block

public void save(File f) throws IOException {
  OutputStream out = new BufferedOutputStream(new FileOutputStream(f));
  out.write(...);
  out.close();
}

public void load(File f) throws IOException {
  InputStream in = new BufferedInputStream(new FileInputStream(f));
  in.read(...);
  in.close();
}

The above code opens an output stream to a file, allocating a file handle in the operating system. File handles are a rare resource and need to be properly freed, by calling close on the FileOutputStream (same for FileInputStream of course). To ensure that even in the case of an exception (the filesystem may become full during the write), closing must happen in a finally block. Here the stream is also wrapped into a buffering stream. That means not all data will have been written to disk by the time we arrive at the close() call. The close call itself will flush the pending data in the buffer to disk and may thus itself fail with an IOException. If that close fails the file on disk is incomplete (truncated) and thus probably corrupt. The method should therefore propagate the IOException in that case. In the case of a FileInputStream we can safely ignore the potential IOException from a close() call. We have read all data that we need, and there is nothing useful that we can do if the underlying close() failed anyway. It's not even worth logging it.

In a perfect world BufferedOutputStream.close() would be implemented correctly. But sadly it has a bug that's not going to be fixed: it loses any IOException from the implicit flush and truncates your file silently. So here we give the proper workaround with an explicit flush before close.

To be exact the corrected code below can leak in one small corner case: when the file stream was allocated but then allocating the buffered stream fails mysteriously (with out of memory for instance). As a pragmatic person I think in such a pathological case we can safely rely on the garbage collector to clean up the mess. It's not worth the hassle to deal with it.

// code for your cookbook
public void save() throws IOException {
  File f = ...
  OutputStream out = new BufferedOutputStream(new FileOutputStream(f));
  try {
    out.write(...);
    out.flush(); // don't lose exception by implicit flush on close
  } finally {
    out.close();
  }
}

public void load(File f) throws IOException {
  InputStream in = new BufferedInputStream(new FileInputStream(f));
  try {
    in.read(...);
  } finally {
    try { in.close(); } catch (IOException e) { }
  }
}

Let me give you also the cook book recipe for another ubiquitous pattern: database access. Again this is the pragmatic approach. Yes, rs.close() could fail with mysterious Errors, except they only occur in your university lecture on Quantum Mechanics and not in The Real World (tm). And only perverts would write the try/finally cascade that no Error neutrino can escape. Forgive my sarcasm. Here once and for all this is how to deal with SQL objects:

Car getCar(DataSource ds, String plate) throws SQLException {
  Car car = null;
  Connection c = null;
  PreparedStatement s = null;
  ResultSet rs = null;
  try {
    c = ds.getConnection();
    s = c.prepareStatement("select make, color from cars where plate=?");
    s.setString(1, plate);
    rs = s.executeQuery();
    if (rs.next()) {
       car = new Car();
       car.make = rs.getString(1);
       car.color = rs.getString(2);
    }
  } finally {
    if (rs != null) try { rs.close(); } catch (SQLException e) { }
    if (s != null) try { s.close(); } catch (SQLException e) { }
    if (c != null) try { c.close(); } catch (SQLException e) { }
  }
  return car;
}
With that said, don't miss the next paragraph.

Abusing finalize()

public class FileBackedCache {
   private File backingStore;
   
   ...
   
   protected void finalize() throws IOException {
      if (backingStore != null) {
        backingStore.close();
        backingStore = null;
      }
   }
}

This class uses the finalize method to release a file handle. The problem is that you can don't know when the method is called. The method is called by the garbage collector. If you are running out of file handles you want this method to be called rather sooner than later. But the GC will probably only invoke the method when you are about to run out of heap, which is a very different situation. It may take anything from milliseconds to days until GC and finalization runs. The garbage collector manages memory only. It does that very well. But it must not be abused to manage any other resources apart from that. The GC is not a generic resource management mechanism! I find Sun's API Doc of the finalize method very misleading in that respect. It actually suggest to use this method to close I/O resources - complete bullshit if you ask me. Again: I/O has nothing to do with memory!

Better code provides a public close method, which must be called by a well-defined lifecycle management, like JBoss MBeans or so.

public class FileBackedCache {
   private File backingStore;
   
   ...
   
   public void close() throws IOException {
      if (backingStore != null) {
        backingStore.close();
        backingStore = null;
      }
   }
}
JDK 1.7 (Java 7) will introduce the AutoClosable interface. It enables an automatic call to a close method, when the variable (not the object) goes out of scope of a try-with-resource block. It is very different from a finalizer. Its time of execution is well-defined at compile time.
try (Writer w = new FileWriter(f)) { // implements Closable
  w.write("abc");
  // w goes out of scope here: w.close() is called automatically in ANY case
} catch (IOException e) {
  throw new RuntimeException(e.getMessage(), e);
}

Involuntarily resetting Thread.interrupted

try {
	Thread.sleep(1000);
} catch (InterruptedException e) {
	// ok
}

or 

while (true) {
	if (Thread.interrupted()) break;
}

The above code resets the interrupted flag of the Thread. Subsequent readers will not know that the Thread has been interrupted. If you need to pass on the information about the interrupt, rewrite the code like so.

try {
	Thread.sleep(1000);
} catch (InterruptedException e) {
	Thread.currentThread().interrupt();
}

or 

while (true) {
	if (Thread.currentThread().isInterrupted()) break;
}

Spawning thread from static initializers

class Cache {
	private static final Timer evictor = new Timer();
}

java.util.Timer spwans a new thread in its constructor. Therefore the above code spawns a new thread in its static initializer. The new Thread will inherit some properties from its parent: context classloader, inheritable ThreadLocals, and some security properties (access rights). It is therefore rarely desireable to have those property set in an uncontrolled way. This may for instance prevent GC of a class loader.

The static initializer is executed by the thread that first loads the class (in any given ClassLoader), which may be a totally random thread from a thread pool of a webserver for example. If you want to control these thread properties you will have to start threads in a static method, and take control of who is calling that method.

class Cache {
    private static Timer evictor;

	public static setupEvictor() {
		evictor = new Timer();
	}
}

Canceled timer tasks that keep state

final MyClass callback = this;
TimerTask task = new TimerTask() {
	public void run() {
		callback.timeout();
	}
};
timer.schedule(task, 300000L);

try {
	doSomething();
} finally {
	task.cancel();
}

The above code uses a timer to enforce a timeout on doSomething(). The TimerTask contains an (implicit) instance reference to the outer class. Thus as long as the TimerTask exists the instance of MyClass may not be GC'ed. Unfortunately the Timer may keep cancelled TimerTasks around until their scheduled timeout has expired! That would leave the program 5 minutes with a dangling reference to the MyClass instance during which it can not get collected! It's a temorary memory leak. A better TimerTask would override the cancel() method and null the reference there. It requires slightly more code.

TimerTask task = new Job(this);
timer.schedule(task, 300000L);

try {
	doSomething();
} finally {
	task.cancel();
}


static class Job extends TimerTask {
	private volatile MyClass callback;

	public Job(MyClass callback) {
		this.callback = callback;
	}

	public boolean cancel() {
		callback = null;
		return super.cancel();
	}

	public void run() {
		MyClass cb = callback;
		if (cb == null) return;
		cb.timeout();
	}
}

Holding strong references to ClassLoaders and unflushable caches

In a dynamic system like an application server or OSGI, you should take good care not to prevent ClassLoaders from garbage collection. As you undeploy and redeploy individual applications in an application server you create new class loaders for them. The old ones are unused and should be collected. Java isn't going to let that happen if there is a single dangling reference from container code into your application code.

As various libraries are used throughout an enterprise application, that directly means that libraries should do their very best not to hold involuntary strong references to objects (and thus their class loaders).

This is not easy. Classes like java.beans.Introspector from the JDK or org.apache.commons.beanutils.PropertyUtils from Apache BeanUtils or org.springframework.beans.CachedIntrospectionResults from Spring implement caches to speed up their inner workings. They keep strong references to classes you pass them for analysis. Fortunately they provide methods to flush their caches. But finding all classes that may have internal caches and flushing them at the right time is a near to impossible job for the developer.

If you happen to use org.apache.commons.el.BeanInfoManager from Apache Commons EL you probably have a leak. This ancient class keeps a cache of strong references that only ever grows until out of memory. And it has no flush method. Even Tomcat had to implement a workaround involving reflection to clean it.

It would be much better if these libraries just used soft or weak references in the first place. A quick reminder:

Soft and weak references basically differ in the point in time when they are nulled.

Only if the library just caches objects from its own packages (with no external references), it may be fine not to use these special references and just use normal references.

Using soft or weak references also helps the runtime behaviour of your application: if memory gets tight, the last thing you want to spend memory on is caches. So the garbage collector will reclaim the memory used by caches if necessary. A bad example here is JBoss' SQL statement cache: it's compeletely static and can use a lot of memory, even when that is tight. Another bad example is JBoss' authentication cache.

Also every static cache must always provide a simple way to flush its contents. It's the nature of a (clean)cache (as opposed to e.g a write cache) that its contents are not valuable and can be safely discarded at any time. The limits of the cache are another trap. Caches should never grow large, and never cache objects for too long. A really bad example here is the default settings for the JDK DNS cache (it completely ignores DNS record lifetimes, and stores negative lookups forever in an unbounded list). Your API documentation should state if and when caching happens. This also helps the user to estimate runtime performance.