[83275 views]

[]

[toggle ads]

Date and time in Java

During the course of my career I have met (too) many developers who are not familiar enough with issues revolving around time and date. I am pointing all developers with knowledge deficiency in that area to this page. This absence of knowledge can manifest in different statements:

As with so many things everything said here is only true to some precision. The precision here ends with leap seconds. I am not taking these into account as in normal IT systems they are not relevant and our clocks are not as accurate. If you work in astronomy, or rocket science this may not be precise enough for you.

What is time?

Time is a physical property. For Physicists it is a dimension tightly coupled with space (spacetime). Time as a physical property (or dimension) passes uniformly. As far as we know. And to any precision that is relevant to us. Even if it did not we were probably unable to tell because we lack something to compare it to. We simply can't define how fast time passes - we live inside of it. Just like a piece of driftwood in a river is unable to change its speed relative to the current we can not change our speed within time. Because to measure speed you compare the difference of what you measure to a difference of time. That sort of measurement is absurd for time itself of course. So we can safely assume that physical time passes uniformly. For physical time there is no notion of "date" or "clock" or "second". These are notions defined by humans.

Measuring time

We don't live inside a particle accelerator. We live on earth. We know day and night. We can see time passing with every day, month, year. These are marked by periodic events: sunrise, full moon, the longest day. That's the way we have been measuring time forever: we use natural periodic events. Even the most scientific definition of today's "second" bases on that: The duration of 9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the Caesium-133 atom (Wikipedia). Here the periodic events are the waves of (a very precise) electromagnetic radiation. Note that, lacking any other option, our definition of the second is completely arbitrary.

We live on Earth, which spins around its axis. This is an accelerated motion by the terms of General Relativity, which means that we measure (slightly) shorter seconds than someone with a different acceleration e.g. in a space ship. But as stated above our definition of the second is arbitrary anyway. So as long as you don't work for NASA and as long as SETI hasn't found anything out there, don't worry.

Measuring duration

So we employ the definition of the duration of a second to measure the passing of time. We can notably use seconds (and its fractions) to measure durations: how long it takes us to go shopping, how long it takes to boil an egg. A duration is the answer to the question "how long". You measure durations with a stop watch. Specifically a duration does not answer the question "when".

UTC

If we want an answer to the question "when" other than "in 27 seconds from now" we must have a means to reference any point in time. We can do this simply by laying out a time scale much like a ruler. A ruler has a zero mark and a positive (and maybe negative) part on its scale. We can do the same in time. We define some arbitrary point in time as 0. From that we can make a reference to any point in the future or in the past by telling how many seconds it is away from point 0. That is what the Coordinated Universal Time (UTC) does. It gives every second on that scale a name: "2007-12-31 23:59:59.000 UTC" for example. UTC always increases. It never goes backwards. Some minutes however may not have exactly 60 seconds. Some may have 59 (that has not happened so far), some may have 61 (happens a couple of times a year). They are called leap seconds and are used for corrections.

Computer clocks

Computers contain clocks (also referred to as timers). Several of them usually. With different precisions and capabilities. The operating system gives us access to the "system clock", which in practice uses one of these timers. On a "good" operation system this clock always advances, and never appears to go backwards. That means for consecutive readings a and b we can rely on: a <= b, but not necessarily a < b. Because these timers have limited precision, consecutive readings may return the same value, even though the numerical precision would allow to represent a difference. On modern Linux systems System.currentTime() normally achieves millisecond precision, but System.nanoTime() does not achieve nanosecond precision. Different restrictions are valid for Object.wait(). This method uses interrupts to get a timeout notification. The precision of this timing may vary widely. Please note that these clocks and interrupts may exhibit a lot of jitter: the clock appears to jump wildly even when read in precise intervals.

NTP

Computer clocks tend to be very bad clocks compared to your watch. Their speed varies constantly. They need to be adjusted all the time to avoid going either too fast or too slow. NTP does just that. With some millisecond precision. Don't be scared that NTP would set your clock backwards and you could observe a > b. The operating system will only advance the clock or delay it (slew), but never reset it. That is, only to really adjust a wrong clock for more than 2 seconds.

Time synchronise your servers. Servers whose time is not in sync with the global time cause problems. The UNIX NTP daemon can keep time in sync with millisecond precision. This minimal /etc/ntp.conf is enough to keep your server synced (needs UDP port 123 in your firewall open):

server     pool.ntp.org    iburst
driftfile  /var/lib/ntp/ntp.drift
restrict   default nomodify nopeer
restrict   127.0.0.1

Local time

When you look at your watch, most of you do not read UTC. You read local time. Also known as wall clock time. It is a representation of UTC in a certain calendar and time zone. More on this later.

Ambiguous Day

This may seem obvious, but it is not. A day is defined as 86400 seconds (24 hours). This is the time the earth takes for a single rotation. Exactly it is the time until the sun appears in the same place in the sky again. This is different from the time it takes to see distant stars in the same place which is called a sideral day (the earth advances on its orbit around the sun). As you will see in the paragraph about Daylight Saving Time a day in your local time is not always 24 hours. In many countries there is one day a year with 23 and one with 25 hours. Applications should never assume that a local day has 24 hours. The notion of a "day" is not well-defined without a calendar!

This ambiguity easily manifests in Java Timers. The method java.util.Timer.schedule(TimerTask task, Date firstTime, long period) is unsuitable for executing daily tasks. If you pass 86400000 milliseconds (24 hours) as the period, the timer will fire every 24 hours, but after a DST change the timer appears to be firing one hour off schedule. While the timer sticks exactly to its 24 hour interval, people usually expect timers to fire at the same local time every day.

Representing points in time by UNIX time

In IT systems we sometimes need to refer to precise points in time. Examples are:

These points in time have the characteristics of an event. If two events have the same time stamp they occurred in the very same moment. It doesn't matter if one occurs in Shanghai and the other in London: they still occur in the very same moment. Independent of any time zone. We usually use UNIX time to represent points in time.

UNIX time stamps are the number of seconds that have elapsed since midnight 1.1.1970 UTC. Uh... here appears a "date". Never mind, we'll get to that in a minute. Java's java.util.Date class effectively encapsulates a UNIX time stamp. It represents a point in time by a millisecond counter in a 64 bit long variable. While 32 bit representation on UNIX will overflow in 2038 the 64 bit signed long will last for roughly another 18 billion years. Please note that none of the deprecated methods and constructors of the Date class should ever be used. They are from a time when the Sun people were confused about time themselves. javax.management.timer.Timer has some convenient constants for the most used millisecond values.

// these are equivalent
long now = System.currentTimeMillis();
Date now = new Date();

// long and Date are mutually exchangeable
// Date is actually just a long wrapper.
long ts = (new Date()).getTime();
Date d = new Date(1095379201L);

// simple arithmetics. Be careful of int overflows!
Date now = new Date();
Date in8Hours = new Date(now.getTime() + 8L * Timer.ONE_HOUR);
Date in7Days = new Date(now.getTime() + 7L * Timer.ONE_DAY); // assumes 24h day!

// this is overkill
Date now = new Date(System.currentTimeMillis());

Calendars

Referring to a point in time by its UNIX time stamp may be quite exact. But it is far from usable. If I tell you "let's meet at 1,095,379,201" it'll probably take you a while until you noticed that this point in time already passed several years ago. That's why we use calendars. Not only in the western world today we use the Gregorian Calendar. It defines twelve months a year of different length and inserts a leap day every four years (rule is a little more complex) at the end of February. It is a very exact calendar which means that in some hundred years time it will be snowing in July in Switzerland not because of an aberrant calendar but much more likely because of a change in climate.

Calendar and time zone

Calendar define dates. Without a calendar "31 December 2001" has no meaning. A calendar is tightly coupled to the notion of a "day". Our notion of "day" is also strongly coupled to sunrise and sunset. Sunrise and sunset depend on where on earth you are, don't they. Hence, it is apparent that a Calendar is only well-defined together with a geographical position. This is especially visible on New Year's eve when around the globe people launch fireworks to celebrate the beginning of a new year. The toasting and launching of fireworks happens roughly 24 times (some time zones are not much inhabited, there are "odd" time zones) during a whole day!

Each one of these partying people of course claims the hour 24:00 for them - be they in Australia or Western Europe. It is also apparent that at any given moment there is not just one date around the globe but two. While people in Europe are already drunk and celebrating January 1st, people in New York are just starting to cook dinner on December 31st in the same moment. The two date "halves" of the Earth are separated by the time zone where midnight is and the international date boundary in the Pacific.

This leads to the shocking fact that "31 December 2001" is not well defined. Because there are two dates around the globe at any given moment this date can refer to anything within 48 hours. Not even "31 December 2001 24:00:00" is well defined at all without specifying a time zone. It can refer to any point in time within 24 hours.

All of the following expressions are useless without a time zone!

Time zones

You have certainly heard of time zones. But do you really know exactly what they are? Think of them as a unit of measure for the thing called "time". Comparable with the unit of measure for "length". A stick of 1 meter length is 1000 millimetres, or 3.28 feet. The length of the stick doesn't change whether I say it's 1000 millimetres long or 3.28 feet. Same goes for time. The point in time doesn't change whether I express it in one time zone or another.

The mother of all time zones is Greenwich Mean Time (GMT). This is the time zone used by UTC. Offsets of all other time zones are specified in offsets from GMT. Zones to the east have positive offsets, zones to the west have negative ones. Again this nicely compares to length: the "meter" is the mother of all SI length units of measure (the original prototype is a platinum bar kept in France). Other units are expressed as multiples of the meter: 1 millimeter = 1/1000 meter, 1 kilometer = 1000 meters.

Converting between time zones is easy: 18:00 GMT = 18:00+00:00 = 19:00+01:00 = 17:00-01:00

The suffix is the offset from GMT that has been added to the GMT time.

The Java java.util.TimeZone can represent two things depending on how it is used (and this may surprise and confuse most people):

Time zones are not assigned to locations by astronomers but by politicians. That's why that time is called legal time. These assignments change. That's why TimeZone keeps a database of historical time zone assignments of many cities. Also (and this may again surprise many people) many countries in the world change their time zone twice a year: daylight savings time is nothing else but a time zone. More on DST below. NB: Also England uses DST. So their time zone is not always GMT!

Most of the time applications use a location database like TimeZone.getTimeZone("Europe/Zurich"). This frees the application from most DST issues. See below. Of course an application can also use a simple exact time zone if required: TimeZone.getTimeZone("GMT+04:30")

Using the correct time zone for java.util.Calendar and java.text.DateFormat is crucial to make your application time zone safe. If you don't specify the time zone, Java will use the system default time zone. In a server environment that is hardly the one that's right.

Localizing an application doesn't just mean to support different Locales, but also different TimeZones. So classes in your codebase that are Locale dependent, are probably also TimeZone dependent.

It should be noted that TimeZone.getTimeZone() does not throw an exception when passed an unknown time zone name. Instead it returns GMT. If the name of the time zone comes from an untrusted source, such as a GUI, you need to validate it. Otherwise your application code will behave unexpectedly.

Read on for a lot more detailed information.

Representing a point in time in a Calendar

As we have seen above, in IT systems, points in time do not have to be represented in a calendar. It is enough to use a UNIX time stamp. Representing a point in time in different calendars and time zones has no influence on the point in time that is being referred to (the length of the stick doesn't change). The only time a representation in a calendar is desirable, is for user interfaces and calendar arithmetics (finding the begin/end of a day, adding days, weeks, etc.). When a human needs to read a point in time she expects it in a human readable form in her local time zone.

In Java we use the java.text.DateFormat and it's most used implementation java.text.SimpleDateFormat to convert Date instances to and from user readable time representation. Important here is not to forget to specify the time zone. Otherwise the system default time zone will be used which is only meaningful for Desktop applications. Also note that SimpleDateFormat instances are not thread-safe. You should thus avoid static instances.

// not thread-safe
public static SimpleDateFormat dfm = new SimpleDateFormat("yyyy-MM-dd");

// This is how to initialise a well-known Date (often used in unit test fixtures)
DateFormat dfm = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
dfm.setTimeZone(TimeZone.getTimeZone("Europe/Zurich"));
Date a = dfm.parse("2007-02-26 20:15:00");
Date b = dfm.parse("2007-02-27 08:00:00");

// Don't use the deprecated constructors
Date badhabit = new Date(106, 1, 26, 20, 15, 00);

// output is likewise simple
System.out.println("Result: "+ dfm.format(a));

// if you don't initialise the DateFormat with a TimeZone, include it in the format!
DateFormat dfm = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss Z");
Date a = dfm.parse("2007-02-26 20:15:00 +0200");

Be careful with the symbols of SimpleDateFormat. "hh" corresponds to Calendar.HOUR which is the hour on the 12-hour clock. It is ambiguous without AM/PM. Furthermore the Calendar class doesn't throw an exception when you do calendar.set(Calendar.HOUR, 20), but it will do something completely wrong! Use "HH" and Calendar.HOUR_OF_DAY for the 24-hour clock.

Java also offers the java.util.Calendar and java.util.GregorianCalendar pair. It allows to convert to and from Date with the setTime and getTime methods. It also allows modification of individual fields. This is the class to use when one must determine the start and end of a day. Also here, set the right time zone or what you are doing is undefined.

// simple way to get a specific point in time
GregorianCalendar cal = new GregorianCalendar(tz);
cal.set(2009, Calendar.DECEMBER, 31, 20, 15, 00);
cal.set(Calendar.MILLISECONDS, 0);
Date d = cal.getTime();

/**
 * Calculates midnight of the day in which date lies with respect
 * to a time zone.
 **/
public Date midnight(Date date, TimeZone tz) {
  Calendar cal = new GregorianCalendar(tz);
  cal.setTime(date);
  cal.set(Calendar.HOUR_OF_DAY, 0);
  cal.set(Calendar.MINUTE, 0);
  cal.set(Calendar.SECOND, 0);
  cal.set(Calendar.MILLISECOND, 0);
  return cal.getTime();
}

/**
 * Adds a number of days to a date. DST change dates are handled
 * according to the time zone. That's necessary as these days don't
 * have 24 hours.
 */
public Date addDays(Date date, int days, TimeZone tz) {
  Calendar cal = new GregorianCalendar(tz);
  cal.setTime(date);
  cal.add(Calendar.DATE, days);
  return cal.getTime();
}

AM/PM

Some nations are very comfortable using the 12-hour clock and using AM or PM to clarify which half of the day they mean. But for most of the people the 24-hour clock is far more familiar. Be very careful when using the 12-hour clock. Some people (especially in central Europe) are extremely unfamiliar with it. Imagine you hand out plane tickets with a boarding time like "10. January 12:30 AM". You can be sure that half the passengers will show up half past midnight (which is correct), and the others half past noon (having missed their flight).

As a rule of thumb: There is no zero hour on the 12-hour clock, and a day starts with AM. So 12:00 AM is equal to 0:00 on the 24-hour clock, and 12:00 PM is equal to 12:00 on the 24-hour clock.

I strongly suggest you use the 24-hour clock in applications, and never the AM/PM notion. Or at least let the user choose his preference. On printed documents, such as tickets, you should never use the 12-hour clock. In Java this means to avoid Calendar.HOUR, Calendar.AM_PM and the "hh" symbol of SimpleDateFormat. Instead use Calendar.HOUR_OF_DAY and the "HH" symbol.

Conversion between time zones

As a rule you can not convert a java.util.Date to a different time zone. This is semantically impossible as a Date is always in UTC. Only a Calendar or a textual representation of a date can be converted into a different time zone. In practice time zone conversions are only necessary when dates are read from or written to text. Typically on a GUI. Or in data exchange interfaces. In these cases a java.text.DateFormat should be used with the right time zone set.

DateFormat indfm = new SimpleDateFormat("MM/dd/yyyy HH'h'mm");
indfm.setTimeZone(TimeZone.getTimeZone("Australia/Sydney"));
Date purchaseDate = dfm.parse("12/31/2007 20h15");

DateFormat outdfm = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
outdfm.setTimeZone(TimeZone.getTimeZone("GMT"));
csvfile.println(outdfm.format(purchaseDate) +" GMT");

Please avoid coding the following method. It has no effect and demonstrates a lack of understanding of Date and Calendar.

// BULLSHIT CODE! This method does nothing.
public static Date convertTz(Date date, TimeZone tz) {
  Calendar cal = Calendar.getInstance();
  cal.setTimeZone(TimeZone.getTimeZone("UTC"));
  cal.setTime(date);
  cal.setTimeZone(tz);
  return cal.getTime();
}

When changing the time zone of a Calendar object there is a small catch: you need to call Calendar.get(int) immediately after, or your Calendar object may not be updated correctly!

GregorianCalendar c = new GregorianCalendar(TimeZone.getTimeZone("GMT"));
c.set(2012, Calendar.APRIL, 19, 22, 0, 0);  // 20.04 00:00
c.set(Calendar.MILLISECOND, 0);
c.getTime(); // calculates other fields and ms value after calls to set()
c.setTimeZone(TimeZone.getTimeZone("Europe/Zurich")); // invalidates all fields
c.get(Calendar.DATE); // calculates fields from ms value

Periods of time (intervals)

Imagine the following examples that refer to a period of time:

Each of these are not as precise as a UNIX time stamp. They refer to more than a certain second in time. Some of these designate the duration of a single day. Others several days.

Representing a period

Periods of time can have very different semantics. They may require different representations:

When two UNIX time stamps are used for a period, their general contract should be that the beginning time stamp is considered inside the interval, while the end time stamp is considered outside the interval. This way it is possible to represent consecutive intervals without gaps. For representation to a user it may be necessary to specially handle boundary cases: "Monday 00:00 until 24:00" reads easier than "Monday 00:00 until Tuesday 00:00".

Representing days of the week

In very special occasions it is even necessary to think carefully of how to represent a day of the week in a database. Consider the following situation: A calendaring web application stores recurring calendar entries in the DB. Example: "Monday March 12 2007 01h00 GMT+02:00, recurring every Monday". So the next occurrence is "Monday March 19 2007 01h00 GMT+02:00".

The application chooses to store the start date in GMT without explicitly storing the time zone. Because the user may travel around the globe and wants to see date and time of his calendar entries in the local time zone of his current location. So it makes no sense to store the date in that random time zone that he was in at the time when the calendar entry was created. Thus the start date becomes "2007-03-11 23h00 GMT" which in GMT belongs to a Sunday. If the webapp now stores "recurring every Monday" it will calculate the next occurrence as "2007-03-19 23h00" in GMT (19. is the Monday) which in the user's local time zone is "Tuesday March 20 2007 01h00 GMT+02:00" — one day too late! This happens of course due to the fact that the notion "Monday" has a different meaning in each time zone.

It is apparent that the application also needs to convert the day of the week to GMT. It does that by choosing 0h00 of a nearby Monday in the user's time zone, converting that point in time to GMT and evaluating the day of the week again — resulting in Sunday of course.

Exchanging date and time information

Care must be taken when transferring date and time information between IT systems. As outlined above the time zone information may be an important part of a date which must not be lost or changed. In other situations the data is independent of a time zone.

In text formats like CSV, XML, HTTP or SMTP headers dates are usually formatted in a human readable way (not as UNIX times). Generally such date formats should always incorporate a time zone to avoid any ambiguities. XML schema for instance uses the ISO 8601 standard where the time zone is optional, where SMTP uses RFC 2822 with a mandatory time zone. The time zone for a given point in time must be given as an absolute offset (not a geographical location). Otherwise interpretation of a such a time may be ambiguous with respect to DST (see below).

ISO 8601 date and time: "2007-02-26T21:23:14.250+01:00"
RFC 2822 date and time: "Mon, 26 Feb 2007 21:23:14 +0100" 

Rules of thumb:

For persistence (databases) the same rules apply. A store / load cycle of a UNIX time stamp must not modify the data even when in between any of the following has happened: DST has changed, client has changed time zone, database has changed time zone.

Daylight savings time

During daylight savings time one hour is added to normal time. It usually starts on a Sunday night in spring at 02h00 when the clock jumps to 03h00, skipping one hour. And in autumn the clock jumps back from 03h00 to 02h00, repeating one hour.

As already noted above, DST is nothing else than a time zone. It is important to understand that countries that use DST thus change time zone twice a year. Also Great Britain. Thus just because a guy can see the Greenwich observatory from his window doesn't mean his time zone is GMT (but could be GMT+1 in summer).

For instance in Switzerland normal time is GMT+01:00 and DST is GMT+02:00. DST start and end are a matter of the most absurd political decisionmaking (Australia postponed its DST start in 2006 because of a sports event...). Also some countries like India once used to have DST and now haven't anymore. All this history is recorded in the Java TimeZone database (and also the UNIX time zone databases). Because of the ever changing nature of this data it is important that you keep this DB uptodate. As of 2007 for example the USA will greatly expand the duration of DST (due to some Energy Conservation Act or something).

Please note that while the DST change no such "clock jumping" is observed in GMT: In spring: 02:00+01:00 = 03:00+02:00 = 01:00 GMT and in autumn: 03:00+02:00 = 02:00+01:00 = 01:00 GMT. It is really just a change of the time zone. Our clocks just measure the time in a different zone. But UTC keeps ticking on.

DST causes a bit of a pain for calendaring applications: two days in the year do not have 24 hours in local time. The day in spring has 23 and the one in autumn 25 hours. I don't know of any calendaring application that supports that in its week view...

Sometimes DST is called "summer time" because it is employed during the summer months. But careful, when it's summer on the northern hemisphere, it's winter on the southern hemisphere. Obviously the difference in time zones can vary greatly because of that. Let's compare Zurich, Switzerland and Auckland, New Zealand. During July Zurich is in the GMT+2 and Auckland in the GMT+12 zone, resulting in a clock difference of 10 hours. During December however, Zurich is in the GMT+1 and Auckland in the GMT+13 zone, creating a difference of 12 hours. So better check before skyping with your Kiwi friends!

DST also causes a tiny ambiguity on the end day when users have to enter a time. Not a GUI that I know asks the user for the time zone that he is entering a date and time in. Most GUIs just take the location of the terminal the user is currently on as a reference. Consider the user entering "29.10.2006 02:30" in Zurich. This is the night when Europe switches from DST to normal time. So the hour between 02 and 03 in the morning shows up twice on local clocks. The time entered by the user is therefore not well-defined. It's not clear whether she means 02:30+02:00 or 02:30+01:00 one hour later. Note that the ambiguity is not present in spring. In that night 01:30 is always GMT+01:00, 02:30 does not exist and 03:30 is always GMT+02:00. If this case is important for your application you should check for it. Unfortunately in Java there is no API call that directly tells you the DST end date. But you can test like so (checks if a day has 23 hours):

public boolean isDSTend(Calendar cal) {
  int year = cal.get(Calendar.YEAR);
  int month = cal.get(Calendar.MONTH);
  int date = cal.get(Calendar.DATE);
  TimeZone tz = cal.getTimeZone();
  return isDSTend(year, month, date, tz);
}

public boolean isDSTend(int year, int month, int date, TimeZone tz) {
  java.util.Calendar cal = new java.util.GregorianCalendar(tz);
  cal.set(java.util.Calendar.MILLISECOND, 0);
  cal.set(year, month, date, 00, 30, 00);
  cal.add(java.util.Calendar.HOUR_OF_DAY, 24);
  return (23 == cal.get(java.util.Calendar.HOUR_OF_DAY));
}

Week numbers

Week numbers are popular among managers: "can we arrange a meeting in week 34?". By week numbers they usually mean the number displayed in MS Outlook. What they don't know: ISO-8601 defines the week numbers, Outlook's implementation is not (necessarily) compatible with that. And week numbers have a small ambiguity. Java's Calendar class can be made ISO compatible with setMinimalDaysInFirstWeek(4).

The thing is: 30 December 2002 is in week 1 of the year 2003. But Java's Calendar class only has one YEAR field. This year field has different meanings depending on how the Calendar object is used:

Sun refuses to fix the bug. Use the following method to determine the correct year for the week field:

public int getYearForWeek(GregorianCalendar cal) {
   int year = cal.get(Calendar.YEAR); 
   int week = cal.get(Calendar.WEEK_OF_YEAR);
   int dayOfMonth = cal.get(Calendar.DAY_OF_MONTH);

   if (week == 1 && dayOfMonth > 20)
      return year + 1;

   if (week >= 52 && dayOfMonth < 10)
      return year - 1;

   return year;
}