Apache POI 1904 date

The Apache POI project has classes for reading, writing and manipulating Microsoft Office documents in both the old binary format and also the new open XML formats. Unfortunately there is no easy way to check whether spreadsheets use 1904 or 1900 based dates. This utility class allows code to access this option.

The 1900/1904 date issue is a small GOTCHA when accessing Excel spreadsheets programmatically (and even in older spreadsheets when simply opening on a different platform). Mac OS used to use 1904 based dates as it was easier to calculate leap years. Excel spreadsheets store dates as floating point numbers interpreted as days and fraction of a day. In early Mac Excel spreadsheets this counted from 1st Jan 1904, whereas Windows spreadsheets used days since 1st Jan 1900.

There are two packages dealing with Excel spreadsheets XSSF for .xlsx (XML format) workbooks and HSSF (horrible spreadsheet format!) for the older binary format, but neither exposes this information in public methods.

Internally in XSSF workbooks there is a isDate1904() method, but this is private. In HSSF there is an internal workbook format InternalWorkbook, which exposes the option. This accessible through a variable workbook in HSSFWorkbook and _book in HSSFCell, however both are protected.

Happily in all cases when you set or request a cell as Java Date object, the setCellValue and getDateCellValue methods do the appropriate conversions.

So, the safest option is probably to ignore the 1904 issue and always use Java dates, however if you do need to access date cells as numbers, it is important to know which date system is being used.

The following checks a worksheet for the 1904 date option by creating a ghost cell, setting its value to zero and then getting the value as a date and seeing if it is 1900 or 1904.

import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Cell;

import java.util.Calendar;
import java.util.GregorianCalendar;
import java.util.Date;

class SsCheck1904 {
	public static boolean isDate1904(Workbook wb) {
        Sheet sheet = wb.createSheet();
		int sheetIndex = wb.getSheetIndex(sheet);
        Row row = sheet.createRow(0);
        Cell cell = row.createCell(0);
		cell.setCellValue(0.0);
		boolean is1994 = isDate1904(cell);
		wb.removeSheetAt(sheetIndex);
		return is1994;
	}
	
    /**
     * throws an exception for non-numeric cells
     */
	public static boolean isDate1904(Cell cell) {
        double value = cell.getNumericCellValue();
		Date date = cell.getDateCellValue();
		Calendar cal = new GregorianCalendar();
		cal.setTime(date);
		long year1900 = cal.get(Calendar.YEAR)-1900;
		long yearEst1900 = Math.round(value/(365.25));
		return year1900 > yearEst1900;
	}
}