You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is related to a few existing issues: #1725, #1716, #1470.
The reason to writing up this one, is that I've tracked the bug to a particular location in the code, and so this issue is with that particular piece of logic. Issue manifests itself either in browser environments or in server environment running in a timezone other than UTC. It is NOT a problem if timezone is UTC (for example, serverless environments, like AWS Lambda).
It is known that Excel does not store timezone information, all dates are timezone-less. It is stored as an offset (serial number) from 1st Jan 1900 (or 1904 in some formats).
var basedate = new Date(1899, 11, 30, 0, 0, 0); // 2209161600000
var dnthresh = basedate.getTime() + (new Date().getTimezoneOffset() - basedate.getTimezoneOffset()) * 60000;
function datenum(v, date1904) {
var epoch = v.getTime();
if(date1904) epoch -= 1462*24*60*60*1000;
return (epoch - dnthresh) / (24 * 60 * 60 * 1000);
}
function numdate(v) {
var out = new Date();
out.setTime(v * 24 * 60 * 60 * 1000 + dnthresh);
return out;
}
The problem is that JS Date is a local date in the timezone of the browser (or system timezone for Node.js). Which means that a straight timezone offset calculated in dnthresh is dependent on whether daylight saving is on or off at the time that the library is loaded (i.e. new Date().getTimezoneOffset()). This leads to unexpected dates being returned after parsing the XLSX - sometimes they are off by an hour, which could then be one day before the expected date.
Since time in a timezone with daylight savings is not serial (i.e. it jumps around daylight savings), the serial formula needs to be corrected.
Logically the fix is something like this (not taking performance into account):
function numdate(v) {
var out = new Date();
out.setTime(v * 24 * 60 * 60 * 1000 + dnthresh);
// DST correction (!)
out.setTime(out.getTime() + (out.getTimezoneOffset() - new Date().getTimezoneOffset()) * 60000);
return out;
}
Obviously there would be some performance optimisation in the actual fix. For example, new Date() and new Date().getTimezoneOffset() should be calculated only once at the init time.
The main idea here is that applying the date serial, it is not clear whether it will fall into daylight saving according to the local timezone. That is why we have to check out.getTimezoneOffset() after setting it, and adjust accordingly.
Noting here - other solutions are also possible. We don't have to play with Unix epoch UTC offsets (i.e. getTime/setTime), and instead apply some math to the date serial to extract the actual year, month, day (logic widely available in popular date libraries). I am not sure of the performance implications here, perhaps setTime was chosen because of its speed? In any case, several approaches can be tried, and benchmarked.
The text was updated successfully, but these errors were encountered:
This is related to a few existing issues: #1725, #1716, #1470.
The reason to writing up this one, is that I've tracked the bug to a particular location in the code, and so this issue is with that particular piece of logic. Issue manifests itself either in browser environments or in server environment running in a timezone other than UTC. It is NOT a problem if timezone is UTC (for example, serverless environments, like AWS Lambda).
It is known that Excel does not store timezone information, all dates are timezone-less. It is stored as an offset (serial number) from 1st Jan 1900 (or 1904 in some formats).
Following code is performing conversion at the moment from the serial number to JS Date object: https://github.com/SheetJS/sheetjs/blob/master/xlsx.js#L2683
The problem is that JS Date is a local date in the timezone of the browser (or system timezone for Node.js). Which means that a straight timezone offset calculated in
dnthresh
is dependent on whether daylight saving is on or off at the time that the library is loaded (i.e.new Date().getTimezoneOffset()
). This leads to unexpected dates being returned after parsing the XLSX - sometimes they are off by an hour, which could then be one day before the expected date.Since time in a timezone with daylight savings is not serial (i.e. it jumps around daylight savings), the serial formula needs to be corrected.
Logically the fix is something like this (not taking performance into account):
Obviously there would be some performance optimisation in the actual fix. For example,
new Date()
andnew Date().getTimezoneOffset()
should be calculated only once at the init time.The main idea here is that applying the date serial, it is not clear whether it will fall into daylight saving according to the local timezone. That is why we have to check
out.getTimezoneOffset()
after setting it, and adjust accordingly.Noting here - other solutions are also possible. We don't have to play with Unix epoch UTC offsets (i.e.
getTime/setTime
), and instead apply some math to the date serial to extract the actual year, month, day (logic widely available in popular date libraries). I am not sure of the performance implications here, perhaps setTime was chosen because of its speed? In any case, several approaches can be tried, and benchmarked.The text was updated successfully, but these errors were encountered: