Re: GDate




Thanks for the nice suggestions!

On Tue, 24 Nov 1998, Josh MacDonald wrote:
> I looked at the gdate module in CVS, it looks nice.  Anyway, I think
> several things are missing: direct manipulations of ISO-8601 dates,
> and unification of these functions with more accurate time divisions
> (hours, minutes, seconds).
> 

What is wrong with time_t and struct tm for h-m-s calculations? e.g.  add
an hour, just get the number of seconds in an hour and add it to a time_t. 
To get a date for that time_t call g_date_set_time().  I don't have a
clear idea what GDate could add that would be useful. (I'm open to ideas,
but the only HMS routines in Steffen Beyer's library looked pretty
pointless, and I couldn't think of anything else.) What do you think of
simply providing some time_t manipulators?

GDate is meant to handle facts and calculations about dates. e.g., is this
a leap year; what day of the week is 2/13/2004; how many days in the year
1203; how many days between 1203 and today; what is the date three months
from now; what is the date tomorrow; how many days in February;  etc.
These things are all a big headache with struct tm. 

> Formatting other than ISO-8601 should be strongly discouraged.  I don't 
> think a strftime function should be included, rather a funcion to convert a 
> time-string to a GDate and a function to convert a GDate to a ISO-8601 
> formatted string.  Perhaps RFC-822 is desirable, but anyone who uses 
> anything other than ISO-8601 today should not be, anyway.

GDate's immediate application (i.e. why I'm writing it) is to label plot
axes. I don't think '1998-11-25' is a very good plot tick label. I've also
been thinking of applications such as gnomecal, where 1998-05-12 would be
an equally poor choice. In these cases you want to be able to use the
written-out month name, and also month abbreviations, just the month, just
the day, just the year, etc. (For H-M-S, ISO is not much better;
gnomecal-like programs should not default to 24-hour time). strftime is
the only remotely i18n-friendly choice for these UI applications. 

I agree that for file formats, internet applications, that kind of thing,
ISO-8601 looks nice; it is a precise format when we want to communicate
across locales and time zones. So I agree it should be supported, but
removing strftime sounds bad to me. ISO-8601 is international, but in the
same sense Morse code is international. Not suitable for everyday use.

How about:
g_date_set_iso(GDate* d, const char* iso_string);
g_date_iso    (GDate* d, char* buf, size_t n);

?

>  I highly 
> recommend Paul Eggerts date/time parsing functions which can be located 
> in diffutils.  He's very knowledgeable about these things--
> <eggert@twinsun.com>.
> 

I'll look into this, thanks for the tip. Tim Janik also sent me some code,
that I haven't had a chance to look at; I'm sure all these date-parsing
functions are better than my first attempt. :-)

> Also, without digging into the implementation, what is:
> 
> 	g_date_set_parse
> 
> Or more precisely, how does it parse?  This is tricky stuff, again
> I recommend Paul Eggert's code.
>

I just wrote the first draft of it today. Right now it is a non-validating
heuristic parser. It is intended to take something a user might type in
and guess what date they mean. The code is a little messy and needs
cleaning up, but basically it does this:

 - tries to deduce the preferred order of M, D, Y from the current 
   locale using some sample strftime output
 - uses strftime to ascertain the current locale's long and short
   month names
 - caches the above until the locale changes
 - extracts up to three numbers and one long or short month name from 
   the input, ignoring anything that is not a number or month name, and
   ignoring all numbers after the first three and all months after the 
   first. (not done yet, but eventually it will look for a fourth number
   to catch typos like 12 31 19 98) Basically this means we permit
   any separator,  / , . - : or whatever. Month names are compared
   case-insensitively.
 - If there are three numbers, it parses them as M-D-Y in the locale-
   specific order
 - If there are two numbers and a month day, it parses D and Y in the 
   locale order, then parses the month day
 - If there are two numbers and no month day, it assumes it's a M and Y 
   in locale order
 - If there is one number and a month day, it assumes the number is a year
 - If there's just a single number it tries YYMMDD then YYYYMMDD

Given the appropriate initial determination about M-D-Y ordering, 
it will parse ISO strings:
YYYY-MM-DD
YYYYMMDD
YYMMDD
YY-MM-DD

It will do the wrong thing with week-of-year format, like 1998-W03, since
it just ignores the non-number stuff. It will also parse ridiculous
strings like "sdflk;jasd;gj12asjgdkjg18;asdjf;klsjg98", but if you feed
that to the parser you probably deserve what you get.

GDate's week of year functions are non-ISO, since they are monday and
sunday weeks, and the ISO week inexplicably starts on Thursday. But this
is simple to rectify; g_date_iso_week_of_year() could be added. 

I was planning to move all my parse settings (MDY order, whether to allow
2-digit years, etc.) into a single struct and allow the user to force
particular settings - that way one could override the magical heuristics
and get a validating effect.  (Though the magic seems to be pretty nice
for 90% of cases.)

For ISO format, where the format is strictly specified, you might not want
this kind of fuzzy parser; thus the _set_iso() function makes sense to me. 

> And support for less-accurate forms of date types seems not very 
> useful.   I like the Java interface here, have you looked at it?  I
> think it might be nicer. 

I hadn't looked at it. java.sun.com claims that Date is deprecated? Their
docs describe yet another parsing algorithm in some detail, so I will look
into that too. 

What aspects of the Java interface do you like in particular?

> Why not include times in the GDate structure?
> A date without a time is useless since it may be Sunday in California
> and Monday morning in Europe.  Timezone's impact the date, and therefore
> time impacts the date.  Times must be included for this interface to
> be acceptable.  I think the Java interface should be cloned.
> 

Again, you are assuming an application that needs to send a date/time
across time (e.g. a file format), or across borders (e.g. an internet
protocol), or both. However a UI on screen at a given moment does not need
this information. Also, for many statistical data sets we can just pick
some time zone and use it consistently, and date resolution is sufficient.
(price per day, daily meter readings, etc.)

So I would argue that there is plenty of use for the library without time
information. FWIW our company's application is international and crosses
time zones. And GDate is more or less a replacement for Steffen Beyer's
DateCalc, which seems to be useful.

That said, let me speculate a little on the implications of time
information; thinking out loud, let me know what you think.

First, I guess time_t is broken since it can't represent pre-1970 dates
(OK, earlier I asked why time_t wasn't good enough, here's one answer.)
So our representation of time would have to be a 64-bit unsigned integer 
representing seconds since 0001-01-01.

Second, this would increase the size of a GDate by 4 bytes, up from 8,
assuming we replaced Julian days. Not a huge hit but maybe important
if we have large arrays. (GDate does support static arrays.)

Third, we might as well replace Julian days, otherwise we take a huge
performance hit since the seconds representation must be kept in sync at
all times and is not stored redundantly. We can always use the seconds
representation instead of Julian by converting days to seconds. 

(I should explain that GDate has both Julian and MDY representations 
internally; anytime you set the date, it sets the most convenient one
and marks the other invalid, anytime you access or operate on the date it
updates the nicest one to have for that calculation; so either or both of
the representations can be valid at any given time, and we avoid
recomputing them.)


I guess replacing the julian days with a seconds-since-whenever
representation would work fine and add H-M-S features. However, I don't
have any real idea how seconds work (leap seconds, time zones, etc.). More
importantly I don't have a use for these features or a clear idea how they
would be used. 

I don't think the existing interface would change if H-M-S were added; it
would all continue to be valid, we would simply add mutators and accessors
at a higher resolution, and add time operations. Is there any reason H-M-S
can't be added at a later time in a backward-compatible way?

If you think that's possible, it's my preferred course of action. It looks
to me like the existing interface is indeed a subset of (for example) the
Java Date class, so I would expect it to be compatible with a
finer-resolution one.

Anyway, thanks for the mail, lots of food for thought. I want to work on
the parser more tomorrow, after looking at Paul Eggert's, Tim Janik's, and
the Java Date::parse().

Havoc









[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]