DateList "aggressively" interprets strings, such as "p72h":
DateList["p72h"] gives {1972, 1, 1, 0, 0, 0.}
Strings like "p72h" represent not DATE type but DURATION type values ("72 hours") in TimeML's TIMEX3 specification.
In a perfect world this would not matter since Cases or similar pattern matching can be used to select TIMEX tags based on the type element. Unfortunately, coding is not perfect; some fraction of DURATION tags are misclassified as DATES.
I don't see an Option to force DateList to be more literal (is there?). Since all TIMEX3 DATE values normalized as one of the following three forms, where Y, M, D are individual digit characters corresponding to year, month, day:
"YYYY-MM-DD" or "YYYY-MM" or "YYYY"
What's the most succinct StringExpression to match only these cases and reject DURATION strings? I'm guessing it may involve Alternatives and DigitCharacter but I'm not overly familiar with the subtleties of string expressions. Any help appreciated.