Make parsing log filenames future-proof#4518
Make parsing log filenames future-proof#4518europaul wants to merge 2 commits intolf-edge:masterfrom
Conversation
02876f6 to 58d6ee9 Compare | the yetus warning can probably be ignored since we do the same in other files that import gomega yetus: pkg/pillar/types/newlogtypes_test.go#L10 |
Actually not in all places, like here gomega it's not dot imported: https://github.com/lf-edge/eve/blob/master/pkg/pillar/types/patchenvelopestypes_test.go#L10 I would avoid as much as we can to introduce new Yetus errors and in this case it's really not hard to fix it... |
| @europaul , LGTM and I liked the flexibility added by using regex. However, I would fix the Yetus error in order to avoid YAYEE (Yet Another Yetus Error on EVE 😃 ) |
| } | ||
| | ||
| // GetTimestampFromGzipName - get timestamp from gzip file name | ||
| func GetTimestampFromGzipName(fName string) (time.Time, error) { |
There was a problem hiding this comment.
As far as I see, we still use this func in another package, edgeveiew:
eve/pkg/edgeview/src/log-search.go
Line 152 in 2edd016
eve/pkg/edgeview/src/system.go
Line 163 in 2edd016
So, we should be careful here...
The new functions are (mostly) backward compatible with the old ones. The only problem that I see is with logs that get created before the devices sets the correct time (probably through NTP), so the initial log files look like |
@OhmSpectator yes, I will change them once this is merged and I can update the pillar dependency. |
58d6ee9 to b703eb2 Compare
@europaul , can we affirm that the timestamp will always precede the file extension, like this If so, the following regex could be used |
So, we are expecting a new PR very after this one is merged... Have you tested the changes in edge view somehow? |
it's actually not just 3 digits - the timestamp can be anything between 0 and now microseconds. |
ok, let's postpone this merge until we can execute it really in sync with the PR for edge-view and newlog. I'll convert it to draft for now |
| appUUID = fStr[0] | ||
| appUUID, err := types.GetUUIDFromFileName(fName) | ||
| if err != nil { | ||
| log.Errorf("buildAppUUIDMap: cannot parse app file name %s: %v", fName, err) |
There was a problem hiding this comment.
Could this generate/log an error when it is run on a device log filename?
There was a problem hiding this comment.
it would, because a device log filename doesn't contain a UUID. however this function is only run on app log files
eve/pkg/pillar/cmd/loguploader/loguploader.go
Lines 653 to 655 in b703eb2
pkg/pillar/types/newlogtypes.go Outdated
| // GetTimestampFromFileName extracts a microsecond timestamp from a filename | ||
| func GetTimestampFromFileName(filename string) (time.Time, error) { | ||
| // Regular expression to match a microsecond timestamp (sequence of 13-16 digits) | ||
| timestampRegex := regexp.MustCompile(`\b\d{13,16}\b`) |
There was a problem hiding this comment.
Isn't it better to declare this as a variable in the file so that it is compiled at init time and not for each Get call?
There was a problem hiding this comment.
that absolutely makes sense!
pkg/pillar/types/newlogtypes.go Outdated
| return time.Time{}, fmt.Errorf("getTimestampFromGzipName: invalid log file name %s", fName) | ||
| // GetTimestampFromFileName extracts a microsecond timestamp from a filename | ||
| func GetTimestampFromFileName(filename string) (time.Time, error) { | ||
| // Regular expression to match a microsecond timestamp (sequence of 13-16 digits) |
There was a problem hiding this comment.
Comment below says millisecond and here you said microsecond. Which one is it?
There was a problem hiding this comment.
thanks for noticing! it's always milliseconds
pkg/pillar/types/newlogtypes.go Outdated
| // GetUUIDFromFileName extracts a UUID from a filename with dot-delimited parts | ||
| func GetUUIDFromFileName(filename string) (string, error) { | ||
| // UUID regex pattern (supports v4 UUIDs like "123e4567-e89b-12d3-a456-426614174000") | ||
| uuidRegex := regexp.MustCompile(`[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}`) |
There was a problem hiding this comment.
Define as a local variable in the file?
| | ||
| // Check each part for a UUID match | ||
| for _, part := range parts { | ||
| if uuidRegex.MatchString(part) { |
There was a problem hiding this comment.
Or call the UUID parser function we use elsewhere?
There was a problem hiding this comment.
I think we mostly use uuid.FromString, which returns a UUID type and I'm fine with string
Instead of parsing filenames according to a certain schema, we should use regexp to look for timestamps or UUIDs. This way the file names can change in the future without affecting backward compatibility with the existing code for parsing filenames. Signed-off-by: Paul Gaiduk <paulg@zededa.com>
Since not all logs in persist are to be sent, we rename LogRemainToSendMBytes to MaxGzipLogMBytesInPersist. This parameter represents the maxinum size of all gzipped logs in persist. Signed-off-by: Paul Gaiduk <paulg@zededa.com>
b703eb2 to 82178e5 Compare
I fixed this problem, by using the first digits-only part that we find between dots as a timestamp |
@OhmSpectator should I prepare another PR with changes to edge-view and newlog that would be merged right after this one before we merge this one? |
I don’t actually remember how we update Pillar as a library. Do we first need to get some kind of hash and add it to the go.mod file of another component? If it’s possible to do within the same PR (I’m not sure), I would prefer to do it that way. I always ask @christoph-zededa or @deitch when dealing with such things. |
Yes, that's how we usually do it, we need to first publish pillar before we can update it as a dependency in other modules. Afaik, it's not possible to do both in one PR. |
| | ||
| // Check each part for a timestamp match | ||
| for _, part := range parts { | ||
| if timestampRegex.MatchString(part) { |
There was a problem hiding this comment.
| if timestampRegex.MatchString(part) { | |
not needed, because you check the error of strconv.ParseInt later anyways.
Main commit is a8725c7. Instead of relying on a certain structure of log filenames, we should just use regexp to look for timestamps or UUIDs - this way the filename structure can change in the future, but it will still be backwards compatible with our filename parsing logic.
This PR contains the changes for pillar. In a later PR those changes will also be applied to newlog and edge-view.