While working with ORC data, it becomes difficult to maintain schema while model is still evolving. At the same time creating test data becomes cumbersome. Issue becomes more severe when model is evolving. Development team has to go through multiple iteration of phases where model change triggers change in testing data and whole lot of time has to be invested.
OrcUtil tries to solve such problems.
- Using annotation we can specify metadata in model class, for ORC schema and OrcUtil will generate ORC schema on the fly. No need to manually update schema everytime model changes. For more info, refer test cases in OrcSchemaGeneratorSpec.
- Unit test data can be easily generated using OrcCreator#createOrcStruct. For more info, refer test cases in OrcCreatorSpec.
Currently below ORC types are supported:
PRIMITIVE(INT,DOUBLE,LONGandSTRING)STRUCTLISTMAP