We use Trino (https://trino.io/) to connect to HDFS. I discovered that the data in the information_schema tables, for example:
select * from information_schema.columns clz where clz.table_catalog = ‘hive’ and clz.table_schema = ‘<schema_name>’ and clz.table_name = ‘<table_name>’ doesn’t always match up with what I get if I run
show tables from [schema] show columns in [schema].[table] etc. It seems that the show tables/show columns commands pretty much always match up with what I see if I run the hadoop command (hadoop fs -ls ...) to show the contents of the hdfs folder.
So I’m trying to figure out:
- why the information_schema doesn’t give the same results as show tables/show columns/etc.
- if there is a way to refresh/update information_schema to make it current
Thank you.