Skip to main content
Bumped by Community user
added 20 characters in body
Source Link
raphael75
  • 244
  • 4
  • 18

We use Trino (https://trino.io/) to connect to HDFS. I discovered that the data in the information_schema tables, for example:

select * from information_schema.columns clz where clz.table_catalog = ‘hive’ and clz.table_schema = ‘<schema_name>’ and clz.table_name = ‘<table_name>’ 

doesn’t always match up with what I get if I run

show tables from [schema] show columns in [schema].[table] 

etc. It seems that the show tables/show columns commands pretty much always match up with what I see if I run the hadoop command (hadoop fs -ls ...) to show the contents of the hdfs folder.

So I’m trying to figure out:

  • why the information_schema doesn’t give the same results as show tables/show columns/etc.
  • if there is a way to refresh/update information_schema to make it current

Thank you.

We use Trino (https://trino.io/) to connect to HDFS. I discovered that the data in the information_schema tables, for example:

select * from information_schema.columns clz where clz.table_catalog = ‘hive’ and clz.table_schema = ‘<schema_name>’ and clz.table_name = ‘<table_name>’ 

doesn’t always match up with what I get if I run

show tables from [schema] show columns in [schema].[table] 

etc. It seems that the show tables/show columns commands pretty much always match up with what I see if I run the hadoop command to show the contents of the hdfs folder.

So I’m trying to figure out:

  • why the information_schema doesn’t give the same results as show tables/show columns/etc.
  • if there is a way to refresh/update information_schema to make it current

Thank you.

We use Trino (https://trino.io/) to connect to HDFS. I discovered that the data in the information_schema tables, for example:

select * from information_schema.columns clz where clz.table_catalog = ‘hive’ and clz.table_schema = ‘<schema_name>’ and clz.table_name = ‘<table_name>’ 

doesn’t always match up with what I get if I run

show tables from [schema] show columns in [schema].[table] 

etc. It seems that the show tables/show columns commands pretty much always match up with what I see if I run the hadoop command (hadoop fs -ls ...) to show the contents of the hdfs folder.

So I’m trying to figure out:

  • why the information_schema doesn’t give the same results as show tables/show columns/etc.
  • if there is a way to refresh/update information_schema to make it current

Thank you.

Source Link
raphael75
  • 244
  • 4
  • 18

How is the information_schema table in trino updated?

We use Trino (https://trino.io/) to connect to HDFS. I discovered that the data in the information_schema tables, for example:

select * from information_schema.columns clz where clz.table_catalog = ‘hive’ and clz.table_schema = ‘<schema_name>’ and clz.table_name = ‘<table_name>’ 

doesn’t always match up with what I get if I run

show tables from [schema] show columns in [schema].[table] 

etc. It seems that the show tables/show columns commands pretty much always match up with what I see if I run the hadoop command to show the contents of the hdfs folder.

So I’m trying to figure out:

  • why the information_schema doesn’t give the same results as show tables/show columns/etc.
  • if there is a way to refresh/update information_schema to make it current

Thank you.