Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to connect to our existing remote HDFS and obtain metadata #5374

Open
yu286063991 opened this issue Oct 30, 2024 · 6 comments
Open

How to connect to our existing remote HDFS and obtain metadata #5374

yu286063991 opened this issue Oct 30, 2024 · 6 comments

Comments

@yu286063991
Copy link

We attempted to connect to HDFS by entering IP and port in the location parameter, but were unable to retrieve catalogs and filesets, and there were no error messages in Gravitino. How should we configure it to successfully retrieve metadata?
The following image shows our configuration information, but we are unable to obtain HDFS metadata information through this configuration:
hdfs

@yu286063991
Copy link
Author

We conducted testing based on version 0.6.1-incubating

@FANNG1
Copy link
Contributor

FANNG1 commented Oct 31, 2024

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

@yu286063991
Copy link
Author

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

Thank you for your reply.
Do we need to create a mapping relationship between HDFS directories and local directories through the Fileset API instead of directly querying HDFS metadata

@FANNG1
Copy link
Contributor

FANNG1 commented Oct 31, 2024

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

Thank you for your reply. Do we need to create a mapping relationship between HDFS directories and local directories through the Fileset API instead of directly querying HDFS metadata

sorry, I couldn't get your point. Generally, you need to create a fileset that maps an HDFS directory, and you could read and write the HDFS data by using gvfs://xxx not hdfs://xx.

@yu286063991
Copy link
Author

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

Thank you for your reply. Do we need to create a mapping relationship between HDFS directories and local directories through the Fileset API instead of directly querying HDFS metadata

sorry, I couldn't get your point. Generally, you need to create a fileset that maps an HDFS directory, and you could read and write the HDFS data by using gvfs://xxx not hdfs://xx.

Does creating a mapping refer to calling API to create a fileset corresponding to the HDFS directory.
For example, the following API:
http://localhost:8090/api/metalakes/:metalake/catalogs/:catalog/schemas/:schema/filesets

@yuqi1129
Copy link
Contributor

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

Thank you for your reply. Do we need to create a mapping relationship between HDFS directories and local directories through the Fileset API instead of directly querying HDFS metadata

sorry, I couldn't get your point. Generally, you need to create a fileset that maps an HDFS directory, and you could read and write the HDFS data by using gvfs://xxx not hdfs://xx.

Does creating a mapping refer to calling API to create a fileset corresponding to the HDFS directory. For example, the following API: http://localhost:8090/api/metalakes/:metalake/catalogs/:catalog/schemas/:schema/filesets

Exactly, a fileset can map a logical directory to a physical position and ignore the actual implementation. So, I wonder what's the problem you 're encountering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants