Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't load GeoLite2-City.mmdb from hdfs #3

Open
mshirley opened this issue May 11, 2016 · 2 comments
Open

Can't load GeoLite2-City.mmdb from hdfs #3

mshirley opened this issue May 11, 2016 · 2 comments

Comments

@mshirley
Copy link

mshirley commented May 11, 2016

I'd like to load the GeoLite2-City.mmdb file from HDFS but hive-udfs can't read it because it's not clear what the file path is. The only way I can get it to work is execute 'list files', copy the tmp directory location then use that in the function.

hive> ADD jar hdfs:///resources/jars/hive-geoip-udf-0.1-SNAPSHOT.jar;                                                                                                        [4/1829]
converting to local hdfs:///resources/jars/hive-geoip-udf-0.1-SNAPSHOT.jar
Added [/tmp/0fd54f8d-e3eb-4cfe-823f-8d1a0ce7c13a_resources/hive-geoip-udf-0.1-SNAPSHOT.jar] to class path
Added resources: [hdfs:///resources/jars/hive-geoip-udf-0.1-SNAPSHOT.jar]

hive> ADD FILE hdfs:///resources/data/geoip/GeoLite2-City.mmdb;
converting to local hdfs:///resources/data/geoip/GeoLite2-City.mmdb
Added resources: [hdfs:///resources/data/geoip/GeoLite2-City.mmdb]

hive> CREATE TEMPORARY FUNCTION geoip as 'com.spuul.hive.GeoIP2';
OK
Time taken: 0.537 seconds

hive> select geoip('8.8.8.8', 'CITY', 'GeoLite2-City.mmdb');
OK
Time taken: 1.258 seconds, Fetched: 1 row(s)

hive> select geoip('8.8.8.8', 'CITY', './GeoLite2-City.mmdb');
OK
Time taken: 0.165 seconds, Fetched: 1 row(s)

hive> list files;
/tmp/0fd54f8d-e3eb-4cfe-823f-8d1a0ce7c13a_resources/GeoLite2-City.mmdb

hive> select geoip('8.8.8.8', 'CITY', '/tmp/0fd54f8d-e3eb-4cfe-823f-8d1a0ce7c13a_resources/GeoLite2-City.mmdb');
OK
Mountain View
Time taken: 0.253 seconds, Fetched: 1 row(s)
@DanielMuller
Copy link
Contributor

Seems that the path of the database file given in the geoip method is relative to the path of the JAR.
I am no expert in this, I only used the files on the local file system. Both in root folder of the user launching hive.

@weiatwork
Copy link

From the API it looks like it only supports reading from local file. HDFS is not supported.
https://github.com/maxmind/GeoIP2-java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants