You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'd like to load the GeoLite2-City.mmdb file from HDFS but hive-udfs can't read it because it's not clear what the file path is. The only way I can get it to work is execute 'list files', copy the tmp directory location then use that in the function.
hive> ADD jar hdfs:///resources/jars/hive-geoip-udf-0.1-SNAPSHOT.jar; [4/1829]
converting to local hdfs:///resources/jars/hive-geoip-udf-0.1-SNAPSHOT.jar
Added [/tmp/0fd54f8d-e3eb-4cfe-823f-8d1a0ce7c13a_resources/hive-geoip-udf-0.1-SNAPSHOT.jar] to class path
Added resources: [hdfs:///resources/jars/hive-geoip-udf-0.1-SNAPSHOT.jar]
hive> ADD FILE hdfs:///resources/data/geoip/GeoLite2-City.mmdb;
converting to local hdfs:///resources/data/geoip/GeoLite2-City.mmdb
Added resources: [hdfs:///resources/data/geoip/GeoLite2-City.mmdb]
hive> CREATE TEMPORARY FUNCTION geoip as 'com.spuul.hive.GeoIP2';
OK
Time taken: 0.537 seconds
hive> select geoip('8.8.8.8', 'CITY', 'GeoLite2-City.mmdb');
OK
Time taken: 1.258 seconds, Fetched: 1 row(s)
hive> select geoip('8.8.8.8', 'CITY', './GeoLite2-City.mmdb');
OK
Time taken: 0.165 seconds, Fetched: 1 row(s)
hive> list files;
/tmp/0fd54f8d-e3eb-4cfe-823f-8d1a0ce7c13a_resources/GeoLite2-City.mmdb
hive> select geoip('8.8.8.8', 'CITY', '/tmp/0fd54f8d-e3eb-4cfe-823f-8d1a0ce7c13a_resources/GeoLite2-City.mmdb');
OK
Mountain View
Time taken: 0.253 seconds, Fetched: 1 row(s)
The text was updated successfully, but these errors were encountered:
Seems that the path of the database file given in the geoip method is relative to the path of the JAR.
I am no expert in this, I only used the files on the local file system. Both in root folder of the user launching hive.
I'd like to load the GeoLite2-City.mmdb file from HDFS but hive-udfs can't read it because it's not clear what the file path is. The only way I can get it to work is execute 'list files', copy the tmp directory location then use that in the function.
The text was updated successfully, but these errors were encountered: