29 August 2013

Creating a Basic Hive UDF

Creating and using a basic Hive UDF is pretty simple.

First locate the hive-exec and hadoop-core jars on your system, and add them to the class path:


Next create a directory structure for the java files:

mkdir -p udf_test/src/com/sodonnel/udf
mkdir -p udf_test/classes

Create the most basic hello world UDF in udf_test/src/com/sodonnel/udf/HelloWorld.java:

package com.sodonnel.udf;

import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.hive.ql.udf.UDFType;

public class HelloWorld extends UDF
  public String evaluate(String v) {
    return "Hello World!";

In the src directory, compile the Java class:

javac -d ../classes com/sodonnel/udf/HelloWorld.java

This will create the directories and class file under the classes folder. Now we need to create a JAR out of the class file. In the classes directory run the following command:

jar cf HelloWorld.jar com

The final step is to load this jar file into Hive:

hive> add jar /export/home/sodonnel/udf/src/com/sodonnel/udf_test/classes/HelloWorld.jar;
Added /export/home/sodonnel/udf/src/com/sodonnel/udf_test/classes/HelloWorld.jar to class path
Added resource: /export/home/sodonnel/udf/src/com/sodonnel/udf_test/classes/HelloWorld.jar

hive> create temporary function hello_world as 'com.sodonnel.udf.HelloWorld';
Time taken: 0.0040 seconds

Now call the function when selecting some rows from a table:

hive> select hello_world('any string') from my_table limit 10;
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!

Not a very useful UDF, but it opens the door for more interesting things.

blog comments powered by Disqus