apache pig lower 函数

作者: 找个陈美嘉那样的女人
来源: 51数据库
2020-09-20

最近感受了hive的udf函数的强大威力了，不仅可以使用很多已经有的udf函数，还可以自己定义符合业务场景的udf函数，下面就说一下如何写udf/udaf/udtf函数，算是一个入门介绍吧。
　　First, you need to create a new class that extends UDF, with one or more methods named evaluate.
　　package com.example.hive.udf;

import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public final class Lower extends UDF {
public Text evaluate(final Text s) {
if (s == null) { return null; }
return new Text(s.toString().toLowerCase());

　　hive进行udf开发十分简单，此处所说udf为temporary的function，所以需要hive版本在0.4.0以上才可以。 hive的udf开发只需要重构udf类的evaluate函数即可。例： package com.hrj.hive.udf; import org.apache.hadoop.hive.ql.exec.udf; public class helloudf extends udf { public string evaluate(string str) { try { return "helloworld " + str; } catch (exception e) { return null; } } } 将该java文件编译成helloudf.jar hive> add jar helloudf.jar; hive> create temporary function helloworld as 'com.hrj.hive.udf.helloudf'; hive> select helloworld(t.col1) from t limit 10; hive> drop temporary function helloworld; 注： 1.helloworld为临时的函数，所以每次进入hive都需要add jar以及create temporary操作 2.udf只能实现一进一出的操作，如果需要实现多进一出，则需要实现udaf