forked from azkaban/azkaban-plugins
-
Notifications
You must be signed in to change notification settings - Fork 4
/
INSTALL
63 lines (42 loc) · 3.87 KB
/
INSTALL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
******************Introduction*******************
One significant difference between Azkaban2 and Azkaban (or many other workflow management/scheduling systems) is the plugin based executor design.
This way, Azkaban2 core code does not need to change when a specific job type changes, say from hadoop 0.2x to hadoop 1.x, or to YARN clusters.
One can also easily add new job types for Hive, Crunch, etc. Each jobtype plugin can be configured with a different set of classes that will be loaded for user by default.
Included in this package are job types for legacy java and pig jobs, which are just plugin version of java and pig types in original Azkaban.
There are also hadoopjava and hadooppig job types in this package. The difference between these two sets of job types is that, the legacy java and pig, while also work with secure hadoop clusters, expose security information such as proxy user and azkaban keytab location to the user code. It is easy for a user to use such information and proxy himself as anybody to do anything on a secured hadoop cluster.
The hadoopjava and hadooppig jobs utilize hadoop security tokens, which makes it much safer from malicious users.
Regardless, these hadoop jobs need hadoopsecuritymanager plugin for getting tokens or getting proxy users. Included in azkaban-plugins is a hadoopsecuritymanager that should work with hadoop 1.x versions. It should be easy to implement hadoopsecuritymanager for different versions of hadoop installations.
*****************Plugin Directory Structure*****************
The jobtype plugins should be organized into a flat structure.
for example:
azkaban-executor-server/
....................... plugins/
................................jobtypes/
.........................................java/
.........................................pig/
.........................................other_job_type/
One needs to set azkaban.jobtype.plugin.dir in azkaban's conf file to the jobtype plugin root directory, (in this case plugins/jobtypes).
Azkaban2 executor server will try to load job types in that directory upon start up.
One can use common.properties for system wide properties that will be inherited by all jobs and will be visible to the user. commonprivate.properties should contain system wide properties that users may not see, such as security settings.
Inside each jobtype directory, Azkaban2 looks for private.properties, which needs a minimum of following:
jobtype.classpath ---- the job type specific classes that this pluging classloader should load, and should be on individual job's classpath.
jobtype.class ---- the plugin class that implements this job type.
The directory name will be inferred as the job type name in Azkaban2.
One can have other settings that are not exposed to users in private.properties, and those visable to users in plugin.properties. These settings will inherit the global settings in common.properties and commonprivate.properties respectively.
*****************How to Install:*****************
The job types in this package can be installed as follows:
1. run ant in the azkaban-plugins directory, make sure you have installed the "dustjs-linkedin" module via "npm install -g less dustjs-linkedin"
2. compile hadoopsecuritymanager
cd plugins/
cd hadoopsecuritymanager/
ant
## this builds azkaban-hadoopsecuritymanager-0.10.jar
3. build jobtype plugins, package with aforementioned directory structure
cd plugins/
cd jobtype/
ant package
## this builds a tar.gz file in ../../dist/jobtype/packages/azkaban-jobtype-0.10.tar.gz
untar this tar file in your plugin directory, say myplugins/
untar the package, change its name to your azkaban.jobtype.plugin.dir setting.
You need to fill the variables in the .properties files with correct values, including the classpath to your hadoopsecuritymanager jar file.
You need to restart azkaban-executor-server or start a new instance of executor-server to get the new types picked up.