Oozie Flow Simple Configurator

  • 13 April 2016 Andriy Kashcheyev 4097

Properly designed Oozie flow is not hardcoded with parameters but operates on variables which are exposed for external configuration. The parameters are passed as oozie variables in sumbitted job properties file with default values specified in the config-default.xml. This setup is perfectly fine for developers who meddle from shell with their low level action development and debugging tasks usually on some dedicated or personal Hadoop environment. 

Situation becomes completely different when workflow component should be part of the Hadoop application with multiple configurable business parameters and used primarily by people who don't have and honestly don't need to be exceptionally proficient and knowledgeable about mechanics of Oozie operations. QA Engineers are most famous „victims“ in the complexity and versatality of Hadoop application configurations since they have to deal with the wholistic view of an Application which might consist of tenth of flows with a plethora of actions.

I personally believe that instead of writing documentation thoroughly describing where required XML file should be found and which XML attribute should be changed it is much better either to automate everything and/or provide an intuitive UI interface to manage parameters.

The following use case is a very simplified demostration which reflects a real production flow: distributed copy of data between different Hadoop clusters. The core of the flow is distcp Action but there are couple of other actions which have their own configuration options and some of those options are identical parameters. And what you don't want to do in complex entrerprise software is duplicate data „management“. 

For designing, debugging and testing complex flows we use the management console named Hadoopware and this is how such a flow looks like in its web interface 

Exta actions in the flow like auditlog-job and job-lock/job-unlock provide facilities to perform entrerpise wide audit trail logging and synchronization execution respectively. Synchronized flow execution is a distributed lock utilizing Zookeeper which was described in the previous Hadoop Zookeeper as locking-mechanism 

I want to note that it took me about 5 minutes to create a flow and the rest of the time was spent debugging and tweeking the flow (some effort was required to arrange the diagram on the screen in a nice way. Essentially this is (creating, not arranging) what Developers would do when doing ad-hoc integration testing of the full flows and QA engineers during exploratory and manual testing. Creating and visualizing flows is super brisk and straighforward in UI interface. However the configuration is still obscure because of nuasences of Oozie actions peculiarities for different actions. Inexperienced Hadoop practitioner will have to constatnly consult the Ooozie manuals to configure everything properly.

Hadoopware provides a configurable specification for Hadoop flows to define attributes which will be visible as global parameter for the full flow. It is similar to Oozie coordinator, but have a more global scope - applied to all components within a Hadoop Application scope. I plan to cover management of multi-flow Hadoop Application use cases in a seperate article.

On the right there is a Properties Panel where flow wide configurations can be edited. Property names are effectively placeholders for values which can be also specified as default values. Since Hadoopware has a concept of Design time and Runtime management, everything specified in "Design time" flows becomes default values with some parameters automatically replaced with environment specific settings at deployment instant (e.g. namenode, but not limited to standard Hadoop properties)

Everything I have to know from the functional perspective of the flow is visible and configurable in one place for all actions.

This is how the same flow looks like after execution:

Debugging the flow is quite straightforward: execute, check for errors, make modifications directly in the UI, save and run again. All modifications are applied immediately on the HDFS and ready to be executed.

When the flow is fully functional, I can export it from the cluster into Hadoopware IDE as a "template" for future references: new flows creation based on it. building more complex flow from it, using just parts of the flow, assemlying superflows. Eventually I can download it as a source code with all included jars and configuration files for committing all fixes to the source repository (on the development flow I will touch in subsequent article).

My Hadoop environment for this demo in one screenshot:

Another interesting point about aforementioned flow is that data was copied from Cloudera cluster to the HortonWorks cluster. The window view in the left top corner shows all clusters currently under management of Hadoopware (3 CDH clusters: 4.8, 5.3 (Kerberized),5.5 and 1 HDP)

Last point: CDH 5.5 still has a bug in Oozie email action, where missing password causes action to fail with a NPE and only Java methods in stack trace hint to the origin of the error:

2016-04-11 19:31:17,317 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hed-lab1.ea.intropro.com] USER[oozie] GROUP[-] TOKEN[] APP[emailer] JOB[0000000-160411143739514-oozie-oozi-W] ACTION[0000000-160411143739514-oozie-oozi-W@email_1] Error starting action [email_1]. ErrorType [ERROR], ErrorCode [NullPointerException], Message [NullPointerException: null]
org.apache.oozie.action.ActionExecutorException: NullPointerException: null
	at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
	at org.apache.oozie.action.email.EmailActionExecutor.start(EmailActionExecutor.java:112)
	at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
	at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
	at org.apache.oozie.command.XCommand.call(XCommand.java:286)
	at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
	at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
	at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
	at java.lang.String.<init>(String.java:166)
	at org.apache.oozie.service.ConfigurationService.getPassword(ConfigurationService.java:556)
	at org.apache.oozie.service.ConfigurationService.getPassword(ConfigurationService.java:571)