• 27 May 2016 Andriy Kashcheyev 3208

Relatively complex Hadoop Application usually consists of multiple components and flows. In Enterprise environments application is often logically encapsulated into one single project at management level, but more importantly into a single deployable artifact at deliverable level. In simple words Hadoop Application must be released as a single deployable package to Implementation or DevOps team 

  • 13 May 2016 Igor Khotin 4116

Will the Cloud unlweash an acid rain on Set Top Boxes (STB)? This is what many people want you to think. Others provide rational arguments why this is not happening. If you take a “long view” though...

The STB is Morphing

By any measure the Set Top Box (STB) / Customer Premises Equipment (CPE) business is huge. Hearings in the U.S. Senate in 2015 estimated that there are over 220 Million STB’s in U.S. Television Households served by Multichannel Video Program Distributors. That’s over 2 STB’s per household; 99% of these units are rented which generates an average of $231 a year per household1. That’s $19.5 Billion a year folks. No wonder that ownership of Internet set-top boxes (iSTBs) such as Apple TV and Roku jumped 63% from 2013 to 2014. Today, over 20% of U.S. households now own an Internet STB (I-STB)2.

  • 13 April 2016 Andriy Kashcheyev 4096

Properly designed Oozie flow is not hardcoded with parameters but operates on variables which are exposed for external configuration. The parameters are passed as oozie variables in sumbitted job properties file with default values specified in the config-default.xml. This setup is perfectly fine for developers who meddle from shell with their low level action development and debugging tasks usually on some dedicated or personal Hadoop environment. 

  • 03 March 2016 Andrii Petruchek 4807

Domain creation procedure is a complex and laborious process itself. Depending on the domain size this task can take several days. At the same time, most of the settings are the same across all servers. In order to unify domains configuration and reduce the impact of the “human factor” we decide to create automation script for domain creation procedure.

  • 17 February 2016 Andriy Kashcheyev 5084

I've been doing a simulation of the possible data ingestion flow where incoming messages are funnelled through Kafka and stored on ELK for fast analytical and intelligent insights into data quality. Instead of working with raw messages which require several stages of data transformations (format conversion, filtering, mapping to business attributes), I've taken post-processed files as initial input. However, the files are in AVRO format which is not an issue for Kafka but would be a real parsing disappointment for Logstash

  • 11 December 2015 Danylo Denyshchenko 6313

Nowadays, companies, which are developing software for Big Data, face the problem of competition for computing resources between participants of the development process. On certain stages developers use their own environment deployed on their computers, which requires installing all necessary services locally or, more often, running virtual machines.

One of the major players in the Hadoop ecosystem is Cloudera. It provides CDH (Cloudera Distribution Including Apache Hadoop) that includes Hadoop platform core (HDFS, MapReduce, Hadoop Commons) and integrated open source projects, such as: Apache Spark, Apache HBase, Apache Pig, Apache Hive and others. Also they have Cloudera Manager console for administering and managing Hadoop ecosystem.

Cloudera prepares virtual machine and docker QuickStart image, which include Cloudera Manager and CDH.

  • 02 December 2015 Iurii Zaiarnyi 3288

Within IntroPro's RnD department we are designing and developing pretty complex prototypes and solutions that usually have Mobile Clients as a user interaction UI layer. When such type of a project is fast paced with resources constraints and pursues, mostly, for technology exploration and investigation goals; the requirements to the mobile application quality are generally rather loose. However, there are situations when a project emphasis is more on the visual represantation or exploitation of advanced features, that can be better depicted with a mobile application. In such cases, we have to put an effort to deliver a high quality user experience, that is always associated with extensive QA phases. Mobile clients are usually manifested by at least iOS and Android applications, but in our case we have an „exacerbated" situation since sometimes we also have TV oriented client.

  • 25 August 2015 Yuriy Petyuk 11648

The main reason why we may want to set space quota on HDFS is to limit the space consumption by users or applications. We could run out of free space when using Hadoop cluster during some period of time. The general recommendation is to predict data increase and add new data-nodes to the cluster. But in case we didn’t do that on time we may have numerous issues where cluster became very sluggish and on the verge of collapse due to space full.

  • 23 July 2015 Anton Zhuk 6712

A successful software project is much more than software development only. There are always expected development problems, usability problems and, surely, delivery problems. 
Often the delivery phase of software development project is postponed. The product is good enough and the design is fine, the code looks well-written and good-tested — but the final handoff and deployment have been dismissed.

The DevOps movement has emerged as a response to these problems — improving deployment and management of software that has already been written.

  • 24 June 2015 Oleg Serdyukov 4172

Our client has decided to have the solution which will make possible the automated installation of Secure Hadoop Cloudera on Cloud computing software platforms (Amazon, OpenStack) Also this decision has to have an opportunity to automatically reconfigure Hadoop nodes at excess of critical limits of memory, free space etc. We created the Cloudera Configurator and Cloudera Hadoop Scaleout that completely meets these expectations.