Job Duties:

Work with the Business Analyst and Project Lead to collect and translate business / user requirements into technical and functional requirements and specifications for the Hadoop (Big Data) redesign system/application;

Design, code, program, develop and implement the Hadoop (Big Data) systems/database architectural redesigns, flow charts and work flow and database models and user interfaces using Visual Studio, HTML, CSS, Java Script, ASP, ASP.NET, SQL Server, PL/SQL, Shell Scripting, XML;

Design, program, and implement software codes and software scripts for data loading and validation using Java, Java Script for validations;

Design, code, program and load database models, tables, views and stored procedures and queries for the application/system to capture and analyze data using SQL Scripts, SQL Developer and SQL Server;

Review and analyze data for importing, uploading and validation;

Data mapping and data migration between database and developing software scripts that correctly capture the data being migrated;

Deploy, migrate, customize and integrate the redesigned /modified database system/application;

Design, program and implement software code and software scripts for Data conversion and migration;

Perform Big data analytics, data validation and stored procedures developing and using SQL queries and MS SQL Server;

Installed and configured multiple Hadoop clusters on the platform as per the requirements;

Configured several Hadoop components on the database clusters such as MapReduce, YARN, Zookeeper, HDFS, Hive, HBase, Spark, Sqoop, Oozie etc. and ensure that they are functioning as expected;

Develop and automate the ETL workflows for importation and analysis of required data from several RDMS sources such as SQL server, Oracle and DB2 into Hadoop (Big Data) using components like Oozie, Sqoop, Hive, UNIX shell scripts and scheduled them to run daily;

Install and configure data analytics tools such as R, SAS on the Hadoop clusters and provide necessary technical support to data scientists to ensure that they work properly;

Monitor multiple clusters on the Hadoop platform and resolving any issue that may occurred, to ensure that the platform is always available for the data scientists to run their jobs;

Develop and implement an automated monitoring application to transmit high alerts if any component of the Hadoop platform is down using Python;

Design, develop and implement HiveQL queries to materialize the existing hive tables to generate data that can directly be used for data analytics;

Configure and administer Spark and provide technical training and support to the Scientist analytics team on use of Spark for use cases;

Review, update, monitor Hadoop (Big Data) system/application performance;

Perform upgrades on the cluster whenever new version of Hadoop components were released and resolved any issues caused due to upgrade by working with users and product vendor;

Perform Hadoop Platform and database administration, monitoring, upgrade, disaster recovery, and technical support to end-users;

Design, code, programming and installation of redesigned database architect and third-party software and hardware;

Perform troubleshooting and system maintenance, backup and disaster recovery;

Establish and monitor user environments, directories and security profiles, and ensure/verify proper functioning of all aspect of the Hadoop platform and database;

Providing 24×7 production, technical support and maintenance.


Minimum Requirements: Bachelor’s degree or equivalent in Computer Science, or Information Systems or related technical fields and 1-2 years of related experiences.


Location:   Herndon, VA and Charlotte, NC and other locations nationwide

Apply for this position

Allowed Type(s): .pdf, .doc, .docx