Teradata Data Engineer in Islamabad, Pakistan

Company Introduction

Think Big is the leading professional services firm focused exclusively helping companies unlock new insights and value from Big Data. We provide data science and data engineering services to assemble custom applications that deliver business outcomes. We collaborate with leaders to prioritize initiatives and generate value quickly with our proven “test and learn” methodology.

Job Description

We are looking for a Big Data Engineer that will work on the collecting, storing, processing, and analyzing of huge sets of data. You will be member of a team that develops and implements advanced algorithms and data pipelines that extract, classify, merge, and deliver new insights and business value out of heterogeneous structured and unstructured data sets. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.

You will have a chance to learn and work with multiple technologies and Thought Leaders in the Big Data space. You will also be responsible for integrating them with the architecture used across the company.

Responsibilities

  • Work with consultant teams on specific customer deliverables as and when required.

  • Integrating any Big Data tools and frameworks required to provide requested capabilities.

  • Designing and Implement Data Lake.

  • Monitoring performance and advising any necessary configurations & infrastructure changes.

  • Debugging and Resolving Hadoop (YARN/Map Reduce/Spark etc.) issues.

  • Advise and implement Data lake security using Kerberos/Knox/Ranger/SSL etc.

    Skills and Qualifications

  • Proficient understanding of distributed computing principles

  • Management of Hadoop cluster, with all included services using Apache Ambari, Cloudera Manager, MapR control system

  • Proficiency with Hadoop v2, MapReduce, HDFS,YARN,Tez

  • Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming, ni-fi.

  • Good knowledge of Big Data querying tools, such as Pig, Hive, Oozie and Impala

  • Working knowledge of Apache Spark

  • Experience with integration of data from multiple data sources

  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB

  • Knowledge of various ETL techniques and frameworks, such as Flume

  • Experience with various messaging systems, such as Kafka or RabbitMQ

  • Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O

  • Good understanding of Lambda Architecture, along with its advantages and drawbacks

  • Experience with any of the following Hadoop distributions: Cloudera/MapR/Hortonworks

  • Training/Certification on any Hadoop distribution will be a plus.

  • Completion of any MOOCS will be an advantage.

Location

This position is open for Islamabad and Lahore