Big Data Architect

12 to14 years
12 to14 years
  About opportunity

Must Have

  • Basics of Distributed computing
  • MapReduce
  • Distributed computing vs RDBMS/ scale up vs scale out
  • Hands on experience in any one of the programming languages (Java, Python, Scala)
  • Understanding of Linux and Bash scripting
  • Knowledge of SQL
  • Basics of Hadoop framework, problem patterns that can be solved like filtering, aggregation, joins etc
  • Understanding of Spark concepts like RDD, Dataframes, Clousures etc., has implemented at least one project using Spark and Scala
  • Should have worked on at least 1-2 bigdata projects (Could be ingestion, ETL processing) on the Cloudera Platform
  • Understanding of Hive/Pig, concepts like partitioning, bucketing, metastore, schema on read vs schema on write, SerDe
  • Solid programming fundaments /design concepts.
  • In depth understanding of different batch and stream processing technologies and NoSQL storage
  • Demonstrated work experience as an Sr.Developer/ Jr. Architect role in Bigdata/Cloud and opensource technology stack.
  • Should be able to articulate, suggest right use of technology stack for different use cases with reasoning.
  • Understanding of Lambda, Kappa architecture
  • Should have participated or able to suggest right hardware choices, platform components, distributions etc.

Good to Have

  • Programming concepts
  • Object oriented vs Functional programming concepts
  • Design patterns (Singleton, Immutable, Factory)
  • MapReduce Programming like Combiner, Partitioiner, InputFormat/OutputFormat, Serialization
  • Distributed Computing
  • Scale up vs Scale out
  • Scala hands on, SparkSQL, dataframes etc.
  • Understanding of different storage formats Avro, RCFile, ORC, Parquet
  • Has worked/working on any one of the cloud platform AWS, Azure, GCP
  • Has worked/working on any one of the bigdata platforms like Hortonworks, Cloudera, Datastacks, Databricks
  • Aware of latest technology trends in streaming, real-time, batch processing frameworks (Storm, Apache Beam, Flink, Spark, Kafka Connect etc)
  • Certified in any of the bigdata distribution (Hortonworks/Cloduera/Databricks/Datastacks)
Hiring Team
Pradeep Pasupuleti and other hiring team at Hitachi in Hyderabad,Pune,Bengaluru

Pradeep Pasupuleti

Working for companies like Hitachi Consulting with 15+ years experience

