Site Reliability Engineer

5 - 9 yrs exp
5 - 9 yrs exp
About opportunityAbout opportunity

MoEngage Inc. is a leading Marketing Technology Stack provider that is helping brands redefine their customer engagement in the mobile era. Brands use MoEngage to drive long-term, personalized and context-based engagement across channels to help achieve increased customer retention as well as customer LTV. Sitting at a conflux of diverse technologies like Artificial Intelligence, Big Data, Web & Mobile platforms, MoEngage technology analyzes billions of data points generated customers and their devices, to predict customer behavior and build marketing campaigns that proactively engage users. In just three years since inception, MoEngage is working with leading brands across e-commerce, entertainment, travel, publishing and banking domains among others. With a global presence spanning 35 countries, MoEngage has offices in San Francisco, Berlin, Jakarta, and Bengaluru. We are a young, fast-paced workplace that fosters a culture of innovation, ownership, freedom, and fun while building tech products of the future. Our teams are comprised of self-driven, passionate, smart individuals from top-tier institutes who are young achievers.

Minimum qualifications:

  • Master’s programme in Computer Science or a related technical field.
  • Experience in one or more of the following: Java, Python, Go, Perl, Ruby, shell scripting.
  • Has hands on experience working on AWS or GCP (at-least 2+ year).
  • Experience with Unix/Linux operating systems internals and administration (e.g. filesystems, inodes, system calls, etc) or networking (e.g. TCP/IP, routing, network topologies and hardware, SDN, etc).
  • Expertise in analyzing and troubleshooting large-scale distributed systems.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.

Good to have skills:

  • Knowledge of distributed systems like Kafka, Yarn Elastic Search etc..
  • Experience with any one modern APM tools such as New Relic, Datadog, Icinga etc..
  • Experience with any one logging stacks such as ElasticSearch or Graylog etc..
  • Implementation experience with metrics collectors such as Graphite or Prometheus for distributed system.
  • Experience in containerized application deployments(using k8s/Docker), monitoring, tracing/logging on cloud platforms.
  • Proven track record in Optimising and monitoring server performance and infrastructure cost.
  • Operation and architecture of multi-tier distributed systems Involving load balancers, caching layers and real-time event processing.
  • Any NoSql DB cluster life cycle management experience is good to have.
  • Work Experience of at least 4+ years.


  • Work at Scale and challenge yourself. Free to choose your own tech gear. Work with a smart team which grew up in the Mobile First world
  • Free lunch, evening snacks, caffeine all day.
Read more
Interview RoundsInterview roundsUnavailable
Hiring Team
Thank you!
You will get the call soon.
success tick
Thanks much!
Appreciate your feedback.