Senior Big Data Engineer, ML Operations team



Software Engineering, Operations, Data Science
Kyiv city, Ukraine
Posted on Saturday, November 18, 2023

Our ML teams are part of our core Data Science and Machine Learning group and consist of Online and Offline ML teams. The Online ML (ML Platform) team is responsible for building and operating low-latency and data-intensive systems such as a feature store, feature extraction, ML model serving, and versioning systems. The Offline ML (ML Operations) team is responsible for the ML model release process, ML pipelines, model training, and validation.

ML operations team aims to deliver frequent, trusted & stable model releases to Sift customers by continuously evolving our process and tooling. We serve customers across multiple verticals such as online commerce, delivery service, finance, travel sites, etc., and we have customers in both developed and developing countries. Our technology helps protect Internet users from ever-evolving online scams, payment fraud, abusive content, account takeover, etc. They are a forward-thinking team constantly challenging themselves and the status quo to push the boundary of machine learning and data science across multiple product offerings at Sift and collaborate with product engineering teams to deliver tangible customer value.

The technical stack - Java, Apache Airflow, Apache Spark, Databricks, Dataproc, Snowflake, GCP (GKE, BigTable).

Opportunities for you

  • Professional growth: quarterly Growth Cycles instead of performance review;

  • Experience: knowledge sharing through biweekly Tech Talks sessions. You will learn how to build projects that handle petabytes of data and have small latency and high fault tolerance;

  • Business trips and the annual Sift Summit, in 2022, Summit took place in California;

  • Remote work approach: you can choose where you work better.

What would make you a strong fit

  • You have a growth mindset and a strong interest in working with real-world machine-learning applications and technologies;

  • Excellent communication skills and collaborative work attitude;

  • 7+ years of professional software big data development experience;

  • Experience building highly available low-latency systems using Java, Scala, or other object-oriented languages;

  • Experience working with large datasets and data processing technologies for both stream and batch processing, such as Apache Spark, Apache Beam, Flink, and MapReduce;

  • Knowledge of GCP or AWS cloud stack for web services and big data processing;

  • Basic knowledge of MLOps on model release/training/monitoring;

  • Conceptual knowledge of ML techniques;

  • B.S. in Computer Science (or related technical discipline), or related practical experience.

What you’ll do

  • Build massive data pipelines and batch jobs that are part of our model training process using the latest technologies such as Apache Spark, Flink, Airflow, and Beam;

  • Prototype and explore with latest machine learning or analytics technologies such as model-serving frameworks, or workflow orchestration engines;

  • Collaborate with machine learning engineers and data scientists and contribute to internal tooling, experimentation, and prototyping efforts;

  • Simplify, re-architect, and upgrade the existing pipeline technologies.

Benefits and Perks

  • A compensation package that consists of financial compensation, a biannual 5% bonus, and stock options;

  • Medical, dental, and vision coverage;

  • 50$ for sports and wellness;

  • Education reimbursement: books, education courses, conferences;

  • Flexible time off: we follow an unlimited vacation approach;

  • Tuned work schedule to Kyiv timezone despite US offices location: biweekly demo sessions are optional for our team and we watch them from recording;

  • Mental Health Days: additional day-offs;

  • English courses and social activities inside the company allow improving your public speaking and language.

Our interview process

  • 45-minute introduction call with a recruiter;

  • 60-minute technical screen with the medium leet code problem-solving task;

  • Virtual onsite with the team will take approximately 3,5 hours (system design, coding, deep dive, and values-based interview).

During our sessions, you will have the opportunity to learn about company culture, meet engineers from your team, and discuss distributed system problems. You will have time for all interesting questions and get transparency regarding your future responsibilities and the project.

Let’s Build It Together

At Sift, we are intentionally building a diverse, equitable, and inclusive workplace. We believe that diversity drives innovation, equity is a fundamental right, and inclusion is a basic human need. We envision a place where all Sifties feel secure sharing their authentic selves and diverse experiences with their teams, their customers, and their community – ultimately using this empowerment and authenticity to build trust and create a safer Internet.

This document provides transparency around the way in which Sift handles personal data of job applicants: https://sift.com/recruitment-privacy