Data Engineer
About Habu
Habu is a startup that is redefining how companies do data-driven marketing in a privacy-first era. Launched by the team that wrote the book on data-driven marketing as well as built the market-leading DMP that was acquired by Salesforce, Habu is designed for the 2020’s. We future proofed our system with identity-strong and privacy-by-design foundations and have built software applications for privacy-safe data sharing, analytics and measurement. In a complex world of distributed systems, our modular technology can deliver intelligence from data wherever it may live.
We firmly believe in the value of company culture which holds true to our core principles of grit, innovation, and collaboration. These values ring true in the people, products, and passion on display each and every day. We understand what makes experiences transformative in our company and with our clients and we know, Habu can help.
Role Description
We are looking for a highly skilled data engineer who is comfortable wrangling with petabytes of data. They will be responsible for the architecture, design, development of data pipelines, optimize and scale data systems and/or build them ground up.
Our Tech stack
Golang gRPC / Scala / Java / Python
Spark / Kafka / Airflow
Postgres / ScyllaDB / Snowflake
AWS / GCP / Azure
Docker / Kubernetes / Terraform
Desired Skills
Proficient experience using big data tools - Spark, Kafka, Hadoop, Airflow
Highly experienced building and optimizing big data pipelines, architectures and datasets.
Highly experienced with back-end programming languages like Scala, Java, Golang
Experience with cloud APIs and frameworks
Strong analytical skills related to working with structured and unstructured datasets.
Experience working with AWS or GCP or Azure
Proficient in the use of relational databases, SQL and NoSQL technologies
Knowledge of code versioning tools such as Git.
Collaborate with the rest of the engineering team to design and launch new features.
Experience working in an agile environment.
Responsibilities
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources
Build analytics tools that utilize the data pipeline to provide actionable data insights
Build data systems that interact with microservice architecture
Understand and implement data security, quality and protection standards
Maintain quality and ensure the responsiveness of data intensive applications
Collaborate with the rest of the engineering team to design and launch new features
Requirements
3+ years of professional experience with the following:
Spark/Kafka/Hadoop/Airflow
Scala/Java/Python; Golang/gRPC a Huge Plus!
SQL and/or NoSQL databases
Continuous Integration/Continuous Deployment (CI/CD).
Strong communication/presentation skills
Nice to Have
Project lifecycle and Agile/Scrum/Kanban methodology
Bachelor’s degree or higher in Computer science.
Experience in the early-mid stages of a fast-growing company.
Benefits
Open PTO and remote work policies
Excellent medical, dental, and vision benefits for you and your family
Life and long term disability insurance
Flexible Spending Accounts and 401k