About Kraken Kraken is changing the world. Join the revolution! Our mission is to accelerate the adoption of cryptocurrency so that you and the rest of the world can achieve financial freedom and incl
Kraken is changing the world. Join the revolution!
Our mission is to accelerate the adoption of cryptocurrency so that you and the rest of the world can achieve financial freedom and inclusion. Founded in 2011 and with over 4 million clients, Kraken is one of the world’s largest, most successful bitcoin exchanges and we are growing faster than ever. Our range of successful products are playing an important role in the mainstream adoption of crypto assets. We attract people who constantly push themselves to think differently and chart exciting new paths in a rapidly growing industry. Kraken is a diverse group of dreamers and doers who see value in being radically transparent.
In our first decade Kraken has risen to become one of the best and most respected crypto exchanges in the world. We are changing the way the world thinks about money and finance. The crypto industry is experiencing unprecedented growth and Kraken is leading the charge. We’ve grown from 70 Krakenites in January 2017 to over 1800 today and we have no intention of slowing down.
This role is fully remote.
About the role:
As a Site Reliability Engineer in Big Data you will work within a team of world-class engineers to establish and maintain infrastructure which is critical in enabling Kraken to make data-driven decisions.You’ll be responsible for helping keep our data platform online and operating at full efficiency. The data platform processes hundreds of thousands of records per second and must provide stable and rapid access for all of our internal users and systems.You’ll also have the opportunity to leverage your expertise and help implement best practices with regards to operating data infrastructure in Kubernetes and AWS.
* Monitor and support data infrastructure in UAT and production environments
* Manage infrastructure releases using Kubernetes
* Collaborate with data engineers and data software engineers to improve infrastructure stability, monitoring, and alerting.
* Participate in support rotations to help respond to infrastructure issuesRequirements:
* 3+ years in a DevOps role (SRE, Data Ops, DevOps, etc…)
* Solid understanding of Infrastructure as Code, Linux, Docker and Kubernetes
* Experience with monitoring tools such as Prometheus and Grafana
* Experience using Git as a version control system
* Previous experience operating one or more of the following tools: Debezium, Mirrormaker, Kafka, Druid, Superset, or Airflow.
* Strong understanding of security best practices
* Ability to work autonomously with little supervisionNice to have:
* Understanding of Terraform
* Experience with Helm and Helm chart customization
* Experience with Go or Python programming languages
* Experience managing EMR or maintaining hosted Jupyter/Zeppelin environments
* Knowledge of AWS best practices
* Understanding of best practices with regards to alerting and monitoring using Prometheus and Grafana
* Experience with Slack, JIRA, or Gitlab APIs
* Passion for crypto
This role will help the Big Data team stabilize it’s infrastructure to scale with the growing demand on our existing tools such as Superset and Airflow. It will also help stabilize our data pipelines to ensure tools like Superset and Zeppelin can provide accurate data in a timely manner.
Job Types: Full-time, Permanent