Senior Software Engineer, Distributed Transactional Database
Job title: Senior Software Engineer, Distributed Transactional Database in USA at Airbnb
Company: Airbnb
Job description: The Community You Will Join:The Transactional Storage Services team sits within Airbnb's Online Data organization that owns all of Airbnb's online serving stores and databases. The group is responsible for designing, building and operating a new source-of-truth, open-source NewSql database running on top of stateful Kubernetes that hosts all the critical Airbnb user, listing and financial data, with all the essential DB capabilities on such as backup restore, CDC, multi-tenancy.This stack will also serve as the unified storage backend for Airbnb online data such as MySql, KVStore, GraphDB, etc. With users around the world, reliability, scalability, efficiency, availability, security and platform evolution are our team's core concerns. As a member of this team you would be working with talented engineers on a modern distributed database system. Building an entire online data ecosystem around a NewSQL database provides distributed systems and database technologists with a front-row seat to how most companies will be building their data systems in the future.The Difference You Will Make:We're looking to add Senior engineers who are hands-on and capable of solving broad and deep technical challenges in the following areas:
- Control Plane and Operations
- Design frameworks and maintain the general ecosystem around our NewSql database's monitoring, permissions, service discovery integration, etc.
- Design, automate critical database operations such as centralized and hierarchical config management system, fully automated image building and release certification for major version upgrades, zero-downtime Blue/Green deployment.
- Be part of the team that defines and delivers a generalized database platform for partner KVStore, ORM, MySql teams.
- Migration and Adoption
- Deliver a zero-downtime forward and reverse replication pipeline with near-real-time consistency between two transactional databases, with correctness guarantee across transactional boundaries. Deliver a robust failover/failback mechanism to guarantee correctness and continuity during unexpected outages.
- Backup & Restore
- Conduct case study of all Airbnb's disaster recovery scenarios, leverage existing open source and/or design and implement software that satisfies Airbnb's requirements on database backup and restore, cross-region data resiliency, PiTR, etc.
- Design the right cluster topology, restore logic, and ransomware policy to safeguard Airbnb's business continuity.
- 5+ years of relevant industry experience.
- Solid understanding of distributed systems and infrastructure fundamentals.
- Experience in deep diving and then owning a complex code base.
- Knack for writing clean, readable, testable, maintainable code.
- Ability to decompose large-scale distributed systems and figure out monitoring metrics, failure scenarios and debug them in an efficient manner.
- Strong collaboration and communication skills in a remote-working environment.
- Expertise with a public cloud provider (AWS, GCP, Azure) and their Storage, VM, Networking, Security offerings. E.g. external-dns, route53, ebs etc.
- Experience in Java, Go, Rust or C++
- Experience with writing robust automation frameworks and tooling
- Experience with Kubernetes, operator pattern, helm, etc; experience with Infrastructure as Code, such as Chef and Terraform
Expected salary:
Location: USA
Apply for the job now!
[ad_2]
Apply for this job