About the Role
We are seeking a seasoned Senior PostgreSQL DBA to design, operate, and evolve highly available, scalable database platforms powering a high-volume, customer-facing global travel technology ecosystem. This is a mission-critical role at the heart of production operations — ideal for a hands-on engineer who thrives under pressure, owns problems end-to-end, and brings deep PostgreSQL and Patroni expertise to the table.
If you’ve built HA clusters, led major incident response, and driven automation at scale — this role was built for you.
Key Responsibilities
PostgreSQL Architecture & Design
- Design and maintain scalable, fault-tolerant PostgreSQL architectures aligned with business growth strategy
- Define standards for cluster design, configuration, versioning, and lifecycle management
- Architect and operate high-availability environments using Patroni with distributed consensus systems (etcd / Consul)
- Lead PostgreSQL upgrades, migrations, and platform modernisation initiatives with minimal downtime
- Evaluate and implement PostgreSQL extensions and best practices for large-scale production workloads
Operations & Production Support
- Own day-to-day operational health of PostgreSQL platforms across production and non-production environments
- Lead major incident response — failovers, performance degradation, replication issues, and data consistency problems
- Perform deep Root Cause Analysis (RCA) and drive long-term corrective and preventive actions
- Tune PostgreSQL for performance and stability across:
- Memory and storage optimisation
- WAL and checkpoint tuning
- Autovacuum and bloat management
- Query optimisation and execution plan analysis
- Manage and validate backup, restore, and point-in-time recovery strategies (pgBackRest, Barman, or equivalent)
- Serve as a senior escalation point during on-call rotations and high-severity production events
Patroni & High Availability
- Design, configure, and operate Patroni-based PostgreSQL clusters in production
- Troubleshoot complex HA scenarios including:
- Leader election failures
- Split-brain conditions
- Network partitions and fencing issues
- Define and tune HA behaviour — failover, switchover, and synchronous replication settings
- Establish safe maintenance, patching, and upgrade procedures for HA environments
Automation, Monitoring & Reliability
- Drive automation for PostgreSQL provisioning, configuration, patching, and deployments
- Implement proactive monitoring and alerting for performance, replication, capacity, and availability
- Partner with SRE and infrastructure teams to improve platform reliability and operational tooling
- Develop and maintain operational documentation, runbooks, and SOPs
Infrastructure & Tooling
- Strong Linux fundamentals across production database environments
- Experience with cloud or hybrid environments — AWS / Azure / GCP
- Familiarity with monitoring tools: Prometheus, Grafana, Datadog, or equivalent
- Scripting proficiency in Bash, Python, or similar
Requirements
Must-Have
- Proven, hands-on PostgreSQL DBA experience in large-scale production environments
- Deep expertise in Patroni-based HA cluster design and operations
- Strong understanding of PostgreSQL internals — WAL, autovacuum, query planning, replication
- Experience with backup and recovery tools: pgBackRest or Barman
- Solid Linux systems knowledge and cloud platform exposure (AWS / Azure / GCP)
- Scripting skills in Bash, Python, or equivalent
Good to Have
- MySQL experience in enterprise environments — architecture, performance tuning, replication, and availability
- Experience supporting PostgreSQL-to-MySQL coexistence or migration strategies
- Familiarity with observability stacks: Prometheus, Grafana, Datadog