Job expired
This job has now expired and is not accepting new applications.
View all of our live jobs below.
Site Reliability Engineer (Devops/Kubernetes) in City of London, London
Location
City of London, LondonSalary
£500 - £750 per day + Inside IR35 (negotiable rate)Contract
ContractSite Reliability Engineer
6 Months
Full time on site in Central London
£Negotiable day rate (Inside IR35)
My client, a well known financial services client, is looking for a Site Reliability Engineer to join their fast paced team on an initial 6 month contract.
Site Reliability Engineering are responsible for delivering continuous improvement, automation and self-service offerings to operational teams across Bank EMEA and Securities International
On the Job details-
* Responsible for the reliability and efficiency of infrastructure through the delivery of common, repeatable tools and processes that greatly reduce the amount of toil operations must perform
* Member of L3 Engineering team providing subject matter expertise and ultimate escalation
* Develop software to make infrastructure services self-managing and self-service dashboards.
* Deliver continuous service improvement by developing Infrastructure as Code
* Eliminate manual, repetitive, automatable, tactical tasks that are devoid from value
* Improve system performance, make effective use of resources, distribute load and reduce latency
* Identify SLO's (Service Level Objectives) to meet availability and latency objectives
* Develop pro-active monitoring solutions that alert on symptoms and not just on outages
* Perform detailed root cause analysis (RCA's) on incidents and outages to prevent future
* Partner with development teams to improve services via rigorous testing and release procedures
* Identify technical debt and partner with application teams to build remediation plans
* Develop standard operational procedures and produce effective documentation
* Analyse workloads and devise suitable cloud migration strategies where appropriate
* Ensure all project / investment workloads are delivered according to plans and budget defined
* Liaise with Infrastructure Control and IT Risk teams to satisfy internal and external audit requests
* Deputise for team lead when required to do so and act-up accordingly
* Identify cost saving and optimisation opportunities across the group
* Build strong working relationships across the organisation
* Adhere to the core values of the bank
Essential skills required:
* Exceptional skills in Docker/Kubernetes deployment and configuration, scaling and management of containerized applications.
* Excellent skills in managing, performance optimisation of complex Prometheus, Influxdb and Grafana monitoring stack.
* Excellent skills in writing/maintaining Grafana Dashboard using PromQL, InfluxQL/Flux.
* Experience in distributed technologies like Rook, Ceph, Noobaa, Trino, MariaDB Xpand, Dremio, Kibana, KX platform
* Experience in CI/CD/CT platforms like Git, Ansible, Terraform and TeamCity
* Serena Deployment Automation (SDA) and Jenkins
* "Infrastructure as Code" Principles and practices.
* "Continuous Integration (CI) and Continuous Development (CD)" Principles and practices
* Agile, Site Reliability Engineering (SRE) and DevOps Principles and practices
* Scripting and programming languages such as PowerShell, Python, Bash and C#
* Fluent in Backup and Recovery processes and procedures
* Advanced knowledge of Clustering, High-Availability, Replication and Disaster Recovery techniques
* Ability to tune Network, Storage, Server and Virtualisation layers for optimal performance and reliability
* Excellent Performance Tuning skills, in-depth knowledge of system internals
* Ability to interpret and implement CIS security hardening recommendations in a controlled manner
* Acute awareness of Security and Auditing requirements in a regulated environment
Highly Desirable skills (nice to have):
* RHEL, Oracle Linux, Oracle Solaris and related technologies
* Microsoft Windows Server and related technologies
* Microsoft SQL Server, Oracle, Sybase ASE, MongoDB and Snowflake
* Active Directory, LDAP and Kerberos
* IBM Tivoli / Netcool
* Nutanix HCI and VMWare ESX
* Networking Protocols (TCP/IP, DNS, DHCP, VLAN's)
* Cloud computing - IaaS, PaaS and SaaS offerings across Azure, AWS, GCP and Oracle
* Knowledge of data security governance and regulations such as GDPR and SOX
Disclaimer:
This vacancy is being advertised by either Advanced Resource Managers Limited, Advanced Resource Managers IT Limited or Advanced Resource Managers Engineering Limited ("ARM"). ARM is a specialist talent acquisition and management consultancy. We provide technical contingency recruitment and a portfolio of more complex resource solutions. Our specialist recruitment divisions cover the entire technical arena, including some of the most economically and strategically important industries in the UK and the world today. We will never send your CV without your permission. Where the role is marked as Outside IR35 in the advertisement this is subject to receipt of a final Status Determination Statement from the end Client and may be subject to change.
Email me jobs like this
Similar Jobs
Related news
Read all arm team newsRecruiter Myths: Things You Need to Know
As a job seeker, you may be wondering what the deal is with recruiters. Are they trustworthy? Are they worth using? Why have I heard such bad things? As a…
Unleashing the Power of Recruitment Agencies: Maximising Results through Engaging Passive Candidates
In today's competitive job market, finding top talent can be a challenging and time-consuming task - as revealed in our blog 'Why is it hard to find rail talent right…
HS2: Not all bad?
The government has revealed its plans for the preferred route of phase 2 of HS2, and it opens up tremendous opportunities for jobs at HS2. The high-speed rail line's second…