Job Description
Job Description
Job Description
We are seeking a Senior Site Reliability Engineer (SRE). As one of our SREs, you will be 100% hands on with both infrastructure and software development. We support and implement multi-enclave hybrid-cloud software factories for the DoD that includes Kubernetes (K8s) platforms, DevSecOps tools, IaC, Cybersecurity tools, and custom software.
This candidate will have the opportunity to work with technical leaders in Software Development, Cybersecurity, and Operations (DevSecOps) to develop the next generation of software factories. You will work both independently and collaboratively with your team to troubleshoot and resolve highly technical issues that customers encounter while using deploying solutions. You'll partner cross-functionally with product and engineering teams to drive feedback, improve internal & external tooling, launch new products & features, and deliver an exceptional customer experience.
Responsibilities:
- Participate in a collaborative Kanban multi-discipline team working closely with customer to accelerate cloud initiatives and improve processes
- Work with the customer to design and build CI/CD pipelines. Develop and integrate toolchain systems to provide path to production from development Software Factory
- Enable Continuous Integration/Continuous Delivery through appropriate design guidelines.
- Maintain traceability between requirements, design, and test cases
- Work directly with Development and Operations teams to increase velocity, prioritize tasks, implement requirements, and automate.
- Lead SRE and DevSecOps work initiatives from inception to production
- Knowledge of architecture concepts including microservices, container orchestration, and traditional 3-tier applications
- Design and implement Kubernetes platforms and tools chains
- Implement infrastructure as code using tools such as Ansible Automation Platform, and/or VMware vRealize Automation?
- Develop and maintain code (Bash, Python, YAML, PowerShell, Ruby, Groovy)
- Experience with observability tools such as Log Insight, Elastic Stack, Splunk, QRadar, or Prisma Cloud
- Design and implement enterprise on-premises and hybrid cloud deployments
- Lead efforts using Agile methodologies
- Ability to work both independently and in a team environment with clients and vendors, demonstrated technical leadership skills, good verbal and written communication skills
- Provide expertise in integrating and administering Kubernetes (K8s) Platforms (Tanzu, Open Shift, Konvoy), Elastic, Istio, Gitlab and other DevSecOps products.
- Provide expertise in system integration and development in an agile environment
- Troubleshooting and resolving technical support requests created by our customers spanning a growing range of container products and services, including Managed Kubernetes and Container Registries
- Contributing to internal documentation that provides your team with the resources they need to perform in their role and external documentation that allows our customers to self-serve
- Engaging customers and responding to technical questions received through our community Q&A forum
- Representing the voice of support, speaking on behalf of our customers through direct engagement with our product and engineering teams
- Assist customers on-site during release deployment and with periodic system/application patching
Basic Qualifications:
- 10+ years' experience in working with customers in identifying their business and technical requirements, and designing and/or implementing optimal technical solutions for them
- Advanced knowledge and experience with container technologies using Docker/OpenShift/Tanzu and Kubernetes (design, build, configuration)
- Analyzing and troubleshooting container performance
- A deep understanding of container tools within the CI/CD pipeline (GitLab, Twistlock, Anchore, etc.)
- Establish continuous integration (CI) pipeline to fully automate deployment of containers
- Ability to write sustainable scripts using a language such as Python, Perl, Java, YAML or PowerShell
- Experience with automation, preferably Red Hat Ansible
- Understanding of operating systems, application security configurations, and best practices in Windows and Linux/UNIX environment is required
- Working experience with JIRA
- Knowledge of Agile and iterative development methodologies
- Strong problem-solving skills to assist in issue resolution
- Strong organizational and time management skills
- Experience recommending and implementing systems solutions
- Ability to work in a team environment as well as autonomously
- Ability to multitask for various components of complex projects
Desired Skills:
- Strong experience of design, implementation, and support enterprise automation
- Strong experience in creating Senior Level / General Level slide decks to visualize modernization efforts
- Experience troubleshooting basic and advanced Kubernetes issues ranging from pods and deployments to the control plane
- Knowledge of kubectl, community projects such as helm, istio, linkerd, Kuma, Kong, prometheus, NGINX ingress-controller, and similar software and utilities used to manage deployments
- Certified Kubernetes Administrator (CKA)
- Experience with Atlassian, VMware, Red Hat, GitLab, Elastic, Rancher, and RKE2
Clearance Level:
- TS SCI clearance required.
Education Level:
- Minimum Bachelor's Degree in Computer Engineering, Electrical Engineering, or a related Engineering Degree.
- Minimum of 10 years' professional experience in a technical engineering position involving infrastructure design technologies, data management and interchange, system design and/or development for complex applications
- Must obtain/maintain a DoD 8570 IAT Level II certification (Security +, CCNA Security, CySA+, GICSP, GSEC, CND, SSCP) within 120 Days of hire.
Job Tags
Work experience placement, Shift work,