Job Title: Infrastructure Engineering Engineer 3
Location: Dearborn, MI (hybrid)
Pay range: $65-70/hour
***Note: No C2C arrangements possible for this role
Position Description:
We are seeking a **Site Reliability Engineer (SRE)** to join our Infrastructure Engineering team. This role is critical in ensuring the reliability, scalability, and performance of our container-based microservices. As an SRE, you will collaborate with cross-functional teams to design, build, and maintain robust infrastructure solutions that support our business objectives.
Responsibilities
- Design, build, and maintain reliable container-based microservices.
- Develop and maintain CI/CD pipelines using tools such as Tekton or Cloud Build.
- Write and maintain CLI tools, APIs, and scripts to support infrastructure and application needs.
- Manage and optimize Kubernetes clusters and deployments.
- Work with Google Cloud Platform (GCP) to build scalable and secure infrastructure.
- Collaborate with development teams to ensure best practices for microservices architecture.
- Troubleshoot and resolve issues related to infrastructure, deployments, and performance.
- Advocate for and implement automation to improve system reliability and reduce manual intervention.
Required Qualifications
- **Experience Level**: Mid or Senior-level engineer.
- Proficiency in programming languages: **Java, Go, Python**.
- Hands-on experience with: – **Kubernetes**
- Managing and scaling containerized applications. – **Google Cloud Platform (GCP)**:
- Infrastructure and services. – **GitHub Enterprise**: Source control and collaboration. – CI/CD tools such as **Tekton** or **Cloud Build**.
- Strong understanding of building and hosting reliable container-based microservices.
- Knowledge of best practices for microservices architecture, including monitoring, logging, and fault tolerance.
- Ability to write clean, maintainable, and efficient code for infrastructure automation.
Skills Required:
1. Kubernetes (OpenShift) Expectation:
- Deep expertise in deploying, managing, and troubleshooting OpenShift (Kubernetes) clusters on GCP, including both OpenShift Container Platform (OCP) and Google Kubernetes Engine (GKE) where relevant.
- Strong understanding of OpenShift-specific features (Operators, Routes, SCCs, OAuth, etc.) and Kubernetes primitives (Pods, Deployments, Services, ConfigMaps, Secrets, RBAC, CRDs).
- Experience with OpenShift GitOps (ArgoCD), Helm, and Kustomize for application delivery.
- Ability to diagnose and resolve complex issues related to networking, storage, and cluster upgrades in OpenShift.
- Proven ability to implement security best practices (SCCs, NetworkPolicies, OpenShift OAuth, etc.) and automate cluster operations.
2. Docker Expectation:
- Proficient in building, optimizing, and securing Docker images for OpenShift and GCP workloads.
- Deep understanding of Dockerfile best practices, multi-stage builds, and image vulnerability scanning (using tools like Clair, Trivy, or OpenShift integrated scanning).
- Experience with Google Artifact Registry and OpenShift internal registries for image storage and lifecycle management.
- Ability to troubleshoot container runtime issues within OpenShift and GCP environments.
3. Python Expectation:
- Strong Python skills for automating GCP and OpenShift operations (using GCP SDK, OpenShift Python client, or custom operators).
- Experience with Python packaging, virtual environments, and dependency management.
- Ability to write robust, maintainable, and testable code for infrastructure automation, monitoring, and custom OpenShift controllers/operators.
- Familiarity with integrating Python scripts with GCP APIs, OpenShift REST APIs, and CI/CD pipelines.
4. Cloud Infrastructure (GCP) Expectation:
- Hands-on experience provisioning and managing GCP resources (GCE, GCS, VPC, IAM, GKE, Cloud SQL, etc.) using Infrastructure as Code (Terraform, Deployment Manager, etc.).
- Deep understanding of GCP security (IAM, service accounts, VPC Service Controls), cost optimization, and monitoring/logging (Stackdriver/Cloud Monitoring).
- Ability to automate the lifecycle of GCP infrastructure and integrate with OpenShift clusters.
5. Cloud Architecture (GCP + OpenShift) Expectation:
- Ability to design cloud-native, highly available, and resilient architectures on GCP using OpenShift as the application platform. • Experience with hybrid and multi-cloud strategies, disaster recovery, and compliance in a GCP/OpenShift context.
- Strong understanding of architectural trade-offs, cost implications, and performance optimization for containerized workloads on GCP.
Skills Preferred:
- Experience with microservices observability tools (e.g., Prometheus, Grafana).
- Familiarity with service mesh technologies (e.g., Istio, Linkerd).
- Strong problem-solving skills and a proactive approach to identifying and addressing issues
.
Experience Required:
- Engineer 3 Exp: Prac. In 2 coding lang. or adv. Prac. in 1 lang. 6+ years in IT; 4+ years in development