Senior Observability Engineer

  • Location: McLean, Virginia
  • Type: Contract
  • Job #102443

Job Description: Observability Engineer (W2 Contract)

SRE background required, AWS, Python/Java, Expertise in observability tools like Splunk, New Relic, Observe (Must have)

Working on journey mapping on DFS intent

Job Title: Observability Engineer (Contractor)
Focus: Full-Stack Observability, System Traceability, & Executive Health Scoring

Role Summary
We are seeking a hands-on Observability Specialist to accelerate the adoption of our Observe based platform. The ideal candidate possesses an SRE mindset—the ability to explore how complex systems interact and identify the exact data sets needed to provide a 360-degree view of the environment. You will bridge the gap between disparate Lines of Business (LOBs) to build E2E traceability and unified “Health Indices” that reduce mean-time-to-detect (MTTD) from hours to minutes.

Technical Skill Requirements
1. Core Observability & Tooling
Platform Expertise: Deep experience with modern observability platforms. While we use Observe, proficiency in New Relic, Splunk, or Databricks is required for rapid ramp-up.

Query & Data Fluency: Expert-level ability to write complex queries (SQL-based or proprietary like NRQL/SPL) to aggregate API success rates, latency, and crash-free session data.

Dashboard Architecture: Proven track record of building “Drill-Down” architectures—moving from high-level user journeys (e.g., Login) directly into microservice-level logs and traces.

2. The Modern Tech Stack
Infrastructure: Hands-on experience with AWS (ECS/Fargate/Lambda) and Docker.

Languages: Ability to navigate and instrument code in Python or Java.

Integrations: Familiarity with GraphQL for data fetching and Jenkins for CI/CD pipeline monitoring.

Instrumentation: Hands-on experience with OTel, and familiarity with NewRelic APM or Datadog APM

3. SRE & Systems Architecture Mindset
Cross-Domain Traceability: Experience monitoring digital customer engagement across disparate system boundaries (e.g., Comms, Phone, and Backend APIs) to expose “silent failures.”

Telemetry Mapping: Ability to map technical metrics to business outcomes, specifically creating Unified Health Indices for Senior Leadership (SLT)Root Cause Analysis (RCA): Skill in configuring alerts and correlations that enable instant pinpointing of failures within complex user flows.

Must sit onsite in McLean, VA, Richmond, VA, or Chicago, IL hybrid three days per week

#LI-NK1 #INDEPI

Scroll to Top