We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results

Senior Enterprise Monitoring Engineer

First Tech Federal Credit Union
paid time off, 401(k)
United States, California, San Jose
Jan 22, 2025
Description
The Senior Enterprise Monitoring Engineer is responsible for building enterprise monitors for large-scale business applications, infrastructure, and network systems (both in-house and cloud).
Here's what you can expect from the job and what you need to be successful:
Job Duties
  • Mature the Application Performance Monitoring (APM) program to include application and platform performance and resiliency, in addition to availability, through real-time performance monitoring, alerting and telemetry for on prem and cloud hosted solutions (e.g, AWS or Azure)Using DevOps tools lead efforts to automate deployment of monitoring requirements
  • Optimize monitoring by considering tool consolidation and implementation of leading industry solutions, establishing telemetry requirements for alerts and self healing solutions
  • Create and refine a knowledge base levering tool, such as confluence, to create easy to consume, user friendly, documentation to interpret alerting on system/application performance
  • In partnership with IT technical leads, establish verifiable performance benchmarks for core and system components
  • Partner with IT technical teams (such as network, infrastructure, and database) and vendors to build technical availability and/or performance monitoring solutions with customized performance dashboards that facilitate proactive response and/or faster service mean time to restore
  • Gather technical requirements, use cases and business KPIs, and translate them to tool specifications for availability and/or performance monitoring and necessary dashboards
  • Use existing tools and processes for monitoring and measuring service performance in production and sandbox environments
  • Collaborate with development and QA to agree on availability and performance benchmarks to ensure sign-off prior to production roll-out
  • Create, maintain, and update documentation for all availability and performance monitors.
  • Provide mentorship to team members to promote understanding of system monitors and alerts
Essential Skills
  • Proven experience in site reliability engineering, DevOps, Cloud based telemetry (AWS/Azure), experience with cloud hosted solutions primarily in AWS and/or Azure.
  • Minimum 4 years' experience creating and/or maintaining availability and performance monitors and dashboards (with a focus on digital products and single pane of glass), using tools such as Dynatrace, New Relic, SolarWinds, Splunk, ExtraHop, as well as native tools provided by vendors such has HP, VMWare, Microsoft, AWS, etc.
  • Experience with system capacity monitoring and forecasting applications as well as application & infrastructure monitoring for performance, availability, and scalability
  • Strong knowledge of Windows and Linux OS as well as virtualization platforms such as VMWare and Microsoft Hyper-V in a hybrid multi-cloud architecture
  • Solid understanding of large-scale applications, network architectures, firewall topology, monitoring performance and fault management, and software design and development methodologies
  • Medium proficiency scripting knowledge including regular expressions, bash, PowerShell, and Python.
  • Knowledge of DevOps tools (e.g, Puppet, Ansible) and knowledge of containerization technologies (e.g., Kubernetes, Dockers,), CI/CD tools.
  • Proven track record of automating processes and developing effective QA measures that include performance testing, capacity management, and reporting
  • Ability to create and deliver engaging presentations tailored to diverse audiences
  • Excellent analytical, time management, organizational and problem-solving skills with the ability to multi-task and work in a deadline-driven environment
  • Ability to foster strong relationships with internal and external stakeholders
  • Excellent verbal and written communication skills and the ability to engage with business partners to understand their requirements and express ideas concisely and logically
  • Working knowledge of Microsoft Office Suite
  • Minimum Education: Bachelor's degree in Information Technology or related fields is preferred
Location: Hillsboro, OR 97124 | San Jose, CA95134 | Rocklin, CA95765 (HYBRID)
Target Compensation in Hillsboro, OR: $110k - $125k annually + annual bonus
Target Compensation inRocklin, CA 95765: $120k - $138k annually + annual bonus
Target Compensation in San Jose, CA 95134: $140k - $158k annually + annual bonus
Benefits options include:
  • Traditional medical, dental, and vision coverage
  • 401K matching up to 5% per pay period
  • Accrue up to 17 days of Paid Time Off your first year of employment
  • 11 paid federal holidays
  • Special employee pricing on lending products such as mortgage, auto, and personal loans (eligibility for special employee pricing is subject to standard account requirements and underwriting criteria)
What makes First Tech different? Click hereto learn more!
First Tech is not currently offering Visa transfer/ sponsorship for this position
#LI-KW1
Applied = 0

(web-6f6965f9bf-g8wr6)