Find A Job

IT Operations Engineer

chicago, IL | IT
Job ID: 101915
Listed on 11/9/2020

KellyMitchell matches the best IT and business talent with premier organizations nationwide. Our clients, ranging from Fortune 500 corporations to rapidly growing high-tech companies, are exceptionally served by our 1500+ IT and business consultants. Our industry is growing rapidly, and now is a great time to launch your career with the KellyMitchell team.

IT Operations Engineer

Job Summary 

The IT Operations Engineer is responsible for the implementation and maintenance of monitors, sensors, and scripts for applications, system software and infrastructure. The role is responsible for enterprise eCommerce and application health monitoring and automation tasks. The engineer is the product owner and the technical SME for the Dynatrace product. The position supports the IT Monitoring and Event Management Operations Manager in product lifecycle, roadmap, documentation, deployments (ie: upgrades), incident remediation, product socialization, regular status updates and tool reporting metrics.

Duties

  • Administer the Dynatrace product
  • Respond to, track and resolve Service Requests and incidents
  • Create custom extensions with Powershell scripts to automate tasks
  • Collaborate with Digital DevOps team on ongoing monitoring initiatives
  • Inform team of new or updated service standards, systems, procedures, forms and manuals through staff meetings and verbal and written communications
  • Quantify and measure performance of current service against previous periods, benchmarks, and KPIs
  • Add tasks and update sprint JIRA stories to plan the content of upcoming sprints
  • Identify areas of improvements or best practices
  • Document processes and operational knowledge base
  • Implement needed CIs, events, and logic (including thresholds and alerts) following an agile sprint-based methodology
  • Implement needed sensors and instrumentation to systems, service, and application following a hybrid agile sprint-based methodology
  • Complete needed Change Requests to obtain needed change approval and provide visibility to relevant stakeholders
  • Perform work as planned and produce the required artifacts to document successful completion
  • Evaluate current partners monitoring performance
  • On bi-weekly and monthly basis, compile relevant data on the performance of the Monitoring and event management service, notable achievements, opportunities for improvement, etc.
  • Point out what’s important to know, not just pages of charts and numbers
  • Include proposed remediation for identified issues

Ideation Phase 

  • Monitoring and event management planning activities as SME
  • Build and prepare prod and non-prod environments based on project needs, including data integrity and integrations
  • Collaborate with enterprise architect to determine and document product lifecycle, roadmap and feature enhancements
  • Evaluate existing monitoring toolset and recommend new, potential tools (PRTG, Dynatrace, JAMs, Application Insights, Canary, Tango and etc.)
  • Regularly work with the IT Operations and IT Monitoring and Event manager to plan and document key goals, objectives, and initiatives for the Monitoring and Event Management function
  • Support manager in coordination efforts within the team and other teams (both internal and external)
  • Document, update and publish environment inventory, availability and schedule

 Project Delivery Phase 

  • Participate in design reviews with product owners and architecture to ensure monitoring considerations are understood
  • Create monitoring strategy plan along with sensor matrix and alerting thresholds
  • Implement new sensors in alignment with Event Management implementation plan and requirements
  • Write scripts to monitor application health; integrate these programs with the enterprise monitoring system
  • Update and configure existing sensors based on updated requirements or needs
  • Integrate and configure monitoring data from various systems and tools (i.e.: Dynatrace, PRTG, etc.) into ServiceNow
  • Define and configure Alert groups and dashboards in ITOM
  • Configure ServiceNow to open and assign tickets based on detected conditions identified
  • Identify resolution steps that can be automated within toolset
  • Identify 24x7 / off business hours monitoring opportunities
  • Ensure non-production environments are available in a timely manner (including data integrity and integrations)
  • Manage prod and non-prod environments, including overseeing deleting/updating outdated environments and ensuring data is updated proactively
  • Support coordination efforts with other project teams and vendors
  • Create and document any required change management documentation to support the team prior to deployments

Pilot / Rollout / Post-Production Phases 

  • Continually evaluate existing and potential enterprise-wide monitoring tools capabilities and effectiveness
  • Conduct technical tool evaluation
  • Update existing monitoring scripts and configurations
  • Generate raw data reports from systems for analysts to review
  • Provide Level 3 operational support for third party applications
  • Continually look for opportunities for automation and improvement for prod and non-prod environment management
  • Provide reporting and analysis on service, usage, and availability
  • Attend Change Management meetings, document requests/updates/feedback and inform the team accordingly
  • Document and publish required documentation for service operations
  • Analyze data and generate reports of the performance of Monitoring and Event Management functions

Desired Skills/ Experience

  • Bachelor’s Degree in Computer Science or related field
  • Minimum of 3-5 years of professional experience
  • Familiarity with event management concepts and tools such as PRTG, Dynatrace, Nagios, Application Insights or Tango
  • Experience analyzing, supporting, and troubleshooting applications for an enterprise at a level 3 function
  • Excellent written and verbal communication skills and should be comfortable communicating with all levels of the organization
  • Candidate should operate autonomously requiring little direction from the IT Monitoring and Event Management Operations Manager on daily tasks
  • Candidate should have familiarity with ServiceNow ITSM and ITOM
  • Candidate should have Familiarity with Windows 10 and Office 365
  • Demonstrate a high level of accuracy and attention to detail
  • Knowledge of ITIL practices

*mjp123