What You"ll Do
Design and implement solutions that enhance application reliability, performance, scalability, and resilience.
Build and maintain monitoring, alerting, observability, and telemetry to drive proactive detection and incident analysis.
Lead incident management efforts, perform root cause analysis, and implement action based improvements.
Implement operational workflows using scripting, IaaC, and configuration management tools.
Manage capacity, performance, and scaling solutions to forecast demand and optimize infrastructure.
Collaborate with engineering teams to embed operability, resilience, and security into application architectures.
Build and automate reliable deployments through CI/CD pipelines, release governance, and version control systems.
Maintain clear runbooks, architecture diagrams, and operational documentation that enable efficient production support.
Experience Required
Managing Kubernetes and containerized workloads (EKS, AKS, GKE), including scaling, networking, upgrades, and monitoring.
Experience with cloud platforms (AWS, Azure, or Google Cloud Platform) across compute, storage, networking, IAM, and cost governance.
Using observability and APM tools such as Dynatrace, Splunk, Prometheus, Grafana, Datadog, Elastic/ELK.
Strengthening security and compliance controls in regulated environments (e.g., PCI DSS, SOC 2), including secure management of workloads.
Infrastructure automation experience using Terraform, CloudFormation, Ansible, or similar tools.
Designing and maintaining CI/CD pipelines using Jenkins, GitLab CI, GitHub Actions, or Azure DevOps.
Scripting and automation using Bash, PowerShell, or Python.
Experience in environments of electricity, engineering, or military related background (preferred).
Good to Have
Certifications such as AWS SysAdmin, AWS DevOps Engineer, Google Cloud DevOps Engineer, or CKA.
Experience with legacy applications, IBM iSeries, and/or library systems.
Hands on database operations and performance tuning (Oracle, SQL Server, PostgreSQL).
Prior experience as a major incident commander, stakeholder communicator, or ops lead/coordinator.
Experience with ITIL and ServiceNow (change, incident, and configuration management).
...Description The Parts Sales Representative is responsible for driving trailer parts sales, providing exceptional customer service, and maintaining accurate inventory control. This role supports customers and internal teams by ensuring timely and accurate parts identification...
...assistance or if you have questions about a job posting, please contact Human Resources at (***) ***-****. Department: Goldman Sachs 10K SB Department's Website: Summary of Job Duties: The Goldman Sachs 10,000 Small Businesses (GS10KSB) program has...
...Job Description Job Description Top Job Located in Champaign, IL Salary: up to $11.00/hr Bartender | Full-Time We are seeking an enthusiastic, guest-focused Bartender to deliver an exceptional drinking experience. The ideal candidate is confident behind...
...A leading hotel management company is seeking a Director of Sales and Marketing to drive revenue growth at their Fort Worth location. This role involves actively pursuing new business, managing client relationships, and leading the sales team. Ideal candidates have 3+...
...direct and indirect patient care through the nursing process of Assessment, Planning,... ...other departments. Approximate percent of time required to travel: 0% Must read, write... ...Certification Current state licensure as Registered Nurse. BCLS certification required....