




Summary: We are seeking a hybrid Support Engineer to operate at the intersection of software development and infrastructure operations, focusing on reliability, observability, and critical live operations. Highlights: 1. Hybrid role: software development and infrastructure operations support 2. Focus on reliability, observability, and mission-critical scenarios 3. 60% coding for tools and automation, 40% operational support ### **Description** We are looking for a hybrid Support Engineer who operates at the intersection of software development and infrastructure operations support. This isn't a traditional ticket\-closing role; it is an engineering role focused on reliability, observability, and the physical reality of running code in mission\-critical scenarios like manufacturing, logistics, and data centers. We view support as a software problem. You will spend roughly 60% of your time writing code, building internal tools, automating remediation, and fixing platform bugs \- and 40% on high\-level operational support, incident response, writing troubleshooting guides and NOC\-style monitoring. You will act as the bridge between the hardware layer and the application layer. You are the person who understands why a server is overheating *and* why the API is throwing 500 errors, and you have the coding skills to fix it. If you are a developer who loves the adrenaline of live operations, or a NOC engineer tired of manual processes, this role is for you.### **What You’ll Do** * **Engineer Reliability Tools:** Design and develop automation scripts and internal tools (Python, Go, or Bash) to detect issues before they impact users and to automate repetitive operational tasks. * **Incident Response \& Troubleshooting:** Lead deep\-dive investigations into complex production incidents, spanning from bare metal and network switches to application logic and database performance. * **Mission Critical \& NOC Operations:** Monitor the health of distributed infrastructure, coordinate with on\-site hands for hardware replacements, and manage escalation policies. * **Code\-Level Support:** distinct from L1 support, you will dive into the codebase to identify root causes of bugs, submit patches, and improve system resilience. * **Observability:** Build and refine monitoring dashboards (Grafana, Datadog, or similar) and alert rules to separate signal from noise, ensuring the team sleeps through the night unless it’s critical. * **Environment Management:** Maintain and troubleshoot Linux\-based environments, ensuring configuration consistency across development, staging, and production. * **Feedback Loop:** Act as the voice of reliability within the pod, feeding production insights back to the core engineering team to influence architectural decisions. ### **What We’re Looking For** * **Hybrid Experience:** 3\+ years of professional experience, ideally split between software development and NOC\-style operations or Site Reliability Engineering (SRE). * **Developer Mindset:** Proficiency in at least one scripting or system language (Python, Go, Ruby, or Bash). You should be comfortable reading other people’s code and writing your own automation. * **Linux Deep Dive:** Strong Linux systems administration skills. You know your way around the kernel, file systems, systemd, and memory management. * **Networking Knowledge:** Solid understanding of L2/L3 networking, TCP/IP, DNS, VPNs, and troubleshooting connectivity in a Data Center environment. * **Operational Scars:** Experience with monitoring tools (Prometheus, Nagios, ELK Stack) and incident management. You know how to stay calm when production is down. * **Hardware Awareness:** Familiarity with bare\-metal provisioning, server hardware (raid controllers, BIOS/IPMI), and the physical constraints of Data Centers. * **Communication:** Fluent English is non\-negotiable. You must be able to communicate complex technical incidents clearly to US\-based stakeholders. **Nice to Have*** Experience with container orchestration (Kubernetes/Docker). * Background in virtualization (KVM, VMware, or Proxmox). * Experience with Infrastructure as Code (Terraform, Ansible). ### **Benefits** * Fun team activities every month to build strong bonds. * We're located in a great area, with plenty to enjoy nearby (El Golf, Las Condes). * Snacks and drinks are always stocked in the office. * Weekly team lunches to recharge and connect. * You’ll work closely with high\-caliber US\-based companies. * Comprehensive health and dental insurance to keep you covered. * Wellness activities reimbursement. ### **About Andes Path** Andes Path is a Chilean–US engineering and design studio building for the frontier. We work with cutting\-edge US startups and F500 innovation teams tackling real\-world problems in robotics, AI, simulation, and infrastructure. Our clients are fast\-moving, technically ambitious, and building things that haven’t quite been built before. We operate like product teams, not outsourcing shops — with high\-trust relationships, sharp execution, and a deep respect for craft. Our team is small, low\-ego, and serious about building exceptional products. While we have a core office in Santiago, we’re remote\-friendly across the Americas. Our culture is fast\-paced, collaborative, and grounded in mentorship, clarity, and ownership.


