Responsibilities: *Infrastructure Architecture & Scalability:** * Lead the design, implementation, and management of highly available, fault-tolerant, and **auto-scalable infrastructure** on public cloud platforms including **AWS, Google Cloud Platform (GCP), and Azure**. * Develop and implement advanced strategies for **scaling solutions** (e.g., microservices architecture, serverless) to handle increasing traffic, data loads, and global distribution efficiently. * Serve as a subject matter expert for container orchestration platforms, primarily **Kubernetes**, including cluster design, advanced deployment strategies (e.g., Helm, Kustomize), service mesh integration, and complex troubleshooting. * Champion **Docker** containerization best practices, image optimization, and artifact management in enterprise registries. * Evaluate and integrate new cloud services and technologies to enhance infrastructure capabilities. *Automation & CI/CD Excellence:** * Drive end-to-end **automation** across the entire software development and operations lifecycle, from infrastructure provisioning to application deployment, testing, and monitoring. * Design, implement, and maintain robust, high-performance **CI/CD pipelines** (e.g., Jenkins, GitLab CI/CD, GitHub Actions, Azure DevOps) that support rapid, reliable, and secure software releases at scale. * Implement and enforce Infrastructure as Code (IaC) principles using advanced configurations with tools like Terraform, CloudFormation, or Ansible for consistent, repeatable, and auditable infrastructure provisioning. * Develop custom automation scripts and tools to streamline complex operational workflows and reduce manual intervention. *Networking, Security & Compliance:** * Design, implement, and manage secure and optimized network architectures, including advanced configurations of **VPNs, VPCs, Subnets**, routing, and peering across multi-cloud environments. * Implement and enforce stringent **secure infrastructure** practices, including advanced identity and access management (IAM) policies, network security groups, firewalls, Web Application Firewalls (WAFs), and robust security best practices for cloud environments. * Conduct regular security audits, vulnerability assessments, and penetration testing remediation. * Ensure infrastructure and processes adhere to relevant industry compliance standards (e.g., SOC2, ISO 27001, GDPR, HIPAA) as required. * Manage and automate SSL/TLS certificate lifecycles and complex DNS configurations. *Observability (Monitoring, Logging & Alerting):** * Design and implement comprehensive observability solutions utilizing the **ELK Stack (Elasticsearch, Logstash, Kibana)** for centralized logging, data ingestion, analysis, and visualization. * Set up robust monitoring and alerting systems (e.g., Prometheus, Grafana, CloudWatch, Stackdriver) to proactively identify, diagnose, and resolve issues across applications and infrastructure. * Develop custom dashboards, metrics, and sophisticated alert thresholds to track system performance, application health, and user experience, enabling data-driven operational decisions. * Implement distributed tracing and anomaly detection for deeper insights. *Version Control & Collaboration:** * Lead and enforce best practices for **GitHub versioning** and advanced branching strategies (e.g., GitFlow, Trunk-Based Development), ensuring code integrity and efficient team collaboration. * Foster a strong DevOps culture, promoting seamless collaboration, shared ownership, and communication between development, QA, and operations teams. * Conduct technical reviews of infrastructure code and deployment strategies. *Troubleshooting & Performance Optimization:** * Serve as a primary escalation point for complex infrastructure and application issues, performing deep-dive root cause analysis and implementing robust preventative measures. * Proactively identify and resolve performance bottlenecks across the stack (application, database, network, infrastructure). * Optimize resource utilization and implement advanced cloud cost management strategies to ensure efficiency without compromising performance or reliability. *Documentation & Knowledge Sharing:** * Create and maintain comprehensive, high-quality documentation for all infrastructure, deployment processes, operational runbooks, and architectural decisions. * Lead knowledge-sharing sessions and mentor junior DevOps engineers, elevating the team's overall technical capabilities. Qualifications: *5+ years of hands-on, demonstrable experience as a Senior DevOps Engineer** or in a similar lead role, with significant contributions to large-scale, production-grade systems. *Proven expertise in designing, implementing, and optimizing highly scalable, resilient, and secure solutions.** *Expert-level proficiency with Kubernetes** for container orchestration, including advanced deployment patterns, cluster management, and troubleshooting. *Deep expertise in Docker** for containerization, image optimization, and managing container lifecycles. *Extensive, hands-on experience with at least two major cloud providers (AWS, GCP, Azure)**, including in-depth knowledge of their core compute, networking, storage, security, and managed services. *Mastery in designing and implementing robust CI/CD pipelines** (e.g., Jenkins, GitLab CI/CD, GitHub Actions, Azure DevOps). *Expertise in automation using scripting languages** (e.g., Python, Bash) and advanced Infrastructure as Code (IaC) tools (e.g., Terraform, CloudFormation, Ansible). *Strong, practical experience with the ELK Stack (Elasticsearch, Logstash, Kibana)** for centralized logging, analysis, and visualization. *Advanced knowledge of networking concepts** (TCP/IP, HTTP/S, DNS, VPN, VPC, Subnets, Routing) and experience in designing and securing complex cloud network architectures. *Demonstrable experience implementing and maintaining secure infrastructure**, including IAM, network security, and compliance frameworks. * Proficiency with **GitHub versioning** and advanced collaborative development workflows. * Experience with other monitoring/alerting tools (e.g., Prometheus, Grafana). * Exceptional analytical and complex problem-solving skills, with a proven ability to debug and resolve critical production issues quickly. * Excellent communication, interpersonal, and presentation skills, with the ability to effectively articulate complex technical concepts to both technical and non-technical audiences. * Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field, or equivalent extensive practical experience.
#JobOpening #Hiring #JobSearch #NowHiring #CareerOpportunity #Employment #JobOpportunity #JobListing #JobPosting #JobAlert #recruitment
If interested can forward your updated resumes on hr5@tasolutions.in and can directly contact us on 9056679449 also can provide our reference to your friends and colleagues.