Job summary:


Title:
High Performance Computing (HPC) administrator - Remote

Location:
Remote

Length and terms:
Long term - W2 or C2C


Position created on 01/12/2021 07:31 pm

Job description:


**** Webcam interview;  *** Long term project *** Remote; due to security only US Citizens

high performance computing (HPC) administrator to support our machine learning (ML) platforms and ecosystems.  This candidate would support a product team including data scientists and machine learning engineers to support ML products and services for our Manufacturing and Engineering customers. The role includes the running and maintenance of the current environment including incident and change management as well as capacity and release management associated with the development and execution of the ML solutions.  A successful candidate will have a solid understanding of operation systems and databases as well as some exposure with HPC environments utilizing graphics processing units (GPU).  An understanding of basic programing skills covering UNIX Shell scripting and Python is also required.  

Deliverables:

  •    Assists in the day-to-day operations including incident management working w/the team to resolve the associated issues with the infrastructure and systems.
  •    Develop and maintain automation and orchestration software and scripting to assist with Machine Learning code release management.
  •    Prioritize and efficiently manage deployment and configuration tasks
  •    Support of the containerized ecosystem
  •    Workflow processes for
  •    Data ingest (Python)
  •    API provisioning service
  •    Scheduling (HTCondor)
  •    Watchdog jobs monitoring for data and performing data ingest
  •    Support of the hardware specific configurations, tuning, GPU support 

Skills

  •    Experience with enterprise standard Database systems (i.e. Oracle, Microsoft SQL Server) Experience at working both independently and in a team-oriented, collaborative environment is essential
  •    Functional understanding of networking concepts; routing, switching, firewalls, load balancers, proxy services, & protocols (TCP, UPD, HTTP, TLS, SIP, SMTP, SNMP, LDAP)
  •    Customer service focused with excellent communication (written and verbal) and interpersonal skills. Must be able to effectively work with customers, coworkers, vendors and management
  •    Experience in Python, PL/SQL
  •    Experience in Unix shell scripting, PowerShell, and Bash
  •    Working knowledge of containers and/or orchestration platforms (i.e. Docker, Singularity, Kubernetes, Rancher)
  •    Prior application development experience using tools such as Jenkins, Gradle/Ant, SVN/Git, Artifactory, Automation
  •    Understanding of computer HW and architecture specifically high performance compute (HPC) utilizing GPUs 
  •    Familiarity with agile methodology, ideally Scaled Agile Framework (SAFe)

Contact the recruiter working on this position:



The recruiter working on this position is Gowtham Reddy(Shaji Team)
His/her contact number is +(1) (205) 5983015
His/her contact email is gowtham.reddy@msysinc.com

Our recruiters will be more than happy to help you to get this contract.