MLOps Engineer (Mid- Senior level)

AIxBlock Anywhere
AIxBlock Remote, Full-time
  1. About AIxBlock

AIxBlock is a comprehensive platform for quickly developing AI models using decentralized resources with flexibility and full privacy control.

Our mission is to enable users to bring their AI models to market quickly, with the freedom to customize and maintain full control over their privacy.


We are now seeking an MLOps Engineer to enhance our platform’s scalability and efficiency through robust infrastructure-as-code (IaC) practices, distributed ML training methodologies, and advanced MLOps capabilities.


Visit our website for more information.


Our DNA is a globally distributed company that allows you to work fully remotely with flexible working hours, we always work with folks from all walks of life with no borders.

 

  1. Job type: Full-time & Remote

Flexible working hours from Monday to Saturday, ensuring 44+hrs. We need A players, who can work like hell, if you prefer a 9-5 job, this might not be a suitable company for you.

 

  1. Location: Work from anywhere.  


  1. Reporting line: CTO 

 

  1. Your responsibilities


    As an MLOps Engineer, you will design, build, and maintain the infrastructure required to support distributed ML training on multi-nodes (computes) at scale. You will collaborate closely with other teams to streamline workflows, ensure high-performance computing for decentralized training, and deploy production-ready machine learning pipelines. The ideal candidate will have hands-on experience with IaC, MLOps best practices, and distributed training frameworks. Key responsibilities are as follows: 

MLOps & Infrastructure Management:

  • Develop and maintain machine learning pipelines from data preparation to deployment.

  • Automate infrastructure provisioning and management using IaC tools (e.g., Terraform, CloudFormation).

  • Implement CI/CD pipelines for model training, testing, and deployment.

Distributed ML Training on multi nodes:

  • Architect and optimize systems for distributed ML training, ensuring efficient use of decentralized compute resources.

  • Implement distributed training frameworks such as PyTorch Distributed, TensorFlow Distributed, etc

  • Collaborate with the engineering team to integrate distributed training methods with AIxBlock’s decentralized compute marketplace.

Monitoring and Scalability:

  • Design robust monitoring systems to track model performance and infrastructure health.

  • Ensure scalability and fault tolerance in the platform’s ML workflows.

  • Optimize resource allocation for compute-intensive training tasks.

  1. Requirements:

  • Proven experience in MLOps and machine learning infrastructure.

  • Strong knowledge of Infrastructure-as-Code (IaC) tools 

  • Hands-on experience with distributed ML training frameworks

  • Proficiency with containerization and orchestration tools like Docker and Kubernetes.

  • Familiarity with multiple cloud platforms

  • Experience with CI/CD tools for ML pipelines.


  1. Compensation & Benefits:

  1. Compensation: 

  • Base salary: Negotiated salary depending on experience. 

  • Token bonus based on Performance


  1. Benefits:

  • Salary review depending on the performance

  • Birthday gift

  • Holiday gift.

  • Year End Performance Bonus (Cash)

  • Year-end party.


  1. Leaves:

  • Public holidays. Take time off and spend it with your family during your country’s public national/regional/state holidays.

  • Annual leave: 12 days/year and to be pro-rata rated for the actual monthly working period for full-time staff. Applied after the probation.

  • Sick leave with pay: maximum 6 days/year, on top of the 12-day annual leave credit, for full-time staff. Applied after the probation. 

  • Period leave: 1 day each month for female employees.

  • Personal leave policy for special cases.

  1. Training and recognition

  • Performance recognition and promotion opportunities for consistently good performance.

  • External/internal training programs. 


  1. Working environment

  • Fully remote.

  • Flexible working hours, divide your working hour within a day and week.

  • Fast-track for professional growth in a fast-paced startup environment. 

  • Work with a talented and diverse team in a dynamic environment that encourages continuous learning and professional development.

  • The opportunity to meet and work with global professionals around the world to expand your network.


    9. Application process:

  • Resume & Portfolio screening

  • Interview with the TA

  • Interview with the CTO

  • Offer discussion and contract Signing


   10. Apply

To apply, please click HERE to fill out your application. 

  • Please note: We're all about remote work and have collaborators based all around the world, and English is our primary language. Therefore, English CV is required. 

  • The application process may be slightly modified (shortened or prolonged) when necessary.