Rebellions | 리벨리온대한민국 경기도 성남시 분당구 정자일로 239, 102동 8층
Responsibilities and Opportunities
Designing an RBLN runtime module that interfaces RBLN compiler and driver modules
Designing and implementing user-level APIs, adding support for various language bindings, deploying ML models to the RBLN SDK, and maintaining user documentation
Conducting benchmarking and profiling to evaluate the existing runtime system's performance and implementing optimizations to enhance the overall system performance of RBLN NPU products
Optimizing inference serving using vLLM for NPU and conducting SOTA (State-of-the-Art) research to improve model serving performance
Key Qualifications
Bachelor's or higher degree in Computer Science, Electrical Engineering, or a related field
Comprehensive understanding of deep learning models and their applications in vision, natural language processing, speech recognition, and other domains
Familiarity with system software, including compilers, runtimes, drivers, firmware, etc.
Proficiency in programming languages: C++ and Python
Knowledge of data structures, algorithms, and OOP design patterns
Strong written and verbal communication skills
Ideal Qualifications
Hands-on experience with AI accelerator (e.g., GPU) driver APIs and runtimes
Exposure to ML frameworks such as PyTorch, TensorFlow, ONNXRuntime, TensorRT, and their respective optimization techniques
Solid understanding of operating systems, resource management, and high-performance computing principles
Deep expertise in Python or modern C++ and its advanced features for writing efficient, high-performance code
Experience with multithreading and parallel programming
Experience with serving platforms such as vLLM, TorchServe, and Triton Inference Server
채용 및 업무 수행과 관련하여 요구되는 법령 상 자격이 갖추어지지 않은 경우 채용이 제한될 수 있습니다.
보훈 대상자 및 장애인 여부는 채용 과정에서 어떠한 불이익도 미치지 않습니다.
담당 업무 범위는 후보자의 전반적인 경력과 경험 등 제반사정을 고려하여 변경될 수 있습니다. 이러한 변경이 필요할 경우, 최종 합격 통지 전 적절한 시기에 후보자와 커뮤니케이션 될 예정입니다.
공유하기
NPU Runtime Software Engineer
Responsibilities and Opportunities
Designing an RBLN runtime module that interfaces RBLN compiler and driver modules
Designing and implementing user-level APIs, adding support for various language bindings, deploying ML models to the RBLN SDK, and maintaining user documentation
Conducting benchmarking and profiling to evaluate the existing runtime system's performance and implementing optimizations to enhance the overall system performance of RBLN NPU products
Optimizing inference serving using vLLM for NPU and conducting SOTA (State-of-the-Art) research to improve model serving performance
Key Qualifications
Bachelor's or higher degree in Computer Science, Electrical Engineering, or a related field
Comprehensive understanding of deep learning models and their applications in vision, natural language processing, speech recognition, and other domains
Familiarity with system software, including compilers, runtimes, drivers, firmware, etc.
Proficiency in programming languages: C++ and Python
Knowledge of data structures, algorithms, and OOP design patterns
Strong written and verbal communication skills
Ideal Qualifications
Hands-on experience with AI accelerator (e.g., GPU) driver APIs and runtimes
Exposure to ML frameworks such as PyTorch, TensorFlow, ONNXRuntime, TensorRT, and their respective optimization techniques
Solid understanding of operating systems, resource management, and high-performance computing principles
Deep expertise in Python or modern C++ and its advanced features for writing efficient, high-performance code
Experience with multithreading and parallel programming
Experience with serving platforms such as vLLM, TorchServe, and Triton Inference Server