The NPU SDK part of Software Team owns the end-to-end SDK user experience: modeling work to bring state-of-the-art AI models (LLM, Multi-modal, Vision, Speech, etc.) onto Rebellions NPU, building and operating CI/CD pipelines to ship a production-ready SDK to external users, releasing and distributing the software, and authoring and publishing developer documentation.
Responsibilities and Opportunities
- SDK Ecosystem & Release Pipeline: Design, build, and operate CI/CD pipelines that automate and scale the SDK release process
- Model Deployment & Porting: Port and optimize the latest deep learning models (LLM, Multi-modal, etc.) onto Rebellions NPU, analyze performance bottlenecks, and validate numerical parity against GPU reference implementations when needed
- Framework Integration: Integrate the SDK tightly with the open-source inference and serving ecosystem, including Hugging Face Transformers, Diffusers, vLLM, and to maximize the developer experience
- Verification & Debugging: Design model-feeding frameworks and debugging utilities to ensure functional correctness and long-term stability of the SDK
- Developer Documentation: Author and maintain developer-facing documentation — API references, tutorials, model support matrices, and release notes — and operate the documentation build and publishing pipeline for external developers and partners
Key Qualifications
- Master’s degree or above in Computer Science, Electrical Engineering, or a related field
- Deep working knowledge of CI/CD tools such as GitHub Actions, Airflow, and Buildkite
- Deep understanding of modern deep learning model architectures, including LLMs (Transformer family), Multi-modal, Vision, and Speech
- Strong familiarity with PyTorch internals, including model customization and graph transformations
- High-performance software development skills in Python and Modern C++ (C++17 or later)
- Experience using and integrating major open-source libraries such as Hugging Face (Transformers, Diffusers) and vLLM
- Hands-on experience with Python and Kubernetes-based environments for deploying and troubleshooting AI inference workloads in production
Ideal Qualifications
- 5+ years of relevant industry experience, or equivalent practical expertise
- Experience optimizing and deploying models on specific hardware using CUDA, TensorRT, TensorRT-LLM, MLIR, Triton, or TVM
- Experience building scalable ML infrastructure with Docker and Kubernetes
- Experience in AI Harness Engineering – designing environments where AI agents operate autonomously within defined constraints, feedback loops, and system contexts
- Contributions to open-source projects as a maintainer or regular contributor
- Understanding of the internals of other frameworks such as JAX or TensorFlow
전형절차
- 서류전형 > On-line 인터뷰 > On-site 인터뷰(과제 포함) > Culture-fit 인터뷰 > 처우 협의 > 최종 합격
- 전형절차는 직무별로 다르게 운영될 수 있으며, 일정 및 상황에 따라 변동될 수 있습니다.
- 전형 일정 및 결과는 지원 시 작성하신 이메일로 개별 안내드립니다.
참고사항
- 본 공고는 모집 완료 시 조기 마감될 수 있습니다.
- 지원서 내용 중 허위사실이 있는 경우에는 합격이 취소될 수 있습니다.
- 채용 및 업무 수행과 관련하여 요구되는 법령 상 자격이 갖추어지지 않은 경우 채용이 제한될 수 있습니다.
- 보훈 대상자 및 장애인 여부는 채용 과정에서 어떠한 불이익도 미치지 않습니다.
- 담당 업무 범위는 후보자의 전반적인 경력과 경험 등 제반사정을 고려하여 변경될 수 있습니다. 이러한 변경이 필요할 경우, 최종 합격 통지 전 적절한 시기에 후보자와 커뮤니케이션 될 예정입니다.
- 채용 관련 문의사항은 아래 메일 주소로 문의바랍니다.
- [email protected]