NPU SDK Software Engineer

직군

Software

근무지

Rebellions | 리벨리온경기도 성남시 분당구 정자일로156번길 6, R-TOWER 3F ~ 8F

The NPU SDK part of Software Team owns the end-to-end SDK user experience: modeling work to bring state-of-the-art AI models (LLM, Multi-modal, Vision, Speech, etc.) onto Rebellions NPU, building and operating CI/CD pipelines to ship a production-ready SDK to external users, releasing and distributing the software, and authoring and publishing developer documentation.

Responsibilities and Opportunities

SDK Ecosystem & Release Pipeline: Design, build, and operate CI/CD pipelines that automate and scale the SDK release process
Model Deployment & Porting: Port and optimize the latest deep learning models (LLM, Multi-modal, etc.) onto Rebellions NPU, analyze performance bottlenecks, and validate numerical parity against GPU reference implementations when needed
Framework Integration: Integrate the SDK tightly with the open-source inference and serving ecosystem, including Hugging Face Transformers, Diffusers, vLLM, and to maximize the developer experience
Verification & Debugging: Design model-feeding frameworks and debugging utilities to ensure functional correctness and long-term stability of the SDK
Developer Documentation: Author and maintain developer-facing documentation — API references, tutorials, model support matrices, and release notes — and operate the documentation build and publishing pipeline for external developers and partners

Key Qualifications

Master’s degree or above in Computer Science, Electrical Engineering, or a related field
Deep working knowledge of CI/CD tools such as GitHub Actions, Airflow, and Buildkite
Deep understanding of modern deep learning model architectures, including LLMs (Transformer family), Multi-modal, Vision, and Speech
Strong familiarity with PyTorch internals, including model customization and graph transformations
High-performance software development skills in Python and Modern C++ (C++17 or later)
Experience using and integrating major open-source libraries such as Hugging Face (Transformers, Diffusers) and vLLM
Hands-on experience with Python and Kubernetes-based environments for deploying and troubleshooting AI inference workloads in production

Ideal Qualifications

5+ years of relevant industry experience, or equivalent practical expertise
Experience optimizing and deploying models on specific hardware using CUDA, TensorRT, TensorRT-LLM, MLIR, Triton, or TVM
Experience building scalable ML infrastructure with Docker and Kubernetes
Experience in AI Harness Engineering – designing environments where AI agents operate autonomously within defined constraints, feedback loops, and system contexts
Contributions to open-source projects as a maintainer or regular contributor
Understanding of the internals of other frameworks such as JAX or TensorFlow

전형절차

서류전형 > On-line 인터뷰 > On-site 인터뷰(과제 포함) > Culture-fit 인터뷰 > 처우 협의 > 최종 합격
전형절차는 직무별로 다르게 운영될 수 있으며, 일정 및 상황에 따라 변동될 수 있습니다.
전형 일정 및 결과는 지원 시 작성하신 이메일로 개별 안내드립니다.

참고사항

본 공고는 모집 완료 시 조기 마감될 수 있습니다.
지원서 내용 중 허위사실이 있는 경우에는 합격이 취소될 수 있습니다.
채용 및 업무 수행과 관련하여 요구되는 법령 상 자격이 갖추어지지 않은 경우 채용이 제한될 수 있습니다.
보훈 대상자 및 장애인 여부는 채용 과정에서 어떠한 불이익도 미치지 않습니다.
담당 업무 범위는 후보자의 전반적인 경력과 경험 등 제반사정을 고려하여 변경될 수 있습니다. 이러한 변경이 필요할 경우, 최종 합격 통지 전 적절한 시기에 후보자와 커뮤니케이션 될 예정입니다.
채용 관련 문의사항은 아래 메일 주소로 문의바랍니다.
[email protected]