NPU SDK Software Engineer
직군
Software
근무지
Rebellions | 리벨리온경기도 성남시 분당구 정자일로156번길 6, R-TOWER 3F ~ 8F

The NPU ​SDK ​part ​of Software ​Team owns the end-to-end ​SDK ​user experience: ​modeling work to ​bring state-of-the-art ​AI ​models (LLM, ​Multi-modal, ​Vision, ​Speech, etc.) onto ​Rebellions ​NPU, building and ​operating ​CI/CD ​pipelines to ship ​a production-ready ​SDK ​to external ​users, releasing ​and ​distributing the software, ​and authoring ​and publishing developer documentation.


Responsibilities and Opportunities

  • SDK Ecosystem & Release Pipeline: Design, build, and operate CI/CD pipelines that automate and scale the SDK release process
  • Model Deployment & Porting: Port and optimize the latest deep learning models (LLM, Multi-modal, etc.) onto Rebellions NPU, analyze performance bottlenecks, and validate numerical parity against GPU reference implementations when needed
  • Framework Integration: Integrate the SDK tightly with the open-source inference and serving ecosystem, including Hugging Face Transformers, Diffusers, vLLM, and to maximize the developer experience
  • Verification & Debugging: Design model-feeding frameworks and debugging utilities to ensure functional correctness and long-term stability of the SDK
  • Developer Documentation: Author and maintain developer-facing documentation — API references, tutorials, model support matrices, and release notes — and operate the documentation build and publishing pipeline for external developers and partners


Key Qualifications

  • Master’s degree or above in Computer Science, Electrical Engineering, or a related field
  • Deep working knowledge of CI/CD tools such as GitHub Actions, Airflow, and Buildkite
  • Deep understanding of modern deep learning model architectures, including LLMs (Transformer family), Multi-modal, Vision, and Speech
  • Strong familiarity with PyTorch internals, including model customization and graph transformations
  • High-performance software development skills in Python and Modern C++ (C++17 or later)
  • Experience using and integrating major open-source libraries such as Hugging Face (Transformers, Diffusers) and vLLM
  • Hands-on experience with Python and Kubernetes-based environments for deploying and troubleshooting AI inference workloads in production


Ideal Qualifications

  • 5+ years of relevant industry experience, or equivalent practical expertise
  • Experience optimizing and deploying models on specific hardware using CUDA, TensorRT, TensorRT-LLM, MLIR, Triton, or TVM
  • Experience building scalable ML infrastructure with Docker and Kubernetes
  • Experience in AI Harness Engineering – designing environments where AI agents operate autonomously within defined constraints, feedback loops, and system contexts
  • Contributions to open-source projects as a maintainer or regular contributor
  • Understanding of the internals of other frameworks such as JAX or TensorFlow





전형절차

  • 서류전형 > On-line 인터뷰 > On-site 인터뷰(과제 포함) > Culture-fit 인터뷰 > 처우 협의 > 최종 합격
  • 전형절차는 직무별로 다르게 운영될 수 있으며, 일정 및 상황에 따라 변동될 수 있습니다.
  • 전형 일정 및 결과는 지원 시 작성하신 이메일로 개별 안내드립니다.


참고사항

  • 본 공고는 모집 완료 시 조기 마감될 수 있습니다.
  • 지원서 내용 중 허위사실이 있는 경우에는 합격이 취소될 수 있습니다.
  • 채용 및 업무 수행과 관련하여 요구되는 법령 상 자격이 갖추어지지 않은 경우 채용이 제한될 수 있습니다.
  • 보훈 대상자 및 장애인 여부는 채용 과정에서 어떠한 불이익도 미치지 않습니다.
  • 담당 업무 범위는 후보자의 전반적인 경력과 경험 등 제반사정을 고려하여 변경될 수 있습니다. 이러한 변경이 필요할 경우, 최종 합격 통지 전 적절한 시기에 후보자와 커뮤니케이션 될 예정입니다.
  • 채용 관련 문의사항은 아래 메일 주소로 문의바랍니다.
  • [email protected]
공유하기
NPU SDK Software Engineer

The NPU ​SDK ​part ​of Software ​Team owns the end-to-end ​SDK ​user experience: ​modeling work to ​bring state-of-the-art ​AI ​models (LLM, ​Multi-modal, ​Vision, ​Speech, etc.) onto ​Rebellions ​NPU, building and ​operating ​CI/CD ​pipelines to ship ​a production-ready ​SDK ​to external ​users, releasing ​and ​distributing the software, ​and authoring ​and publishing developer documentation.


Responsibilities and Opportunities

  • SDK Ecosystem & Release Pipeline: Design, build, and operate CI/CD pipelines that automate and scale the SDK release process
  • Model Deployment & Porting: Port and optimize the latest deep learning models (LLM, Multi-modal, etc.) onto Rebellions NPU, analyze performance bottlenecks, and validate numerical parity against GPU reference implementations when needed
  • Framework Integration: Integrate the SDK tightly with the open-source inference and serving ecosystem, including Hugging Face Transformers, Diffusers, vLLM, and to maximize the developer experience
  • Verification & Debugging: Design model-feeding frameworks and debugging utilities to ensure functional correctness and long-term stability of the SDK
  • Developer Documentation: Author and maintain developer-facing documentation — API references, tutorials, model support matrices, and release notes — and operate the documentation build and publishing pipeline for external developers and partners


Key Qualifications

  • Master’s degree or above in Computer Science, Electrical Engineering, or a related field
  • Deep working knowledge of CI/CD tools such as GitHub Actions, Airflow, and Buildkite
  • Deep understanding of modern deep learning model architectures, including LLMs (Transformer family), Multi-modal, Vision, and Speech
  • Strong familiarity with PyTorch internals, including model customization and graph transformations
  • High-performance software development skills in Python and Modern C++ (C++17 or later)
  • Experience using and integrating major open-source libraries such as Hugging Face (Transformers, Diffusers) and vLLM
  • Hands-on experience with Python and Kubernetes-based environments for deploying and troubleshooting AI inference workloads in production


Ideal Qualifications

  • 5+ years of relevant industry experience, or equivalent practical expertise
  • Experience optimizing and deploying models on specific hardware using CUDA, TensorRT, TensorRT-LLM, MLIR, Triton, or TVM
  • Experience building scalable ML infrastructure with Docker and Kubernetes
  • Experience in AI Harness Engineering – designing environments where AI agents operate autonomously within defined constraints, feedback loops, and system contexts
  • Contributions to open-source projects as a maintainer or regular contributor
  • Understanding of the internals of other frameworks such as JAX or TensorFlow





전형절차

  • 서류전형 > On-line 인터뷰 > On-site 인터뷰(과제 포함) > Culture-fit 인터뷰 > 처우 협의 > 최종 합격
  • 전형절차는 직무별로 다르게 운영될 수 있으며, 일정 및 상황에 따라 변동될 수 있습니다.
  • 전형 일정 및 결과는 지원 시 작성하신 이메일로 개별 안내드립니다.


참고사항

  • 본 공고는 모집 완료 시 조기 마감될 수 있습니다.
  • 지원서 내용 중 허위사실이 있는 경우에는 합격이 취소될 수 있습니다.
  • 채용 및 업무 수행과 관련하여 요구되는 법령 상 자격이 갖추어지지 않은 경우 채용이 제한될 수 있습니다.
  • 보훈 대상자 및 장애인 여부는 채용 과정에서 어떠한 불이익도 미치지 않습니다.
  • 담당 업무 범위는 후보자의 전반적인 경력과 경험 등 제반사정을 고려하여 변경될 수 있습니다. 이러한 변경이 필요할 경우, 최종 합격 통지 전 적절한 시기에 후보자와 커뮤니케이션 될 예정입니다.
  • 채용 관련 문의사항은 아래 메일 주소로 문의바랍니다.
  • [email protected]