Serve as the primary technical contact for customers throughout the engagement lifecycle, providing technical guidance and ongoing support
Drive the adoption of Rebellions' AI solutions through product demos and technical presentations, while designing production-ready AI inference infrastructure with high availability and failover resilience
Develop an in-depth understanding of Rebellions' AI solutions, evaluating end-to-end performance (e.g., throughput, latency, energy efficiency), identifying system-level bottlenecks, and optimizing workload scheduling and routing strategies for AI inference
Create clear and concise technical documentation, best practices, and integration guides for customers and partners
Work closely with internal software and hardware engineering teams to relay customer feedback and help shape future product and business direction
Key Qualifications
Bachelor's degree in Computer Science, Electrical Engineering, or a related technical field
Strong problem-solving skills, with a proactive and analytical approach to technical challenges
Deep understanding of AI inference and serving frameworks (e.g., vLLM, TensorRT-LLM) and working knowledge of PyTorch
Hands-on experience with Python and Kubernetes-based environments for deploying and troubleshooting AI inference workloads in production
Strong communication and collaboration skills, with experience working across engineering, business, and strategy teams and engaging directly with customers
Ideal Qualifications
Hands-on experience designing and deploying end-to-end AI inference systems, with a focus on serving, system-level optimization, and performance benchmarking in real-world environments
Solid understanding of hardware acceleration (NPU, GPU, edge AI chips) and practical experience with model optimization techniques such as quantization and pipelining
Prior experience in technical customer-facing roles such as Field Application Engineer, Solutions Engineer, or Sales Engineer
Experience producing technical content such as blog posts, documentation, or whitepapers for AI/ML practitioners
Serve as the primary technical contact for customers throughout the engagement lifecycle, providing technical guidance and ongoing support
Drive the adoption of Rebellions' AI solutions through product demos and technical presentations, while designing production-ready AI inference infrastructure with high availability and failover resilience
Develop an in-depth understanding of Rebellions' AI solutions, evaluating end-to-end performance (e.g., throughput, latency, energy efficiency), identifying system-level bottlenecks, and optimizing workload scheduling and routing strategies for AI inference
Create clear and concise technical documentation, best practices, and integration guides for customers and partners
Work closely with internal software and hardware engineering teams to relay customer feedback and help shape future product and business direction
Key Qualifications
Bachelor's degree in Computer Science, Electrical Engineering, or a related technical field
Strong problem-solving skills, with a proactive and analytical approach to technical challenges
Deep understanding of AI inference and serving frameworks (e.g., vLLM, TensorRT-LLM) and working knowledge of PyTorch
Hands-on experience with Python and Kubernetes-based environments for deploying and troubleshooting AI inference workloads in production
Strong communication and collaboration skills, with experience working across engineering, business, and strategy teams and engaging directly with customers
Ideal Qualifications
Hands-on experience designing and deploying end-to-end AI inference systems, with a focus on serving, system-level optimization, and performance benchmarking in real-world environments
Solid understanding of hardware acceleration (NPU, GPU, edge AI chips) and practical experience with model optimization techniques such as quantization and pipelining
Prior experience in technical customer-facing roles such as Field Application Engineer, Solutions Engineer, or Sales Engineer
Experience producing technical content such as blog posts, documentation, or whitepapers for AI/ML practitioners