Enhance the functional coverage of each operation by addressing operation-specific constraints, including tensor shape variations, precision management, and numerical stability
Improve utilization of heterogeneous compute resources by optimizing operation-level performance based on workload characteristics
Key Qualifications
Master’s degree or higher in Electrical Engineering, Computer Science, or a related field
In-depth understanding of neural network operations, including both high-level concepts and low-level computational workflows
Strong analytical, troubleshooting, and performance optimization skills
Proficiency in C++ and Python
Ideal Qualifications
Broad knowledge of deep learning models across multiple domains, including computer vision, natural language processing, and speech recognition
Experience in model- and layer-level optimization techniques for computational efficiency, such as sparsity, reduced precision, and layer decomposition
Experience with architecture-specific parallel programming and hardware acceleration frameworks, including SSE/AVX (x86), NEON (AArch), and CUDA/OpenCL (GPU)
Academic background in computer architecture is preferred
Enhance the functional coverage of each operation by addressing operation-specific constraints, including tensor shape variations, precision management, and numerical stability
Improve utilization of heterogeneous compute resources by optimizing operation-level performance based on workload characteristics
Key Qualifications
Master’s degree or higher in Electrical Engineering, Computer Science, or a related field
In-depth understanding of neural network operations, including both high-level concepts and low-level computational workflows
Strong analytical, troubleshooting, and performance optimization skills
Proficiency in C++ and Python
Ideal Qualifications
Broad knowledge of deep learning models across multiple domains, including computer vision, natural language processing, and speech recognition
Experience in model- and layer-level optimization techniques for computational efficiency, such as sparsity, reduced precision, and layer decomposition
Experience with architecture-specific parallel programming and hardware acceleration frameworks, including SSE/AVX (x86), NEON (AArch), and CUDA/OpenCL (GPU)
Academic background in computer architecture is preferred