Edge AI Inference Architect: Low-Latency Model Serving
Tether Operations Limited
Tether Operations Limited is seeking an expert in AI model serving and inference optimization based in Italy. The role involves designing high-performance model architectures, ensuring low latency, and optimizing memory usage on resource-constrained devices. Candidates should possess a PhD in a related field and have extensive experience in GPU kernel writing and inference optimization.
The position requires a deep understanding of cutting-edge AI techniques, with responsibilities that encompass building robust inference pipelines and monitoring performance in live environments. Join us to contribute to innovative AI systems!
#J-18808-LjbffrOfferta di lavoro pubblicata 2 mesi fa