Edge AI Inference Architect: Low-Latency Model Serving

Tether Operations Limited

Tether Operations Limited is seeking an expert in AI model serving and inference optimization based in Italy. The role involves designing high-performance model architectures, ensuring low latency, and optimizing memory usage on resource-constrained devices. Candidates should possess a PhD in a related field and have extensive experience in GPU kernel writing and inference optimization.

The position requires a deep understanding of cutting-edge AI techniques, with responsibilities that encompass building robust inference pipelines and monitoring performance in live environments. Join us to contribute to innovative AI systems!

#J-18808-Ljbffr

Clicca qui per candidarti su euspert.app

Offerta di lavoro pubblicata 2 mesi fa