AI Agent Evaluation Analyst (Freelance)
30 $/oraMindrift
1 day ago Be among the first 25 applicants Overview
This opportunity is for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI.
What We DoThe Mindrift platform, launched and powered by Toloka, connects domain experts with cutting-edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe.
Who we’re looking forWe’re looking for curious and intellectually proactive contributors, the kind of person who double-checks assumptions and plays devil’s advocate. Are you comfortable with ambiguity and complexity? Does an async, remote, flexible opportunity sound exciting? Would you like to learn how modern AI systems are tested and evaluated?
This is a flexible, project-based opportunity well-suited for:
- Analysts, researchers, or consultants with strong critical thinking skills
- Students (senior undergrads / grad students) looking for an intellectually interesting gig
- People open to a part-time and non-permanent opportunity
We’re on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you’ll balance quality assurance, research, and logical problem-solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases.
You do not need a coding background, but you must be curious, intellectually rigorous, and capable of evaluating the soundness and consistency of complex setups. If you’ve excelled in consulting, CHGK, Olympiads, case solving, or systems thinking, you might be a great fit.
What you’ll be doing- Reviewing evaluation tasks and scenarios for logic, completeness, and realism
- Identifying inconsistencies, missing assumptions, or unclear decision points
- Helping define clear expected behaviors (gold standards) for AI agents
- Annotating cause-effect relationships, reasoning paths, and plausible alternatives
- Thinking through complex systems and policies as a human would to ensure agents are tested properly
- Working closely with QA, writers, or developers to suggest refinements or edge case coverage
Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone.
Requirements- Excellent analytical thinking: can reason about complex systems, scenarios, and logical implications
- Strong attention to detail: can spot contradictions, ambiguities, and vague requirements
- Familiarity with structured data formats: can read, not necessarily write JSON/YAML
- Ability to assess scenarios holistically: what’s missing, what’s unrealistic, what might break?
- Good communication and clear writing (in English) to document your findings
We also value applicants who have:
- Experience with policy evaluation, logic puzzles, case studies, or structured scenario design
- Background in consulting, academia, olympiads (e.g. logic/math/informatics), or research
- Exposure to LLMs, prompt engineering, or AI-generated content
- Familiarity with QA or test-case thinking (edge cases, failure modes, "what could go wrong")
- Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.)
- Get paid for your expertise, with rates that can go up to $30/hour depending on your skills, experience, and project needs
- Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments
- Participate in an advanced AI project and gain valuable experience to enhance your portfolio
- Influence how future AI models understand and communicate in your field of expertise
Referrals increase your chances of interviewing at Mindrift by 2x
#J-18808-Ljbffr- ...the world’s fastest-growing AI companies accelerating the advancement... ...leverage AI to be a better analyst. This is your chance to... ...highly desirable. Perks of Freelancing With Turing: Work in a... ...type : Contractor assignment/freelancer (no medical/paid leave) Commitments...Libero professionistaRemoto40 h/sett.
- ...the world’s fastest-growing AI companies accelerating the advancement... ...leverage AI to be a better analyst. This is your chance to... ...highly desirable. Perks of Freelancing With Turing: Work in a... ...type : Contractor assignment/freelancer (no medical/paid leave) Commitments...Libero professionistaRemoto40 h/sett.
- An Italian startup studio is seeking a Freelance Backend Software Developer to design and develop scalable backend solutions. This remote-first position requires 4+ years of experience with Node.js, PHP, and Python. You'll collaborate closely with teams to create high-quality...Libero professionistaRemotoOrario flessibile
- ...Nice-to-have - Experience working in startups or early‐stage products. - Familiarity with data platforms, analytics pipelines, or AI‐enabled products. - Part‐time freelance collaboration - Remote‐first, with optional access to our Milan office #J-18808-Ljbffr...Libero professionistaPart-timeStage/TirocinioRemotoOrario flessibile
- ...whenever collaboration (and fun!) is needed. We are looking for a Freelance Backend Software Developer to collaborate on an innovative... ...reliable APIs Contribute to the development of data-intensive and AI-driven backend systems Develop and manage databases and...Libero professionistaRemotoOrario flessibile
- ...Aizoon cerca un Data Analyst da inserire nella Divisione Transportation. Il candidato si occuperà di analisi dati e governance, collaborando con coinvolgendo stakeholder chiave nel settore Automotive. Le principali responsabilità includono la raccolta dei requisiti,...
- ...STIP SRL Torino, Italia Digital Solution Analyst In presenza Descrizione azienda Stip AI sta trasformando il modo in cui le aziende interagiscono con i propri clienti attraverso tecnologie di Intelligenza Artificiale all’avanguardia. Le nostre soluzioni AI automatizzano...Impiego permanente
85.000 €
The IT Business Specialist plays a key role in supporting strategic technology initiatives and ensuring effective alignment between Business priorities and IT delivery. The position combines strong management skills and functional analysis capabilities, with particular...Contratto con partita IVA30 $/ora
...shape the future of AI. What We Do The Mindrift... ...and structured evaluation scenarios for LLM‑based agents. Create test cases that... ...behavior to compare agent actions against.... ...and scoring logic to evaluate agent actions. Analyze... ...Flexible, remote, freelance project that fits...Libero professionistaPart-timeRemotoOrario flessibile- La posizione è aperta presso laDirezione Vita e Welfare - ufficio Consulenza Big Business Attività dell’ufficio Assunzione rischi Malattie, Infortuni e Vita TCM, nell’ambito del settore Grandi Collettività (es. Aziende, Fondi, Associazioni di categoria). Valutazione...Orario flessibile
- Società di Consulenza, Torino Torino, Settore Tecnology Azienda L'azienda è una realtà consolidata nel settore Technology & Telecoms e si distingue per la sua capacità di innovare e offrire soluzioni tecnologiche avanzate. È una grande organizzazione che opera...Smart workingImpiego permanente
- Ruolo e Responsabilità All'interno della divisione ENG DIGITAL per la business line Eng Modernize e in particolare per il gruppo BSI Mobility, siamo alla ricerca di una risorsa per il ruolo di Functional & Business Analysis. La figura verrà inserita nel Team Prime Delivery...
- Una società di consulenza nel settore Technology a Torino cerca un professionista con laurea in Ingegneria Gestionale o Economia e almeno 2 anni di esperienza. Le responsabilità includono analisi dei requisiti, test di software e coordinamento di attività. Offriamo un ...Smart workingImpiego permanente
- ...Business Developer Commerciale — Agente, Consulente o Procacciatore Figura dedicata allo sviluppo di nuove opportunità e alla gestione... ...media management, content marketing Progetti di trasformazione AI per PMI: automazioni, integrazione strumenti, formazione interna...Contratto con partita IVA
- ...di meeting, allineamenti ed eventuali escalation Laurea in ambito tecnico-scientifico 2-3 anni di esperienza nel ruolo di Data Analyst e interesse/predisposizione per attività di governance Ottima conoscenza della lingua inglese (almeno livello B2) Spiccate...
- ...Business Analyst - Categoria protetta Azienda Il nostro cliente è una realtà consolidata e di medie dimensioni con un forte impegno verso l'innovazione e la crescita. Offerta Attività: Coordinare l'avvio e il controllo dei progetti di area, supportando anche...
- We are looking for an experienced Data Engineer with strong technical leadership skills and hands-on experience in data infrastructure projects to collaborate on an innovative project developed by Startup Bakery. Key responsibilities Design and oversee end-...Libero professionista
27.000 € - 37.000 €
Page Personnel Italia SPA a Torino cerca un Business Analyst con esperienza nel coordinamento di progetti. Il candidato ideale ha una laurea in economia o ingegneria gestionale e una buona padronanza della lingua inglese. La retribuzione annuale varia tra 27.000 e 37.000...- Reale Group cerca un professionista per la Direzione Vita e Welfare in Torino. La figura si occuperà di analisi quantitative, costruzione di modelli predittivi e supporto allo sviluppo di strumenti di pricing. È richiesta laurea in discipline attinenti, buona conoscenza...
- Eaton Corporation in Torino is looking for an intern to join the Testing Department. The intern will support software development for the post-processing and analysis of test data. This role requires a degree in Mechanical Engineering and programming skills in Python. ...Stage/TirocinioOrario flessibile
- ...Organizational development initiatives* Reward frameworks, job evaluation, and recognition programs* Employee engagement and development programs... ...candidates participate in interviews without the assistance of AI tools or external prompts. Our interview process is designed to...Smart workingLavoro ibridoOrario flessibile
- ...hours. Flexibility – work remotely or join us at our Milan HQ whenever collaboration (and fun!) is needed. We are looking for a Freelance Frontend Software Developer to collaborate on an innovative project developed by Startup Bakery. You will be responsible for...Libero professionistaRemotoOrario flessibile
30 $/ora
A technology platform for AI projects is seeking contributors with a degree in Mathematics for project-based work. Ideal candidates... ...for numerical validation. Tasks involve designing math problems, evaluating AI solutions, and validating results. Contributors can earn up...Libero professionistaPaga orariaOrario flessibile- ...garantirne coerenza complessiva, chiarezza e sostenibilità nel tempo. Il ruolo ricercato Stiamo cercano un Senior Organization Analyst che entri a far parte della funzione Group Organizational Governance, con un ruolo chiave nel guidare il cambiamento organizzativo...
- ...Reale Mutua Assicurazioni cerca un Senior Organization Analyst per l'area Group Organizational Governance. Sarai responsabile di guidare cambiamenti organizzativi e supportare decisioni strategiche entro il Gruppo. Richiesta esperienza di 5-10 anni in analisi e progettazione...
- ...Salesforce Business Analyst & System Admin Country/Region: IT Job Overview We are looking for an enthusiastic and skilled Salesforce Business Analyst & System Admin with a strong technical background to join our Delivery team and play a critical role in driving...
- ...Turing is looking for a Remote Business Analyst fluent in English and Italian to conduct research, analyze data, and improve large language models. This role requires strong analytical skills and independence for remote work. You will create scenarios to train models...Remoto40 h/sett.Orario flessibile
35 $/ora
...Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is... ...Involves Generate prompts that challenge AI; Evaluate AI-generated solutions for correctness,...Libero professionistaPaga orariaTemporaneoPart-timeImpiego permanente- ...di vita dei progetti, dalla raccolta dei requisiti alla formazione degli utenti finali. Offriamo un contratto di collaborazione freelance e siamo aperti a candidature di ogni orientamento o espressione di genere. Si richiede l'invio di curricula che soddisfano i requisiti...Libero professionistaRemoto
30 $/ora
An innovative AI project firm in Milan seeks QAs for autonomous AI agents. This flexible, project-based role requires strong analytical thinking, attention to detail... ...skills in English. Ideal candidates include analysts or students eager to contribute to AI validation efforts...RemotoOrario flessibile
