Back to Jobs

Magi AI Content Quality Evaluator – En-US (Freelance, Temporary)

Remote, USA Full-time Posted 2025-11-03
Job Description: We are hiring 20 Freelance Evaluators to support a high-priority Magi AI Evaluation Pilot Project for Uber’s AI team. This role involves reviewing and rating AI-generated responses for quality, clarity, accuracy, and factual correctness across varying levels of complexity. All candidates must meet Expertise Level 3 (L3) requirements—the highest evaluator standard—to qualify for this project. Your evaluations will directly shape how AI systems learn, improve, and interact with users. What is Expertise Level 3 (L3)? This role requires Expertise Level 3, meaning the evaluator must have: Advanced analytical and research skills The ability to handle highly complex content, such as technical or domain-specific materials (e.g., science, legal, medical, or data-heavy text) Experience evaluating multi-modal data (charts, PDFs, screenshots, etc.) Capability to conduct line-by-line fact-checking and justify responses with critical reasoning Comfort working with structured rubrics and independently verifying factual accuracy Compensation: Pay Rate: $25 – $31 per hour (USD) The assessment is paid if the evaluator passes and completes at least 4 hours of work on the project Key Responsibilities: Evaluate AI-generated responses across varying task types, using structured guidelines Identify tone, style, factual, and product-specific issues in output Perform detailed comparisons, fact-checks, and accuracy reviews Submit ratings using tools such as dropdowns, screencasts, and feedback forms Meet tight deadlines (all assigned work must be completed within 4 business days) Task Complexity & Time Commitment: Tasks will range from simple to complex, but all evaluators must qualify at the L3 expertise level Average Handling Time: 75–145 minutes per task Minimum expectation: 3+ hours per task cycle, with option to take on more work Work is asynchronous, though task assignment may align with IST time zone Requirements: Professional and expert-level English (US) speaker (can reside in or outside of the U.S.) 5+ years of experience in linguistics, research, writing, content evaluation, or technical review Master’s degree or PhD strongly preferred High attention to detail, accuracy, and critical thinking Secure internet connection and workspace Must complete and pass a qualifying assessment at the Expertise Level 3 standard Assessment Details: Approx. 30 minutes to complete Paid if passed and evaluator completes at least 4 hours of project work Project Details: Project Name: Magi AI Evaluation (Uber Special Project) Work Type: Freelance, Temporary Start Date: Within 48 hours of onboarding completion (target: April 28, 2025) Schedule: Flexible, asynchronous Location: Global (must be a native En-US speaker) Duration: Initial pilot cycle, with possible future work based on performance and need Job Types: Part-time, Temporary Pay: $25.00 - $31.00 per hour Apply tot his job

Similar Jobs