Open role

Remote | LLM Personal Assistant Evaluation Specialist — $70–$180/hour

Remote · Australia Full-time

We are sharing a specialised part-time consulting opportunity for advanced LLM power users experienced in personalized AI workflows, rubric-based evaluation, real-world task assessment, personal productivity systems, and high-context decision support. This role supports current and upcoming remote consulting opportunities focused on evaluating how AI systems handle personalized, real-world life tasks across food, health, productivity, career, learning, research, planning, and personal workflow scenarios. Selected professionals will create realistic prompts, complete complex AI-assisted tasks, record workflow execution, design or apply detailed rubrics, and evaluate whether AI outputs are useful, personalized, practical, safe, and successful in real-life contexts.

Key Responsibilities

Professionals in this role may contribute to Personalized AI Task Evaluation

Create written responses, prompts, and explanations for complex personal-life tasks
Evaluate whether AI outputs are practical, well-reasoned, personalized, realistic, and successful
Identify where outputs succeed, miss context, overreach, provide generic advice, or fail to account for real constraints
Use hands-on LLM experience to assess real-world usefulness across high-context personal workflows Rubric Design & Quality Assessment
Apply structured rubrics and quality criteria to evaluate AI system performance
Create detailed evaluation rubrics for complex personal tasks and multi-step workflows
Judge outputs against criteria involving usefulness, personalization, reasoning quality, safety, completeness, and success conditions
Write clear, specific, and well-supported feedback explaining evaluation decisions Real-World Workflow Execution
Execute AI-assisted tasks while recording screens according to project instructions
Review task performance across tools, prompts, reasoning steps, outputs, and final recommendations
Complete research-intensive personal workflows end-to-end within expected turnaround timelines
Maintain careful documentation of task setup, execution, rubric design, and evaluation results Ideal Profile Strong candidates may have
Heavy personal usage of LLM products and AI tools
Experience using AI for multi-step tasks, planning, research, decision-making, personal workflows, or life administration
Familiarity with tools such as ChatGPT, Claude, Gemini, Perplexity, Cursor, Windsurf, Codex, or other AI agents
Strong ability to explain what makes an AI output useful, incomplete, unsafe, unrealistic, generic, or poorly personalized
Extensive rubric experience, including prior rubric design, evaluation, and quality assessment work
Strong written judgment, attention to detail, and ability to evaluate against structured criteria
Ability to complete tasks within 24 hours when project timing requires Educational Background
Formal degree requirements may vary based on project needs
Practical experience using LLMs for complex personal workflows, rubric-based evaluation, research, writing, QA, product testing, or AI assessment is highly relevant
Experience in education, research, operations, productivity systems, coaching, writing, product evaluation, user research, or AI workflow design may be especially valuable Nice to Have
100+ hours of prior rubric-related work involving rubric design, evaluation, model assessment, quality review, or structured judgment
Experience evaluating AI tools across personal productivity, career planning, food recommendations, learning workflows, health-adjacent reasoning, or personal research tasks
Strong familiarity with personal AI workflows involving calendars, reminders, errands, job applications, LinkedIn, resumes, study plans, restaurant selection, or decision support
Ability to record screen-based workflows clearly and follow detailed task instructions
Access to a desktop or laptop computer suitable for project work and screen recording Why This Opportunity
Apply advanced LLM power-user experience to structured remote project work
Contribute to high-quality evaluation of personalized AI assistant workflows
Work on flexible assignments involving practical, real-world personal tasks across multiple domains
Use your judgment to help assess whether AI systems are truly useful, personalized, realistic, and successful
Remote structure with competitive hourly compensation Contract Details
Independent contractor role
Fully remote with flexible scheduling
Eligible professionals should be based in the United States depending on project needs
Expected commitment of approximately 15–40 hours per week depending on project availability and scope
Participants may be asked to complete a paid work trial as part of onboarding
Work trial compensation may be approximately $30 upon completion depending on project requirements
Tasks may require 24-hour turnaround depending on assignment timing
Desktop or laptop computer required for project work and screen recording
Competitive rates between $70–$180 per hour depending on expertise, project scope, and task type
Weekly payments via Stripe or Wise
Projects may be extended, shortened, or adjusted depending on scope and performance
Work will not involve access to confidential or proprietary information from any employer, client, or institution About The Platform This opportunity is available through 24-MAG LLC. We connect experienced professionals with remote consulting opportunities across technical, evaluation, and project-based workstreams. By submitting this application, you acknowledge that your information may be processed by 24-MAG LLC for recruitment and opportunity matching in accordance with our Privacy Policy https//www.24-mag.com/privacy-policy.

Apply Now Open full posting

Remote | LLM Personal Assistant Evaluation Specialist — $70–$180/hour

Key Responsibilities

More open positions

Virtual Executive Assistant — Patient Success Operations [Remote]

Personal Assistant- Full Time (Weekends Required)

reputed company Live Chat and Email Support Specialist - Remote Customer Care Representative | $25-$35/hr | careerzynith

Experienced Work From Home Customer Service Representative - Phone & Email Support

On-Call Crisis Hotline Counselor- up to 19 hours per week

Bilingual Sales Specialist / Spécialiste des Ventes Bilingue

APTPUO - Hiver 2027 - CMN5532-B00

Experienced Customer Service Representative – Part-Time Work From Home Opportunity at careerzynith

Customer Service Associate – Locker+ Retail Operations (Part‑Time) – Deliver & Return Solutions at careerzynith

Remote Data Entry Specialist – Accurate Page Typing & Database Management – Full‑Time Work‑From‑Home Opportunity at careerzynith

Immediate Hiring: Experienced Data Analyst – Remote Data Entry Opportunity at careerzynith

Virtual Customer Service Lead 20-25 Hours Per Week - Remote Opportunity with careerzynith

Mental Health Counselor (Remote)

User Onboarding Specialist, SaaS (Flexible Hours)

[Remote] Senior Account Program Manager

Travel Customer Onboarding Manager – End‑to‑End Client Success Lead for Global Business Travel Solutions

Staff Accountant - Remote

[Remote] Software Engineering Manager

[Remote] Senior Software Engineer - Cyber (Clearance Required)

Remote Live Chat Support Specialist – Customer Experience & Digital Engagement

Sales Development Representative