All roles

Open role

[Remote] Senior Research Data Engineer (US)

Remote · South Korea Full-time

Note: The job is a remote job and is open to candidates in USA. PointClickCare is a leading health tech company focused on empowering providers to deliver exceptional care. The Senior Research Data Engineer will design and build data systems that support AI model development, ensuring data is accurately transformed and documented for effective use in AI research.

Responsibilities

  • Own the gold data layer. Transform messy, silver tables into curated, semantically rich, clean and documented gold datasets suitable for AI model development, including datasets and features reusable for AI development across projects. Maintain the data as products and needs evolve
  • Reverse-engineer data semantics. Talk with product engineers, clinical and workflow experts to learn how the products are used and how data are created in the field. Understand SQL queries, stored procedures, technical data definitions, and other code to know how products represent and transform data. Learn how data are ingested into the data lake, what silver tables and columns actually represent and how they behave. Capture provenance, semantics, clinical event sequencing, cross module record linkage and known quirks
  • Bridge semantics with AI needs. Understand researcher data needs to design and build the gold data product, with documentation that evolves, to meet AI applied research needs for a highly efficient AI-first foundation for model R&D
  • Curate datasets across modalities. For various AI uses such as generative AI, RAG, predictive and other technique, support researcher needs for chunked and tagged unstructured content with rich metadata, point-in-time-correct features and clean labels. For classical ML and statistical work, deliver model-ready tables
  • Build pipelines for reuse. Develop transformations from silver into gold inside Databricks/Spark as scheduled, observable workloads. Design them so researchers can iterate on new features and data mixes without rebuilding from scratch
  • Automate quality, filtering, and synthesis. Support research needs for programmatic labeling, weak supervision, near-duplicate detection, boilerplate and noise removal, and LLM-API-driven synthetic data generation where ground truth is scarce
  • Version and hand off. Maintain reproducible dataset snapshots. Define clean lineage and semantic definitions so the downstream team can use and re-use gold datasets in AI R&D

Skills

  • 5+ years building production data systems, with at least 2 supporting ML or AI workloads
  • Track record of learning complex new data domains quickly, through reading source code, interviewing experts, and building durable artifacts others rely on
  • Advanced Python, SQL, and PySpark/Databricks for working with large, messy data. Expert SQL specifically: comfortable reading complex stored procedures and reverse-engineering business logic from queries
  • Databricks ecosystem depth: Delta Lake, Unity Catalog, Spark/PySpark tuning, MLflow
  • AI domain literacy: working understanding of embeddings, tokenization, feature engineering, point-in-time correctness, train/validation/test splits, data drift, and the differences between what classical ML and generative models need from data
  • Data wrangling across modalities: transforming unstructured content (text, PDFs, transcripts, logs) and structured tabular data into clean, model-ready forms
  • AI-friendly data formats (Parquet, Hugging Face datasets) and storage layout decisions — partitioning, sharding, caching, that keep researcher workflows responsive in Azure, AWS or other working environments
  • Data quality, filtering, and synthesis pipelines: support for programmatic labeling and weak supervision (e.g. Snorkel or equivalent), near-duplicate detection (MinHash/LSH), content and quality filters, LLM-API-driven synthetic data generation
  • Pipeline orchestration (e.g. a la Airflow, Databricks Workflows, Dagster, or Prefect) and dataset versioning including Unity Catalog and feature-store support
  • Experience handling regulated or sensitive data under controlled access (HIPAA or equivalent). Familiarity with general de-identification concepts
  • Git-based version control and CI/CD for data and code
  • Strong written documentation. Skill in eliciting requirements and tacit knowledge from technical and non-technical experts
  • Bachelor's degree in computer science, data science, engineering, statistics, or related field. Equivalent practical experience considered
  • Hands-on EHR data experience, ideally in skilled nursing, long-term care, post-acute care, or senior living
  • Working knowledge of clinical terminologies (ICD-10, SNOMED CT, LOINC) and data standards (HL7v2, FHIR, CCDA)
  • Dbt for transformation and testing
  • Familiarity with training-side ML frameworks (e.g. PyTorch) sufficient to debug data-side bottlenecks; experience supporting LLM or foundation-model training or fine-tuning data pipelines
  • Clinical NLP, OCR, document parsing, or ASR / transcript pipeline experience
  • Data lineage and catalog tools
  • Prior experience embedded inside an AI or ML research team
  • Master's degree in a relevant quantitative or computer science field

Benefits

  • Benefits starting from Day 1!
  • Retirement Plan Matching
  • Flexible Paid Time Off
  • Wellness Support Programs and Resources
  • Parental & Caregiver Leaves
  • Fertility & Adoption Support
  • Continuous Development Support Program
  • Employee Assistance Program
  • Allyship and Inclusion Communities
  • Employee Recognition … and more!

Company Overview

  • PointClickCare develops web-based products and services to help long-term care providers manage the complete lifecycle of resident care. It was founded in 1995, and is headquartered in Mississauga, Ontario, CAN, with a workforce of 1001-5000 employees. Its website is http://www.pointclickcare.com.
  • Company H1B Sponsorship

  • PointClickCare has a track record of offering H1B sponsorships, with 3 in 2026, 17 in 2025, 11 in 2024, 11 in 2023, 17 in 2022, 4 in 2021. Please note that this does not guarantee sponsorship for this specific role.
  • More open positions

    [Remote] Marketing Associate

    Work from home Full-time role

    [Remote] Senior Project Manager

    Work from home Full-time role

    [Remote] Senior Operations Engineer - US

    Work from home Full-time role

    [Remote] Finance Associate

    Work from home Full-time role

    [Remote] Medicare Collections Specialist

    Work from home Full-time role

    Senior Analyst, Indirect Procurement job at Stryker in Kalamazoo, MI, Carmel, IN, Chicago, IL, Cary, IL

    Work from home Full-time role

    Receptionist; Camera Remote

    Work from home Full-time role

    Lead CA Title Examiner (Remote)

    Work from home Full-time role

    Patient Care Customer Service Coordinator – Remote After-Hours Sleep & Home Medical Equipment Support Specialist (Evenings & Weekends)

    Work from home Full-time role

    [Remote] Full Stack Data Engineer

    Work from home Full-time role

    Sr. Revenue Operations Analyst | Fully Remote US

    Work from home Full-time role

    Experienced Bilingual Customer Service Representative – Work-from-Home Opportunity with careerzynith

    Work from home Full-time role

    Microsoft D365 Senior Project Manager

    Work from home Full-time role

    [Hiring] Consultant – Schedule Management & Process Integration @Pyrovio

    Work from home Full-time role

    [Remote] Full Stack QA Engineer - AI Trainer

    Work from home Full-time role

    Remote Data Entry Specialist – High‑Pay, Flexible Schedule, Virtual Collaboration, and Career Growth Opportunities with careerzynith

    Work from home Full-time role

    Senior Web Developer (.NET) / Dot Net Developer/ Applications developer (remote)

    Work from home Full-time role

    Quant Developer (FinTech)

    Work from home Full-time role

    [Remote] Senior Network Reliability Engineer

    Work from home Full-time role

    Experienced Data Entry Specialist – Part-Time Remote Opportunity at careerzynith

    Work from home Full-time role

    Inbound Toll Collections Processing Agent

    Work from home Full-time role