Khant Sint Heinn

NLP Engineer & Data Foundation Specialist

Pakokku, Myanmar
kalixlouiis@gmail.com • https://linkedin.com/in/khant-sint-heinn
Khant Sint Heinn

About

Passionate Machine Learning Engineer focused on NLP and data-centric AI. Experienced in building robust data foundations for low-resource languages, with a proven track record of curating open-source datasets and developing specialized linguistic tools to democratize AI innovation.

Experience

  • -

    Global - Remote

    Summary:

    • Leading DatarrX as a non-profit foundation dedicated to building a high-quality data foundation for the Burmese language in the AI era. Responsible for defining the long-term vision, mission, and technical roadmap while orchestrating a vibrant open-source community.

    Responsibilities:

    • Setting the strategic vision and roadmap to transform Burmese into a data-rich language for AI innovation.
    • Reviewing, validating, and merging community contributions across GitHub, Hugging Face, and Kaggle repositories.
    • Mentoring contributors and maintaining high coding/data quality standards through rigorous code reviews and documentation.
    • Designing and implementing scalable data collection and synthetic generation pipelines to bridge language gaps.
    • Facilitating collaborative workflows between developers, linguists, and tech enthusiasts to foster an inclusive AI ecosystem.

    Achievements:

    • Successfully built and maintained a diverse ecosystem of 30+ open-source datasets and tools.
    • Translated and contributed the comprehensive LLM Course to the local community, promoting AI education accessibility.
    • Maintained a transparent and open contribution model that encourages multi-disciplinary participation.
    • Natural Language Processing (NLP)
    • Strategic Planning
    • Python
    • Hugging Face Transformers
    • Open Source Leadership
    • NLP Research
    • Open Source Community Management
    • AI Data Engineering
    • Technical Validation

Projects

Skills

  • Python
  • PyTorch
  • Hugging Face
  • Git
  • GitHub GitHub
  • Docker
  • Linux
  • Pandas
  • Numpy

Education

Languages

  • Burmese Native speaker
  • English Intermediate

Interests

  • Artificial Intelligence NLP Research, Large Language Models, Data-Centric AI
  • Open Source Community Building, Democratizing AI, Collaborative Development
  • Linguistics Computational Linguistics, Script Morphology, Language Preservation
© 2026 Khant Sint Heinn. All rights reserved.