Skip to content
View Shehrozkashif's full-sized avatar

Highlights

  • Pro

Block or report Shehrozkashif

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Shehrozkashif/README.md

πŸ‘‹ Hi, I'm Shehroz Kashif

AI Engineer | Software Engineer | LLM & MLOps Researcher
Research Assistant @ Micro Electronics Research Lab (MERL)
LFX’25 Mentee @ RISC-V International

Open-source contributor focused on production-ready AI systems, LLM evaluation, and reproducible ML pipelines.


πŸš€ About Me

I’m an AI Engineer and Researcher working at the intersection of LLMs, MLOps, and open-source systems.
I build reliable, testable, and deployment-ready AI pipelines rather than experimental-only models.

πŸ” Current Focus

  • 🧠 LLM Evaluation & Benchmarking β€” functional, syntactic, adversarial
  • πŸ›‘οΈ Hallucination Mitigation β€” GAN-based approaches for private LLMs
  • βš™οΈ Reproducible ML Pipelines β€” CI/CD, logging, SLA-aware validation
  • πŸ“Š RISC-V Tooling & Data β€” machine-readable specifications and verification

πŸ’‘ Making AI systems trustworthy in production is my passion.


🧠 Roles & Affiliations

  • πŸ”Ή Research Assistant β€” MERL
    LLM evaluation pipelines, benchmarking frameworks, RISC-V tooling

  • πŸ”Ή LFX’25 Mentee β€” RISC-V International
    Machine-readable RISC-V specifications, schemas, and CI validation


🧰 Tech Stack

Languages: Python Β· Scala Β· Verilog Β· Java Β· Shell Β· JavaScript Β· HTML Β· CSS
AI / ML: PyTorch Β· TensorFlow Β· Hugging Face Transformers Β· GANs Β· LLM Evaluation Β· NumPy Β· Pandas Β· Scikit-learn
MLOps & Engineering: CI/CD Β· Docker Β· REST/gRPC Β· Logging & Monitoring Β· Reproducible Pipelines Β· Git Β· GitHub Actions Β· Linux Β· pytest
Data & Config: JSON Β· YAML Β· MySQL


πŸ’‘ Featured Projects

πŸ›‘οΈ AI4org β€” GAN-based Hallucination Mitigation for Private LLMs

πŸ”— GitHub Repository

  • Built a privacy-first ML pipeline to detect and mitigate hallucinations in private LLMs
  • Designed a GAN-style generator/discriminator for hallucination detection
  • End-to-end pipeline: ingestion β†’ validation β†’ reproducible training β†’ containerized inference
  • Integrated CI/CD, automated testing, and monitoring for production readiness

πŸ“Œ Designed for enterprise and on-prem LLM deployments where reliability matters.

πŸ”¬ ArcheV β€” LLM Benchmark Suite

πŸ”— GitHub Repository

  • Engineered a reproducible LLM benchmarking framework
  • Standardized JSON I/O and CI-driven evaluation pipelines
  • Validates functional and syntactic correctness for deployment decisions
πŸ“˜ RISC-V Unified Database

πŸ”— GitHub Repository

  • Maintained versioned YAML/JSON schemas for RISC-V tooling
  • Implemented CI validation to ensure data integrity and observability
  • Improved downstream reliability for tooling and ML pipelines

πŸ† Highlights & Achievements

  • πŸŽ“ Linux Foundation Mentorship Program (LFX) 2025
  • πŸ§ͺ Research Assistant at MERL
  • πŸ“Š Improved LLM benchmarking reliability by ~25%
  • 🧠 Hands-on experience with LLMs, GANs, MLOps, and CI/CD
  • πŸ“ Contributor to open-source and research-grade tooling

πŸ“ˆ GitHub Stats


πŸ“« Connect With Me


⭐ If you find my work useful, feel free to star a repository.
🀝 Open to collaborations in AI, LLMs, MLOps, and open-source systems.

Pinned Loading

  1. riscv/riscv-unified-db riscv/riscv-unified-db Public

    Monorepo containing a machine-readable database of the RISC-V specification and artifact generation tools

    Ruby 166 128

  2. Vermithor Vermithor Public

    RISCV RV-32I 5 Stage Pipelined Processor

    Scala

  3. merledu/ArcheV merledu/ArcheV Public

    RISC-V RV-32i RTL Benchmark for evaluating Large Language Models.

    Verilog 3

  4. merledu/ai4org merledu/ai4org Public

    Hallucination reduction framework for LLMs using RAG, multi-discriminator RL, and automated data pipelines.

    Python 1 5