Skip to content
View ZihanWang314's full-sized avatar
🏠
Working from home
🏠
Working from home

Highlights

  • Pro

Block or report ZihanWang314

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ZihanWang314/README.md

Pinned Loading

  1. mll-lab-nu/RAGEN mll-lab-nu/RAGEN Public

    RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

    Python 2.6k 210

  2. deepseek-ai/ESFT deepseek-ai/ESFT Public

    Expert Specialized Fine-Tuning

    Python 732 261

  3. CoE CoE Public

    Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models

    Python 227 27

  4. xingyaoww/mint-bench xingyaoww/mint-bench Public

    Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Zihan Wang*, Jiateng Liu, Yangyi Chen, Lifan Yuan, Hao Peng and …

    Python 133 8

  5. mll-lab-nu/TStar mll-lab-nu/TStar Public

    TStar is a unified temporal search framework for long-form video question answering

    Python 93 6

  6. mll-lab-nu/VAGEN mll-lab-nu/VAGEN Public

    Training VLM agents with multi-turn reinforcement learning

    Python 433 50