Skip to content
View iBacklight's full-sized avatar
  • University of Alberta

Block or report iBacklight

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. uarm-artemis-official/Robots_Basic_Frame_TypeC uarm-artemis-official/Robots_Basic_Frame_TypeC Public

    C 2

  2. PipelineLLM PipelineLLM Public

    PipelineLLM 是一个系统性的大语言模型(LLM)后训练学习项目,涵盖从监督微调(SFT)到偏好优化(DPO)、强化学习(RLHF/PPO/GRPO)再到持续学习(Continual Learning)的完整技术栈。

    Python 19 3

  3. reinforcement-learning reinforcement-learning Public

    Forked from dennybritz/reinforcement-learning

    Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

    Jupyter Notebook 1

  4. AlbertaSat/ex2_obc_software AlbertaSat/ex2_obc_software Public

    Main repository for Athena service & equipment handler implementations

    C 10 7