Skip to content

Finetune language models to automatically generate documentation for different programming language (Python, Java, go, etc)

License

Notifications You must be signed in to change notification settings

fastbatchai/docstring-generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 AutoDoc Course

MIT License Python Course

Learn how to fine-tune language models to automatically generate high-quality docstrings across multiple programming languages.

🎯 What You'll Learn

  • Multi-task Fine-tuning: Train models to generate docstrings across multiple programming languages simultaneously
  • LLM Fine-tuning Techniques: Instruction fine-tuning and RL fine-tuning using GRPO
  • Hands-on Experience: Work with different fine-tuning libraries (PEFT, TRL, Unsloth)
  • Cloud Infrastructure: Deploy scalable training with Modal
  • Performance Evaluation: Compare models using automated metrics and evaluation frameworks

🚀 Quick Start

# Clone and install git clone https://github.com/fastbatchai/docstring-generation.git cd docstring-generation uv pip install -e . # Setup Modal modal setup # Run training modal run -i -m autoDoc.train --training-type sft --use-unsloth

📖 Course Lessons

📊 Results

Fine-tuning Performance: CodeGemma vs CodeGemma+LoRA

Language CodeGemma CodeGemma+LoRA Improvement
Python 0.47 0.52 +11%
Java 0.57 0.55 -4%
JavaScript 0.43 0.48 +12%
Go 0.49 0.54 +10%
PHP 0.42 0.63 +50%
Ruby 0.52 0.60 +15%

NOTE: These are preliminary results based on training with a small subset (1K samples for each programming language).

Instruction finetuning results

Model Comparison Across Different Base Models
LoRA Configuration Impact on Performance

More results are available in Lesson 5: Evaluation and Comparison

🤝 Community

📄 License

MIT License - see LICENSE file for details.


⭐ Star this repository if you found it helpful!

About

Finetune language models to automatically generate documentation for different programming language (Python, Java, go, etc)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages