Skip to content

ABrain-One/nn-gpt

Repository files navigation

GPT-Driven Neural Network Generator

GitHub release
short alias lmurg

📖 Overview

This Python-based NNGPT project leverages large language models (LLMs) to automate the creation of neural network architectures, streamlining the design process for machine learning practitioners. It leverages various neural networks from the LEMUR Dataset to fine-tune LLMs and provide insights into potential architectures during the creation of new neural network models.

LangGraph Multi-Agent Workflow

NNGPT supports an optional LangGraph-based multi-agent orchestration mode. The agent system integrates directly inside tune() — no separate entry point, no duplicated logic.

Design Principle

All pipeline logic remains in ab/gpt/util/Tune.py as the single source of truth. Agent nodes are thin wrappers only — they read from state and call the existing functions. No logic is reimplemented inside any agent file.

Agent Flow

The professor-specified flow is: Finetuner → Generator → Evaluator → Predictor

  • manager — controls routing, checks epoch stop condition, decides next node
  • generator — calls nn_gen() / trans_gen(); skips if epoch < skip_epoch; skips evaluator if no code generated
  • evaluator — calls _evaluate_epoch(); stores accuracy and all predictor inputs in state
  • finetuner — calls _finetune_epoch(); increments epoch counter, returns to manager
  • predictor — optional; activates after epoch 1 and epoch 2 accuracies are both available

Any future improvement to nn_gen(), trans_gen(), _evaluate_epoch(), or _finetune_epoch() automatically applies to both classic and agent modes.

Crash Recovery

Agent mode uses LangGraph MemorySaver checkpointing. If the pipeline crashes mid-epoch (e.g. GPU OOM), re-running with the same nn_name_prefix resumes from the last completed node — no restart from epoch 0.

Usage

Enable agent mode by adding --use_agents to the standard run command:

python -m ab.gpt.TuneNNGen_7B_code_olympic_channel_alter --use_agents

To also enable the accuracy predictor agent:

python -m ab.gpt.TuneNNGen_7B_code_olympic_channel_alter --use_agents --use_predictor

Without --use_agents, the pipeline runs in the original classic mode — behaviour is identical to the unmodified pipeline.

Agent Files

File Purpose
ab/gpt/agents/run_agent.py Builds and runs the LangGraph StateGraph
ab/gpt/agents/manager.py Routing logic and epoch stop condition
ab/gpt/agents/predictor.py Optional accuracy prediction node
ab/gpt/agents/state.py Shared AgentState TypedDict — field names match LEMUR DB columns
ab/gpt/util/Tune.py Single source of truth: nn_gen, trans_gen, _evaluate_epoch, _finetune_epoch, generate_step, evaluate_step, finetune_step
ab/gpt/util/AccPredictor.py Accuracy predictor interface (to be implemented)

Create and Activate a Virtual Environment (recommended)

For Linux/Mac:

python3 -m venv .venv source .venv/bin/activate python3 -m pip install --upgrade pip

For Windows:

python3 -m venv .venv .venv\Scripts\activate python3 -m pip install --upgrade pip

It is assumed that CUDA 13.0 is installed; otherwise, consider replacing 'cu130' with the appropriate version. Most LLM usage scenarios require GPUs with at least 24 GB of memory.

Environment for NNGPT Developers

Pip package manager

Create a virtual environment, activate it, and run the following command to install all the project dependencies:

python -m pip install --upgrade pip pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu130 pip install -r req-no-isolation.txt --no-build-isolation --extra-index-url https://download.pytorch.org/whl/cu130

If there are installation problems, install the dependencies from the 'requirements.txt' file one by one.

Update of NN Dataset

To get the latest code and statistics, install the most recent version of the LEMUR Dataset from GitHub:

rm -rf db pip uninstall -y nn-dataset pip install --no-cache-dir git+https://github.com/ABrain-One/nn-dataset --extra-index-url https://download.pytorch.org/whl/cu130

Installing the stable version:

pip uninstall -y nn-dataset pip install nn-dataset --extra-index-url https://download.pytorch.org/whl/cu130

Adding functionality to export data to Excel files and generate plots for analyzing neural network performance:

pip install nn-stat --extra-index-url https://download.pytorch.org/whl/cu130

and export/generate:

python -m ab.stat.export

Installation of NNGPT with pip

 pip install nn-gpt --extra-index-url https://download.pytorch.org/whl/cu130 pip install nn-gpt[flash] --no-build-isolation --extra-index-url https://download.pytorch.org/whl/cu130

Use

  • ab.gpt.NNAlter*.py – Generates modified neural network models.
    Use the -e argument to set the number of epochs for the initial CV model generation.

  • ab.gpt.NNEval.py – Evaluates the models generated in the previous step.

  • ab.gpt.TuneNNGen*.py – Performs fine-tuning and evaluation of an LLM. For evaluation purposes, the LLM generates neural network models, which are then trained to assess improvements in the LLM’s performance on this task. The -s flag allows skipping model generation for the specified number of epochs.

Pretrained LLM weights

🐳 Docker

All versions of this project are compatible with AI Linux and can be seamlessly executed within the AI Linux Docker container.

Installing the latest version of the project from GitHub

docker run --rm -u $(id -u):ab -v $(pwd):/a/mm abrainone/ai-linux:llm bash -c "[ -d nn-gpt ] && git -C nn-gpt pull || git -c advice.detachedHead=false clone --depth 1 https://github.com/ABrain-One/nn-gpt"

Running script

docker run --rm -u $(id -u):ab --shm-size=16G -v $(pwd)/nn-gpt:/a/mm abrainone/ai-linux:llm bash -c "python -m ab.gpt.TuneNNGen_8B"

If recently added dependencies are missing in the AI Linux, you can create a container from the Docker image abrainone/ai-linux:llm, install the missing packages (preferably using pip install <package name>), and then create a new image from the container using docker commit <container name> <new image name>. You can use this new image locally or push it to the registry for deployment on the computer cluster.

Citation

The original version of this project was created at the Computer Vision Laboratory of the University of Würzburg by the authors mentioned below. If you find this project to be useful for your research, please consider citing our articles for NNGPT, architecture design and hyperparameter tuning with LLMs:

@article{ABrain.NNGPT, title = {NNGPT: Rethinking AutoML with Large Language Models}, author = {Kochnev, Roman and Khalid, Waleed and Uzun, Tolgay Atinc and Zhang, Xi and Dhameliya, Yashkumar Sanjaybhai and Qin, Furui and Vysyaraju, Chandini and Duvvuri, Raghuvir and Goyal, Avi and Ignatov, Dmitry and Timofte, Radu}, journal = {arXiv preprint}, volume = {arXiv:2511.2033}, url = {https://arxiv.org/pdf/2511.2033}, year = {2025} } @article{ABrain.Architect, title={From Memorization to Creativity: LLM as a Designer of Novel Neural-Architectures}, author={Khalid, Waleed and Ignatov, Dmitry and Timofte, Radu}, journal={arXiv preprint}, volume = {arXiv:2601.02997}, url = {https://arxiv.org/pdf/2601.02997}, year={2026} } @InProceedings{ABrain.HPGPT, title={{Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning?}}, author={Kochnev, Roman and Goodarzi, Arash Torabi and Bentyn, Zofia Antonina and Ignatov, Dmitry and Timofte, Radu}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)}, url={https://openaccess.thecvf.com/content/ICCV2025W/AIM/papers/Kochnev_Optuna_vs_Code_Llama_Are_LLMs_a_New_Paradigm_for_ICCVW_2025_paper.pdf}, pages = {5664--5674}, year={2025} } 

Licenses

This project is distributed under the following licensing terms:

  • models with pretrained weights under the legacy DeepSeek LLM V2 license
  • all neural network models and their weights not covered by the above licenses, as well as all other files and assets in this project, are subject to the MIT license

The idea and leadership of Dr. Ignatov

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages