short alias lmurg

This Python-based NNGPT project leverages large language models (LLMs) to automate the creation of neural network architectures, streamlining the design process for machine learning practitioners. It leverages various neural networks from the LEMUR Dataset to fine-tune LLMs and provide insights into potential architectures during the creation of new neural network models.
NNGPT supports an optional LangGraph-based multi-agent orchestration mode. The agent system integrates directly inside tune() — no separate entry point, no duplicated logic.
All pipeline logic remains in ab/gpt/util/Tune.py as the single source of truth. Agent nodes are thin wrappers only — they read from state and call the existing functions. No logic is reimplemented inside any agent file.
The professor-specified flow is: Finetuner → Generator → Evaluator → Predictor
- manager — controls routing, checks epoch stop condition, decides next node
- generator — calls
nn_gen()/trans_gen(); skips if epoch < skip_epoch; skips evaluator if no code generated - evaluator — calls
_evaluate_epoch(); stores accuracy and all predictor inputs in state - finetuner — calls
_finetune_epoch(); increments epoch counter, returns to manager - predictor — optional; activates after epoch 1 and epoch 2 accuracies are both available
Any future improvement to nn_gen(), trans_gen(), _evaluate_epoch(), or _finetune_epoch() automatically applies to both classic and agent modes.
Agent mode uses LangGraph MemorySaver checkpointing. If the pipeline crashes mid-epoch (e.g. GPU OOM), re-running with the same nn_name_prefix resumes from the last completed node — no restart from epoch 0.
Enable agent mode by adding --use_agents to the standard run command:
python -m ab.gpt.TuneNNGen_7B_code_olympic_channel_alter --use_agentsTo also enable the accuracy predictor agent:
python -m ab.gpt.TuneNNGen_7B_code_olympic_channel_alter --use_agents --use_predictorWithout --use_agents, the pipeline runs in the original classic mode — behaviour is identical to the unmodified pipeline.
| File | Purpose |
|---|---|
ab/gpt/agents/run_agent.py | Builds and runs the LangGraph StateGraph |
ab/gpt/agents/manager.py | Routing logic and epoch stop condition |
ab/gpt/agents/predictor.py | Optional accuracy prediction node |
ab/gpt/agents/state.py | Shared AgentState TypedDict — field names match LEMUR DB columns |
ab/gpt/util/Tune.py | Single source of truth: nn_gen, trans_gen, _evaluate_epoch, _finetune_epoch, generate_step, evaluate_step, finetune_step |
ab/gpt/util/AccPredictor.py | Accuracy predictor interface (to be implemented) |
For Linux/Mac:
python3 -m venv .venv source .venv/bin/activate python3 -m pip install --upgrade pipFor Windows:
python3 -m venv .venv .venv\Scripts\activate python3 -m pip install --upgrade pipIt is assumed that CUDA 13.0 is installed; otherwise, consider replacing 'cu130' with the appropriate version. Most LLM usage scenarios require GPUs with at least 24 GB of memory.
Create a virtual environment, activate it, and run the following command to install all the project dependencies:
python -m pip install --upgrade pip pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu130 pip install -r req-no-isolation.txt --no-build-isolation --extra-index-url https://download.pytorch.org/whl/cu130If there are installation problems, install the dependencies from the 'requirements.txt' file one by one.
To get the latest code and statistics, install the most recent version of the LEMUR Dataset from GitHub:
rm -rf db pip uninstall -y nn-dataset pip install --no-cache-dir git+https://github.com/ABrain-One/nn-dataset --extra-index-url https://download.pytorch.org/whl/cu130Installing the stable version:
pip uninstall -y nn-dataset pip install nn-dataset --extra-index-url https://download.pytorch.org/whl/cu130Adding functionality to export data to Excel files and generate plots for analyzing neural network performance:
pip install nn-stat --extra-index-url https://download.pytorch.org/whl/cu130and export/generate:
python -m ab.stat.export pip install nn-gpt --extra-index-url https://download.pytorch.org/whl/cu130 pip install nn-gpt[flash] --no-build-isolation --extra-index-url https://download.pytorch.org/whl/cu130-
ab.gpt.NNAlter*.py– Generates modified neural network models.
Use the-eargument to set the number of epochs for the initial CV model generation. -
ab.gpt.NNEval.py– Evaluates the models generated in the previous step. -
ab.gpt.TuneNNGen*.py– Performs fine-tuning and evaluation of an LLM. For evaluation purposes, the LLM generates neural network models, which are then trained to assess improvements in the LLM’s performance on this task. The -s flag allows skipping model generation for the specified number of epochs.
All versions of this project are compatible with AI Linux and can be seamlessly executed within the AI Linux Docker container.
Installing the latest version of the project from GitHub
docker run --rm -u $(id -u):ab -v $(pwd):/a/mm abrainone/ai-linux:llm bash -c "[ -d nn-gpt ] && git -C nn-gpt pull || git -c advice.detachedHead=false clone --depth 1 https://github.com/ABrain-One/nn-gpt"Running script
docker run --rm -u $(id -u):ab --shm-size=16G -v $(pwd)/nn-gpt:/a/mm abrainone/ai-linux:llm bash -c "python -m ab.gpt.TuneNNGen_8B"If recently added dependencies are missing in the AI Linux, you can create a container from the Docker image abrainone/ai-linux:llm, install the missing packages (preferably using pip install <package name>), and then create a new image from the container using docker commit <container name> <new image name>. You can use this new image locally or push it to the registry for deployment on the computer cluster.
The original version of this project was created at the Computer Vision Laboratory of the University of Würzburg by the authors mentioned below. If you find this project to be useful for your research, please consider citing our articles for NNGPT, architecture design and hyperparameter tuning with LLMs:
@article{ABrain.NNGPT, title = {NNGPT: Rethinking AutoML with Large Language Models}, author = {Kochnev, Roman and Khalid, Waleed and Uzun, Tolgay Atinc and Zhang, Xi and Dhameliya, Yashkumar Sanjaybhai and Qin, Furui and Vysyaraju, Chandini and Duvvuri, Raghuvir and Goyal, Avi and Ignatov, Dmitry and Timofte, Radu}, journal = {arXiv preprint}, volume = {arXiv:2511.2033}, url = {https://arxiv.org/pdf/2511.2033}, year = {2025} } @article{ABrain.Architect, title={From Memorization to Creativity: LLM as a Designer of Novel Neural-Architectures}, author={Khalid, Waleed and Ignatov, Dmitry and Timofte, Radu}, journal={arXiv preprint}, volume = {arXiv:2601.02997}, url = {https://arxiv.org/pdf/2601.02997}, year={2026} } @InProceedings{ABrain.HPGPT, title={{Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning?}}, author={Kochnev, Roman and Goodarzi, Arash Torabi and Bentyn, Zofia Antonina and Ignatov, Dmitry and Timofte, Radu}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)}, url={https://openaccess.thecvf.com/content/ICCV2025W/AIM/papers/Kochnev_Optuna_vs_Code_Llama_Are_LLMs_a_New_Paradigm_for_ICCVW_2025_paper.pdf}, pages = {5664--5674}, year={2025} } This project is distributed under the following licensing terms:
- models with pretrained weights under the legacy DeepSeek LLM V2 license
- all neural network models and their weights not covered by the above licenses, as well as all other files and assets in this project, are subject to the MIT license