This repository contains data and codes for PolyCL.
- Dependency: You will need only
polycl.pyin this repository andtorch,transformerspackages as the minimum requirement. - Obtain the polymer embedding: Simply follow the demonstration in
PolyCL_Easy_Usage.ipynb.
You might need to configure git lfs first and download git lfs following instructions on https://git-lfs.com/ . Then install git lfs using:
$ git lfs install After git lfs properly configured:
$ git clone https://github.com/JiajunZhou96/PolyCL.git # create a new environment $ conda create --name polycl python=3.9 $ conda activate polycl # install requirements #$ pip install numpy==1.26.4 #$ pip install pandas==1.3.3 #$ pip install scikit-learn==0.24.2 $ pip install torch==1.12.0+cu113 -f https://download.pytorch.org/whl/torch_stable.html $ pip install transformers==4.20.1 $ pip install -U torchmetrics $ pip install tensorboard $ pip install tqdm $ conda install -c conda-forge rdkit pip install torch-geometric==1.7.2 torch-sparse==0.6.18 torch-scatter==2.1.2 -f https://pytorch-geometric.com/whl/torch-1.12.0+cu113.html Run with key parameters for the pretraining summarized in config.json.
train.py Run with sample configurations described in config_tf_notebook.json.
transfer_learning.py Models available for benchmarking are stored in the ./benchmark/ directory.
-
- Run
tf_polybert.pyand polyBERT model will be automatically downloaded from https://huggingface.co/kuelumbus/polyBERT .
- Run
-
- Download the model folder of Transpolymer "pretrain.pt" from https://github.com/ChangwenXu98/TransPolymer/tree/master/ckpt .
- Put the folder to the directory
"./model/Trasnpolymer/"to be referred to as"./model/Trasnpolymer/pretrain.pt". - Run
tf_transpolymer.py.
- Download the model folder of Transpolymer "pretrain.pt" from https://github.com/ChangwenXu98/TransPolymer/tree/master/ckpt .
-
- Assign "gcn" or "gin" to the key "gnn_type" in
config_graph.jsonto use different types of GNNs. - Run
gnn.py.
- Assign "gcn" or "gin" to the key "gnn_type" in
-
- Run
morgan_nn.pyto use neural network. - Run
rf.pyto use random forest. - Run
xgb.pyto use XGBoost.
- Run
