Skip to content

Couldn't find plip/wiki_dev as the validation dataset #16

@yiwang454

Description

@yiwang454

Hi @xplip , thanks for sharing your code. I'm currently running the pre-training scripts, and have met the issues with finding validation dataset.
I got errors like "Repository Not Found for url: https://huggingface.co/api/datasets/plip/wiki_dev', which probably suggests that the "validation_dataset_name": "plip/wiki_dev" which was specified in their config file specified an evaluation dataset that is not on the huggingface.

The detailed error msg is as below:

Traceback (most recent call last):
File "/exports/eddie/scratch/s2522559/pixel_project/pixel/modify_running_script.py", line 145, in
main()
File "/exports/eddie/scratch/s2522559/pixel_project/pixel/modify_running_script.py", line 142, in main
trainer()
File "/exports/eddie/scratch/s2522559/pixel_project/pixel/modify_running_script.py", line 106, in call
trainer.main(self.config_dict)
File "/exports/eddie/scratch/s2522559/pixel_project/pixel/scripts/training/run_pretraining.py", line 325, in main
validation_dataset = load_dataset(
File "/exports/eddie/scratch/s2522559/conda/envs/pixel/lib/python3.9/site-packages/datasets/load.py", line 1676, in load_dataset
builder_instance = load_dataset_builder(
File "/exports/eddie/scratch/s2522559/conda/envs/pixel/lib/python3.9/site-packages/datasets/load.py", line 1502, in load_dataset_builder
dataset_module = dataset_module_factory(
File "/exports/eddie/scratch/s2522559/conda/envs/pixel/lib/python3.9/site-packages/datasets/load.py", line 1254, in dataset_module_factory
raise e1 from None
File "/exports/eddie/scratch/s2522559/conda/envs/pixel/lib/python3.9/site-packages/datasets/load.py", line 1225, in dataset_module_factory
raise e
File "/exports/eddie/scratch/s2522559/conda/envs/pixel/lib/python3.9/site-packages/datasets/load.py", line 1205, in dataset_module_factory
dataset_info = hf_api.dataset_info(
File "/exports/eddie/scratch/s2522559/conda/envs/pixel/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/exports/eddie/scratch/s2522559/conda/envs/pixel/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 1761, in dataset_info
hf_raise_for_status(r)
File "/exports/eddie/scratch/s2522559/conda/envs/pixel/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 293, in hf_raise_for_status
raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-654baa8e-2186f9fe22cc891357596294;926d6ac1-fd23-457c-b19d-8707baa23362)

Repository Not Found for url: https://huggingface.co/api/datasets/plip/wiki_dev.
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

Do you have any idea how to get access to the development set?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions