I try to load a dataset using the datasets python module in my local Python Notebook. I am running a Python 3.10.13 kernel as I do for my virtual environment.
I cannot load the datasets I am following from a tutorial. Here's the error:
--------------------------------------------------------------------------- NotImplementedError Traceback (most recent call last) /Users/ari/Downloads/00-fine-tuning.ipynb Celda 2 line 3 1 from datasets import load_dataset ----> 3 data = load_dataset( 4 "jamescalam/agent-conversations-retrieval-tool", 5 split="train" 6 ) 7 data File ~/Documents/fastapi_language_tutor/env/lib/python3.10/site-packages/datasets/load.py:2149, in load_dataset(path, name, data_dir, data_files, split, cache_dir, features, download_config, download_mode, verification_mode, ignore_verifications, keep_in_memory, save_infos, revision, token, use_auth_token, task, streaming, num_proc, storage_options, **config_kwargs) 2145 # Build dataset for splits 2146 keep_in_memory = ( 2147 keep_in_memory if keep_in_memory is not None else is_small_dataset(builder_instance.info.dataset_size) 2148 ) -> 2149 ds = builder_instance.as_dataset(split=split, verification_mode=verification_mode, in_memory=keep_in_memory) 2150 # Rename and cast features to match task schema 2151 if task is not None: 2152 # To avoid issuing the same warning twice File ~/Documents/fastapi_language_tutor/env/lib/python3.10/site-packages/datasets/builder.py:1173, in DatasetBuilder.as_dataset(self, split, run_post_process, verification_mode, ignore_verifications, in_memory) 1171 is_local = not is_remote_filesystem(self._fs) 1172 if not is_local: -> 1173 raise NotImplementedError(f"Loading a dataset cached in a {type(self._fs).__name__} is not supported.") 1174 if not os.path.exists(self._output_dir): 1175 raise FileNotFoundError( 1176 f"Dataset {self.dataset_name}: could not find data in {self._output_dir}. Please make sure to call " 1177 "builder.download_and_prepare(), or use " 1178 "datasets.load_dataset() before trying to access the Dataset object." 1179 ) NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported. How do I resolve this? I don't understand how this error is applicable, given that the dataset is something I am fetching and thus cannot be cached in my LocalFileSystem in the first place.
pip install fsspec==2023.9.2and then try againPython 3.10.13. What's your Python kernel version ?fsspecversion.