I have audio files and I'd like to make a tf.DataSet from their audio content (i.e. each audio file in the dataset should be represented as a vector of float values)
Here's my code
def convert_audio_file_to_numpy_array(filepath): sample_rate = sox.file_info.sample_rate(filepath) audio, sr = librosa.load(filepath, sr=sample_rate) array = np.asarray(audio) return array filenames_ds = tf.data.Dataset.from_tensor_slices(input_filepaths) waveforms_ds = filenames_ds.map(convert_audio_file_to_numpy_array, num_parallel_calls=tf.data.AUTOTUNE) This produces this error: TypeError: stat: path should be string, bytes, os.PathLike or integer, not Tensor
I'm using DataSet's map function following the pattern in this official tutorial (see the call to files_ds.map). In it, the function that map uses takes a filepath.
What am I doing differently to the official tutorial?
sox.file_info.sample_rateexpects a string or a Path, but the values in your dataset are Tensors (of type string). If you want to load things from a string Tensor, you need to usetf.iofunctions or similar.