Timeline for Why does one need Google's WaveNet model to generate audio if it already takes audio as input?

5 events

when toggle format	what		by	license	comment
Oct 17, 2020 at 20:30	vote	accept	Joe Black
Jul 15, 2020 at 3:56	comment	added	Tim Mak		In the paper, "operate on audio waveform" does not mean "take audio waveform as input". It simply means that they model the audio waveform directly. Your post is off topic though. Try StackOverflow next time perhaps.
Jun 15, 2020 at 21:58	comment	added	Joe Black		where's the quote in the paper "creates a raw audio waveform from the text it is given"? I i couldn't the find it in the paper though i understand Wavenet is supposed to generate audio and that's why it's unclear to me, which is the reason stated in the title and why i made this question.
Jun 15, 2020 at 21:55	comment	added	Joe Black		I understand it's supposed to generate audio, but could you reconcile what i quoted? how else one to interpret "operating directly on the raw audio waveform"? what's the input to wavenet when it's used with tacotron-2 for text-to-speech, esp the input to `input_convolution` that described in the OP?
Jun 15, 2020 at 20:36	history	answered	sjp	CC BY-SA 4.0