Merging AudioReader, TextReader and ImageReader #166

Zeta36 · 2016-10-26T14:21:15Z

As asked in #162 and in #117, I've made a merging from the AudioReader, the TextReader and the ImageReader so anybody can train and generate not only wav files but also texts and images (#117 and #129).

Changes are transparent, so the only thing it has to be done is change two new parameters in the wavenet_params.json:
{
....
"raw_type": "Audio",
"file_ext": "*.wav"
}

In this way, you can decide to train the model in texts by just copying a folder with texts and setting later the params in wavenet_params.json to:
{
....
"raw_type": "Text",
"file_ext": "*.txt"
}

For image testing just:
{
....
"raw_type": "Image",
"file_ext": "*.jpg"
}

The file_ext can be change to any pattern like ".gif", ".mp3", etc.

In the generation time the only change is that I've change the args param --wav_out_path to --file_out_path. Nothing else.

I've tested everything in my machine and it's all fine. I don't know if these changes are going to pass the test cases. Anyway, I think this is a good thing for the team.

Somebody could help me with this by doing two things:

Change test case if this PR doesn't pass it due to an old test case implementation problem.
Change the Readme file and explain a little better the new changes and how to set the two new params in wavenet_params.json.

Thank you!!

Regards,
Samu.

Zeta36 · 2016-10-26T14:21:41Z

Merging from the AudioReader, the TextReader and the ImageReader so anybody can train and generate not only wav files but also texts and images (#117 and #129).

jyegerlehner · 2016-10-26T14:56:05Z

Regarding the test failures: you can run these tests locally on your own machine with:

sh ci/test.sh

Zeta36 · 2016-10-26T15:37:10Z

@jyegerlehner , yes I know it. But I have no too much time. I did my best this morning (in Spain) trying to merge all the readers. That's why I asked somebody to help me with this two simple ways:

Changing the test case if this PR doesn't pass it due to an old test case implementation problem.
And changing the Readme file and explain a little better the new changes and how to set the two new params in wavenet_params.json.

I think you could help me with this @jyegerlehner. I'd be very grateful.

Regards.

jyegerlehner · 2016-10-26T16:38:29Z

@Zeta36 I also have demands on my time. There's no rush. You can get to it whenever you can get around to it.

Zeta36 · 2016-10-26T19:39:13Z

I've passed all the tests in Python 2.7, but I don't know why I have an error in Python 3.5 (https://travis-ci.org/ibab/tensorflow-wavenet/builds/170878554):

In the four test files I have the next error:

 File "/home/travis/build/ibab/tensorflow-wavenet/test/test_model.py", line 11, in <module>

    from wavenet import (WaveNetModel, time_to_batch, batch_to_time, causal_conv,

  File "/home/travis/build/ibab/tensorflow-wavenet/wavenet/__init__.py", line 1, in <module>

    from .model import WaveNetModel

  File "/home/travis/build/ibab/tensorflow-wavenet/wavenet/model.py", line 3, in <module>

    from .ops import causal_conv

  File "/home/travis/build/ibab/tensorflow-wavenet/wavenet/ops.py", line 5, in <module>

    import audio_reader

ImportError: No module named 'audio_reader'

Somebody knows what can it be? (In python 2.7 all works well). Locally, the test is passed too in 3.5.

Regards.

Zeta36 · 2016-10-26T21:00:28Z

No way. All test are passed locally in my machine with:
sh ci/test.sh

But in the github Travis-CI it is passed for Python 2.7 but not for Python 3.5 (https://travis-ci.org/ibab/tensorflow-wavenet/builds/170909383)

I don't know what's the problem.

lemonzi · 2016-10-26T21:00:28Z

This is great, thanks a lot! I had assumed you would have made enough changes to make a merge that supported all cases very difficult.

Regarding the bug, I found this on StackOverflow: http://stackoverflow.com/questions/12172791/changes-in-import-statement-python3. So, from what people say and if I understood correctly, the safest way to proceed would be to always import wavenet.whatever, both in the script and in the internal code. I'm no Python expert though, and using a relative import seemed to work so far.

Zeta36 · 2016-10-26T21:45:32Z

Thank you @lemonzi. You were right about the problem.

All tests passed now ;).

If you finally commit these changes to the master I think I could have soon a working version of the global conditioning too.

jyegerlehner · 2016-10-31T22:25:56Z

Thanks for this @Zeta36

I pulled the branch to sanity check that the usual use-cases still work. When I try to train on the usual audio stuff I got this exception:

  File "train.py", line 316, in <module>
    main()
  File "train.py", line 217, in main
    input_batch = reader.dequeue(args.batch_size)
  File "/home/jd/dev/models/tensorflow/tensorflow-wavenet/wavenet/audio_reader.py", line 89, in dequeue
    encode_output = mu_law_encode(output, self.quantization_channels)
NameError: global name 'mu_law_encode' is not defined

whereas with the current master it runs OK.

jyegerlehner · 2016-10-31T22:35:28Z

wavenet/ops.py

+def FileReader(data_dir, coord, sample_rate, sample_size,
+               silence_threshold, quantization_channels,
+               pattern, EPSILON=0.001, raw_type="Audio"):
+    if raw_type == "Audio":


Rather than this if-elif we might prefer a FileReader factory, something like lemonzi had me do here for creating the optimizer.

jyegerlehner · 2016-11-01T01:20:05Z

wavenet/ops.py

+
+
+def write_output(waveform, filename, sample_rate, raw_type="Audio"):
+    if raw_type == "Image":


Taking the above idea a bit further, we could have a dictionary of FileHandler objects with the factory method for creating the right kind of Reader, a write method, bool IsAudio() and so on. That would allow us to eliminate the if statements with magic string literals that are litered across the code. I think it will be more maintainable because it won't force anyone changing the code to know all the places where those string literals appear, and the code would be more compact. For example, this if-else statement could be replaced with a single line

file_handler[raw_type].write_output(waveform, filename)

Samuel Graván added 11 commits October 26, 2016 14:56

Update wavenet_params.json

c021ffe

Update train.py

e33c821

Update requirements.txt

26b7f40

Update generate.py

70e2ce5

Update audio_reader.py

643c3e7

Update __init__.py

44bcfca

Update model.py

56a1824

Update ops.py

332eadc

Add files via upload

7a44106

Update README.md

4a4773e

Update README.md

af530ae

Samuel Graván added 15 commits October 26, 2016 19:21

Update audio_reader.py

45facc0

Update __init__.py

b435a2b

Update image_reader.py

23bdca5

Update text_reader.py

a7bde0f

Update ops.py

7186d10

Update test_model.py

b5b89e9

Update ops.py

ba53667

Update ops.py

2c6c9c7

Update README.md

0d08c4f

Update README.md

27bc0fc

Update README.md

ca1814b

Update README.md

f3c1dab

Update README.md

10a0929

Update README.md

2c094f4

Update README.md

ab3d3f2

Update README.md

07fb070

Zeta36 closed this Oct 26, 2016

Zeta36 reopened this Oct 26, 2016

Samuel Graván added 3 commits October 26, 2016 21:41

Update __init__.py

b3459db

Update __init__.py

a5140c3

Update __init__.py

a202b0e

Samuel Graván added 3 commits October 26, 2016 22:09

Update ops.py

e280e2f

Update audio_reader.py

cf29635

Update audio_reader.py

d686cba

jyegerlehner reviewed Oct 31, 2016

View reviewed changes

jyegerlehner reviewed Nov 1, 2016

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging AudioReader, TextReader and ImageReader #166

Merging AudioReader, TextReader and ImageReader #166

Zeta36 commented Oct 26, 2016

Zeta36 commented Oct 26, 2016

jyegerlehner commented Oct 26, 2016

Zeta36 commented Oct 26, 2016 •

edited

Loading

jyegerlehner commented Oct 26, 2016 •

edited

Loading

Zeta36 commented Oct 26, 2016 •

edited

Loading

Zeta36 commented Oct 26, 2016 •

edited

Loading

lemonzi commented Oct 26, 2016

Zeta36 commented Oct 26, 2016 •

edited

Loading

jyegerlehner commented Oct 31, 2016

jyegerlehner Oct 31, 2016

jyegerlehner Nov 1, 2016



		def write_output(waveform, filename, sample_rate, raw_type="Audio"):
		if raw_type == "Image":

Merging AudioReader, TextReader and ImageReader #166

Are you sure you want to change the base?

Merging AudioReader, TextReader and ImageReader #166

Conversation

Zeta36 commented Oct 26, 2016

Zeta36 commented Oct 26, 2016

jyegerlehner commented Oct 26, 2016

Zeta36 commented Oct 26, 2016 • edited Loading

jyegerlehner commented Oct 26, 2016 • edited Loading

Zeta36 commented Oct 26, 2016 • edited Loading

Zeta36 commented Oct 26, 2016 • edited Loading

lemonzi commented Oct 26, 2016

Zeta36 commented Oct 26, 2016 • edited Loading

jyegerlehner commented Oct 31, 2016

jyegerlehner Oct 31, 2016

Choose a reason for hiding this comment

jyegerlehner Nov 1, 2016

Choose a reason for hiding this comment

Zeta36 commented Oct 26, 2016 •

edited

Loading

jyegerlehner commented Oct 26, 2016 •

edited

Loading

Zeta36 commented Oct 26, 2016 •

edited

Loading

Zeta36 commented Oct 26, 2016 •

edited

Loading

Zeta36 commented Oct 26, 2016 •

edited

Loading