GitHub Link of the code - https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/learn/text_classification.py

The above code is an example of text classification using RNN or bag of words.

I trained the model using Ubuntu's conversation corpus. After training i modified the code to predict on the trained model ( changes are descirbed in another page.).

Adding the prediction part as follows,

  x_test = pandas.DataFrame(data=["What's the apt-get equivalent to rpm -qi?"])
  y_test = pandas.Series(data=[])

After this when i tried to run the code, it failed with following error,

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/preprocessing/text.py", line 168, in fit_transform
    self.fit(raw_documents)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/preprocessing/text.py", line 150, in fit
    for tokens in self._tokenizer(raw_documents):
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/preprocessing/text.py", line 51, in tokenizer
    yield TOKENIZER_RE.findall(value)
TypeError: buffer size mismatch

After trying different things for a while, I ended up with below small change to the code which worked.

  x_test = pandas.Series(data=["What's the apt-get equivalent to rpm -qi?"])
  y_test = pandas.Series(data=[])

TypeError: buffer size mismatch

results for ""

No results matching ""