Skip to content Skip to sidebar Skip to footer

Nltk, Reading In Word Numbers To Float Numbers

I've looked at the corpus section of NLTK, but there doesn't seem to be a numbers corpus. I want to change word numbers into text. For example: input: one thousand two hundred fort

Solution 1:

There isn't. What you need to do is build off this Is there a way to convert number words to Integers? or someone else you find useful/easier to work with.

To start off you'll need regex to extract those strings of interest (i.e. one, two...) then replace using the code above.

The first example you've given will be the easiest of the three, the last example is just divide that number by 100 since the output is actually an integer. The second one will be a little tricky as you'll have to modify the code or possibly create a whole new function.

AFAIK, there is no module that will parse the whole text for that.

Another possibility, as I looked further into this, is to use CD tagging from Tree Parser to help identify numbers. But you'll still need a function similar to the one mentioned above.

Post a Comment for "Nltk, Reading In Word Numbers To Float Numbers"