How Can I Send A Batch Of Strings To The Google Cloud Natural Language Api?
I have a Pandas dataframe containing a number of social media comments that I want to analyse using Google's NLP API. Google's documentation only discusses (as far as I can see) ho
Solution 1:
If you want to parallelise requests you can have a Spark job do it for you.
Here there's a code snippet I tried myself and worked:
from pyspark.context import SparkContext
from pyspark import SparkConf
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types
defcomment_analysis(comment):
client = language.LanguageServiceClient()
document = types.Document(
content=comment,
type=enums.Document.Type.PLAIN_TEXT)
annotations = client.analyze_sentiment(document=document)
total_score = annotations.document_sentiment.score
return total_score
sc = SparkContext.getOrCreate(SparkConf())
expressions = sc.textFile("sentiment_lines.txt")
mapped_expressions = expressions.map(lambda comment: comment_analysis(comment))
(where sentiment_lines.txt is a plain text document with some comments)
Each element of mapped_expressions would be the overall sentiment to each "comment" in expressions.
In addition, remember you can have Dataproc run the Spark job so everything stays managed inside Google Cloud.
Post a Comment for "How Can I Send A Batch Of Strings To The Google Cloud Natural Language Api?"