Skip to content Skip to sidebar Skip to footer

How Can I Send A Batch Of Strings To The Google Cloud Natural Language Api?

I have a Pandas dataframe containing a number of social media comments that I want to analyse using Google's NLP API. Google's documentation only discusses (as far as I can see) ho

Solution 1:

If you want to parallelise requests you can have a Spark job do it for you.

Here there's a code snippet I tried myself and worked:

from pyspark.context import SparkContext
from pyspark import SparkConf

from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types


defcomment_analysis(comment):

    client = language.LanguageServiceClient()
    document = types.Document(
        content=comment,
        type=enums.Document.Type.PLAIN_TEXT)
    annotations = client.analyze_sentiment(document=document)
    total_score = annotations.document_sentiment.score
    return total_score


sc = SparkContext.getOrCreate(SparkConf())

expressions = sc.textFile("sentiment_lines.txt")

mapped_expressions = expressions.map(lambda comment: comment_analysis(comment))

(where sentiment_lines.txt is a plain text document with some comments)

Each element of mapped_expressions would be the overall sentiment to each "comment" in expressions.

In addition, remember you can have Dataproc run the Spark job so everything stays managed inside Google Cloud.

Post a Comment for "How Can I Send A Batch Of Strings To The Google Cloud Natural Language Api?"