How Can I Send A Batch Of Strings To The Google Cloud Natural Language Api?

April 19, 2024 Post a Comment

I have a Pandas dataframe containing a number of social media comments that I want to analyse using Google's NLP API. Google's documentation only discusses (as far as I can see) ho

Solution 1:

If you want to parallelise requests you can have a Spark job do it for you.

Here there's a code snippet I tried myself and worked:

from pyspark.context import SparkContext
from pyspark import SparkConf

from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types


defcomment_analysis(comment):

    client = language.LanguageServiceClient()
    document = types.Document(
        content=comment,
        type=enums.Document.Type.PLAIN_TEXT)
    annotations = client.analyze_sentiment(document=document)
    total_score = annotations.document_sentiment.score
    return total_score


sc = SparkContext.getOrCreate(SparkConf())

expressions = sc.textFile("sentiment_lines.txt")

mapped_expressions = expressions.map(lambda comment: comment_analysis(comment))

(where sentiment_lines.txt is a plain text document with some comments)

Each element of mapped_expressions would be the overall sentiment to each "comment" in expressions.

Baca Juga

In addition, remember you can have Dataproc run the Spark job so everything stays managed inside Google Cloud.

Getting Started with Python

How Can I Send A Batch Of Strings To The Google Cloud Natural Language Api?

Solution 1:

Post a Comment for "How Can I Send A Batch Of Strings To The Google Cloud Natural Language Api?"