Extract information using OpenAI action step
  • 11 Jan 2024
  • 3 Minutes to read
  • Contributors
  • Dark
    Light
  • PDF

Extract information using OpenAI action step

  • Dark
    Light
  • PDF

Article Summary

What this advanced step does is that it provides input to OpenAI with some instructions using natural language and use it in many ways to get structured content back. Some examples are generating summaries from large text or extracting information based on specified fields, etc.


How can I activate the feature?

First, you need to activate the OpenAI Integration App in the Dexi Apps Section. On the account, more than 1 activation can be added for different OpenAI keys.

Find the app and click Edit. The app should be configured by entering your own API key from your OpenAI account. If you enter a valid key, the app will activate, and the new robot step can be used inside Dexi robots.

 How can I use it?

When creating your extractor in the Robot Editor, you click on the section/text that you want to extract and on the right-hand menu select Add Step for Element.

Select the extractor step – OpenAI Integration – Extract Data with Open AI. The step will be added to the robot steps diagram.

If you edit this step, the right-hand menu will appear with several customizable options:

Action activation: You have the option to choose between different activations of the OpenAI app

Output field: Choose an output field type and give a natural language description. More than 1 output field can be added. You can add or remove different field types.

Additional information with example is provided on question mark icon on the step as shown in the below attached screenshot.

Example: Choose email (string), description: provide a list of emails.

Test the robot using Start/Stop button and when finished go to the Results Preview tab which will give a preview of your desired data divided into the fields you have chosen. Finish editing the robot, click Save and run the robot through a configuration. 

How much would it cost to use the Chat GPT API on robots?

API usage is subject to rate limits applied on tokens per minute (TPM), requests per minute or day 

(RPM/RPD), and other model-specific limits. Your organization’s rate limits for gpt-3.5-turbo can be 

found here.

Costs depend on how large your requests are (the number of tokens), so it will depend on how much 

you use the chat GPT API on your robots. Request can have a varying number of tokens and both input 

and output tokens count toward the total consumption. For example, if you make a request with 10 

tokens in the message input and receive 20 tokens in the model-generated message, you would be 

billed for a total of 30 tokens.

The pricing can be subject to change in the future so you can find the updated prices in the pricing link.

Below you can find the pricing on the number of tokens for our model (gpt-3.5-turbo).

When to use AI to extract data?

• When data is not formatted or structured, so they cannot be extracted using CSS/XPath
• To parse dates in varying formats into a single uniform date format
• For a short summary of a long paragraph
• To classify items into categories
• To return English versions of text that is in other language
• ...

Tips to retrieve best result from Open AI

• Choose a meaningful name for the Output Field, i.e a name that represents the output expected 

from OpenAI:

  • email, phoneNumber, firstNameList, lastNames, recipeIngredients, dateOfBirth, birthCountry

• Do not use Generic field names. These require detailed explanations for OpenAI to understand the task, and still might not produce the expected result:

  • test, data, text, variable1, response, result, output

• Difference explained:

Output: phoneNumbers

Description: phone numbers

vs

Output: test

Description: return/extract all phone numbers that you can find in the given text

-

Output: email

Description: email

vs

Output: output1

Description: Retrieve email from this block of information

• Correctly spell the field name in English. In case of spelling mistakes, Open AI sometimes tries to correct them, and result might go unnoticed since we still point to the wrongly spelled field.

• If having troubles retrieving results, rephrasing description might help. Also, phrases like "in the given text", "for the provided text", "from the input", "that are mentioned", "that are available", "that are captured" at the end of the description might help


Was this article helpful?