Glossary
  • 05 Nov 2024
  • 3 Minutes to read
  • Contributors
  • Dark
    Light
  • PDF

Glossary

  • Dark
    Light
  • PDF

Article summary

Account: Created when you sign up. Multiple users with different roles can be added to your account.

App: Additional functionality that can be added to your account, e.g., more step types/pipe actions or integrations

AutoBot: A type of robot that, given a list of input URLs (on different domains), maps inputs to outputs via an Extractor robot per domain

Configuration: Various properties of a robot, e.g., its concurrency level, inputs, proxy or schedule. Multiple configurations can be defined for a robot. Synonymous with Run.

Crawler: A type of robot that visits links given a starting page and extracts basic information about the pages it visits, e.g., the URL and page title

Dataset: A set or table of rows. Similar to a table in a SQL database, a sheet in a spreadsheet or collection in a NoSQL database. See also Deduplication and Record Linkage.

Data type: Field definition of a row used in a dataset, dictionary, or as input or output for an Extractor, Pipes or AutoBot robot. Can be useful to standardise input and output when working with many robots.

Deduplication: The process of removing duplicate rows within one data set as defined by the key configuration on the dataset.

Dictionary: A mapping of keys to values.

Element path: A CSS3 selector expression used in steps in Extractors specifying how to traverse the DOM tree of a web page to reach the particular element to interact with

Execution: The process of running a specific configuration of a robot. An execution has one result per input.

Extractor: A type of robot that extracts information from a web page and interacts with the page in various ways, e.g., fills in forms, clicks buttons and much more

Input: The values to be used when executing a robot, e.g., a URL or a search query value. A data type can be used. In an Extractor the input fields are defined and in a configuration of the robot, the actual values are specified. In Pipes robots, input fields are automatically calculated given the starting nodes of the Pipes graph.

Integration: A type of app that specifically integrates (or connects) with an external/3rd-party service, e.g., Amazon S3 or Google Drive

Key configuration: Field definition of how duplicates in a data set should be identified

Output: The definition of fields in a robot that should be saved as results. A data type can be used. In an Extractor the output fields are defined. In Pipes robots, output fields are automatically calculated given the exit nodes of the Pipes graph.

Pipes: A type of robot that performs various actions in a sequence or workflow, e.g., reads data from a source, performs some processing/transformation and saves results in a data store

Pipes action: A part of a Pipes robot that performs some action, e.g., executes a robot, iterates rows in a data set, makes HTTP requests or looks up details of a Facebook page by name

Project: An asset on your account, e.g., a robot or data set

Proxy: A server performing requests on behalf of a robot execution

Record Linkage: The process of combining two data sets using the key configuration on the data set combining into. For more details, see How do I use Data Sets for deduplication/record linkage?

Results: The data, in row format, saved by a robot. See also Execution and Output

Result log: A text file containing all events pertaining to the particular result

Robot: A type of asset that performs an automated process, e.g., extracts information from a web page. See Extractor Pipes*, Crawler and AutoBot.

Run: Synonymous with Configuration

Schedule: The recurrence with which a configuration is executed. Can be expressed in cron syntax.

Scraper: Deprecated. See Extractor

Step: A part of an Extractor robot that performs some action, e.g., visits a URL, clicks a link, waits for an element or extracts a piece of information

Timetable: The earliest and latest possible times that a configuration can be executed

Trigger: Added to a project asset, e.g., a configuration or data set, which causes an action to be performed when some event occurs, e.g., adding a row to a data set when an execution completes

Webhook: A type of integration that notifies an external endpoint about some event, e.g., when an execution completes

Worker: A part of the dexi.io platform that does the work of a robot, e.g., extracts information from a web page. The number of workers on your account determines how much execution work can be done concurrently.


Was this article helpful?

What's Next