- 19 Jul 2021
- 5 Minutes to read
- Updated on 19 Jul 2021
- 5 Minutes to read
There are three major components to understand Dexi.
A robot is the most fundamental part of dexi.io and is something that automates things — such as websites or data flows. Robots come in four forms: Extractors, Crawlers, Pipes, and AutoBots.
A Pipe is a super robot. They can control other robots and create robot workflows. Essentially, taking one robot, initiated the run, then automatically moving to the next robot and triggering its run, etc. Pipes can pull in external information from API's, databses, and similar sites. Pipe-bots do not extract data from websites themselves, rather they combine other robots, API's, and datasets to make a single flow for data extraction and processing.
AutoBots always accept a URL as input and then maps that URL to a list of Extractors for a range of sites. If you request something from an Autobot that it doesn't know how to compute it will add the URL to its list of internal sites. This allows you use a single AutoBot to scrape products from hundreds of sites using URLs for the individual projects.
For every robot, you must have at least one run to execute it. A run is a configuration of how you want to execute it, not an execution itself.
You can have an unlimited number of executions of a single run. You can also have an unlimited number of runs per robot, but for the vast majority of robots, you only need one or two.
A run configuration includes:
The Integrations tab allows you to select which of your configured integrations this particular run should use. For every selected integration, dexi.io will upload all available formats to that integration upon successful execution.
Inputs are especially important to understand, as inputs are often used to pass search criteria, login credentials, or other information to the web site. If your robot requires input, you must add inputs to the run or the robot will fail.
Adding an input looks like this in the Extractor editor, in the Inputs tab:
Here is what the step looks like:
In the configuration page, add inputs using the Inputs tab:
You will then get a series of results for each individual input:
To import your input values:
- Download the CSV template dexi.io automatically creates based on your robot's input fields.
- Copy the values.
- Save the CSV file.
- Upload it using the Import CSV button to import the values.
Watching runs and robots
To be notified via e-mail or push notification when an execution succeeds or fails, you can Watch a run.
To start watching:
- Select the Not watching button when editing a run.
- A drop-down menu will be brought up where you can specify what you want to watch.
To enable push notification to your smartphone or tablet devices you must connect with Pushover.net.
Monitoring your robots
If you want to monitor that your robots are in good condition, you can set up smaller runs that execute daily and then use watching to alert you if something goes wrong.
This will provide you with an early warning system to keep your robots running smoothly.
Executions are the results of robot configurations after you initiate a run.
Executions contain two tabs:
You can access an execution by completing the following steps:
- Go to Projects.
- Open the relevant project folder.
- Open the robot.
- Double-click on the required Configuration.
- Click the Executions tab.
- Click View.
The following options are available in both screens:
- Connect – This displays when the run that was executed contains Integrations. Select Connect to retry integrations associated to the run, if needed.
- Retry failed / stopped – Continue running the execution.
- Download – Select from the following options:
- Excel XML (.xls)
- Excel spreadsheet (.xlsx)
- Excel 97-2004 workbook (.xls)
- Comma separated values (.csv)
- Semicolon separated values (.scsv)
- XML (.xml)
- JSON (.json)
- Attachments/images (.zip)
The Information tab contains robot ID information, how much time it took to extract the data, how much traffic the site used, numbers of errors, list of events, and other relevent statistics.
The Results tab displays all the results of the execution and whether they've succeeded or not.
For each result, you'll see at least one screenshot, which is the screenshot of the last page of the execution. If your robot has 0 or 1 inputs, all the screenshots will be the same since all the results were retrieved in the same session.
Screenshots are only available for scraping executions.
Results tab icons
– Filter results by the following statuses:
– Refresh results.
– Auto-fresh results.
– Pause auto-refreshing results.
– Retry count.
– Retry count (no values present)
– Open result log.
– Data is good.
– Debug. Opens the robot to fix the error.
– Screenshot of the last page of the execution.
– No screenshot available.