A web crawler (also known as a web spider or web robot) is a program or automated script which browses the World Wide Web in a methodical, automated manner.
Tangiblee's preferred method for collecting product data from your website is using our web crawler. Tangiblee's crawler is designed to simulate single-visitor activity to prevent any disruptions or performance issues with your website.
How the crawler works:
- Our bot will “scrape” data from your website by periodically crawling at an agreed upon frequency, from every 15 minutes to once a day.
- Our crawler will automatically parse the dimension data from each PDP, regardless of it’s format (text, numbers, units, etc`) or location in the page.
- The crawler reviews the product images and selects the one with the highest resolution and shooting angle that best showcases your product.
What does Tangiblee need from me to start crawling my website?
Depending on the security protocols in place on your website, Tangiblee usually does not need anything from you to begin crawling your website.
What if my website security protocols specifically restrict ‘bot’ traffic?
Tangiblee will provide our IP address for whitelisting on your website. Whitelisting Tangiblee’s IP address will provide access to only Tangiblee’s crawling bot while maintaining security protocol for “black bot” traffic.
Here is our crawler information for White-Listing:
User-Agent: TangibleeBot/1.0.0.0 (http://tangiblee.com/bot)
*NOTE: Tangiblee will never need or request access to any of your website’s original source files. This crawler will NOT impact load time nor will it create any vulnerabilities to your website or it’s performance.