Posts Tagged: robots

Adding a Method to Check for Previous Crawl Data

I added a new feature to the script that will hopefully make the TOSBack crawls more reliable! The app has had some intermittent problems downloading pages; in some cases, the crawl data would come back blank even if the document had downloaded properly before. I decided to right a couple methods that would check for… Read more »

Searching Crawl Data for Empty Files

I needed a way to scan the crawl data programmatically and determine if there were blank policies that I didn’t know about. With around 1000 rules in the TOSBack app, I need some ways to double check the data that comes back from the web scraping. I decided to set up a class method that… Read more »