sobota 12. prosince 2015

Comparison of and OutWit

When was released, I was excited. However, the excitement disappeared. The reasons follow:
  1. Whenever you are defining a crawler, you have to always define at least 5 examples, even though you know, that in this case just 2 examples would be enough.
  2. The interface is sluggish even in the offline version of
  3. The crawling is approximately 10 times slower than in OutWit.
  4. The export is not satisfactory. If you tell to export the data into csv, then strips away all commas from the scraped text. If you need to preserve the commas, you can still export the data in XLS or JSON. But Excel has a limit on the length of text in cell. And when you get over the limit, you cannot open the file. JSON is neither a workable solution because the characters in the text are not always correctly escaped, making the JSON invalid. Hence, after several hours of web scraping with you find yourself unable to scrape
While OutWit irritates me with it's deep context menus, at least it does it's work.   

