A tool that I’ve recently started using a lot is Automator app. For those of you unfamiliar with it, it’s a prepackaged Mac tool that allows you to automate time-consuming and repetitive tasks.
Automate Filename Scraping
For example – one of the ways that I help out my developers is to put together the relational models for the backend of the app. Developer time is better spent actually coding, not organizing information, so anything I can do to reduce barriers helps get our products to market faster.
Putting together a relational model in excel can be time consuming. Particularly if you are dealing with several different models that interlink in weird ways and have hundreds of data entries, so this is where Automator can be incredibly useful.
In the screenshot above, I had to build a model which organized filenames by an index number. Each index also included a small animation, so I needed to add a sequence of images names to the respective row. I did the first few by hand, and it took me at least 2 minutes looking up each file, then clicking the name, and copying it over into my spreadsheet. If I tried to build the entire model by brute force, it would have taken me over 10 hours…
Instead, I built a workflow in Automator that scraped the filenames recursively from my images folder and exported them into a text file. I was then able to import the text file into excel and do some data cleanup and then that was it! 10 hours of work completed in less than 30 minutes.
Next, I wanted to include some text data to explain the technical information displayed in the app (it’s a reference tool), so I was interested in scraping dictionary definitions from the internet. Usually building a webscraper requires access to some type of API or JSON/XML feed. This can be REALLY time consuming since you need to properly format the data that you’re scraping into something readable.
I’m not particularly good at working with different database structures, so rather than try to build something from scratch, which would probably take me an entire week, why not just build an automator work flow that can pull this information for you.
I put together a workflow that displayed the URLS from pages I had open in safari, then scraped the content from the open URLS and filtered the text I wanted based on html tags in the source code. Presto – something that would have taken me a ridiculously long time to build and implement actually took a little less than an hour.
Technology is the best.