Original post by MARC GAYLE via TC
Editor’s note: Marc Gayle is a Rails developer and founder of 5KMVP, where he builds Minimum Viable Products for just $5K. Follow him on Twitter.
A few months ago, I got the idea that one way to get leads for remote freelance gigs was to scour Craigslist. So after doing the manual work of “crawling” through at least 100 job postings by hand, I wrote a Ruby script to do the heavy lifting and filtering for me.
Once I started looking through the data, some interesting things started jumping out at me. Even though I don’t actually live in the Valley (I live in Jamaica), I consume a lot of the news, blog posts, and articles that come from the Valley. Suffice to say, I am affected by the “Valley echo-chamber.” One side effect of that is an obsession with Ruby and Ruby on Rails as my development stack and a general expectation that the rest of the world has woken up to its beauty and elegance.
Alas, much to my surprise, that is not the case.
Before diving into the data, let me explain exactly what this script does. Throughout Craigslist, there are two URL subpaths that tend to have the majority of the web development freelance gigs: /cpg/ and /web/. So the script creates a list of all the cities on Craigslist (because CL doesn’t provide a clean, RESTful API that allows you to get this info easily) and then simply adds /cpg/ and /web/ to the end of that URL.
Then, on each link, it checks to see if the current link actually has gigs posted in that city. The reason for this is that whenever there is no gig posted in the current city, what CL does is shows gigs from “Nearby cities.” To prevent duplication, the script automatically checks for that and eliminates those cities that don’t have uniquely posted gigs. However, it does not eliminate a gig that has the exact text and is posted in two different cities – because, well, I hadn’t gotten there yet.
Once the script has a list of valid cities with gigs posted, then it starts to parse each of the links on the first page of those cities (i.e. up to 100 links in each city – CL does pagination by the 100 links) for keywords that I specified. The upside to only using the last 100 links in each city is that those are the most recent. The downside is that in active cities, the last 100 links aren’t always a good sample from the entire population.
For the Rails results, I have the following keywords: rails, (ruby on rails), (ruby on rails 3), (rails 3), (rails 2).
For the Ruby results, I have the following keywords: ruby, (ruby 1.8.7), (ruby 1.9.2), (ruby 1.9.3), ruby187, ruby192, ruby193.
We’re delighted to announce our new Tech Masterclasses to help educate you about the most in demand, innovative technologies and to help you stay in touch with the latest advances in web and mobile. We’ve created a curriculum with the help of some of the top experts in their field to educate our community members, allowing them to learn from the very best at reasonable prices. We have planned a range of beginner, intermediate and advanced classes on subjects such as Ruby on Rails, HTML 5 & iOS to start with.