Excel spreadsheet of candidates and their constituencies

How to scrape with OutwitHub

Data scraping is one of those data journalism phrases that I encountered and thought – yep, never going to be able to do that. But it sounds a lot more scary than it is.

In this how-to guide I’ll run through how to do a really simple scrape using the free version of Outwit Hub.


In honour of 420 (unofficial national marijuana day in America), I wrote a piece about GE 2015 candidates from the Cannabis is Safer Than Alcohol (CISTA) Party in order to practise/refresh my scraping skills.

I will be walking you through the scraping process I went through to get the following data from the CISTA website:

a) Candidate name

b) Candidate constituency

I then mapped the candidates according to where they were standing for election, using mapping tool CartoDB.

What does Outwit Hub do?

At its most basic, Outwit Hub retrieves text from between two ‘markers’ defined by you, the user. It delves into the source code of a web-page and recovers the data you want.

Step 1: Download OutWit Hub
The free version limits you to scraping 100 rows of data, but that should definitely be more than enough.

Step 2: Open up OutWit Hub, copy and paste the CISTA candidate web page into the URL bar at the top of the application. This will also show the source code of that particular web page.


Step 3: Click on scrapers, create new scraper

Step 4: For the first line, under the ‘Description’ column, put ‘Candidate name’. This line will be where we’ll pull out the candidate name.

Step 5: Go to ‘marker before’, and put in: <div class=”col-md-3″>

Step 6: Go to ‘marker after’, and put in: </h3>

Step 7: For the second line, under the ‘Description column, put ‘Constituency’. This will be where we’ll pull out the candidate’s constituency that they’re standing in.

Step 8: Go to ‘marker before’, and put in: </a>

Step 9: Go to ‘marker after’, and put in: </p>

A photo probably illustrates the process better than me just writing about it. So — your screen should now look something like this:

OutWit Hub showing before and after markers

Step 10: Click ‘execute’

Congratulations! You have successfully scraped using OutWit Hub.

OutWit Hub showing what final scrape should look like

I still find it a bit touch and go in terms of deciding exactly which bits of source code to use as markers. In this case, you can see that the scraper pulled out one result (the top one) that also fit into the before and after markers we specified – it’s not an exact science, but you can delete outliers.

You can now export your results (click ‘Export’).

Excel spreadsheet of candidates and their constituencies

Cannabis Rally in Lincoln, NE (11)

Would you vote to legalise cannabis?

CISTA (Cannabis is safer than alcohol) is a new political party campaigning for drug reform in the UK, starting with cannabis.

It argues that the UK’s “war on drugs” has failed, and points to the increasing number of states in the US which have legalised cannabis as examples of how a more progressive drugs policy can have tangible benefits.

4/20 (the date when marijuana-smokers in the United States traditionally gather together and smoke up) may have been and gone last week.

But for those bemoaning the fact that cannabis is still an illegal drug in the United Kingdom, fear not – you may be one of the lucky ones who happens to live in one of the 32 constituencies in which pro-legalisation political party CISTA is fielding candidates.

MAPPED: Are CISTA fielding a candidate in your constituency?

The party will be fielding candidates in 32 UK constituencies.

But how many people actually support cannabis legalisation?

In their online manifesto, CISTA claims that “84% of the UK population now concede that this so-called ‘War on Drugs’ has failed and cannot be won”. But does acknowledging a failure of policy automatically translate into direct support for drugs legalisation?

According to YouGov, the majority of British adults still don’t think that marijuana should be legalised in the UK, with 32% of respondents saying that it should be legalised compared to 49% against. In fact, Britain lags behind Germany and the United States, both of which have higher percentages of the population coming out in favour of legalisation.

Residents in Scotland are the most pro-legalisation (with 39% in favour of legalising marijuana), followed by those living in London (35% in favour). Perhaps unsurprisingly, the younger you are the more pro-legalisation you are likely to be, with 48% of 18-24 year olds in favour of legalising marijuana, compared to just 20% of those over 60.

Support for legalisation of marijuana


Featured image – Flickr: Jonathan Reyes

Article updated to include cartoDB map and plot.ly graph 27/04/15