Parli-N-Grams is an N-Gram viewer. The N-Grams it displays are extracted from Hansard, the archive of parliamentary debates, and most specifically from the House of Commons debates.
Parli-N-Grams is based on a collection of scripts.
There is a harvesting component that collects the debates files, a parsing component that extracts the actual N-Grams and builds a database model, and a data visualization component that creates the charts.
If you want to try, just type some words in the field and click on search. You can add multiple ngrams using the "+" button.
The ngrams are extracted from the original text using word2phrase and word2vec. The viewer supports 1- through 8-grams.
The data is refreshed once a day, in the early morning, which is when data becomes available.