Borne out of trying to get a meaningful view of Galway City Council meeting minutes (see Council Meetings) is a pet project of mine to parse those minutes. The idea being to extract contextual information like the councillors and officials present, the agenda items, the chronology & outcome of motions and similar information. I haven’t fully considered the end presentation – since I’ve been inspired by KildareStreet.com I may look at the open source software used there. Of course, the document is written for people to read so this presents a challenge to automate. I’ve opted to use Perl and so far have opted to split the document (after converting to plain text) based on a list of pre-defined section headings i.e. agenda items:

Consideration of Minutes

Reports of Committee Meetings

Consideration of Reports of Officials

Consideration of Reports of Mayor

Business Prescribed by Statute

Notice of Motions

Conferences

Questions

Correspondence

Any Other Business

Anything prior to the first occurrence of one of these section headings will be treated as preamble i.e. a clump of text that I’m not sure what to do with yet. So far it seems to be ad-hoc and may contain information about things like presentations to the councillors.

I’m looking to split out proposals by the delimiters ‘proposed’ and ‘seconded’ (i.e. the sentences with these words form the beginning and the end). That’s the next task for the time being. Trying to relate these to a motion reference will then be a fun task. Here’s an example:

Cllr. T. Costello proposed that:
“A Special Meeting be held in two weeks time i.e. 26th January to discuss Lead Contamination in the Public Watersupply.”
This was seconded by Cllr. Brolcháin N.Ó.

I’ve opted to use Perl for this, while I’m not proficient in Perl it’s certainly made things a lot easier. Easy string handling and accessible use of regular expressions are key here I think.

Since I’m not too bright, I’d really appreciate any thoughts people have on this – it may be a fools errand but I see no point keeping it to myself and easily giving up.