Speedup ideas: - if patch fails, try parsing without patch, then erase patch if that works - make it easier to erase patch and force reparse - special patchtool command line option - Add auto-updating ministerships, so don't have to manually move date ranges for solicitor general et al (see member-aliases.xml) Being errors found each day running the scraping job Out of sequence dates are when the rescraper picked up new stuff? Dates are quite rough as sometimes I put todays date rather than the error date 2004-07-23 Advocate General disambiguation by office, date moved forward 2004-07-21 Alias for new MP (Parmjit Gill often includes middle name) 2004-07-21 Missing "." in time. 2004-07-20 Missing column/date at start of days debate 2004-07-20 "Ivor Caplin)" extra close bracket in name 2004-07-16 Beginning headers, missing center tag 2004-07-15 FONT tag in middle of paragraph 2004-07-14 Missing (1) in wrans 2004-07-13 Terry Davis listed as voting when he had resigned, removed him 2004-07-07 Solicitor-General date range moved 2004-06-?? Missing "To ask the" (twice) 2004-06-?? Space in the middle of name "ji m" 2004-06-28 Missing (1) in first para multi-part wrans 2004-06-28 Spurious in wrans 2004-06-17 "Dame" added to list of titles in resolvemembernames.py 2004-06-16 Gareth Thomas disambiguation of office, date moved forward 2004-06-14 Confusing

in table 2004-06-14 Style padding attributes in in a wrans 2004-06-11 Four part question broken into two sections 2004-06-10 Heading between two parts of a single written question 2004-06-08 Jim Marshall voted, when he was dead 2004-06-08 Office of name broken on separate line to partial name 2004-06-08 Solicitor-General date range moved 2004-05-25 Advocate General disambiguation of office, date moved forward 2004-05-21 Entirely missing speaker of wrans answer (fixed with broken-name) 2003-10-27 Totally spurious tag 2003-11-06 Extra 4 times 2003-11-06 Mangled "Opik" 2004-05-20 Missing (1) in first para multi-part wrans 2004-05-18 Missing "." in time stamp "4 58 pm" 2004-05-12 Missing "Robertson:" at end of name, limit to ask the 2004-05-17 Missing "to" in "to ask" 2004-04-29 Missing bold on name 2004-05-04 A couple of missing "to ask the" 2004-05-?? Some names needing tags 2004-05-?? Wrans multiple-questions (with one answer) split by heading 2004-05-06 Warning: pargraph numbers not consecutive 2004-05-06 Missing "to ask the" 2004-05-10 Solicitor-General disambiguation of office, date moved forward 2004-10-20 (page changed) Incorrect date heading 2004-05-12 Gareth Thomas disambiguation of office, date moved forward 2004-04-27 Missing bold formatting on name, some wrans without "to ask" at start of question 2004-04-28 Missing k in "to ask", missing "(1)" in multi-part question, 2004-04-27 New entity ÷ which is division symbol 2004-04-26 Bad paragraph number in wrans 2004-04-?? Added alias with hyphen in name 2004-04-23 Word from text got into bold round name 2004-04-20 Missing (1) in two wrans, typo of "M.r", missing bold round name 2004-04-19 House of Commons header merged with next 2004-03-31 Gareth Thomas disambiguation of office, date moved forward 2004-03-19 Slight misspelt wrans major heading CHURCH COMMISSIONER 2004-03-18 Part name inside bold: Multiple matches Gareth Thomas (Clwyd, West) (Lab) 2004-03-16 Wrans with two answers (although really a messup and the questions should be separate) 2004-03-04 Missing (1) in wrans which appears otherwise fine 2004-03-08 Marginal colnum case (missing

tag after colnum) 2004-03-10 Slightly misspelt wrans major heading INTERNATIONAL DEVEOPMENT 2004-03-08 Missing "To ask the secretary of...." on wrans 2004-03-05

        'all' rule broke, because there is one extra case (also /UL) 2004-03-05 Missing chunk in middle of wrans, detected because numeral (2) missing 2004-03-04 In the morning, 5 pages were repeated, fixed by afternoon 2004-03-03 Gareth Thomas disambiguation of office, date moved forward 2004-02-26 Mangled column number text: Missing formatting 2004-02-25 Mangled multi-part question: Missing (1) marker 2004-02-25 Mangled "to ask": To the Secretary of State for Environment 2004-02-23 Mangled name: Ms Hazel Blears)