Workbench Ballot Files

The workbench reads sets of ballot data from files, and analyzes those ballot data as-if the result of an election. These workbench ballot files are described below:

1. File Description

A ballot file is a two-dimensional table of ballot information, with:

  1. The candidate names as column headers, and
  2. Each row of the file thereafter representing a single ballot, where
  3. For each given ballot the voter’s preference is given in the column position corresponding to the candidate name in the header, such that, a voter preference in regard to each candidate, is:

    1. A number between 1 (most preferred) up-to and including n, where n is the number of candidates (least preferred.)
    2. Blank choices are allowed, where blank choices are deemed as being marked with (least) preference n.
    3. Non-numeric values other than blank, or numeric values less than 1 or more than n are invalid, and invalidate the ballot as a whole.

Such textual data are presented in a ballot file as Comma-Separated-Values (CSV), typically in files having a “.csv” suffix:

  1. Values such as candidate names, which might themselves contain commas, are enclosed in quotes, e.g.: “family-name, given-name.”
  2. Blank (empty) rows are ignored in the analysis (but counted as file rows, in order to identify the locations of ballots within the file).
  3. The first non-empty row in the file is taken as the list of candidates, given in the order in which they are to appear on the ballot.
  4. Each subsequent non-empty row is taken as a ballot, with each such ballot identified in terms of its row number within the file. (Zero-based – the first row in the file is row 0.)

Such a ballot file might look like this:

“Duch, Dawn”, “Mowz, Mike”, “Samm, Yosem T.”, “Yote, Wally C”, “Ruhner, Rod”

2. Creating Ballot Files

In a real election a ballot file would perhaps be read from an optical-reader.

For our purposes in the workbench, to explore the concepts: one might create a spreadsheet of election data using a tool such as Microsoft Excel, Open Office Calc or Libre Office Calc, and such.

In LibreOffice Calc the above data would look like this:

“Export” such a set of data as a “Comma-Separated Values” (.csv) file.

3. Accessing Ballot Files

When the workbench application is run, a default “scenario 1” ballot file that is bundled with the software is opened.

To open a different ballot file, use the “File” menu, “Open Ballot-File” option:

  1. There you will see one or more “Scenario” files, which represent various sample ballot files that are bundled with the application, as well as
  2. A “Find Ballot File” option, which invokes a dialog for specifying a different file from your computer to be used instead.

This invokes a dialog to specify the particulars of the ballot-file to be used:

The “Find Ballot-File” button here invokes athe actual file-open dialog with which you can navigate-to and select a file from your computer.

Once a suitable file is selected, the ballot-file dialog might look like this:

The dialog displays the column names (Candidates), as acquired from the data, and, on the “Preview (Sanity Check)” tab, a list of the ballot data themselves.

It turns out that there are variations in how such “CSV” files are constructed:

  1. Sometimes the values are indeed separated by commas, but sometimes tab-characters, or even other characters are used.
  2. The file needs to be read in keeping with the specifications used to create it, otherwise you get gibberish.
  3. The “File Specification” tab allows these choices to be specified, and when changed, a “Refresh” button will appear.
  4. Upon clicking the Refresh button the data will be re-read, and you can see if they make better sense.

When you press “OK,” the data will be transfered into the main application tabs, and analyzed.

4. Scenarios

As noted above, a number of ballot files have been bundled with the application, for user convenience as well as to explicitly demonstrate specific cases. A description of these folows:

Winner loses, Loser wins

This is a small data set, 22 ballots:

  1. It demonstrates the case where the Condorcet winner, the candidate who wins a majority in every one-to-one match up against each other candidate, and indeed comes in first by Condorcet/Ranked-Pairs (or any Condorcet method), can in fact, and in this case does, come in dead last by FPTP (and IRV).
  2. It shows, similarly, the Condorcet loser, who loses every one-to-one match-up against the other candidates, yet in a FPTP election can, and in this case does, win on a plurality.
  3. It also shows a second-place Condorcet/Ranked-Pairs tie. This is not a preference cycle, and involves no non-affirmed pairs – which means that these candidates would tie by any Condorcet method.

    It means that after all the preferences have been holistically considered the voters ranked these two the same.

    Arguably, the chances of such a precise equality of preference diminish with a larger data set so as to be considered relatively unlikely, but it is, nevertheless, possible with any size of election.

    As for any system, a tie is possible in any position. When we’re electing a single candidate it is of consequence, of course, only when it is a first-place tie.

    Whether, or how, to resolve such a tie is a matter external to the particular voting system itself, and its handling would be specified by the enabling legislation, which could range from another election to a coin flip.

All Systems Go

This is a small data set of 5 candidates, 16 ballots, contrived such that:

  1. The Condorcet/Ranked-Pairs winner is also the FPTP (plurality) winner, as well as the IRV winner. It can happen.
  2. We also have no Condorcet winner in this case, because of a preference cycle in the data.
  3. It’s really a setup, though, for the next scenario – to see what happens when we throw in another candidate who is “like” one of the others.

Similar Candidates

Here we take the ballots of the previous scenario and add another candidate “similar” to the former winner:

  1. Basically, we attempt to “slot” the new candidate into the ballots in roughly the same preference as the former winner.
  2. Now, we get the same Condorcet/Ranked-Pairs winner as before, and the new “similar” candidate is, all in all, second in preference to it; but we get a different FPTP winner, and yet a different IRV winner. In fact, the “similar” candidates now are both eliminated in IRV prior to the final round.
  3. The point demonstrated here is that the outcome for FPTP and IRV can be skewed by introducing a similar candidate, but that Ranked Pairs is insensitive to this. A strong FPTP or IRV candidate can be weakened, or even defeated, by having friends on the ballot.

IRV Note:

  • In the case where there are multiple candidates tied for last-place in any round, there are options on how to proceed.
  • One could eliminate them one at a time, in some manner, or just eliminate them all in one fell swoop, which is what is done here.
  • It’s possible, however, that a wholly different outcome could ensue by eliminating one by one, for the one eliminated could accrue ballots to the other last-place candidates, which could entirely change their ranking for the next round, and one such could, conceivably, even go-on to win. The problem, then, is how to decide which to eliminate first?
  • Since we’re not here to write the definitive IRV system, we take the short, simple, approach as sufficient to demonstrate the broad differences among the three systems.

FPTP Majority Winner

Another small data set, 20 ballots:

  1. This set is constructed such that there is a FPTP majority winner.
  2. The first-place majority also means that this candidate is more preferred, in all one to one match ups, than every other candidate, and he or she is thus the Condorcet winner. This candidate is also the Condorcet/Ranked-Pairs winner, therefore, and would win by any Condorcet method.
  3. This candidate is also the IRV winner since with a FPTP majority, there is no need to eliminate a lowest candidate for IRV. With a FPTP majority winner, there is only a single IRV round.

Large Data Set

The workload for counting Condorcet/Ranked-Pairs is proportional to the number of ballots counted, and to the number of pairs. If there are n candidates, there are n (n-1)/2 pairs. With a lot of candidates, in particular, this can get out of hand for a manual count. It’s fair to wonder about an electronic count.

This data set doubles the number of candidates from the previous scenarios to 10 (so there are now 45 pairs), and significantly increases the number of ballots (2000).

  1. The preferences in each ballot are randomly generated in their originating spreadsheet, with no attempt to avoid marking multiple candidates with the same preference.

    These ballots are perfectly valid for Condorcet/Ranked-pairs, but they prevent allocating a ballot to a single candidate in FPTP or IRV; they will be rejected in those cases.

  2. In IRV, as candidates are eliminated in each round, by rendering such same-preference cases non-ambiguous such rejected ballots can come back into play in subsequent counting rounds.
  3. With 10 candidates, and preferences more or less randomly determined, the distribution is fairly even. This is a measure of the randomness of the derivation. For the analysis, however, it means, in particular, an extremely weak FPTP winner.

    It demonstrates the point; but in real life, voters tend to second-guess the outcome and cast their votes according to such expectations (Duverger’s law), rather than voting their true preferences. This means that the distribution would likely be not so even but would tend to elevate artificially two or three of the perceived “most-likely” candidates.

In terms of bulk numbers, however, if one envisions running such a count for a poll for a BC election, which would be, give or take, 400 ballots (if everyone voted, (and they don’t)) per general poll, or maybe triple-ish that for an advanced poll, and there are rarely as many as 10 candidates (though that could change), we get a reasonable stress-test for implementation of the method.

This Java “workbench” application handles this larger count without breaking a sweat, in an almost inappreciable time frame.

This suggests that a purpose-built poll application, or a polling-station application, or even a electoral-district equivalent, whether written in Java or something faster, would be a non-issue in terms of performance.

Next: Workbench Tabs

Share Button