A Legit Analysis

Willem Vandercat of ROPtimus Prime has posted a fantastic analysis of our 2013 finals game, based on the data we released: A BS Analysis Based on Legit Data. This blog post is the requested response, so read their post first. It's really good!

1. Review

A: As far we can tell, these flags became completely inaccessible to the teams and were thus effectively destroyed, invalidating the zero-sum assertion.

The plan was to make a list of all the teams that redeemed flags this round, and split up the remainders evenly, holding on to the leftovers for next round:

scoring_teams = Team.
joins(:redemptions).
where(redemptions: {round: ending_round}).
distinct.to_a

Unfortunately, I wasn’t storing which round redemptions were made in:

scorebot=# select round_id, count(*) from redemptions group by round_id;
round_id | count
----------+-------
   NULL   | 18880
(1 row)

This meant that for the purposes of remainder reallocation, nobody ever scored, and we never reallocated those flags.

Admitting you have a problem is the first step towards recovery (or maybe causing the problem is), so this won’t happen again.

B: However, it appears that flags were never initially paired to specific services (see discussion of flags table in the database section below) making it difficult to prove that this in fact was the case throughout the contest.

This was a miscommunication on my part. We debated and discussed this internally, and while it did make it into the rules document, I got mixed up over what we decided and it never made it into the source code.

E: The database provides no transaction records for flags that were redistributed as a result of failed availability checks.

This is something I want to fix for next year. It made many things more difficult for us, and makes a lot of analysis impossible. Ideally, there’ll be a separate table just for flag movements, that can join to either redemptions or availabilities.

E:  A failed availability check resulted in the loss of 19 flags (presumably from that service's pool of flags, but again we can't tell) which were redistributed among the teams whose same service was deemed to be up in the same scoring period.

When failing an availability check, you lose 19 or all your flags, whichever is less. You can still receive flags lost by other teams.

2. Description of the database

The final CTF score [Fig 2] appears to be derived directly from [the flags] table.

Yes:

SELECT t.name, count(f.id) as score
FROM teams as t
left JOIN flags AS f
ON f.team_id = t.id
WHERE t.certname != 'legitbs'
GROUP BY t.name
ORDER BY
score desc,
t.name asc

ASSUMPTION: Lacking any other guidance, we assume for the purposes of this analysis that any availability status other than zero represents a failed availability check.

Correct. Status checks were separate programs (mostly Python and Perl scripts) executed by the main scoring script. A zero exit status indicates success, non-zero indicates failure. For most of the game, we logged stdout and stderr too.

(the round_id field in the redemptions table is unused)

 Whoops!

It is important to understand that the captures table contains the only record of flags transfers within the game and that this record is further limited to flags transfers that resulted from token redemptions. There is no data in the database that allows us to audit the transfer of flags that resulted from failed availability checks.

Yup :(

3. Interesting? data extracts from the database

This section is wonderful!

ASSUMPTION: a redemption with no associated capture record indicates that the team whose token was redeemed had no flags remaining to redistribute.

Yeah. Can’t get water from a stone.

SUBMITTING YOUR OWN TOKENS: According to the published scoring algorithm, this practice should not have increased a team's score, it merely should have stopped their score from decreasing as quickly as it might otherwise have.

This agrees with my interpretation, although the more important part is it stops your opponents’ scores from increasing as quickly as they might otherwise have too.

SOME THINGS WE CAN SAY ABOUT AVAILABILITY:  From rounds 1-99, LegitBS was also checked for availability and they failed 16 of these checks [Fig 8]. We have no idea whether there is any significance to this or not.

We maintained our own instance to run availability checks against, unreachable by teams, as a way to make sure our scoring scripts were working correctly. If we failed an availability check, it was because of bugginess on our end, and teams would be forgiven a failed SLA check that round.

Our availability checks only show up for the first day because we did some reëngineering of how they work Friday night. Running 84 or more checks sequentially is very time-consuming, so starting Saturday morning, we ran twenty at a time. In the refactoring to get this feature in, while we kept the result of our check in memory, we never persisted it to the database.

4. Replaying the game attempting to validate the published results

This is good, thanks for doing it!

However seven teams finished with zero flags and others may have hovered around zero at times. Any team that possessed fewer that 19 flags at the time of an availability failure lost fewer than 19, perhaps even no flags so the formula can't be used in all case and we must rely on a simulation of the game to compute the correct deductions.

Correct. In our internal discussions, we considered any team with fewer than about 50 flags to be "circling the drain." They might fluctuate up and down based on the incidental (not planned or sorted, but not really random) order that services were scored and redemptions were processed. We're still looking at what to do with this in 2014.

One caveat though: we adjusted flags after the game due to a network error on our part:

The misconfigured network caused teams to be incorrectly throttled in their connections to the REST API that redeemed tokens for flag captures. This meant that some teams weren't able to redeem captured tokens due to the busy and hostile network environment. Since this was discovered on Sunday morning, after a long night of discovering new vulnerabilities, it was especially painful.

We have reprocessed those expired tokens based on logs and scorebot data, since they disproportionately and unfairly affected individual teams unevenly. They are included in the final results.

 5. Conclusions

Thanks for analyzing the data, and pointing out some of the shortcomings that make it difficult to audit, or make the game not match its documentation. 

If you want to run your own analyses, check out our 2013 releases at https://blog.legitbs.net/p/2013-releases.html.

2014 Registration and Other Announcements

Hello

We have two big announcements for you today!

Qualification Registration

Exciting news: registration is open for the 2014 DEF CON CTF season! Based on feedback from many of our international competitors, we have moved the qualifying weekend a month, to May 17-19. We hope this will help with some of the time sensitive issues of arranging for international travel to Las Vegas, should you qualify.

2014 DEF CON Capture The Flag Qualification Registration

Qualifiers will be played as previously announced, May 17-May 19, Midnight UTC to Midnight UTC. The starting time in your local time zone can be found here.

Changes for 2014

  • We have some new and exciting categories for you this year! Questions will no longer be forced into categories by type, but instead by author, with the exception of...
  • Baby's first! A whole category targeted for CTF beginners. Every question in this category is unlocked at the start of the game. Answering a question from this category will not allow board control. Hints will be available for this category.
  • Quals teams aren't size limited. Qualify with the whole group, and make difficult choices about who competes in finals!
  • Automated password resets
  • Two-factor authentication
  • Less pink!

Not changed for 2014

  • Finals teams are limited to eight total players.
  • Qualifying teams get 8 human badges, 2 hotel rooms.
  • Flags will continue to be captured using computers.

Open Up

One of our goals in building our vision the DEF CON CTF has been to be “more open”. We have opened up and released our design process, our packet captures, our finals database, and even some of our mistakes. We have made public some of the resources that generally only teams at finals get - namely the rules document, and the finals services. This is much more open than any DEF CON CTF in recent memory

We think we can do better. As our last action for the 2013 DEF CON CTF season, we're releasing many of the quals and finals services, open source, on GitHub. There are a few pieces that remain absent - these pieces involve some body of code we have intentions to reuse in future events. Please don’t bother us for these last few pieces - they’ll be released when we’re done using them for competitions.