This is part 4 of a series of posts about Building DEF CON Capture the Flag.
Finals is the hardest part of CTF to run. A lot of it is timeboxed (we generally didn’t have access to the CTF room until Wednesday or Thursday), it’s a pain to get stuff in and out of the conference area, and there’s a definite feeling of do-or-die. Why? It’s where your reputation among the CTF community comes from, it’s an important show for DEF CON attendees, and it’s a justification for CTF’s continued existence at DEF CON and in the hacker community.
For us, the preparation for finals started before it wrapped the previous year. The 2015 announcement about CGC architecture for 2016 was planned, the 2016 announcement about the custom architecture in 2017 was already a year in the making.
Preparation and Infrastructure
Designing the game itself is something that also has to get started early. We tended to spend a lot of meeting time on the scoring algorithms, plenty of both meeting and individual time on infrastructure, and a ridiculous amount of individual time on building challenges.
Our scoring algorithm had gotten really good for 2015 and 2017. Each team service instance would start with a pile of virtual flags, which could be stolen by other teams, or lost to other teams with an availability check failure. In addition to competitor teams, we had a secret team that got remainder flags after uneven divisions, and had a private instance of the challenge to verify that availability checks worked as intended. Novelty was rewarded by not sharing stolen flags with other teams, which especially with the public patches of 2017, worked amazingly well.
Infrastructure is where I spent most of my time, along with Selir. The scoreboard “scorebot” for every year but 2016 was a Rails & Postgres app, because that’s what I’m most familiar with running. The most important thing to me was making the database (mostly) append-only, creating all the database constraints and foreign keys necessary, and realizing that you only have like 200 users, tops. After 2013, we realized we needed admin screens to answer team questions. There will be lots of those.
Why append-only for the database? We found a lot of value in being able to re-score the game offline. Teams were affected by hardware faults and scoring problems because the game has to run in the real world. Since teams can't work around these issues, it's unfair to penalize them. We identify these issues during game time, and let teams know we'll reprocess them. Reprocessing is easiest to reason about when you have a separation between player-driven events (successful token redemptions, unsuccessful availability checks) and score change events (capture or penalty flag movements).
Our network was fantastic. Selir (who is an absolute networking genius) had it rigged up so team tables got untagged game traffic, and DEF CON network traffic (including public internet) on a vlan. Player game network access was rate limited to about the speed we could capture and store traffic, which also reduced their ability to flood the network. Since network dumps were captured and made available for teams to download in five minute intervals, this made the dumps less noisy too.
We ran vulnerable machines for teams. This meant we could surprise players with ARM, didn’t expect teams to bring more hardware than strictly necessary, and most importantly, allowed us to give teams unprivileged or no access to the machine to prevent unstoppable “superman” defenses.
What do we mean by this? We’ve had teams wrap challenges to forbid them from reading the flag from the filesystem, among other things. It’s possible to design challenges to make this intractable, but it’s way easier to just deny teams that kind of access to “their” machine.
We used "consensus evaluation" for 2016 and 2017. Consensus evaluation comes from the DARPA Cyber Grand Challenge. Teams (or in CGC, autonomous computers) upload replacement service files to the scoring system, which is then responsible for placing the replacements on filesystems to be evaluated, and also sharing the replacements with other teams. One comment i heard after our 2017 game was that consensus evaluation makes it feel more like a battle with another person.
Writing challenges for finals is very different from qualifiers, because they have some features that get availability-tested, and ideally have multiple unrelated vulnerabilities. They need to be both difficult to attack, and difficult to defend. Like quals challenges, that’s another discussion entirely.
Live Operations and Game Management
Game operations and competitor management are a big part of finals: getting everything set up in the room, making sure it works, getting teams in and connected, actually kicking off the scoring system at game start, and stopping the game at end of day. We do a test setup the Thursday before the game starts, making sure the wires to tables work, the firewall rules are all configured, and that there’s enough power on the floor to run the game. Everything that can be scripted is something that you probably won’t mess up the next morning.
Competitors need to know when and where the competition is. Some competitors will need invitations to the US in order to get a visa to travel to Las Vegas. Once competitors are in Vegas, they will need help getting DEF CON badges from you, which is difficult because casinos are intentionally confusing and not every team has proficient english speakers.
Getting teams in the room and on the network is something that can take some coördination. We prefer to have emailed and printed information on when to come in, and what the network will and won’t provide. Emailed and printed documents mean players not fluent in English can get them translated by a teammate ahead of time, and makes network setup less unfair to teams that have never competed at finals before.
We allow an hour or so for setup time in the mornings once players are allowed in, with a few game elements network-reachable, but no scoring allowed. In the meantime, we’re coming up with a rough plan of what services we want to release that day. When it’s a minute or so before game start, things get really tense and quiet as we count down on the microphone: I’d be armed and ready to fire the polling/flag redistributing service, and Selir would be armed to change the firewall rules from setup to gameplay.
Scheduling services for release during the day is a complex topic, especially since it involves dropping binaries and pollers in the right places, activating them in the scoring service, and, most of all, having every team start looking for any kind of weakness in the service, intentional or not.
We show full scores and rankings on Friday, rankings on Saturday, and nothing on Sunday. This means there can still be a surprise upset for closing ceremonies, and keeps teams more invested, since they won't know if they're on the verge of moving up or down in the rankings. We've had a few surprise upsets in the past because of this!
Make sure you capture backups, and practice restoring them. This isn't just disaster preparedness, it's a powerful enabler for during-game dev work. Being able to drop a backup, load it on your dev machine, and test score fixup scripts, admin screen changes, and other scoring system changes is extremely valuable in its own right.
Wrapping Things Up
Introducing CTF finals during DEF CON closing ceremonies is absolutely thrilling. You're on stage in front of a massive crowd that (ideally) respect and support you, and you get to introduce the top three teams from a very intense weekend. How do you get there?
By the end of the game Sunday, you should have a pretty good idea of how to get the top three teams, know of any last-minute scoring fixes that need to be run, and can make sure your scoring database is backed up elsewhere.
Downtime on Sunday is also a great time to write your speech. This isn't as hard as it sounds, the rules are pretty easy to follow:
- Make the game sound hard and imposing
- All the teams are wonderful competitors
- Thank everyone involved: DT and the rest of DEF CON staff, especially the goons, competitors, DEF CON attendees, and the global CTF community
- Third place, second place, and finally, first place
Before closing ceremonies, know who'll be talking on stage. Don't let them drink too much (there's plenty of time after they're done being in front of a mic). We put speeches on notepads (for reliability), and each speaker transcribed their own part (for legibility).
After closing ceremonies, get trashed. You survived DEF CON CTF!
Thanks Matthew Pancia for proofreading and reviewing.