Human Computers: TDD for GenAI

By Alan Ridlehoover

Elevator Pitch

In 1962, when IBM began calculating flight paths for NASA, human “computers” hand-calculated examples for use in validating the software. Now, as the role of programmer is moving from human to machine, we need to both prompt AI and validate the code it generates. We need to bring TDD to GenAI.

Description

In the 1950s, NASA employed a team of African American women known as the West Area Computing Unit to hand-calculate flight paths, fuel consumption, and more. In 1962, IBM began doing the math for manned spaceflight, and with that, the role of “computer” moved from human to machine.

To confirm the machines produced the correct results, human computers pre-calculated examples to specify what the software should return for a given set of parameters. Programmers then used these specifications to validate their code.

Now, we’re on the cusp of a similar revolution. Machines are already coding alongside us. The role of programmer may soon move from human to machine, too. Ensuring machines produce the correct code will require prompting them with self-validating, executable specifications. We need to bring TDD to GenAI.

Notes

Why this talk?

Most programmers we’ve spoken to are interacting with AI in one of two ways: either they’re using LLMs to generate blocks of code from natural language prompts, or they’re letting LLMs type ahead for them in their editors. There are problems with both approaches, today, including:

  • Natural language prompts generally don’t end up in source control, making them difficult to repeat in the event that the code needs to be regenerated.
  • Generated code often contains subtle errors that can slip by manual inspection.

Furthermore, as AI advances, source code will become decreasingly relevant. There will be fewer reasons to retain it, since we can regenerate it as long as we retain the prompts. In essence, the prompts will become more important to retain.

Regardless of whether we retain the source code, we will need to test the functionality of the generated code. In fact, automated specifications offer us a structured language that we can use to both prompt the LLM, and to verify the generated code.

Our contention is that we need to be growing our skills at describing software through automated specifications, now. We can benefit immediately from the clarity it provides during our existing, traditional development work. And, as more code is generated by AI, we can leverage our specification writing skills to clearly articulate requirements and validate that they’ve been met.

Why these speakers?

We are seasoned speakers:

  • We’ve spoken together at meetups and conferences across multiple continents to audiences ranging from 50 to 500 people.
  • Our talks attract audiences and are routinely praised by attendees.

We are experts:

  • We are passionate Rubyists, each with 14 years experience in Ruby and Rails.
  • We have 14 and 24 years of TDD experience, respectively.
  • We use AI every day to generate tests and code.
  • We write behavior-centric, implementation-agnostic specifications.
  • We know the pain associated with implementation-specific tests.
  • We work in one of the largest, most successful Rails apps on Earth!

Intended Audience

Software developers and managers who are curious about harnessing the power of AI to improve velocity by prompting the machine with executable specifications that can then validate the generated code.

Outline

  • Introduction (5 minutes)
  • Writing good automated specifications (5 minutes)
  • How can we use automated specifications with AI today? (10 minutes)
  • How might we use automated specifications in the future? (5 minutes)
  • Conclusion (5 minutes)