Using regex to find JSON…from a serial port…connecting a robot’s eyes to its brain. And testing all of it with JUnit

Lofty click bait title….check! (we’re actually just using JUnit to test the speed of regex)

We built our Robot's eyes from a custom board and an openMV camera. Python scripts on the openMV find the balls and strips on the game field.

We call our camera the Slime.

So SlimeLight's python script takes the locational data from the ball/strip it finds and sends that information to the roboRIO via serial port. Seeing stuff and then having a robot act on what it sees...that's non-trivial. Well, non-trivial to do it gracefully. When you see a robot moving around like your Uncle on a wedding dance floor trying to do...well...The Robot, that's what I'm talking about. Building bad robots isn’t hard. Dancing poorly isn’t hard. Bad robots rarely win FRC contests. Wedding Uncles are rarely thought of fondly by the bride and groom….

Eventually we’re all this guy….

So our fancy camera sends JSON down the wire to our roboRIO. We don’t have a measured, proven technique to parse our payload from the serial port. The team was using a Scanner but I suggested something else: regex. You know, the intuitive and user friendly markup for finding text within text…

/^<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)$/

Groans…

“Have you guys ever searched a document in Word, ya know, pressed ctrl-f?”

“yeah”

“cool, you’ve used regex…stop groaning, this’ll be fun”

FRC is the perfect place to introduce practical science to these Students. Have an idea, design an implementation, test that implementation and decide if it's doing what you want, as quickly as you want .

PLUS you get some data to support your decisions.

Maybe regex is a terrible idea...maybe our original Scanner idea was the right way to know; without some form of formal experimentation, we'd never know.

I suggested the regex approach to our programming team and one of them set to work on it. He followed some examples and reported back a couple of days later, "the regex is done". I sometimes forget how quickly young minds take to new ideas.

Fast forward to last night. They upload the code on the Robot and....it's terrible. The Robot just scoots past the ball. They report the Robot taking "half a second" to respond to the camera seeing the ball.

"that's not good, what do you think is causing it?"

"the regex is too slow"

"how do you know that?"

"that's the only change we've made"

"did you change the SlimeLight’s scripts?"

"well, yeah"

"and is this reading off the queue?"

"no"

"when you say half a second, is that actually measured or just what it seems like?"

"not really measured, no"

And I hope I'm not coming across as mean. I was actually thrilled with what these guys were doing. They were walking into the reason Science exists in the first place...to prove a theory. To set up the best tests you can and measure your findings.

So I've been rambling through this introduction to get to what I actually want to show/tell.

Wanna prove something in Java? Write a JUnit test.

The theory presented by our developers was: "regex is causing the processing on the roboRIO to run too slowly, regex is the problem, regex == bad"

So I wrote a quick unit test in JUnit to see how long it takes to find a string within another string.

This test was doing three things:

1. Does the matcher find the pattern within the larger string?

assertTrue(matcher.find());

2. Does the matcher pull out the json we’re looking for?

assertEquals(expectedStringToFind, matcher.group());

3. Does this entire thing take less than 10ms?

long endTime = System.currentTimeMillis();
long totalTime = endTime-startTime;
assertTrue(totalTime≤10);

Now of course this isn’t a great test:

  • 1. The Pattern is instantiated every time the test is run. In the wild, this would be a field level variable, instantiated at boot and therefore the processing wouldn’t pay that every time a message arrived
  • 2. It’s not testing it with any load
  • 3. It’s not testing one of our classes…it’s just testing Java (normally this would be testing the class doing the regex work and therefore be more in line with the implementation)

To address the load and pattern concern, I wrote a better test

See the differences here:

  • 1. The Pattern was instantiated at a TestClass level rather than within the test method (therefore only pay the cost to make it once)
  • 2. The String is recreated with Math.random() injected rather than the same string every time
  • 3. We are running this within a loop, executing it 100,000 times. If the code ever takes more than 10ms to run, the test fails.

Is this perfect? Certainly not. But it’s a measurable result and if we’re going to write successful software, we always need to think in terms of measurable results. If a method or class can’t be measured, refactor it in a way that makes it measurable.

  • Your private methods can’t be tested? Make them protected and use the same package declaration in your tests to expose them to your test suite.
  • You use a private void method for doing some processing? Stop doing that. Your method should return something, anything, that can be studied and measured.

The alternative is software/robot/gizmo that works in a way only it knows…

I write software for a living.

I help kids learn about writing software by volunteering with a local FRC team, 4627 Manning Robotics. If you have kids and access to First Robotics, look into it with them. Amazing things come from this program. If you have technical skills and you’re looking to give back to your community, same statement: look into First Robotics. Thank me later.

Husband and Father. Wilderness First-Aid Certified. Terrible at tying knots. I play Squash. I like things that Trade. Leafs fan. FRC and Scouts Canada