when it comes
will have no sheep

Research scientist Janelle Shane on why she made an artificial neural network write proverbs.

Death when it comes will have no sheep

Research scientist Janelle Shane on why she made an artificial neural network write proverbs.

A good anvil does not make the most noise. An ox is never known till needed. A good wine makes the best sermon. These are just some of the bespoke “ancient” proverbs crafted by an artificial neural network.

Janelle Shane, a research scientist who works with laser beams by day, finds artificial neural networks fascinating. Artificial neural networks are essentially a simulated group of interconnected brain cells, or neurons, that are often used in combination with machine learning. Machine learning is just a way of saying that whatever algorithm learns from itself over time.

Shane recently used an artificial neural network, char-rnn, to create a bizarre list of proverbs. This isn’t the first time she’s fed raw data into an artificial neural network; she’s also tried to get networks to invent recipes, come up with pet guinea pig names, and generate Sherwin Williams-esque names for paint colors.

Shane fed 2,000 proverbs into the neural network, which produced its own list of gems like “A good face is a letter to get out of the fire,” “No wise man ever wishes to be sick,” “A good excuse is as good as a rest,” and “There is no smoke without the best sin.”

To understand why and how Shane made her artificial proverbs, we spoke with her about oxen, training parameters, and drawing meaning from nothing at all.

The Outline: How did you get involved in the neural network stuff?

Janelle Shane: I’ve had an interest in machine learning for a while. Actually, as an undergraduate at Michigan State, I worked with a research group that did some genetic algorithms in machine learning. So, that’s a type of machine learning that imitates evolution.

That kind of led me into doing more laser optics kind of things, but I was always interested in machine learning, and sort of mimicking natural processes of discovering things about the world. And so I came across — and I forget how — but it was a list of recipes that Tom Brewe had put together using a neural network framework, char-rnn, and he trained the neural network on a list of tens of thousands of recipes, and had it generate new recipes, and they were absolutely hysterical. One line that sticks out to me was the “shredded bourbon” — that was my favorite ingredient.

And so because of that, I read through them all, and I wanted there to be more of them, so I had to learn how to generate some more myself.

Why did you decide to make the list of proverbs?

This dataset was collected and provided to me a guy named Anthony Mandelli. He thought it would be a great dataset; there is another more famous proverb dataset out there that a couple people have used, but that one’s a lot more wide-ranging in scope and also includes quotes from famous literature, and a bunch of things that aren’t really proverbs.

“Since they’re training themselves, it’s often tough to figure out what their internal workings are.”

So I’ve never really been tempted to try that dataset, but Anthony’s dataset was great. It was really focused on these old-timey, ancient-sounding proverbs, half of which already sound really familiar. Once he sent me that dataset set and said, “Here you go, if you’d like to,” I definitely jumped at the chance. I really found the results to be funny.

What stuck out from the results obviously was the obsession with oxen.

If I were to train the neural network again on that same dataset with a slightly different random seed or only very slightly different training parameters, it’s quite likely it would have seized on something else instead like dogs or cats or something.

So really there’s no telling why it’s obsessed with oxen.

It’s tough to tell. I mean, the word “oxen” does come up sort of a few times, and the word “fox” comes up even more, so it’s possible that it’s having these two “-ox” words in there kind of reinforce the idea that “-ox” is a good combination to be using. It is possible that there is some amplification of “oxen” in that way.

That’s the thing about neural networks and learning algorithms in general. Since they’re training themselves, it’s often tough to figure out what their internal workings are, and how are they arriving at the conclusions they’re arriving at.

When you talk about setting certain training parameters, what are some examples of what you mean?

One of the big things that you can play with is the size of the neural network — that is, the number of brain cells or neurons that it has to work with. And then you can also arrange those in layers so you’ve got one layer of neurons that sees the data first and is passing information to another layer of neurons, and you can set up how many layers of those there are. That all can add to the power of the neural network, but it also means you need a more powerful computer to actually do the calculations.

Another thing you can mess with is its sort of memory. Like, the length of the chunk of text it looks at at one time. So if you give it a really long memory. For the kind of training I’m doing, long would be 75, 100 characters at a time that it would be looking at, then that’s the amount of text it’s looking at at a time when it’s training itself. The longer the chunk that it’s looking at at once, the more likely it is to be able to pick up long-term correlations, rhythms in sentences, things like that.

I think that this one I trained is maybe only a 25-character memory, something like that. When I did that, that was deliberate, because I didn’t want it to be able to memorize the proverbs too easily; I wanted it to be able to get beginnings, and middles, and ends of proverbs and stick ‘em together in a way that was a bit newer and fresher.

Machine learning can be difficult to explain. How would you describe what’s happening when you feed the data into the neural network?

I’d say it’s discovering patterns. The kinds of patterns it’s discovering would sometimes be ones that we aren’t even aware of ourselves. It can definitely discover patterns like, “The first letter of the sentence is capitalized. The end of the sentence has a period. Words are on average a few characters long, and they tend to have more ‘e’s in them than ‘v’s, for example.” And so on and so forth. “The letter ‘o’ is often followed by an ‘x’ because ‘x’s are important.”

I’m speculating here on the internal rules that it could devise, but I don’t know if these are the rules that it’s using.

Right, because it forms its own as it goes.

Yeah, exactly. It could be something more complicated.

Did you have any favorites out of the proverbs?

I’m pretty fond of one that other people seem to have enjoyed as well. “Death when it comes will have no sheep.”

And what does that mean to you? Or other people?

Part of why I like it is it doesn’t mean anything in particular. It sounds profound and mysterious, and then it’s got sheep at the end of it. That’s just what comes out silly by the time you get to the last word of the phrase.

The network doesn’t really pick up on meaning, but maybe some of that’s transferred here unintentionally.

When you see the neural net working on things like recipes or even paragraph text of stories, you definitely get the sense that it’s categorizing words somehow as similar to one another. If it looked at prose, like a story, and the prose happens to talk about one character’s mother, then somehow it does know — in some sense — that these different words are the names of characters, and it’ll start referring to somebody else’s mother even though that person’s mother never appeared or was mentioned in the story.

“The kinds of patterns it’s discovering would sometimes be ones that we aren’t even aware of ourselves.”

There is definitely some categorization going on even in this neural network that’s supposed to be only looking at a character at a time. It’s still somehow being able to categorize entire words as similar to one another.

Why do this? What do you get out of it?

I find them amusing. I love being able to try a new dataset, see what comes out of it. Kind of wonder at what’s going on under the hood… I really like the feeling. To me, it seems to have this kind of fun, almost otherworldly quality. It’s like nothing I could make up on my own. It is continual pleasure to look at these new datasets and see what the neural network comes up with.