Wednesday, September 27, 2017

Isabella: Programming, by Speech

Previously on Dr. Lambda's blog:

In a previous post I presented my newest pet project: Isabella. Isabella is a voice controlled personal assistant, like Siri, Alexa, and others. We have decided to investigate how difficult it is to make such a program. In the last post we added support for follow-up queries, so we could have "conversations".

Now, the continuation...

SPIKE

As we mentioned in the very first post, one of the goals was to have Isabella to assist with coding. In particular, coding herself. Now that we have added quite a few functions, and features such as wild cards, and follow-ups, it is time for another SPIKE to test if we are getting closer.

In the first SPIKE we reached a point where we could call functions, and generally evaluate expressions. This time we want to see if we can implement a function with Isabelle. The time-box is a few hours, at most an evening, thus tomorrow everything we write from this point will be deleted again.

The idea

Dictating everything is horrible, it is slow and error prone, so the Isabella should at least have some domain knowledge. It should feel like telling one of my students what to do. My idea is to have her take control of the conversation, this way she can keep track of where we are, thus simplifying navigation significantly.

Simply put, we initialize a series of questions, where each answer is translated into code differently depending on which part of the function we are in. That is, the answer to "what should I call the function" would be translated into an identifier, where as the answer to "what is this statement" would be parsed as f.ex. an assignment.

This "conversation" is much more complex that what we have used before, because there is no set limit to the number of questions. There could be no parameters, or there could be 15. How do we support this?

Counting (arrays)

Our first solution was to have her start an enumeration by asking "how many Xs will there be", and then just loop that many times for answers. This worked.

It did required us to could how many things we wanted up front, which is not natural when coding. Next time you are coding something try to predict how many statements are going to be in a function. For this reason we abandoned this solution.

If we think about it, this is exactly how arrays work. When we initialize them we have to specify their length. The problems we ran into are also recognized from arrays: it is difficult to change their length later.

Terminating (lazy lists)

Having realized that our first idea was basically arrays, we can use this to come up with an alternative idea: Lazy lists. Our next idea is to reserve a keyword ("done"), just keep asking for more until we said that word. This way we just start saying statements, and then when there are no more, we just say "done" and the function is complete.

This is much more natural! So this is probably the way to go.

Conclusion

Coming to the of the SPIKE, what have learned?

First of all we did succeed in programming a method using speech. And it did not take an unrealistically long time. We also learned a few important lessons for later when we decide to implement the real thing.

Unfortunately it was not as smooth as we had hoped, after an evening of working on it. It is clear that to make this useful we basically have to implement an entire language for programming. We did this for "a function" and it was nice, but we should also apply the idea to "statements", and probably "expressions" and so on. If we have to do this anyway Javascript might not be the right choice. Maybe we should invent a language which works specifically with speech? We'll have to see.

No comments:

Post a Comment