Saturday, September 23, 2017

Isabella: Now we're talking

Previously on Dr. Lambda's blog:

In a previous post I presented my newest pet project: Isabella. Isabella is a voice controlled personal assistant, like Siri, Alexa, and others. We have decided to investigate how difficult it is to make such a program. In the last post we continued adding more APIs: functionality for free.

Now, the continuation...

More APIs

As mentioned, this stage of development, is super fun, because so much is happening, so fast. Keeping with this we try to add at least two new things at a time.

The small easy addition this time is a quote API. If we thought the joke API was difficult to find, we did not anticipate how difficult it would be to find a good quote API. In fact, implementing features like this takes about 7 minutes, however finding a good API takes about 2 hours.

In the end we gave up the search for a "general, inspirational quote API". And settled for a "programming quote API". Again, we can take advantage of the target audience of Isabella; me, and maybe a few of my friends. We are fine with programming quotes.

At this point we have also added:

"Conversations"

A different thing completely, is that we eventually want Isabella to have some contextual understanding. Like saying "Play Ed Sheeran", and then following that up with "How old is he", or something like that.

The most basic examples of contextual understanding comes from saying "what" when you don't hear what she says. In this case we want her to repeat the last thing she said. One characteristic of this follow-up query is that you shouldn't have to say "Isabella" first, like normal, as it comes as part of a "conversation".

We introduced the Isabella name to simplify the command matching algorithm. So we could take advantage of the fact that we knew we should try to match one of the commands, and could do our "backwards trick". Now we want to remove this simplification, and that means that she listens to everything that is said after a command. However, this time not everything that is say should match a command. To facilitate this we needed to add a threshold to our matching algorithm, and say that we only register something if it is a sufficiently good match.

First we just add the option to say "what" after everything and she repeats. Then we add "thank you", to which she will reply "you are welcome". Then we took it to the next level.

We wanted to add notes. Here we did not want to use our wildcards, as the note might be quite long. Instead we wanted to use a custom follow-up, where we could capture everything and save it. So that is what we did.

We are probably going to use this "conversation" feature quite a lot for future commands.

No comments:

Post a Comment