Saturday, January 13, 2018

Isabella: Testing, and a Build Pipe-line

Previously on Dr. Lambda's blog:

In a previous post I presented my newest pet project: Isabella. Isabella is a voice controlled personal assistant, like Siri, Alexa, and others. We have decided to investigate how difficult it is to make such a program. In the last post we made some small changes with big impact.

Now, the continuation...

Strengthening the foundation

As we are in a phase of improvement, this seems an appropriate time to make the foundation more solid: automated testing. On one hand I think I should have done this from the start, but on the other hand...

I am a big fan of Dan North, in particular his [Spike and Stabilize pattern]. With this we treat all code as if it was a Spike, code fast and unstable. Then deploy the feature to the users and see if it is used. Then after a certain amount of time – like a month – come back and see if it is being used. If it is mostly unused then delete it, otherwise if it is used a lot refactor it and write automated tests for it. This way you invest only very little time in code that ends up being unused.

Isabella started as just an experiment and not code I expected to be long lived, I opted for prioritizing new features over a stable code base. Just like stated in Spike and Stabilize. I recently changed my outlook for Isabella; I now expect that I will use (and work on) her for a long time. Said in another way, I have deployed the code, waited, and I now know that the code is being used. So, it is time to stabilize!

Stabilizing the code was an incredible frustrating process, for primarily two collaborating reasons. The first reason requires a bit of explanation. Many non-technical people think that programmers spent most of their time coding. In practice this is far from the truth. Normally with code we spent most of our time searching for and fixing bugs. This I can easily handle. As a child I loved mazes, now I enjoy being lost in a very complex system, fighting to find my way out. I savour the victory when I finally crack it.

During this process I have spent almost all my time searching the internet for answers; which libraries, syntax meaning, and just hard problems. Very boring, and time consuming. And when I wasn't search the internet I was rewriting the same code again and again. I didn't feel like I was making any progress at all. And then finally, even when something worked there was very little visible effect. So it also didn't feel like a victory.

Here is the documentation of my journey, including the problems and solutions I encountered.

Jasmine

My first instinct was: I want testing, I should start writing tests. I had a bit of experience with jasmine-node, so I installed it, and started writing tests. Problem was, because this was client code the way I used multiple files was with multiple script tags. Thus jasmine-node couldn't find any of the dependencies. Adding import statements in all the files was an obvious solution, but that would give errors on the client side.

I did some research and found something called system.js, which would emulate import statements on the client side. This meant having to refer directly to the .js files instead of the .ts files. In spite of this it seemed like a neat solution.

Codeship, Bitbucket, and Git

My next idea was to setup an automatic test-and-deploy cycle. I wanted Heroku to run the app, [Codeship] to test it, and Bitbucket to host the code. So far I had just used Herokus as my code-host. I was faced with an entirely unfamiliar challenge. How to move from one git repo, to another?

I am no git guru. Unfortunately. I wish I could tell you exactly the steps I took to make this work, but I have no idea. I pull'ed one way, then the other, committed, merged, and suddenly I could push to Bitbucket. Codeship quickly picked up the push, and deployed it to Heroku. I'm skipping here a few small issues with some RSA keys for deploying from Codeship.

Gulp

If we think of problems as lianas, software development is like playing Tarzan; we are constantly swinging from one to the next, in a seemingly endless jungle. Usually when I worked on Isabella I would just start tsc -w in the background, and forget about it. Sometimes I would forget to start the compiler which would be super annoying, because then I would push the build to the cloud to test it, and nothing would happen. This was fairly bad, but having added tests it was much more annoying. First, there were now two things to remember (or forget). Sure it also takes a bit longer to deploy, but having Codeship reject a deploy, because I forgot to test it locally was just a slap in the face.

It was time to setup a build tool. I did some research and narrowed the decision down to Gulp and Grunt. To me they seemed fairly equal, and I don't even remember what the deciding factor ended up being. I went with Gulp.

With a build tool the great advantage is that you can add as many post processing steps as you want. As encouraged by [Typescript documentation] I suddenly wanted browserify, and uglyfy. I also wanted it to be "watching", so I couldn't forget anything.

Uglyfy was no problem, watching was easy, browserify was... difficult. As mentioned earlier my test files used imports. In fact this had been quite tricky to achieve. Now it stood in my way, and I was not about to poke that bear. Therefore I abandoned my dream of browserify.

Do-over: Grunt!

As I remember it: I was brawling with an issue with Gulps jasmine-node not supporting the later versions of ecmascript – in particular Promises (which I use heavily). When suddenly I stumble on some blog describing my dream of a build pipe-line. It had everything, a client part, server part, and common part. The server part was tested using jasmine-node, the client part was tested with jasmine, and phantomjs. The client was browserify-ed and uglyfy-ed. There was watching, and the folder structure was beautiful. It was a [fine template] for a project like this.

The only problem was, it used Grunt. I'm not one to be over-confident in my decisions, so if I learn something new I gladly change. Thus I deleted everything I had made up till this point, and tried swapping in Grunt.

This was not problem free, but it wasn't too bad. I ended up testing both client and server with jasmine-node. Isabella is very light on dom, and very heavy on APIs, this I can test just as easily with jasmine-node.

Conclusion

Although this was a tough stretch I did accomplish a few things. My code is now a bit more secure from my students copying it, due to uglify. It also takes up less space, thus loads faster, also because of uglify. It is browserify-ed, so I can use import as much as I want, and I can never forget to include a file in the HTML. I have testing up and running, so now I can start adding tests whenever I add new features, or fix bugs in current ones. I have a guarded deploy so even if I forget to test locally I am guaranteed that the tests will be run before a deploy.

I don't have any general words of wisdom. I wont say that you should always just Grunt, or anything. Setting up a good pipe-line is hard, but it is also invaluable. It is also a problem that we don't tackle often. I am familiar with the DevOps saying: If it hurts, do it more. Encouraging practicing the skills that we struggle with. If you are afraid of deploying, do it more, so you minimize the risk. If you are afraid of changing some code, delete it and write it again, so you know whats going on. While I agree wholeheartedly with this advice, I don't feel like going through this process again any time soon. If you are about to setup a pipe-line of your own: I wish you the best of luck.

Wednesday, January 10, 2018

Isabella: Change to followup

Previously on Dr. Lambda's blog:

In a previous post I presented my newest pet project: Isabella. Isabella is a voice controlled personal assistant, like Siri, Alexa, and others. We have decided to investigate how difficult it is to make such a program. In the last post we talked about the process and problems of moving her to HTTPS.

Now, the continuation...

Followups

In Isabella there is a concept of a followup, meaning that she is able to reply and wait for a response for certain queries. The most important implication of this is that you do not need to start a followup with "Isabella"; you can just say it. This is what we use for playing games, or taking notes.

Something that often annoyed me was saying "Isabella, turn on the lights", which turns on the lights to their last state. This means that in the morning the lights would be dimmed red, because thats their night setting. Therefore I often had to say "Isabella, bright lights", which switches to their bright setting. I did this sequence, or a similar one until I realized that finding the right lighting is an experimental process. I would often switch scenes a few times maybe try dimming, before I found the exact lighting that I wanted. With this in mind it is annoying to have to say "Isabella" again, and again.

Categories

The solution was so simple, because I already had the followup system in place. I simply added a category to every command, then when I execute a command I just push all commands in the same category to the followup database.

Just like the "refresh" command from an earlier post, this was trivial to implement, yet the user experience was tremendous. Last time I argued that you should prioritize changes which benefit the development, so I wont beat that dead horse anymore.

This time I will point out that only by using the product this became apparent to me. This usage over theory was ironically something I learned while taking a theoretical compilers course at university. The course ended with a competition to see who could make the best peephole optimizer, for a byte-code language. To simplify the challenge we were judged only on the number of instructions. Our optimizers were run on several small, but realistic applications, to see who did the best.

Grip by competitive spirit, we thought up hundreds of byte-code patterns which were stupid and could be optimized. Come the judgement, it was revealed that we were the team with the most patterns, one team had only 7 patterns. "They probably slacked off" we thought, and we were confident that with so many patterns we were sure to win. However, that was not what happened. Everybody clapped and cheered as the team with 7 patterns accepted their victory.

It took months of pondering before I fully realized what had happened. The other team had understood when the teachers said "realistic applications". So they had spent their time, not coming up with stupid patterns like my team, but instead coding realistic applications, and looking at the byte-code. Doing this they had spotted 7 weird, but very common patterns.

Then lesson I learned then has stayed with me since. I knew already to "optimize for the common case", yet I did not know the value of investigating what "the common case" is. My advice this time is: remember to walk in your users shoes once in a while. Sometimes it will reveal a tiny change with a huge effect.

"Isabella"

Getting back to the followups. I noticed that with Amazon Echo I would say "Alexa", and then wait, to make sure she was listening. If you have a long, or complicated command it is tedious to repeat it because it didn't hear its name. Again this was a nice touch, which was simple to add to Isabella. I just added an empty command, which would push all commands to the followup database.

Here is a video:

Saturday, January 6, 2018

Isabella: HTTPS

Previously on Dr. Lambda's blog:

In a previous post I presented my newest pet project: Isabella. Isabella is a voice controlled personal assistant, like Siri, Alexa, and others. We have decided to investigate how difficult it is to make such a program. In the last post we discussed how we added Spotify and OAuth to Isabella.

Now, the continuation...

Deploy Debriefing

Having put Isabella in the cloud there are certain things I have struggled with, and some I still do. On localhost we can easily get access to the microphone and speakers. However, online Chrome won't even ask for permissions if it is not an HTTPS connections. At first this might not seem like such a big deal, Heroku immediately supports HTTPS, so adding the S should just work. Right?

Server-side protocol

Part of OAuth is to have the third party (Spotify) redirect back to use. At first we just used

redirect_uri: 'http://' + req.get('host') + spotify_conf.redirect_uri,

Now we needed to find out if we should add the S. Doing that – server-side – turned out to be a bit annoying. Let me save you the trouble:

let protocol = req.headers["x-forwarded-proto"] || "http";

HTTP requests

We also used several APIs in the client, which only run on HTTP. Let alone the blatant security problems of having all the API keys in the client. This violates the HTTPS agreement, so Chrome kindly blocks them, and warns the user.

Again there is a simple – albeit tedious – solution to both problems: to move all the HTTP calls to the server. That way the client only has HTTPS calls (to the server and Spotify).

app.get('/joke', (req, res) => {
  request.get({ url: "https://icanhazdadjoke.com/", 
            headers: { "Accept": "application/json" }}, 
              (error, response, body) => {
    res.send(body);
  });
});

We do have to be a bit careful while doing this. Eg. we use an API to lookup our location based on our IP. Obviously if we just move this call to the server, we will get the servers location. As luck would have it, this particular API allowed us to input a specific IP and look it up. Now we only need to find the clients IP, and pass it along. Again this was cumbersome because we are sometimes running localhost, and sometimes not. Anyway the solution is:

app.get('/ip', (req, res) => {
  let ip = req.headers['x-forwarded-for'] || req.connection.remoteAddress;
  if (ip === "::1") ip = "";
  request.get("http://ip-api.com/json/" + ip, (error, response, body) => {
    res.send(body);
  });
});

Whew... or I mean Hue

Having solved these problems it seemed like we were ready to go full HTTPS. But not without one last problem. Hue. The Phillips Hue API runs locally, and uses only HTTP requests.

I don't want to rant about the general implications or irresponsibility of this.

Because the calls are to a local IP I cannot move the calls to the server. From my search: I cannot change Hue to run HTTPS. So I'm stuck. If anyone has a solution or even suggestions on how to solve this I am all ears.

So, the slightly uncomfortable conclusion is that: with the exception of warnings from each Hue call, we have successfully moved to HTTPS!

Wednesday, January 3, 2018

Isabella: Spotify and OAuth

Previously on Dr. Lambda's blog:

In a previous post I presented my newest pet project: Isabella. Isabella is a voice controlled personal assistant (VCPA), like Siri, Alexa, and others. We have decided to investigate how difficult it is to make such a program. In the last post we finally deployed her to the cloud.

Now, the continuation...

Voice Controlled Personal Assistants

As Isabella has grown, I have started to grow more and more dependant on her, and indeed more attached to her. In the beginning this was just a fun experiment to see how difficult it was to make something like Alexa. At the same time I was strongly considering buying a "real" VCPA like Amazon Echo or Google Home. This doesn't seem reasonable anymore. The other VCPAs do offer a few features that Isabella doesn't have... yet. To balance it out I have decided to add a feature to Isabella that aren't available in the other assistants.

Spotify

I listen to music quite a lot. Wether I'm working, or cooking, Spotify is usually playing in the background. Again I don't want to get into a discussion about which music streaming service is best by any measure, I just happen to use Spotify. Unfortunately playing music from Spotify is not supported by Amazon Echo – in my country, at the time of writing. Of course this is due to politics and not technology. However I still want it.

Research

Spotify has great documentation for their web api. My first idea was just to get some audio stream, pipe it into an audio-tag and boom, music from Spotify. Unfortunately this turned out to be impossible. You can only retrieve a 30 second clip of a song.

This was quite the roadblock, and it stumped me for several days. I looked over the API again and again, and it just seemed to have methods for searching, and "clicking" the different buttons in the interface. In a way the commands in the API could make a remote control. Then it hit me. A remote control was exactly what I was trying to build. I didn't want to build an entire music streaming platform, I just wanted to control one.

This does have the limitation that Spotify needs to be constantly running in the background. But it does also mean that Isabella can control Spotify playing on other devices like phones or tablets.

OAuth

The first step when working with the Spotify API is to implement their OAuth protocol. Luckily Spotify's OAuth is super easy to implement due to their documentation. Most people know OAuth only from the "login in with facebook" (or google), but it can do much more. I imagine that we will use this same protocol for many APIs that we add in the future, like calendars, email, etc. Therefore I briefly explain the basics of OAuth. In my experience OAuth is difficult to grasp at first sight, so you should not expect to gain a deep understanding from this presentation.

Because repetition is good for understanding, I'll explain it using two metaphors I like. Then I'll explain it with the technical terms, because repetition is good for understanding.

Imaging that we are managers in a ware house. We have access to many areas, some of them are restricted, meaning only we have access to them. Now, for some reason we want someone else to solve one of our tasks. But in order to solve this task they need access to some of the restricted areas, that we have access to. This is the fundamental problem that OAuth solves.

The protocol states that:

  • you ask the person who should perform the task.
  • the person asks the secretary for a key to the restricted area.
  • the secretary calls you to ask if this person is allowed into this particular restricted area.
  • you confirm.
  • she writes an official form and gives to the person.
  • the person takes the form to the janitor.
  • the janitor makes the necessary key and gives it to the person.

At this point the person can perform the task. We could imagine the same procedure if you are applying for a job, and the company wants to know your grades, which are usually secret. The protocol states that:

  • you send a job application.
  • the company asks your school (or university) for your grades.
  • the school calls you to ask if this company is allowed to see your grades.
  • you confirm.
  • she writes sends a link to the school.
  • the company opens this link in a browser.
  • the browser shows the company your grades.

Finally let's take the concrete example of Isabella and Spotify. The protocol states that:

  • you want Isabella to control Spotify, so you send a request to Isabella.
  • Isabella redirect this request to Spotify, adding some authentication information, so Spotify knows who "Isabella" is.
  • Spotify then presents you with a "this application wants access to these areas".
  • you click confirm/continue – ie. sends a request to Spotify.
  • Spotify redirects this request to Isabella adding a special token.
  • Using this token Isabella sends a request to Spotify.
  • Spotify returns an access_token.

Basically every call in Spotify's API requires this access_token.

The first step

The first step in the protocol is to show that you want Isabella to take control of Spotify. The standard way is to have a button, and that was my first approach too. This is because the first time you click it, it takes you away from Isabella, and you are confronted with a screen. From a Human-Computer Interaction view point, this view change is feedback, so it is fitting to have a button. However, any subsequent times you click it, Spotify remembers your consent and just sends you straight back to Isabella without you noticing it. This means that in the subsequent cases we have a button without noticeable feedback – not good.

Common computer science knowledge teaches us that we should optimize for the common case. Imagining that you want Isabella to take control of Spotify often... very often. We only log in "for the first time" once. The common case is clearly the subsequent times, where it does not make sense to have a button. Therefore I decided in the end to remove the button, and add a command to "log in to Spotify".

Now this does cause a problem with discovery; the process by which a user learns about features. It is easy to see a button and try to click it. It is harder if there is no visual ques. However this is a general problem for VCPAs, how do you know what you can do with it? With human interaction we assume that either the receiver know how to answer our query, or we can teach them. Is this an approach we can take with VCPAs? Start with a broad basis of tasks, and the have the users teach them what they need? How should they teach it? I will certainly look deeper into this in a later post.

For now, here is a video: