Chatbots: where are you?

Creating a chatbot seems easier then ever. Or is it?

There are several frameworks and services making the life of developers easier, such as the bot framework and LUIS. But when someone starts to cover more-and-more real-life use cases it turns out that the developer still has a lot to do.

One of these challenges I faced in my chatbot comes from getting the user's location.  In bot framework, some channels might provide contextual information along with the user, such as Cortana, which can provide the user's actual location. This comes very handy, when our chatbot needs to know the user's actual location is.

Understanding the user

When it comes to other channels such as Facebook's Messenger, life is not that simple anymore. We do not have the same contextual information. Though in Messenger, we still need to know the user's location. One simple solution is to ask the user for its location in a dialog. Seems easy enough, open a new dialog, post a question and wait for the answer. But considering the answer, what do you expect to be replied: a city name, a full address, a street name or a country? In my app (as it is related to public transport) I am expecting bus stop names. Uhh? - bus stops do have valid names (along with geocoordinates) so you know where to get on/off.

Do you expect someone to type a stop name exactly as it is? Users typically make mistakes, have typos, shorten things. For a stop name "Móricz Zsigmond square" you might get "moricz", "móricz" or "moric zsigmon".

User types the location

You want to parse the reply here with best effort, to have a good user experience. Having a finite set of stop names help. Using dynamic programming algorithms you can you start counting distances between the typed text and stop names, so you can narrow down the possible stop names to a few. In a good case, you find one single stop name, in a worse case you find a handful of names. With a couple of names at least you have a possibility to go back to the user and offer a choice, to choose from the names most likely to match.

The user's other choice is when a user share his/her location.

Sharing location as an entity

In this case the location received will not be in text format but rather a json attached as an entity on the reply. The attached entity will have a latitude and longitude parameter which we can use for further calculations.

The user's location on map

Creating a custom PromptDialog

Bot Framework's API offers a simple PromptDialog class which can be used to prompt the user for a Text, Number, Choice, Confirmation etc. In the following I will show a simple way to extend the PromptDialog so you can ask for the user's Location. The current Text method unfortunately can capture a text or the location as entity, hence the extension. This our desire:

protected void RequestLocation(IDialogContext context)
{
  PromptDialogEx.Location(context, async (ctx, result) => await LocationPromptReply(ctx, result),
    "Where are you now?", "Tell me your location!");
}

We create a new static methods (just so we can get a similar syntax as the original PromptDialog has). I do not prefer static methods (because of testability), but for the sake of this writing, we will go with this approach.

Inside the Location static method, we simply create a new Dialog, and call it on the dialog context, thus it gets on the top of the conversation stack:

var child = new LocationPromptDialog(prompt, retry, attempts);
context.Call<LocationPromptReply>(child, resume);

The only thing left is to add our LocationPromptDialog class. The LocationPromptDialog will derive from  : Prompt<LocationPromptReply, string> (where LocationPromptReply is our POCO to return the user's reply. After adding the required constructors, we need to override the TryParse method:

protected override bool TryParse(IMessageActivity message, out LocationPromptReply result)
{
  if(!string.IsNullOrWhiteSpace(message.Text))
  {
    result = new LocationPromptReply(message.Text);
    return true;
  }
  var geoCoordinates = message.Entities.FirstOrDefault(e => e.Type == "Place");
  if(geoCoordinates != null)
  {
    var entity = geoCoordinates.GetAs<ConversationEntity>();
    if(entity?.Geo != null && entity.Geo.Type == "GeoCoordinates")
    {
      var lat = entity.Geo.Latitude;
      var lon = entity.Geo.Longitude;
      result = new LocationPromptReply(lat, lon);
      return true;
    }
  }
  result = null;
  return false;
}

In this override (without any further logic) we return a string if the user has typed the reply, or we return the coordinates if the user has shared his/her location. We return null otherwise. Further business logic is not included in this implementation.

Note that by deriving from Prompt, we get the retry logic, and the rest of the prompt options for free, we do not need to re-implement it.