Create Publication

We are looking for publications that demonstrate building dApps or smart contracts!
See the full list of Gitcoin bounties that are eligible for rewards.

Solution Thumbnail

“Alexa, ask Algorand for the latest round on MainNet?”

Overview

Voice assistants are everywhere these days adding a whole new dimension to applications for users to interact with. The Voice User Interface (VUI), analogous and complementary to the ubiquitous Graphical User Interface (GUI) which has dominated much of our online and mobile experiences to date, can provide an even more comprehensive and enriching experience for end users of any application. This includes, of course, blockchain-based applications.

As a former Alexa employee, who has seen the power of a well-designed voice experience and how that can enhance overall experiences for customers, building a prototype of an Alexa skill that interacts with the Algorand blockchain was a no-brainer and is the sample code that we provide here. It is the hope that this example demonstrates how simple it is to use the Algorand SDK within an Alexa skill and that, in turn, encourages others to create more magical voice-activated experiences that render blockchain-based applications more exciting and accessible to end users.

Application Goal

Our goal was to create a skill that would provide basic information about what is occurring on Algorand’s MainNet, TestNet, or BetaNet when prompted by users. In doing so, we demonstrate a proof-of-concept for an Algorand-based application that others can use to build out their own voice-activated front-ends.

In particular, we wanted the Alexa skill to be able to respond to the following questions or requests for any of the three networks:

  1. What is the current round?
  2. What is the predicted future round at a given clock time (provided by the user)?
  3. What is the predicted future clock time for a given round (provided by the user)?
  4. Tell me about a particular asset (user provides the asset ID).

We present sample code for how to transform these interactions with the Algorand blockchain into a voice-interactive experience. We use the Alexa Skills Kit to create the skill and its voice interaction model (we will explain what this is later), AWS Lambda to host the skill code, and the Algorand Python SDK to retrieve the request information about the Algorand network.

Application Components

The full code for this application can be found here. There are a few things you need to do to get this up and running. We describe them below as well as some tips and tricks we found useful to us along the way.

Tip

If you are new to Algorand, we recommend checking out the getting started section in the Algorand docs. For getting started with Alexa Skills, we recommend the Develop Your First Skill guide.

Create the Voice Interaction Model

Create an Alexa Developer account and get setup in the Alexa Developer Console where you will create and deploy your skill’s interaction and language models. Peruse the Alexa developer docs to get familiar with building voice user interfaces. I would recommend reading the Voice Design Getting Started Guide if you are completely new to voice design. The Developer Console provides an intuitive user interface for creating intents and slots associated with user requests to your skill which we recommend using for building out custom models. The result of that process is the interaction_model.json which you can directly input into the console if you want to replicate the skill exactly.

There is a ton of documentation available in the Alexa Developer docs that will be useful. For our code, we used the Alexa Python SDK and borrowed components from both the Hello World skill and the Super Heros skill , particularly for the built-in intents. We then created our own intents and slots that map to the functionality we defined in the prior section.

Info

For this component, your goal is to map natural language (i.e. the way a real [human] user would request the information we defined above) to a discrete set of Intents and Slots that your code can then handle on the backend.

In our case, the four requests we defined earlier map one-to-one to our created intents. For example, see how the interaction model for the BlockTimeIntent (Figure 1) looks below. It can take one slot, called NETWORK whose sample values we define further down in the json. The "samples" are example phrases that people might say to ask for the current block. If the user specifies a specific NETWORK, we want to make sure we can capture that value so we retrieve the information from the right network.

Figure 1

            ...
                {
                    "name": "BlockTimeIntent",
                    "slots": [
                        {
                            "name": "NETWORK",
                            "type": "NETWORK"
                        }
                    ],
                    "samples": [
                        "{NETWORK} block",
                        "{NETWORK} time",
                        "get {NETWORK} round",
                        "get round on {NETWORK}",
                        "last round on {NETWORK}",
                        "what is the latest round on {NETWORK}",
                        "the current time on {NETWORK}",
                        "the current time",
                        "the time on {NETWORK}",
                        "the time",
                        "the block time on {NETWORK}",
                        "give me {NETWORK} s current round",
                        "give me the latest round on {NETWORK}",
                        "give me the latest block on {NETWORK}",
                        "what block is {NETWORK} on",
                        "what block are we on on {NETWORK}",
                        "what block are we on",
                        "what is the current block on {NETWORK}",
                        "give me the current time on {NETWORK}",
                        "give me the current block on {NETWORK}",
                        "give me the current time",
                        "give me the current block",
                        "what is the current block",
                        "what is the current round",
                        "what is the latest block",
                        "what is the latest round",
                        "give me the time on {NETWORK}",
                        "what time is it on {NETWORK}",
                        "give me the time",
                        "what time is it"
                    ]
                },
            ...

AWS Lambda Environment

After you create your skill in the console, you must associate it with the AWS Lambda function that will handle the Intent and Slots it receives after the interaction model processes a request. Note that the Alexa Skills Kit takes care of building and deploying these models for you. You just need to associate it with your Lambda-based code and enable the correct trigger. The Develop Your First Skill guide referenced above has all this information and more.

The one hitch to this, which can be frustrating if you have never used lambda before, is that you must make sure that you provide code as a .zip file that contains all required dependencies (except for built-in modules, which lambda is aware of) and the structure of this .zip file matters. How to do this is described briefly in the Alexa docs and in more detail here. The long and short of it is that you should create a Python virtual environment, activate it, install all the packages in requirements.txt and then create a zip file that contains, at its root level, the main lambda code and any modules it imports. That means that the modules within the virtual environment’s site-packages folder should end up at the root level of the zip file (i.e. there will be no site-packages folder in your .zip).

Tip

The AWS Lambda environment uses a Linux-based runtime environment, so be sure to use a Linux-based system to install dependencies, especially for the Algorand SDK. If you install the Algorand SDK on a Mac and then deploy it to Lambda it will not work. If you don’t have Linux, you can spin up a quick ec2 instance to build your code. I also include a .zip folder with the Python SDK in the repo.

Info

The Lambda function handles the Alexa request after it has been processed by the interaction model. The lambda function must contain code to process the incoming intent and slot information, i.e. determine what information it needs to retrieve for the user, and format it back as a fluent natural language response that Alexa will read back to the user.

Final Thoughts and Inspiration

This particular skill layers a voice interactive experience onto some very simple read-only requests available as first-order functions in the Algorand Python SDK. This simple implementation enables users to ask Alexa what block just passed on any Algorand network, to predict future blocks and times, or to provide information about a specific asset ID. Imagine the possibilities when you take this a step further, extending it to new applications and leveraging powerful backend resources.

Here are few ideas:

  1. Store tailored views of the Algorand blockchain data in DynamoDB allowing users to ask more complex queries, e.g. “How many transactions occurred in the last 10 days on MainNet?” or “How many transfers of USDt were there last week?”
  2. Introduce a sign-in/wallet experience and allow users to initiate transfers [securely] from their accounts through voice, e.g. “Send 10 Algos to Sarah” or “Buy 10 USDt”.
  3. TEAL provides an opportunity to initiate certain transactions securely without the need to store keys, e.g. “Trigger my recurring payment to {Mobile Provider}”.
  4. Users register specific accounts that they can ask Alexa about or track transactions to and from, e.g. User sees the notification light and Alexa says: “Your recurring payment to {Mobile Provider} was sent.”

These are just a few examples of how you could extend Algorand-based applications into the voice domain and the example code we provided demonstrates how easy it is to use the Algorand SDKs within this voice experience. Looking forward to seeing what other ideas the community can create!

Video Demo