SMX Overtime: Opportunities and challenges for conversational voice search
At the SMX East conference I spoke on the topic of voice, and in particular, I discussed how to go about building a personal assistant app. During the Q&A for that session there were a few questions that went unanswered, and in today’s post I will address five of them. In the first of those questions I will go into great depth into my thoughts on the total amount of overall voice search usage today. Check it out because you might see the data that you expect!
1. To compete with others on Google search, do we need to build an app for voice search?
While I’m a fan of building personal assistant apps, my experience suggests that the usage level for most such apps remains relatively small, with some notable exceptions. I do think that it makes sense for a large number of organizations to have a voice app program, but for most of these, it’s about gaining experience in voice while it’s still relatively early days in this new arena.
I believe you should be experimenting with voice today. Not because voice is huge right now, but because it will be and it’s coming fast. In other words, having a personal assistant app is not a requirement to compete in Google search, but in a few years, it likely will be.
So why start now? Simple: voice as a source of input will change user expectations of how the dialogue between humans and machines should work, and learning how to deal with speech recognition and natural language processing is hard. You will need to experiment with this and learn how it works and it will take time to build that expertise.
Many of you are probably thinking to yourselves that you’ve seen many statistics that suggest that voice is already huge, and you’re wondering if I’m simply out of touch with what’s happening. However, the reality is that those statistics are deceptive (and in some cases completely misquoted). There are three basic reasons why this is true:
- Frequently cited metrics about voice search are inaccurate.
- The nature of voice queries is quite different than the make up of typed search queries.
- Personal assistant usage is only a portion of total voice search
To address the issue of voice search volume, little real data is published by anyone on this topic. We have the oft-quoted prediction about voice search making up 50% of all search by 2020 that is often attributed to Comscore. First of all, the prediction was made by Andrew Ng, who at the time was Chief Scientist at Baidu.
Second of all, his prediction was for the combination of voice and visual search. And thirdly, the available data that we have suggests that his prediction was greatly exaggerated.
We also have Sundar Pichai’s keynote at Google I/O where he noted:
“in the U.S. on our mobile app and Android, one in five queries, 20% of our queries are voice queries and their share is growing.”
Twenty percent of queries already sounds like a great number, but let’s break that down a bit more:
- This did not include desktop queries, where the voice query share is probably close to zero
- This did not include queries via browser on Android devices, and it maybe that significantly more queries happen in regular browsers than the Google Assistant
- This did not include Google queries that take place in iOS devices, where installations of the Googl Assistant are probably quite low (so nearly all Google queries happen via browser)
Even so, it still seems like it might be a significant number, right? But now let’s dive into the makeup of those queries. We’ll start by looking at Bryson Meunier’s analysis of 3,000 voice queries and Greg Sterling’s report on consumer survey performed by NPR and Edison Research. First, let’s have a look at data from the usage of a Google Home by Bryson’s family.
Between playing music, telling Google Home to stop, setting timers, home control and adjusting the volume, we have 72% of all the queries. There is not much trace of anything actionable for the great majority of businesses with web sites (unless you run a music service).
Bryson breaks this down further by analyzing the overall query intents. He uses the intents as they are defined in the Google Search Quality Raters Guidelines (the full definitions start at page 71). A brief outline of their definitions is as follows:
- Know – the user is seeking information on a topic
- Know Simple – a special case of a Know query where all the user wants is a simple fact
- Do – queries that indicate that the user wants to do something
- Do Device Action – a special case of Do queries, where the device the user is using is able to complete the action on their behalf (such as Play Music queries)
- Website – locate a specific website or web page
- Visit in Person – when the user wants to go somewhere specifically
Looking at this one family’s data set, we don’t see a ton of opportunity for most organizations to gather in lots of search traffic. The basic reason is that a lot of the usage is for new types of use cases that Google can satisfy without needing third party assistance (i.e. your web site).
One of the interesting data points was the dayparting graphic created by NPR that shows how usage of smart speakers varies throughout the day.
Looking at these queries we see that these mostly do not look like traditional search queries these are as well. Further, this is what Greg Sterling had to say about this data:
It’s a broad and diversified mix, though actions/skills discovery remains a problem on both Google Home and Alexa devices. People are not entirely sure about all the things smart speakers can do, and there’s no great discovery mechanism right now.
For one last source, let’s consider data from a voice usage report from PWC.
Cumulatively, what these three different sources confirm for us is that the voice revolution hasn’t fully taken over yet. We’re seeing a significant number of things being done by voice, but a lot of them are NEW or fall into highly repetitive actions, such as paying music, getting directions, setting timers and the like. So they add a new dimension to total search volume, but not much opportunity for most businesses.
The key obstacles to a broader level of adoption, including shifting more traditional search queries to voice are:
- While the reported accuracy of speech recognition is the same as it is for humans (around 95%) the types of errors that happen with voice queries are quite different and often very frustrating.
- Speech recognition is not the only issue and in fact the more important one is natural language processing. The major personal assistants still have a LOT of work to do here.
- Users still do not fully understand the wide range of capabilities that personal assistants offer, and as Greg Sterling noted, discovering new capabilities is not easily done.
All three of these areas need to improve for voice to realize its full potential. Those improvements are coming, and we will get there, but it will take some time.
But brands need to work on developing their understanding of voice apps and interactions, and how to build apps to meet the related user needs. This is something that I’d start working on today. Not to compete on Google today, but to be ready to compete on Google tomorrow.
2. How do you think about using speakable schema for the voice contents and its possible results?
I’m glad that this question was asked because there has been a big change recently. It used to be that speakable markup applied only to news sites. In fact, if you go to the page on developers.google.com on speakable markup, you will still see the following:
To be eligible to appear in news results, your site must be a valid news site. Make sure you submit your news site to Google either through the Publisher Center or setting up a valid edition in Google News Publisher Center.
So as of Dec. 12, Google has let us know that it will look for speakable markup on sites, so go ahead and implement it! Of course, this does not guarantee that you will be used in voice search results, but it likely increases the chances that you will.
3. What is the strategy for voice for verticals like the weather, where the answer gives no credit to the source or drives brand awareness?
That’s a tough one. If your goal is to earn a featured snippet (and therefore presence in voice results) for a query like “Falmouth Weather,” it’s going to be very tough for you unless you are the publisher of weather.com, accuweather.com or wunderground.com. Also, Google does actually provide attribution when a screen is available:
However, as the question suggests, Google does not provide attibution when delivering the results through voice. I discussed this with Barry Schwartz, and we both believe that Google has a deal with Weather.com for the weather results, so the way this behaves is likely the way it was specified in that deal.
So to be clear, the verticals that I know of where there is no attribution provided, I believe that those are the direct result of a negotation between the source of the content and Google. These scenarios should not impact your overall voice strategy.
Unless you are in a position to negotiate a specific deal with Google, you should consider how you structure the deal to get the best result for yourself. However, if you’re not in a position to negotiate such a deal there is probably not much you can do to rank for voice search on these results because someone else (like weather.com) will.
4. How hard was the implementation of Google Assistant/Alexa tracking in Google Analytics?
It does require a certain amount of programming skill to setup involving integration via API calls. A reasonably skilled programmer can probably work out how to do this with a few weeks of programming effort (including testing and debugging).
5. When should you use your own voice app, and when should you use existing tools (like Google My Business) to get your information to users?
I’d phrase the question a bit differently. The real question is when you should do both. If have physical locations for your business and you want to bring foot traffic, you should be in Google My Business. It’s that simple. Further, if there are opportunities on those same queries to be featured in voice related results, which you can now enhance your chances of doing by earning featured snippets or using speakable markup, you should do that too.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.
About The Author
Eric Enge is General Manager of Perficient Digital, a full-service, award-winning digital agency. Previously Eric was the founder and CEO of Stone Temple, also an award-winning digital marketing agency, which was acquired by Perficient in July 2018. He is the lead co-author of The Art of SEO, a 900 page book that’s known in the industry as “the bible of SEO.” In 2016, Enge was awarded Search Engine Land’s Landy Award for Search Marketer of the Year, and US Search Awards Search Personality of the Year. He is a prolific writer, researcher, teacher and a sought-after keynote speaker and panelist at major industry conferences.