Be sure to see my Watson Analytics First Look for a high level take on the tool and check out the upcoming Cognos Vs Watson series to understand how Natural Language Query compares to similar tools and features in Cognos.
In part one of my Watson Analytics review we’re going to take a deeper look at the Natural Language Query capabilities embedded in the tool. IBM has been been hyping Watson Analytics as the next generation, quantum leap, pick-your-superlative Big Advancement to put the BI startups back in their place and reclaim their spot as king atop Mt. Gartner. Read on to see if Big Blue has a shot at the crown.
From Ignorance to Insight, Faster
IBM’s approach with Watson Analytics is to minimize the effort necessary to turn raw data into insight and to experience those insights visually rather than as a chart or crosstab. The hope is to leapfrog current BI darling Tableau with a graphically modern UI that intelligently analyzes data, identifies interesting relationships and allows users to explore them using natural language queries and visualizations. I illustrate how this works in the gif below:
- First I click on an uploaded data set, in this case offensive statistics from the 2014 NFL regular season.
- Watson presents twelve visualizations based on relationships it identified in the data set. No human intelligence was involved in finding or visualizing these relationships!
- I choose “What is the breakdown of Receiving TD by Location and Position” and boom! The Explore tool renders pretty tree chart showing exactly that.
Two clicks. Two clicks! That’s what it took to go from a freshly uploaded CSV of raw data to visual output. In Cognos Workspace Advanced that is probably one hundred twenty clicks and in Tableau, maybe sixteen? Working with Watson Analytics isn’t always that simple but fact that it can be is quite promising.
Ask And Ye Shall Receive (A Chart)
So what if Watson, in all his wisdom, fails to anticipate your question with zero input from you? Because Watson (kind of) understands English here are two options; free form a question in the text box, or click “How To Ask A Question” and use the just-launched query builder to guide you to a phrase Watson can understand. Take a look at the options below:
- This is the data set I want to query. A single click will open the “Starting Points” window you see above.
- These are interesting relationships Watson Analytics automatically identified. Clicking on any of these will launch the Explore tool with the visualization described.
- This text box is where you ask Watson questions in plain English. When you submit a query the suggested visualizations below will update to reflect Watson’s best guess at the answer.
- Here you can access the “Ask a question about your data” screen, which provides a guided Natural Language Query building experience.
- Clicking “Create your own” will let you jump directly into the Explore, Predict and Assemble tools using this data set.
Ask Watson Analytics Yourself
Let’s take a look at the free form question option. Watson is best at interpreting language that contains column titles and specific data values from your data set in combination with a keyword telling him what sort of relationship you’re looking for – Compare, Trend, Average, Maximum, Count, Correlation, etc…
Here I type “Most Touchdowns By Player” into the text box. This is not precisely the kind of language outlined above but it is close to how actual human beings speak. None of these words are exact matches to what Watson expects – there are Total Touchdown, Passing Touchdown and Rushing Touchdown columns in the data but nothing called “Touchdowns”, and “Most” is not one of Watson’s listed keywords. Let’s see the results:
- Clicking on my data set brings up Watson’s suggested visualization?
- Liking none of them, I type in my desired query, “Most Touchdowns By Player”
- “How do the values of Total Touchdowns compare by Player Name” seems to capture the spirit of it.
- The Explore tool opens with the visualization Watson Analytics feels best represents the answer to our question.
Or Get A Little Help
The example above worked great, but believe me there are plenty of times Watson will have no idea what you are trying to ask him. IBM just launched a feature that allows you to construct your query sentence using column titles from your data set and the keywords Watson is designed to recognize. Instead of typing in the text box, click “How To Ask A Question?” to get to the screen below:
- I choose the “Breakdown” Natural Language Query starting point because I want to see which players scored the most receiving touchdowns against each team in the NFL.
- In the drop down boxes I select “Receiving TD”, “Opposition” and “Player Name”. Then I click “Ask”.
- Watson returns to the “Starting Point” screen and displays the relationships it identified based on my question. I click on the one I am most interested in.
- The Explore tool opens with the visualization I choose populated with data and ready for further analysis.
It’s important to note, no setup is required to use a feature like this. Watson figures out all of the data types when you upload a file and provides these query building starting points, no metadata modeling required. Using the dropdown lists you select from your data items and click “ask” to see what visualizations Watson suggests.
How Natural Is Natural Though?
These capabilities are impressive. Watson Analytics is targeting business users and executives who don’t have time to wait for someone like me to write a complex SQL statement for them, and also don’t have time to go back to college and learn to do it themselves. Allowing them to query the data in English really is the giant leap forward IBM purports it to be…
… for the most part. The thing is “Natural Language” means “Not SQL or another coding language” more than it means “How I talk about the NFL at the bar.” Let’s explore the limitations of Watson Analytics Natural Language Query as it exists today by trying to determine the best quarterback in the NFL.
“Who is the best quarterback?”
Straightforward, conversational and exactly how you’d phrase it over some wings and beer. Let’s see how Watson Analytics responds:
Watson has no idea what we’re asking. “Best” is not a term computers will understand for a looooong time, and even then our disagreement over the criteria for determining “best” is what makes the question fun to begin with. “Quarterback” Watson could understand but only if it’s spelled out in the “Position” column. It isn’t, it’s “QB” and so this sentence is complete nonsense to Watson.
“Who are the top ten QB?”
Here we’re providing Watson Analytics with the correct reference to the position (“QB” instead of “Quarterback”) and providing it with a specific set we’re concerned with, the top ten. Think we’ll have more success?
Well… sort of. Because we correctly used “QB” Watson Analytics has determined that we are talking about Position, and our use of “Top 10” is leading it to show breakdowns, counts and comparisons. It appears to be connecting “who” to “Player Name”. However we haven’t provided it a measure so it is comparing Player Name and Position.
That it is able to make these connections is impressive but we are no closer to identifying the best quarterback in the NFL. Let’s try again, this time with some more measurable criteria.
“What is the breakdown of player name by passing td and passing yards?”
Now we’re speaking Watson Analytics! Check out these results:
Here we used the keyword “Breakdown” and directly referenced the columns we are interested in comparing, Passing TD and Passing yards. By giving Watson Analytics the measures we want to see rather than a fuzzy human term like “best” we get the results we are seeking. Let’s choose “What is the relationship between Passing TD and Passing Yards by Player Name?” and see the results as rendered in the Explore tool:
Explore Your Options From Here
You may be wondering if Watson Analytics forces you to submit query after query until you get the phrasing just right to render the visualization you seek. The answer is thankfully, “No.” Once you get into the Explore tool pictured above you can tweak or completely change the visualization much like you would with Tableau. The Natural Language Query is the starting point of a conversation with your data that provides a first visualization from which to work. I will give more details of the Explore tool in an upcoming post.
Verdict: Not Perfect, But Perfect For The Right Users
As you can see, the Natural Language Query feature of Watson Analytics is a powerful tool for quickly identifying interesting relationships and generating modern looking visualizations without writing code or wrangling with chart axis and dimensional aggregates. BI professionals and advanced business users may wish they could skip it entirely and jump straight into the X – Y graph making with which they have expertise, but these people are not the target audience of Watson Analytics.
The relatively conversational nature of the Natural Language Queries Watson Analytics can interpret and the speed at which it presents and visualizes multiple possible relationships derived from a query makes this feature ideal for line of business analysts and executives who want a low cost, self service experience they can easily understand. That it requires no input from IT is either a blessing or curse depending on your perspective, but IBM is expected to unveil integration between Watson Analytics and Cognos Business Intelligence in the coming months.
There’s Much More To Watson Analytics
Natural Language Query is really just the gateway to Watson Analytics. There are three fully fuctional tools, Explore, Predict and Assemble that I will give the full treatment in standalone reviews before wrapping up with an comprehensive overview of Watson Analytics and how it fits into IBM’s portfolio and the BI marketplace. Overall I feel this tool shows great promise and I’m excited to see what IBM does with it going forward.