I introduced you to Cognos Data Sets in Part 1 of this series and you recognize some intriguing possibilities. You want the massively improved performance, simple presentation for end users and quick road to Cognos modernization that Data Sets offer, but you’re not sure how to start. Well I’m here to help you understand when to use Cognos Data Sets – how to recognize each situation and how Data Sets help.
Prepare Data for Advanced Analytics
Advanced Analytics features like forecasting in Cognos often work much better with Data Sets than other data source types. A narrowly focused, in-memory source dramatically enhances the speed, interactivity, accuracy and usefulness of features like Explore or the AI Assistant. This is especially true compared to giant Framework Manager models.
Recognizing poorly prepared data
The need to prepare data is most apparent when the advanced features of Cognos Analytics fail to provide meaningful suggestions or build garbage output. This manifests in the following ways:
- The AI Assistant cannot understand which instance of ‘customer’ you want and picks it from an incorrect namespace
- The AI Assistant makes very poor suggestions
- AI generated visualizations do not filter properly because they contain different versions of the same data item – ‘customer’ from 3 different tables
- The ‘generate dashboard’ command creates a nonsense dashboard
- The forecasting feature does not appear in line, bar or column charts
- Explore takes a very long time to load or interact with
Using Data Sets for advanced analytics
Data Sets make it easy to simplify the data used for advanced analytics. Because they are quick to make and perform very well I use them any time I want a great experience for my end users. The goal of using Data Sets for advanced analytics are:
- Remove any duplicates in the data. Each field should occur only once
- Identify a specific subject of analysis and include only measures and fields that help understand that subject. The explore feature helps immensely with this
- Help the AI assistance shine by producing meaningful results
- Improve performance across the board, especially in Explore
Improve Performance of Existing Models
Most long time Cognos customers have at least some models that perform slowly. Maybe it’s logic processing at run time. Maybe it’s the underlying database. Whatever the cause, you can’t let your end users watch a wheel spin for minutes on end whenever they make a slight change to a dashboard. Oftentimes customers solve this problem by locking Dashboards, Stories, Explore and anything else new and cool away from users. That’s a big mistake.
Recognizing poor performance in Cognos
This is fairly straightforward. You know performance is poor because Cognos is slow, right? Generally yes but there are some situations where poor performance manifests in surprising ways.
- People call you and say ‘Cognos is slow’
- You check Thrive and it tells you ‘Cognos is slow’
- User adoption for self service features is lacking
- Schedules are frequently late or are challenging to maintain
- Source systems process dozens or hundreds of similar queries
- You just keep staring at that damn spinning wheel
Improving Performance with Data Sets
This is an area where Data Sets shine because you’ve already got a model with all sorts of embedded business logic. It’s extremely easy to generate data sets as needed, and they automatically inherit all that Framework Manager logic. Very little data rework results in huge performance gains. Your goal is to:
- Take advantage of in-memory processing and server RAM
- Summarize detailed data to a higher grain to decrease row counts and better target analysis
- Sort data by commonly filtered data items
- Filter out unnecessary records
- Decrease load on underlying data bases
- Banish the spinning wheel forever
Imagine a query that processes for 15 minutes and runs 100 times a day. You are spending 1,500 minutes processing that data. By moving to a data set, the query runs once for 15 minutes to load the data. All subsequent executions load in ~1 second as data pulls from memory, not the database. You just saved 1,445 minutes of processor time. And you saved the sanity of your end users.
A real world performance example
My friend Rory Cornelius gave me the following quote about Data Sets. Rory actually did this with one of our clients. It shows how these techniques work to solve all sorts of Cognos problems.
Not your typical use case, but my client has this huge set of scheduled jobs. There was one job that had 10 reports each that queried almost the same data. Each report took about 45 minutes to run and they wanted them done sequentially to limit load. I pointed the reports to a Data Set instead, and they took 2 minutes to run instead. The Data Set still takes quite a while to load, but even with that, the total time was cut by at least 5 hours with significantly less load on the database.Rory Cornelias, Senior Solution Architect with PMsquare
Combine Data Sources in Cognos
Throughout my career the number one impediment to analytics delivery is the struggle to combine data from multiple databases or applications. Data exists at different levels of detail with messy, mismatched keys and incompatible query languages. It’s just tough out there. However Data Sets radically streamline this process, especially for data already in Cognos. They provide a form of lightweight ETL and query processing to supplement for fully featured tools like Incorta or IBM ADP/Trifacta.
Recognizing data source mashup bottlenecks
Whether it’s a lack of clear requirements or an IT bottleneck for ETL, projects often wait for months or years at this stage. Faced with mounting delays, frustrated end users often choose to export data from Cognos and go it alone in Power BI. But you can learn to recognize the signs of data mashup bottlenecks
- The data warehouse request backlog grows to many multiples of the Cognos backlog
- End users export tons of data to excel
- Advanced metrics are challenging to build because you are missing key calculation components
- You often make model or data warehouse changes to add just a few columns or tables
Combining data with Cognos Data Sets
The process of combining data sources using Data Sets could hardly be easier as I outlined in Part 1 of this series. Instead the challenge lies in working through the logic of how best to combine two sources. The main things you will need to do are:
- Identify the fields required for your analysis and locate them in your data sources
- Create a data set for each source
- Aggregate data at a compatible level of detail
- Perform necessary data cleansing to make joins possible
- Add filters, calculations or other logic at the Data Set level, not in Data Modules or Reports/Dashboards
- Schedule data sets so that they build in the correct order
- Combine them by joining together in a data module
This technique allows you dramatically simplify some complex ETL tasks with large and complex databases by first boiling each source down to just the fields you need. The key thing is to embed as much logic into the Data Set load process as possible. This minimizes query cost at run time and makes building and maintaining your Data Modules as easy as possible.
Simplify Presentation for Self Service
Framework Manager models typically exist for IT and accumulate years or even decades worth of developer focused design decisions. As a consequence they often require crucial yet undocumented context to generate accurate and timely queries, with a host of conditional flags, hidden filters and inscrutable calculations. Self service becomes impossible when end users don’t understand the structure or context of data. This is the number one objection I hear to rolling out Dashboards or Explore in Cognos Analytics
Recognizing overly complex models
An overly complex model stands between you and the evolution of your BI practice like an unbridgeable chasm. It’s calling card is the list of things you cannot accomplish because ‘the data is too complex.’ You know it by:
- End users cannot effectively use the model, or you have locked them out of it due to data quality concerns
- Self service feature roll out met with limited success due to data complications
- Use of your data is always accompanied by caveats, ‘You have to include flags x,y,z to get meaningful results’
- Debugging data problems is extremely confusing or time consuming
- New hires to the BI team require weeks or months to get up to speed with the data
Preparing Data for Self Service
Data Sets are the bridge to this chasm. Because they are so easy to make and inherit all the logic from your Framework Manager source, you embed and effectively hide the underlying complexity with a well designed Data Set. You will need to build multiple Data Sets from the same model to effectively simplify the presentation – this is a factor of your design. Remember, the goal for Data Sets is to break a ‘one size fits none’ model into smaller, usable components. Let your data sets multiply!
- Break your large model into smaller, digestible subject areas based around the types of questions your users need to answer
- Build a Data Set for each subject area
- Don’t be shy about overlapping data in multiple Data Sets. The end goal is to make something easy to use for an individual subject area
- Don’t be shy about building lots of Data Sets
- Remember – the Data Set inherits your Framework Manager logic. You should have a high degree of data consistency across Data Sets as a result
- Always be willing to alter, change, abandon and create new data sets based on evolving user needs.
Your instincts from Framework Manager probably tell you to come up with a grand, cross – Data Set design to ensure consistency and eliminate re-use of fields. Don’t do this. Remember, tailor each Data Set to the needs of its user community and be willing to adapt as those needs change. This is the key to modernizing Cognos to compete with Tableau or Power BI
Modernize Your Cognos Practice
By following all these steps you will modernize much of your Cognos Analytics practice without intentionally doing so. A modern BI practice requires two modes of operation, often called ‘Mode 1’ and ‘Mode 2’. Mode 1 is the traditional enterprise BI way of doing things; ETL, ODS, EDW, monolithic Framework Manager models, IT authored reports. It remains a vital component of our work. However Mode 2 is equally important; Agile data mashup, in-memory processing, collaboration with self-service users and above all, speed.
The techniques outlined above will get you to mode 2 rapidly, even if it seems daunting or impossible today. Because you’ve done so much great work building your Framework Manager models you have an incredible foundation for self service – you just haven’t realized it yet. Using Data Sets in combination with Data Modules and Dashboards will give you the performance, simple data presentation and agility you need. Try it! And as always if you need some help along the way reach out to me and PMsquare. The answer to ‘When to use Cognos Data Sets’ is ‘Now!’