Reliable data and where to find them

Hear from the CIO, CTO and other C-level and senior executives on data and AI strategies at the Future of Work Summit on January 12, 2022. Learn more

This article was contributed by Susan Wu, Senior Director, Marketing Research Pubmatic,

Data is the cornerstone of modern business, with the potential to evoke knowledgeable, progressive discoveries in the context of business decisions. But one dataset can tell many stories, and sometimes those stories are not aligned with reality. The latest forecast data released before the 2021 NJ governor election provides an example. Data predicted a big lead for current Phil Murphy, but in the end, he won by an extremely thin margin.

This is not the first or perhaps the last time that a dataset has exposed a lie, which raises the question: is this data reliable?

While the answer is not always clear, data can be effective and informative when handled properly. In today’s business environment data sources are virtually unlimited and constantly evolving, creating unprecedented opportunities to successfully take advantage of data, yet there are also numerous difficulties when improperly analyzed and implemented. To avoid such failures, datasets need to be accurately defined, data limitations identified and reliable data established.

Dataset defined

Quant’s data science, or information that can be measured or measured, clearly plays an important role in business decision making, but it should not be seen as the perfect path to success due to the numerous disproportionate abstractions that inevitably arise in the analysis and application of data. . In other words, relying entirely on quantum data to reach decisions can lead to disappointing results.

No cookie-cutter method of analyzing data has yet been discovered. However, explicitly and precisely posing problems dramatically increases the chances of resolving data-specific issues. Our team, for example, generates a quarterly industry report that looks at advertising costs by industry categories. We tried to understand which advertising categories have been most affected by the events – the global health crisis (which is still a major ongoing event), the US presidential election, the housing boom and, most recently, the economic recovery – and how the market will turn out. Recover, so we can anticipate or at least manage potential future effects.

While regression would prove excessive for such research, the categorization and segmentation techniques were helpful to understand the seasonal and discretionary costs between categories. The epidemic naturally created discrepancies, which had to be taken into account in the data. Initially, looking at year-over-year changes over a specific 2020 month only showed that advertising spending was declining. But looking quarter-over-quarter, we were able to extract the leading category indicators running the various phases of the recovery, representing the trend lines more accurately.

Data limitations

Data cleanliness is king, although it always comes with data limitations. Consistent, quality, unbiased data is a source of impressive insights into trends, while compromise in these areas creates bias in information. To alleviate this concern, it is important to have constant and conscious awareness of the limitations of data (e.g., understanding how and where data was mined) and to find ways to control data.

Attitude analysis is used to predict future events based on historical behaviors. In the case of our quarterly global digital advertising spending reports, the epidemic made the analysis extremely challenging due to market volatility over an extended period. To create discreet analyzes at the industry level, we use the regiment protocol for raw data: how it is regularly mined from our systems to create error-free datasets for analysis. Data is collected from other sources, “checked and balanced”, and then checked to make sure there is no inadvertent bias in the data pool. Only then can we begin the analysis, as the result will be much more accurate.

Reliable data analysis

Less is more when analyzing and writing data. Readers generally do not need every detail, and data reliability benefits significantly from improved and focused writing skills. Objectives rule when writing data-specific content. The purpose of the insight should be to clarify only the essential aspects of the story. With strong written analysis the reliability of the data increases rapidly, as does the likelihood of its successful application in business applications.

Another, however, important element of data reliability is the constant research and learning from other research and data professionals. Innovative approaches and new data resources are constantly coming to the surface at a frequency never seen before. Keeping up with current trends in a constantly evolving field is a task in itself, yet failure to do so can make all data processes irrelevant and, ultimately, lead the business down the dinosaur path.

Data is ubiquitous. On the one hand, data is essential for making informed business decisions in today’s global business environment. On the other hand, it poses a formidable, constant challenge to accurately interpret a dataset-specific for any given purpose. In the end, the quality of the data analysis is just as valuable. The more refined and sophisticated that process, the more invaluable role data can play in everyday decision making.

Susan Wu is a senior director of marketing research Pubmatic,


Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including tech people working on data, can share data-related insights and innovations.

If you would like to read about the latest ideas and latest information, best practices and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing to your own article!

Read more from DataDecisionMakers

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *