Is your organization 'Data Driven'?

Does your organization collect data that looks like this?

[Removed and replace barring in the machine took a while I followed the spec but found a loys of issues needed to order part 12345 in place of 12456 due to quicker instalation]

Does this scenario sound familiar?

You are a manufacturer of widgets with 100 machines in plants across the country. Your organization has many unscheduled downtimes for no obvious reason, and these events are costing your organization time, money, and your good reputation with your customers.  Your gut is telling you there is a problem with your machines.  In order to validate your hunch, you pull 5 years of sensor data (structured data) that monitors your machines, and you correlate your technician’s pm, or periodic maintenance data (unstructured data).  You give the project to your business analyst who is trained in pattern matching algorithms, and by using some simple Excel tools, such as regression analysis, your business analyst determines that the downtime is due to the technician’s periodic maintenance (pm) schedule.   But you need more information. 

It can sometimes take a significant investment to become data driven. Building a repository of over 5 years of data entails significant infrastructure and the ability to orchestrate a well-coordinated capture of data at differing intervals. Determining when to increase/decrease polling intervals of sensors or logs can impact the size of the IT infrastructure required to support the massive amounts of data. The downside of less frequent polling intervals is the risk of a reduction in data outcomes produced. Some other factors include the ease of access to structured (databases) and non-structured data such as (logs, technician work orders).  Once you have identified the data types and sources needed, you might select one or two projects that will add value to your organization.  From there you can measure the outcome of the project and validate the approaches taken.  

In the above scenario we show examples of structured data from a collection of data, called a Data Historian (a system that records and retrieves production and process data by time), and data from technician work order system in an unstructured format. Structured data typically is stored in a relational data form which is very easy to read as input into a computer.  On the other hand, unstructured data is non-relational and therefore very difficult to read as input.   This is due to the behavior of a technician during the act of entering a maintenance note which is typically written for their own information, in their own words or abbreviations. Unstructured data, in this case, is what the technician types into a free form field.  

At this point, a data scientist could further investigate and discover that there is a pattern which correlates downtimes and the pm, and determine exactly how often this is happening as well as its impact on the business.  Prior to this analysis, the pattern match was too infrequent to notice or simply determine by a hunch.   The consequences of such obscure patterns are very costly.  In this scenario of a manufacturer trying to understand the cause of pesky downtimes, there is a real world example of how businesses are relying on hunches to solve problems, rather than adopting a culture of data.

What lessons did this organization learn from its data?

  • What variables are important to the discovery of the machines’ downtimes.
  • The more data you have, the clearer the outcomes.
  • Which tools are required to view all of the structured and unstructured data.
  • The accessibility to sensor data at a proper polling frequency with the ability to increase polling of sensor based on an established set-point.
  • Your gut was wrong, therefore you need a data driven analysis.
  • You need a resource that is trained in the use of methodical tools to properly interpret the data.

The following concepts are some of considerations your organization may need to implement prior to becoming a data driven culture:

  • Your organization needs to store 5 - 7 or more years of sensor data which may not always be reliable.
  • Do you have the right amount of sensors monitoring your machines and are they polling at proper intervals?
  • Does your manufacturer provide data baselines and patterns to match?
  • Do you have the ability to correlate the structured and unstructured data to use as a common data set?
  • What are the required methods of capturing unstructured/structured data and archiving the data?
  • Determining what variables are within the common data set will help determine a pattern match.
  • How can you eliminate ‘noise’ within the dataset?
  • Do you know the required algorithms which can be applied given the data you have?
  • Determine if you need a resource or consultant that can help provide tools and insights needed to make recommendations derived from the data outcomes.

So where should you start?  Start with a systems approach to your business.  Understand your workflows, inputs and outputs.  Analyze your workflow and data to understand which algorithms can be applied and what kind of value they can produce. Start off with the most important opportunity, and build up from there. This effort will take some time to establish, and a significant investment within your organization will be required.  Your organization may need to make radical changes to become truly data driven.

Author: Sam Marrazzo
With over 25 years of IT experience with a focused in Manufacturing and Supply Chain.

IT WNY GOLD Sponsors

  • As Upstate New York’s largest member services organization, AAA Western and Central New York provides nearly 900,000 members with travel, insurance, financial and automotive related services. Since its founding in 1900, AAA has been a leading advocate for the safety and security of all travelers. Visit AAA at

  • 360PSG

    Making Web Development More Accessible To You.

    Our mission is to provide web solutions in such a way that small and mid-size companies have an affordable option, without sacrificing the quality and expertise that is usually associated with large, diverse and expensive agencies.

  • Advance2000 is a full-service managed IT service provider specializing in Private Cloud Computing. From our high performance Cloud Computing to our comprehensive IT solutions, Advance2000 is renowned for streamlining customer’s cumbersome IT infrastructure while helping them optimize their business performance.

  • At AT&T, we’re bringing it all together. We’re helping people connect with advanced mobile services, next-generation TV, high-speed Internet services and smart solutions for businesses.

    Fast, highly secure and mobile connectivity – to everything on the Internet, everywhere, at every moment and on every device – is what drives us at AT&T.


  • We strive to provide better healthcare to our members, offer greater value to our customers, and improve the health of our communities.

  • We will maintain an unfailing commitment to understand our clients' needs and to provide comprehensive, practical, high-quality, and responsive solutions. We will maintain the highest professional and ethical standards, and continue the Firm's historical commitment to, and leadership in, community activities, pro-bono work, and service to the Bar.

  • Buffalo Business First

    View Daily Local Business News, Resources & more in Buffalo, New York.


    At M&T Tech, we’re a team of makers, doers, and builders working to create the most advanced technology solutions in banking.  We’re not your stereotypical suit and tie bankers: we’re an innovative team of leading tech experts, pushing boundaries, and taking risks. Join us and be part of something new as we build tomorrow’s bank today.

  • Rich Products

  • The Beckage Firm, PLLC is an exclusive law firm that counsels select organizations and high-net-worth individuals on incident response, innovation, business strategy, crisis preparedness, crisis management, and other tech and data security and privacy matters.