Is your organization 'Data Driven'?
Does your organization collect data that looks like this?
[Removed and replace barring in the machine took a while I followed the spec but found a loys of issues needed to order part 12345 in place of 12456 due to quicker instalation]
Does this scenario sound familiar?
You are a manufacturer of widgets with 100 machines in plants across the country. Your organization has many unscheduled downtimes for no obvious reason, and these events are costing your organization time, money, and your good reputation with your customers. Your gut is telling you there is a problem with your machines. In order to validate your hunch, you pull 5 years of sensor data (structured data) that monitors your machines, and you correlate your technician’s pm, or periodic maintenance data (unstructured data). You give the project to your business analyst who is trained in pattern matching algorithms, and by using some simple Excel tools, such as regression analysis, your business analyst determines that the downtime is due to the technician’s periodic maintenance (pm) schedule. But you need more information.
It can sometimes take a significant investment to become data driven. Building a repository of over 5 years of data entails significant infrastructure and the ability to orchestrate a well-coordinated capture of data at differing intervals. Determining when to increase/decrease polling intervals of sensors or logs can impact the size of the IT infrastructure required to support the massive amounts of data. The downside of less frequent polling intervals is the risk of a reduction in data outcomes produced. Some other factors include the ease of access to structured (databases) and non-structured data such as (logs, technician work orders). Once you have identified the data types and sources needed, you might select one or two projects that will add value to your organization. From there you can measure the outcome of the project and validate the approaches taken.
In the above scenario we show examples of structured data from a collection of data, called a Data Historian (a system that records and retrieves production and process data by time), and data from technician work order system in an unstructured format. Structured data typically is stored in a relational data form which is very easy to read as input into a computer. On the other hand, unstructured data is non-relational and therefore very difficult to read as input. This is due to the behavior of a technician during the act of entering a maintenance note which is typically written for their own information, in their own words or abbreviations. Unstructured data, in this case, is what the technician types into a free form field.
At this point, a data scientist could further investigate and discover that there is a pattern which correlates downtimes and the pm, and determine exactly how often this is happening as well as its impact on the business. Prior to this analysis, the pattern match was too infrequent to notice or simply determine by a hunch. The consequences of such obscure patterns are very costly. In this scenario of a manufacturer trying to understand the cause of pesky downtimes, there is a real world example of how businesses are relying on hunches to solve problems, rather than adopting a culture of data.
What lessons did this organization learn from its data?
- What variables are important to the discovery of the machines’ downtimes.
- The more data you have, the clearer the outcomes.
- Which tools are required to view all of the structured and unstructured data.
- The accessibility to sensor data at a proper polling frequency with the ability to increase polling of sensor based on an established set-point.
- Your gut was wrong, therefore you need a data driven analysis.
- You need a resource that is trained in the use of methodical tools to properly interpret the data.
The following concepts are some of considerations your organization may need to implement prior to becoming a data driven culture:
- Your organization needs to store 5 - 7 or more years of sensor data which may not always be reliable.
- Do you have the right amount of sensors monitoring your machines and are they polling at proper intervals?
- Does your manufacturer provide data baselines and patterns to match?
- Do you have the ability to correlate the structured and unstructured data to use as a common data set?
- What are the required methods of capturing unstructured/structured data and archiving the data?
- Determining what variables are within the common data set will help determine a pattern match.
- How can you eliminate ‘noise’ within the dataset?
- Do you know the required algorithms which can be applied given the data you have?
- Determine if you need a resource or consultant that can help provide tools and insights needed to make recommendations derived from the data outcomes.
So where should you start? Start with a systems approach to your business. Understand your workflows, inputs and outputs. Analyze your workflow and data to understand which algorithms can be applied and what kind of value they can produce. Start off with the most important opportunity, and build up from there. This effort will take some time to establish, and a significant investment within your organization will be required. Your organization may need to make radical changes to become truly data driven.
Author: Sam Marrazzo
With over 25 years of IT experience with a focused in Manufacturing and Supply Chain.