ReliabilitySPOT is on temporary hiatus through July due to bereavement. As always, thank you for your readership but also your well wishes and understanding during this time. -Andy

CBM2010 – Condition Monitoring Summit June 8-11

I hope to see you down in beautiful Ft. Myers, Florida at the CBM2010 Condition Monitoring Summit, June 8-11.  The fine folks at Reliabiltyweb.com are hosting this learning event at their brand new facility: Reliability Performance Institute.

Reliabilityweb.com publisher Terrence O’Hanlon has informed me that this event is specifically geared toward entry-level maintenance and reliability managers and technicians interested in establishing a Condition Monitoring program at their site.  Presentations centered around the Predictive Maintenance Technologies will serve to clarify how these tools are utilized to effectively assess machinery health.  Town Hall style meetings each day will give each attendee the opportunity to share their experiences and/or receive answers to their most important CBM program implementation/execution questions.

Come join me as I look forward to meeting many of my ReliabilitySPOT subscribers in person at this event!  The Early Bird Rate with FREE Hotel ends today, so don’t delay! Click Here

Apollo 13: a Reliability Perspective – Part II

Have you ever wondered why movies were never made about Apollo 12 or 14?

In Part I, I focused upon the events that led up to the Apollo 13 “successful failure”.  In Part Two, let’s consider how dramatic failure episodes are not only generally accepted at some industrial facilities, but even celebrated in a Culture of Failure!

As folks watch the movie Apollo 13, they generally accept and do not question the fact that a catastrophic malfunction, or failure occurred. After all, it’s what we expect from Hollywood.  It wouldn’t have been much of a show if the mission had gone as planned now would it?  This is not unlike how many folks react when a critical system or component reaches functional failure in a manufacturing facility.

An event happens that “no one saw coming”, it shuts the place down and gets everyone’s attention.  Life is no longer boring!  “When will it come back up?  What can we do in the meantime?  Expedite those new parts and get them in here as soon as possible!  Can we mitigate?  What about work-arounds?  Can we rob Peter to pay Paul?  We need to do whatever it takes!”  (Did I hear someone say: “Failure is not an option?”)

Capable staff personnel scramble for resources.  They expedite parts or materials to have them made in-house or at a near-by job shop.  They pull together their most knowledgeable engineers and technicians. They acquire or fabricate special equipment and tools to get the job done more efficiently.  Key management personnel receive hourly updates.  The emergency response may go on through the night, or several days and nights until they eventually “save the day”.  This is generally regarded as extremely heroic as “the team really pulled together and went above and beyond the call of duty”.  As relief sets in, the heroes are celebrated, and the resulting hero-worship can be very gratifying!  The resulting attention can be a real boost to maintenance personnel morale, especially if they believe they are generally under-appreciated.  Before long, the maintenance organization begins to clearly understand its true purpose:  emergency breakdown repair.

A Culture of Failure eventually develops within the organization.  If they could hand out Oscars for failure response, they would. This culture places great value on the emergency response to functional failures and almost completely dismisses the notion that perhaps they didn’t need to happen in the first place.  Operations struggles to make schedule between breakdowns, and sometimes tries to run 24/7 to keep up.  Sequels are very popular and have proven to be very lucrative for Hollywood.  In the Culture of Failure, one doesn’t have to wait very long before the next Apollo 13 happens and the cycle repeats itself over and over.

This is so wrong.  Failures can be eliminated.  The consequences of failure can be avoided.  Technologies and processes can be applied to virtually any asset to significantly reduce the risk of failure. But for these to be put into practice, a significant culture change must take place:  The Culture of Failure must be transformed into The Culture of Success!

Make sure and subscribe to ReliabilitySPOT so that you don’t miss my next post:  Machinery Healthcare Reform.

Apollo 13: a Reliability Perspective – Part I

“A successful failure…”  This is how their infamous mission was summarized by Commander Jim Lovell.  Whether you are 27, 77,  or somewhere in-between, you are certainly familiar with the amazing story of the Apollo 13 mission.  I wanted to do my part to commemorate the 40th anniversary of this event by touching on a few points of interest.

First, as you can see by the crew photo (left), these brave men were not Tom Hanks, Kevin Bacon and Bill Paxton. Hopefully this does not come as a shock to some of you! ;).  (Although I do know some folks who believe the moon landings were hoaxes!)

Second, the diligent efforts of the support personnel at Mission Control and the Kennedy Space Center were nothing short of heroic.  This episode in human history serves as a model to all of us who occasionally find ourselves in a tight spot; it truly was crisis management at its finest, ladies and gentlemen!  In the now famous words of Gene Krantz (played by Ed Harris in the Ron Howard epic): “Failure is not an option.”

He was of course referring to the rescue effort, but why did this “successful failure” happen in the first place?  As you probably know, Oxygen Tank #2 on board the Command Module exploded.  The explosion damaged Oxygen Tank #1, and crippled the spacecrafts ability to produce electrical power and breathing oxygen for the crew.  But like many catastrophic events, this one was caused by a series of mistakes.

Reuters Interview with Jim Lovell

It began 5 years earlier in 1965 with a breakdown in the Management of Change process. The manufacturer of the thermostat used to regulate temperature inside the oxygen tanks was not informed of a design change: the system voltage change from 28 to 65 volts.  NASA engineers had ordered all components be redesigned to operate at both voltages.  The next mistake occurred in 1969 at the facility that manufactured the tank itself.  The tank was dropped a distance of 2 inches during handling at the facility in California.  It was inspected, but deemed OK by technicians. (manufacturing process, acceptance testing mistakes).  In fact, the tank fill/empty tube was damaged.  A symptom of this problem became evident during a pre-launch test in March of 1970 when the tank could not be emptied.  NASA technicians worked around the problem by energizing the heater in the tank to change the liquid oxygen to a gas state.  This allowed the tank to empty, but energizing the thermostat caused it to fuse in the closed position.  The temperature in the tank then rose to 1000 degrees F, which damaged the stirring fan wiring insulation. These fans were used to stir the liquid oxygen to enable a more accurate quantity reading.  On the night of April 13, 1970,  the crew was instructed by Mission Control to “stir the tanks”.  The fans were energized, the damaged wires short-circuited which ignited the volatile liquid oxygen causing the tank to explode.

Make sure and subscribe to ReliabilitySPOT so that you don’t miss “the rest of the story”:  Apollo 13:  a Reliability Perspective – Part II

Cost Avoidance: Show Me the Money!

Previously in ReliabilitySPOT, I mentioned that KPIs can be used to reaffirm the value of the maintenance and reliability program to key decision-makers. In the field of Maintenance and Reliability, there are many different Key Performance Indicators that can be utilized.  A manager can trend lagging indicators designed to track business performance at a high-level, or leading indicators which target the performance of specific elements within an overall reliability strategy.

One example of such a leading indicator would be PM Completion rate.  Generally speaking, leading indicators tend to “measure the tool” rather than judge the results of the handiwork.  This type of data is of great interest to M&R managers who are charged with developing and sustaining an effective reliability program.

High level business leaders are less concerned with the tools of the trade and more concerned with the results produced by them. One group of lagging indicators that is always of great importance to a reliability program is cost.  Cost can be measured in many different ways and at many different levels within an organization.  But one of the most important cost measurements that can help translate the value of an effective reliability program to a key decision maker is Cost Avoidance.  Cost Avoidance is basically a measure of the monetary costs not incurred due to the proactive restoration of a piece of equipment.

I touched upon the usage of Cost Avoidance data in my 2006 Uptime Magazine article entitled:  “Soul Mates – Vibration Analysis and Bearing Analysis were made for each other” Here is an excerpt:

Last, but not least, my favorite.  In our case study, the spindle was only two months old.  To the skeptic of predictive technology, it would be absolutely ridiculous to replace a two-month old spindle because the “vibration guys” saw a few peaks and valleys on a chart.  It wasn’t making any noise yet, and there were no part quality issues.  There were people (as there are in every organization) who were pretty adamant about letting everyone know that “this spindle doesn’t need to be changed”.  In most organizations, that’s where it ends; the spindle may get changed, and the predictive analysts are criticized for replacing “good” spindles.  Then, if one burns up, they get hammered for that!  (can anyone else relate to this?)  As your reputation is damaged, it becomes harder to justify the resources needed to maintain or grow an effective program.

In our case study, we not only widely published our vibration reports before and after the spindle replacement, but also the Root Cause Bearing Analysis report.  We didn’t just get the word out, we provided a plethora of quantitative and qualitative data which supported our assertions and claims.  Nothing defuses hearsay better than THE FACTS.  And one of my favorite facts to throw around is COST AVOIDANCE.  SHOW ME THE MONEY!!!  That’s what it’s really all about, isn’t it? (bang for the buck, the proof in the pudding) So to aid in changing the mind-set, we maintain a visual display in front of the main office that details our latest successes, and emphasizes the cost avoidance.

Make sure and subscribe to ReliabilitySPOT so that you don’t miss my next post:  Apollo 13:  a Reliability Perspective – Part I

The Man in the Arena: Part II

In Part One, I spoke of how President Theodore Roosevelt described “The Man in the Arena” and how Peyton Manning and sometimes The Reliability Engineer can become the favorite subject of critics.  These “Armchair Quarterbacks” can share some common traits, among them:

  • Selective Memory:  They only remember when you fall short, ignore successes
  • Bias:  Tend to quickly gravitate to an assessment that fits their paradigm or is self-serving

Even if one hasn’t followed the NFL during the past 10 years, it doesn’t take much statistical research to find that Peyton Manning is not only one of the best QBs to take the field in the past decade, but of all time.  Along with his stellar career numbers, one can also note that he possesses a Super Bowl ring; that elusive prize that many greats have never attained (including his father, Archie).  So statistics may not be everything, but they can go a long way to shed light on the distorted claims of “Armchair Quarterbacks”.

In the field of Maintenance and Reliability, it is important to establish meaningful Key Performance Indicators.  Not only will KPI analysis help prioritize where the reliability engineer spends his or her continuous improvement energies, but these important measurement tools can also be used to reaffirm the value of the maintenance and reliability program to key decision-makers. This can be critical in terms of attaining funds for maintaining an effective program “that no one knows about”.

Finally, it should be obvious that a team like the Super Bowl Champion New Orleans Saints can be an example to all of us who strive for success, no matter what our endeavor.  Head Coach Sean Payton and his staff developed an effective strategy designed to leverage the talents of his individual players and seize opportunities as they became available.  Executing this strategy to near-perfection, these players excelled collectively as a highly-prepared unit with a common focus on the mission at hand (much to the delight of the Who Dat Nation).

For the first 22 years of their existence, the Saints were the laughing stock of the NFL and never had a winning season.  They were known as “The Ain’ts”.  But Coach Payton would not allow their embarrassing past to define the future (their first-ever Super Bowl performance).  He was there to craft an effective strategy to the make the goal attainable.  Coach was there to ensure that his team was fully prepared to execute his strategy.  Coach was there to assess his team’s strengths and weaknesses during execution in order to identify areas to improve upon and adjust the strategy.  He was there to challenge accepted conventional wisdom (on-side kick decision to open the second half).  Coach was there to change paradigms.  He was there to shout “Get up and do it again until you get it right!”

As ”The Man in the Arena”, it’s a good thing to have an experienced and knowledgeable Maintenance & Reliability coach on your side; one who is ready to help you forever change the M & R paradigm at your facility and enable measureable Reliability Success that defies the most experienced armchair quarterback!

Make sure and subscribe to ReliabilitySPOT so that you don’t miss my next post:  Cost Avoidance: Show Me the Money!

The Man in the Arena: Part I

By now you certainly know that the NFL served up a classic last night as the New Orleans Saints soundly defeated the Indianapolis Colts.  The game itself was exciting to watch, and the outcome was unexpected to say the least.  I guess that’s why they play the games!  There were tremendous efforts on both sides.  Both teams had put countless hours into preparation for what would be the most important challenge of their lives.  But what does this have to do with Predictive or Preventive Maintenance?  What does this have to do with Asset Reliability?

There’s a lesson here as we take a look at “The Losers” as some are referring to Peyton Manning and the Colts. There has been plenty of Peyton bashing going on as Colt fans and Manning haters ignore his past successes and focus their wrath on the one key mistake he made:  that 4th quarter interception. I have read that he is a “loser”, a “crybaby”, and that he ultimately doesn’t measure up with the greatest quarterbacks in NFL history.

Now as a Tennessee Titans fan, I usually enjoy it when Manning and the Colts lose.  But to lay all the blame for this loss on Mannings shoulders is just not right.  There were plenty of dropped passes, blown assignments, missed tackles and squandered opportunities to go around.  If we really take a close look at what went wrong for the Colts, we can spread the blame around to plenty of other folks along with the QB. But the critics will continue to beat the Manning drum.

As I hear these critics, I am reminded of the powerful words of the 26th president of the United States, Theodore Roosevelt:

“It is not the critic who counts; not the man who points out how the strong man stumbles, or where the doer of deeds could have done them better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; who errs, who comes short again and again, because there is no effort without error and shortcoming; but who does actually strive to do the deeds; who knows great enthusiasms, the great devotions; who spends himself in a worthy cause; who at the best knows in the end the triumph of high achievement, and who at the worst, if he fails, at least fails while daring greatly, so that his place shall never be with those cold and timid souls who neither know victory nor defeat.

These words are as relevant today as they were nearly 100 years ago when they were first spoken.  In many manufacturing facilities, process equipment sometimes fails, and many times it is the operations folks who are quick to point out that the maintenance team “blew it”.  The “Peyton Manning” of the maintenance and reliability team is usually the Reliability Engineer; otherwise known as The Man in the Arena, who’s successes are many, but who’s failures are most memorable.

Make sure and subscribe to ReliabilitySPOT so that you don’t miss my next post:  The Man in the Arena:  Part II.

Doctor… Which Predictive Technology Should I Use?

Whoa, whoa, whoa… hang on there a second!  To answer this question, we first have to answer another question:  What are we looking to find?  Many people have the idea that they are “The Vibration Guy”, or the “Infrared Dude”.  The fact is that if a PdM technician concentrates on a lone technology, he may find himself “wagging the dog”. By this I mean the reliability strategy should dictate which PdM technology(s) to use, not the other way around.  The technology(s) selected should be driven by the failure mode(s) of interest.  Once the most common failure modes have been identified, a strategy needs to be developed to accurately detect the failure mechanism physical characteristics.  This is not unlike what physicians do to diagnose a disease.

In manufacturing, several different predictive technologies can be used to collect data which, when analyzed, can give an indication of the machine or system condition.  In the medical industry, healthcare professionals also collect and analyze data from test instruments to assess a patient’s condition.  An accurate assessment of the patient’s condition is critical when determining the course of action for medical treatment, and the same holds true for machines.

A medical test can take on many forms, but most have something in common.  The majority of medical diagnostic tests attempt to detect a significant change in the human body.  Many times the detection process focuses on a particular byproduct of the disease in question.  For example, an oncologist may suspect that a patient has an early stage of cancer, so he tests a blood sample for the presence of a particular enzyme.  This data is regarded as significant because the presence of the enzyme is many times associated with a particular type of cancer cell.  If the enzyme is found, he can reason that such a cancer is more likely to exist and order follow-up testing to confirm his suspicion.  So, he did not actually see the cancerous tissue itself, but an identifying characteristic or byproduct that is associated with it.

Simply put, predictive testing entails use of test instruments to extend the humans senses.  This extension of the senses can give the analyst the ability to spot the early warning signs of a potential failure.  The use of a confirmation technology can serve to reaffirm the diagnosis… a second opinion, if you will. This second opinion can provide additional clarity as to the severity of a potential failure, as well.

It is up to the physician to bridge the gap between raw data and relevant interpretation based on his training, knowledge, and experience.  Likewise, it is the condition monitoring analyst’s task to bridge this gap as well, as it relates to the presence and degree of severity of a failure mechanism.

Your failure mode physical characteristics could include vibration, ultrasonic vibration, thermal energy, wear debris, chemical and physical property changes to reservoir oil, etc.  The presence of significant failure mechanism physical characteristics is identified by analyzing data collected using these predictive technologies.  Therefore, the analyst should be properly trained and proficient at multiple technologies in order to become most effective.

So to answer the original question, there is no general short answer; it really depends on your particular failure modes of interest.  These failure modes will drive which technologies will be most effective.  Training and mentoring with subject matter experts to help develop your strategy will improve your chances of success; and the huge cost savings that come with failure elimination!  Mentoring will also help your people focus on the failure mode rather than the PdM tool, and prevent the tail from wagging the dog!

Make sure and subscribe to ReliabilitySPOT so that you don’t miss my next posting:  The Man in the Arena.



Machine Tool Spindle Bearing Analysis – Part II

Part I of this story focused on the actual RCBFA results of a machine tool spindle bearing failure and the resulting cost avoidance.  In Part II, I give you: “The Rest of the Story” (if I may borrow this cliché from the late Mr. Paul Harvey).  Here I will show you the surprising results of the acceptance testing data and what happened as a result.

Does your predictive group conduct acceptance testing following a restoration?  Here are a couple of reasons why they should.  First, you really want to confirm that the work was done correctly.  Secondly (assuming the restoration was a success) you’ll also want to collect baseline data for future trending and comparative analysis.  The appropriate condition monitoring technology(s) should be selected based upon the physical characteristics of the dominant failure modes.

After the rebuild, new vibration data was collected.  An interesting discrepancy was noted between the digital and analog overall acceleration.  Additional data with an extended Fmax (1200 kcpm) confirmed that significant non-typical vibration was occurring beyond 600 kcpm.

This high frequency vibration is  beyond  the  frequency  range where  typical advanced-stage bearing  impacts  are generally observed, but is many times an indicator of very-early stage bearing  failure  (though  it may be several months  to years away).   It’s most likely either the result of microscopic metal-to-metal impacting between the bearing components, or contaminant particle interference.  Neither of these conditions should have been present immediately following bearing replacement, so the decision was made to take it back apart and determine what went wrong.

Upon careful inspection, it was determined that the bearing-spacer stack-up had been incorrectly calculated which resulted in a non-sufficient pre-loading of the precision angular-contact bearings. This lack of pre-load created slight looseness and allowed bearing components to impact each other, breach the lubrication film, initiate metal-to-metal contact, and create the high frequency vibration that was observed.

Precision Machine Tool Spindle

After correcting the problem, the high frequency vibration subsided and a significant 10 micron shift in the true position of the bored hole occurred as the spindle shaft was finally properly centered by the bearings.  You might say that 10 microns is not very significant, as this is only 1/7th the diameter of a human hair, however the true position tolerance was only 50 microns for this finish bore.  While not out of tolerance, the statistical capability of this feature was potentially in jeopardy. The corrected stack-up properly loaded the bearings, located the shaft on the centerline of the spindle housing bore, and corrected the position of this machined feature.  By the way, the reason why the true position shifted following the incorrect rebuild was because the “loose” spindle shaft was allowed to follow the rough hole location.  (There is always concentricity error between rough holes and finish bores as there is a more open tolerance for the drilled rough holes.)

So there you have it.  Condition Monitoring, RCBFA, and Acceptance Testing all combined to produce tremendous short and long term cost avoidance at this facility.  Downtime, Quality Spills, and possible Safety Issues can be successfully averted when you have the right reliability strategy to attack your dominant failure modes. What’s your plan of attack?

Make sure and subscribe to ReliabilitySPOT so that you don’t miss my next posting:  What Predictive Technology Should I Use?

Machine Tool Spindle Bearing Analysis – Part I

A couple of weeks ago, I stated that by bringing in a PdM coach, to mentor an existing PdM program, “the gap between failure mode understanding and PdM data  interpretation can be permanently bridged,  thus preserving the credibility of the PdM effort” (1-07-10 post).

One way that we can accomplish this is by embedding Root Cause Bearing Failure Analysis at your facility.  RCBFA is a value-added practice for you, the diligent machine tool condition monitoring analyst.  Not only can the understanding of historical predictive data become enhanced through RCBFA, but the failure triggers of chronic failure mechanisms can be better understood as well.  Once understood, you just may even be able to do something about them, maximize your Mean Time Between Failures (MTBF), increase uptime, and generate millions of cost-avoidance dollars as chronic bearing failures become a distant memory.  Sound too good to be true?

Our story today begins with a machine tool spindle that was being used to bore a hole in an aluminum component at an automotive engine manufacturing facility.  Vibration Analysis was used to identify the in-process bearing failure and eventually the rebuild work was planned, scheduled and executed.  After restoration, a RCBFA was conducted on the old bearings.  The  front bearing suffered  the most severe damage  in  the  form of a deep spall on the raceway surface of the inner race.  This was the primary source of the vibration.

It was determined through RCBFA that this  bearing  failed  due  to  water  contamination  of  the  grease.    The  non-contact labyrinth seal did not prevent  the  ingression of  this contamination.  The observed static  iron oxide corrosion  is direct evidence.  During idle periods, the moisture settled out of the grease and accumulated at the contact points  between  the  rolling  elements  and  the  races.    This  resulted  in  iron  oxide deposits equally-spaced in the pattern of the rolling elements.

The  reduced  lubricity  of  the  grease due to water contamination resulted  in  metal-to-metal  contact between  the  rolling  elements  and  the races.    This  contact  led  to  secondary surface  damage  in  the  form  of microscopic  skidding, denting  and gouging.  The damaged contact surfaces of  the  bearing  components  resulted  in surface fatigue,  or  spalling;  the  most extreme  area  observed  on  the  raceway surface of the front bearing inner race (as shown above).

There were over one thousand of these spindles in the plant, and contaminated bearing grease was found to be a very common problem that had not been previously identified.   This type of bearing failure had become accepted as a normal cost of doing business. But once this new evidence was communicated, an investigation revealed several likely ways in which water could be getting beyond the labyrinth seal.  The cost avoidance for just one eliminated spindle failure was in the tens of thousands of dollars!

When a problem goes unrecognized, then nothing gets done to resolve it.  So as you can see, there is a substantial cost to doing nothing:  the long-term cost of allowing chronic failure triggers to exist can easily total millions of dollars. For many organizations, it is an excellent investment to bring in a consultant who can mentor their people to help establish valuable practices such as RCBFA  and empower them to go out and find some of this buried treasure (cost avoidance dollars) that could be right under their noses!

Make sure and subscribe to ReliabilitySPOT so that you don’t miss my next posting:  Machine Tool Spindle Bearing Analysis – Part II.  In Part II, I’ll discuss the acceptance testing of the spindle in this case study.

Infrared Leak Detection at JJ’s Barbecue

It’s been a mighty chilly first two weeks of January in Middle Tennessee: the coldest since 1942. The result:  widespread frozen and broken water pipes for both residential and commercial buildings.  One such case occurred last week and shut down a local favorite:  JJ’s Barbecue of Columbia, TN.

Well, the people of Columbia need their BBQ, so this broken line needed to be fixed, and fixed fast!  City workers had shut off the valve at the meter after water was observed to be flowing into the street near the fenced-in dumpster enclosure.  A frost-proof yard hydrant was located in the front corner of the enclosure, however no surface water had been observed flowing from this location.  As a practicing Infrared Thermographer and friend of the proprietors, the author was requested to help find the location of the underground water leak in order to minimize parking lot damage during the repair process.

JJ's Barbecue (Dumpster Enclosure is at the extreme left side of photo)

Dumpster Enclosure (Hydrant in Left-Front Corner)

Underground leak detection is one of the many ways in which Infrared Thermography can be very useful.  This is especially true when there is a great difference in temperature between the leaking fluid or gas, and the ground, as is the case with underground steam line leaks.  But detecting the location of underground water leaks can be a little more difficult.

We arrived at the scene about an hour before daybreak in order to capitalize on the extreme cold and lack of solar influence on the surface.  The ambient temperature was approximately 12 degrees F with virtually no wind.  The plan of attack was to open the valve and watch for temperature change on the surface of the ground, as this could indicate where the warmer water was flowing underneath.  Within ten minutes after opening the valve, surface water began flowing from a seam in the concrete under the fence to a storm sewer drain located in the center of the enclosure.  This location was about 8 feet from the hydrant.

Left: Digital Photo Outside of Enclosure, Right: Thermal Image (note concrete emitting heat)

After another ten minutes had gone by, water began flowing from the opposite side of the dumpster slab and into the street.

Within minutes, the surface temperature of the concrete along the west seam of the slab was observed to be rising, especially in the corner opposite of the hydrant where water was observed to be flowing into the grass.

Left: Digital Photo, Right: Thermal Image

A slight temperature increase was then observed about a foot from the hydrant along this same seam.  Based on this trending, the point of origin was determined to be at the hydrant, even though no surface water was evident.

Left: Digital Photo, Right: Thermal Image

The concrete surrounding the hydrant was then removed to reveal the broken plastic pipe below the surface.

Removal of concrete around hydrant to expose broken pipe

Root cause:  hydrant and supply pipe were not buried deep enough when installed.  Though the line had not frozen for decades, the extreme cold of January 2010 was enough to finally reach the water below the hydrant.

The line was repaired and before long, Julie and Wayne were back in business cranking out the best barbecue in Middle Tennessee!

Make sure and subscribe to ReliabilitySPOT so that you don’t miss my next post:  Machine Tool Spindle Bearing Analysis – Part I.

How Does a Maintenance Manager Implement PdM?

As I’ve stated in previous posts, it is more cost effective to employ the predictive technologies to monitor the condition of critical equipment than to just rely on time-based preventive maintenance tasks.  But  how does a maintenance manager make this happen?  There is more to condition monitoring than simply throwing money at the effort. Many organizations fail to execute their PdM programs effectively because they do not understand that it requires more to achieve and sustain a successful implementation than the purchasing of PdM gadgets.

Selecting the right people to do the job is paramount, and training them is essential.  However, the foundation of a sound PdM program relies on something that is often overlooked: experience.  For a new in-house PdM program, the personnel selected should have the will and ability to learn and apply complex technical information.  They should also possess vast experience in dealing with the failure modes of interest.  Unless expert PdM knowledge and experience can be found within your organization, Reliability Consultants experienced at implementing successful PdM programs should be utilized to help develop the overall Reliability Strategy, improve the probability of success, and ensure a significant return on investment.

A deep understanding of the most common failure modes is at the core of the process of developing a refined condition monitoring program.  The physical characteristics of the failure modes are ever-present and they speak a language that is interpretable to the trained and experienced analyst.  So it is up to the analyst to somehow choose the correct predictive tools and apply them appropriately with the proper set-up parameters, based on the perceived progression of the primary failure process. This can be a very formidable task, but by bringing in a PdM coach, the gap between failure mode understanding and PdM data  interpretation can be permanently bridged,  thus preserving the credibility of the effort.

In addition, methodology refinements derived from initiatives such as Root Cause Failure Analysis can lead to more accurate machinery health assessments.  This enables the analyst to more confidently communicate his corrective maintenance recommendations so that maintenance planners have ample time to plan and schedule the event proactively, prior to functional failure and downtime.

Functional failure can equal downtime and the significance of possible functional failures is determined after considering the operational context of the asset in the facility.  Such considerations would include whether or not there is built-in redundancy in the manufacturing process.  Is this the only machine that performs the operation?  Another would be whether or not this asset is one of many links in a continuous manufacturing chain.  Sometimes the most cost-effective strategy for a piece of non-critical equipment is to allow  it to run-to-failure!

Many times it is beneficial to bring in a different set of eyes to help with assessing your current state and mapping out a course for the future to achieve and sustain reliability improvements.  Experts are available and the maintenance manager can bring them in to provide the coaching and knowledge transfer that can develop the expertise needed to empower his people to become successful at monitoring the condition of his critical equipment, improve planning and scheduling, reduce spare parts inventories, and forever avoid the costs associated with critical equipment downtime.

Make sure and subscribe to ReliabilitySPOT so that you don’t miss my next post:  Infrared Leak Detection at JJ’s Barbecue.

.

Condition Based Maintenance – Saving the Earth?

When discussing industrial maintenance best practices, the first thing that comes to mind for most people is Preventive Maintenance.  Preventive Maintenance has its place in the world of reliability, however by definition; PMs are “time-based.”  This means that these tasks are performed based upon a regular time interval.  As long as the prescribed time interval is short enough to guarantee that the consumable to be replaced has not reached the end of its life, then all is right with the world.  Or is it?  In reality, there is a tremendous amount of waste that comes along with Preventive Maintenance, and as is true with all waste, there is a tremendous cost associated with it. However, well-established and universally accepted practices can sometimes go unchallenged because they are considered to be part of the normal cost of doing business.  Nevertheless, the consequences of waste must be paid for by someone.

An excellent example that’s easy to relate to in our everyday lives is the basic automobile engine oil change.  Since the beginning of time (or at least the past 80 years or so), we have been conditioned into the belief that our engine oil must be changed every 3,000 miles or 3 months (whichever comes first) in order to keep our engines protected from premature wear.  This “time-based” maintenance activity entails draining the existing oil and replacing it, along with the filter element.  Of course, this new oil must be purchased along with a new filter, and you’ll have to pay someone to do the work, unless you happen to be part of the minority who still change their own oil.  In either case, this old oil must be disposed of at significant cost to both our pocketbooks and the environment.

A greener alternative would be to replace our engine oil as it approaches functional failure, or the point in time at which it can no longer perform any one of its intended functions to the required standards.  In order for this to be possible, condition monitoring devices would have to be able to accurately measure critical properties of the oil and then call for replacement when analysis of the data indicates that functional failure is eminent.  This technology exists today (click here) and can significantly reduce the costs associated with oil changes.

How many people today are still changing their oil every 3,000 miles because “that’s just the way my daddy showed me?”  In industry, the over-maintaining of equipment is a safe way of doing business, but it’s very costly.  At a time when manufacturers are looking for every way possible to reduce operating costs, eliminating overkill by replacing obsolete time-based PMs with condition monitoring is a great way to pick some of that low-hanging fruit!  And in the case of oil changes, we might just be saving our world in the process!

Update on 01-17-10:  IntelliStick article link

Machine Tool Condition Monitoring

Here is an excerpt from a presentation that Electrical PdM Consultant Rick Ratz and I gave a few years ago at the Automobile Manufacturing Technical Education Collaborative in Louisville, KY.  It begins by showcasing the major components of a Multispindle Head (commonly used on Machine Tool Transfer Lines) and includes the  case study of a gearbox bearing failure that was detected using Vibration Analysis and Wear Debris Analysis.  Also included is an overview of Motor Circuit Analysis with case studies.       

Condition Monitoring: Modern Medicine for Machines

In the examination room during an annual checkup, a physician gathers data.  He collects the data by asking questions, using basic human senses (such as sight and touch) and by employing a few simple instruments such as a thermometer or blood pressure monitor.  He also collects a blood sample to send to a laboratory.  There at the lab, the sample is tested to measure certain indicators, such as cholesterol, triglyceride levels, etc.  This additional data further helps the doctor assess the patient’s health.

There may be no significant problems, or the data may reveal the possibility of something more serious.  The doctor may order additional tests (MRI, EKG) that are more specific to eliminate or confirm the possible condition, and/or the degree of severity.  The best, most modern tests provide him with the data that allows him to make the most accurate assessment of the patient’s condition. After the careful analysis of all available data, the doctor makes his recommendation.  Unfortunately in some cases, major surgery is required to correct the problem.

Most of us are familiar with the “Predictive Maintenance” buzz-word, but a better, more accurate term is CONDITION MONITORING.  Just like instruments used in the medical field, predictive technology “gizmos” are used to gather and quantify data from machinery in order to monitor its condition.

This data can detect a symptom generated by a machine problem; a symptom which is not yet evident to the human senses. For instance, Infrared Thermography can allow the slight heat resulting from a loose electrical wire connection to be “seen”.  The impacting forces generated from a deteriorating bearing can be “seen” with Vibration Analysis, and actually “heard” with Ultrasound.

When properly analyzed, this PdM data can allow for an EARLY and accurate assessment of the machine’s condition. This assessment is considered along with other criticality criteria to determine the best course of action for the piece of equipment (whether or not to restore it to its original condition, and when).

Sometimes, a simple up-front corrective action can stop the progression of a potential show-stopper.  Other times, a major teardown is required to fix the problem (major surgery).  In both cases, the patient’s health is restored BEFORE he misses work due to an unplanned hospital stay (machine breakdown related downtime).

So, Machinery Condition Monitoring can be thought of as the industrial equivalent to state-of-the-art diagnostics used in today’s modern medicine.  When properly utilized, predictive technologies can “keep our finger on the pulse” of our machines, and give us the time to perform the appropriate corrective actions before catastrophic failure occurs.

Rolling Element Bearing Lubrication… “How do it work?”

In order for a rolling element bearing to live a long and prosperous life, it must be properly lubricated, kept free of contaminants, and operate within the limits of its design.  There are many different types of rolling element bearings used throughout industry such as deep groove, angular contact precision, tapered roller, etc.  Certainly, the presence of foreign material in a bearing set will almost always cause irreparable damage to component surfaces resulting in a shortened life span.  However, proper lubrication is vital and is sometimes overlooked when analyzing premature bearing failure.

Bearing lubrication serves to keep the surfaces of the raceways and rolling elements from coming in contact with each other at a microscopic level.  This form of lubrication is known as elastohydrodynamic lubrication.  At the point of contact between the rolling element and the races, extreme pressures approaching 1 million psi can exist.  Under ideal conditions, this pressure actually causes the viscosity of the lubricant to increase and keep the surfaces separated.

There are several variables that are critical in maintaining this lubrication barrier including the physical properties of the grease or oil, the surface finish of the bearing components, and the load forces that the bearing is exposed to.  Transient events, process and environmental contamination, rotor imbalance, and temperature variations can change the conditions that the bearing was designed to operate in, and thus shorten its life by causing a breach in the elastohydrodynamic lubrication film.  This breach causes bearing component contact to occur which results in very high frequency vibration and is detectable using the Ultrasound and/or Vibration Analysis predictive maintenance technologies.

But if the proper lubricant is used, the bearing is used in an application so as to not exceed its designed operational limits, and contamination barriers work as intended, then one can expect a rolling element bearing to reach its intended life span as designed:  to live long and prosper!

Lake Michigan Thermal Energy Affects Winter Storm

Infrared Thermography is an important predictive maintenance tool used throughout industry to identify equipment failures that are in-process.  The basis for the application of this PdM technology is that the emission of thermal energy, or heat, is associated with many in-process failure modes.  But is thermography the only way to detect heat?  Absolutely not, as is evident in this radar image seen today at Intellicast.com.  The emission of thermal energy from the relatively warm waters of Lake Michigan is causing the frozen precipitation of the storm to change back into rain!   As you can see, this effect diminishes and disappears in the northern half of the lake as the lake temperature is lower due to generally lower ambient temperatures in the northern region.

The Effect of Heat Being Emitted from Lake MI.

Radar Image Taken at 16:15 GMT

The Pulse of Reliability: Coming soon…

The Pulse of Reliability, a recurring poll series posted weekly on Sundays,
is brought to you by ReliabilityMAX©.  In addition to the poll for the current
week, the results from the previous week will be published as well.