Skip to content

MartenThompson/fault_sim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fault_sim

Experimental tools to simulate, detect, and classify conductive faults. A voltage signal is sent down a length of a conductor, and its echo analyzed for possible issue.

Overview

Experiments are conducted within the imagined simplified production scenario diagramed below.

imagined deployment flowchart

diagram

I have chosen to separate fault detection (baseline model in green) from classification (in blue) because the business context and purview of each task differs. The binary inference of fault/no fault from an echo forms the bedrock of the modeling task. It needs to be extremely robust, its error rates tightly understood and controlled. False positives waste user resources, false negatives risk environmental and social damage, and both degrade the product and product trust. Classifying faults, on the other hand, is downstream of that modeling task. Misclassifying a signal already known to be a fault is undesirable but not critically so, and if we afford this task flexibility we gain the ability for this model to adapt and grow. See the Future Work work section for ideas in this direction.

I considered alternatives where detection and classification were modeled together, and certainly within the framework described above the two models can interact/share information. But, the rigid requirements of the fault detection model limited the expressivity of the distinct task of classifying faults.

Contents

Files withing the code/ directory constitue a python package for creating and modeling data, and experiments/ contains more notebook-style scripts for conducting analysis and producing artifacts.

Package

code/echo_simulator.py used to generate different sampling scenarios.

python3.12 -m code.echo_simulator -n 3 -o data/short.csv -f short -plot

code/baseline_modelers.py contains fault detection models, notably the MahalanobisBaselineModel which classifies echos according to the Mahalanobis Distance.

The contents of code/fault_modelers.py classify faults using statistical inference and business logic. Currently, they classify hard faults (open and short).

Experimental Results

Our fault detection model forms the bedrock of the modeling task; it measures the similarity/difference between its training data and a newly encountered echo. A first step is in tuning its sensitivity. The basic premise of this experiment was to

  • Train a baseline fault detection model.
  • Sweep across a range of sensitivities, classifying fault and non-fault echos.
  • Produce artifacts which enable us to choose the sensitivity that controls error rates.

The figures below show baseline signal, short faults, and open faults.

Originally I had planned on studying short and open faults, but the former (as conceived in this simulation) was not challenging enough to be interesting and the fault classifier performed with perfect accuracy. Since Mahalanobis distance operates on correlations, an inverted signal is extremely disimilar to baseline.

The open fault was more interesting, representing only a shift in echo peak, not inversion. I constructed the problem to be challenging: the baseline peak was located at 0.8 * conductor_length and the open fault occured at a location uniformly distributed within (0.7 * conductor_length, 0.9 * conductor_length).

Over 500 simulations, the baseline classifer exhibited the following error rates.

Someone implementing this algorithm could then choose the significance threshold to deploy based on the desired error rates.

Future Work

This is a rich problem space that affords many extensions. Both the model architectures and model inputs as described above could be expanded for greater accuracy and richer product features.

The simplified deployment scenario and corresopnding model architectures described above are static insofar as the fault detection model and classifier are trained, then deployed, and remain unchanged. In reality, both models would benefit from online learning.

  • The fault detection model currently trains during a short burn-in period. Real-world sites will exhibit seasonality, varied loads, and other changes to the no-fault data generating process that the detection model should also learn.
  • The fault classification model currently leverages a fixed set of business logic to classify hard faults. Future work could develop a continuous process by which known faults are classified, unknown faults labeled as such and collected. Once the system has encountered sufficiently many unknown faults, it clusters them and graduates any consistent group to a new classification.

Finally, the inputs to the fault detection and classification models were intentionally kept simple (the voltage echo itself). Their accuracy and expressivity may be improved by considering additional inputs like environmental data (temperature, rain) and grid data (other sensing infrastructure).

About

Sandbox environment to simulate, detect, and classify conductive faults.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages