You are working on a big data project that collects data from sensors that can be polled every 0.1 of a second. But just because you can doesn’t mean you should, so how do we decide how frequently to poll sensors? The tempting answer is to collect it all, every last ping. That way, not only is your back covered but you can …

COMMENTS

House rules Send corrections

This topic is closed for new posts.

Friday 23rd May 2014 09:41 GMT Aitor 1

I can't use it

I already knew the Sample Frequency I need, and I can't use it. Almost nobody in telcos can.

Reasons: it would bring to a halt all the systems.

So we are stuck with "good enough" and "hope for the best". We have also better approaches: a mix between data polling and SNMP traps (alerts).

Anyway, I think that what people need is top down alerts AND checks. so you just klnow the health of the system, you aren't really interested in components.

2 0
Friday 23rd May 2014 10:33 GMT Arnold Lieberman

Just occurred to me

What would happen if a waveform was sampled at non-regular (but defined) intervals? This way, the chances of always measuring a signal as it crosses zero would be solved. One for the mathematicians, I think.

0 1
1. Friday 23rd May 2014 12:26 GMT Stevey
  
  Re: Just occurred to me
  
  If you're measuring something that is periodic and predictable then you may be able to do that. But if it's periodic and predictable why are you measuring it?
  
  If the signal isn't periodic and predictable, then the longest sample period ( lowest frequency) of your non-regular interval needs to be half the period ( or twice the frequency ) of the signal change that you are interested in, otherwise you will run the risk of missing something.
  
  1 0
  1. Friday 23rd May 2014 18:07 GMT Anonymous Coward
    
    Re: Just occurred to me
    
    "If the signal isn't periodic and predictable, then the longest sample period ( lowest frequency) of your non-regular interval needs to be half the period ( or twice the frequency ) "
    
    In theory yes, in reality you need to sample more than that because if for example you're measuring a sine wave and measure at twice the frequency you could hit the point at which the wave crosses zero every time in which case it'll look like there is no signal. In reality you need to measure 2x the frequency and also 2x the frequency at 90 deg phase shift. This is partly how a discrete fourier transform works - it multiplies by the sine AND the cosine before integrating.
    
    0 0
Friday 23rd May 2014 12:20 GMT Martin Gregorie

Slight case of subject drift

The article started off talking about stored data volumes, i.e. storing logging data, and then drifted off into sampling rates, which is all very interesting and must be considered when deciding how to get a true picture of the behaviour over time of the variable being sampled.

The answer to the storage problem, that I expected to see, is to only record the timestamped new value each time the sampled variable changes. Unless the change rate approaches the sampling rate, the storage saved by logging timestamped changes will easily exceed the overhead of recording the timestamp.

2 0
1. Friday 23rd May 2014 12:44 GMT Bartholomew
  
  Re: Slight case of subject drift
  
  That would require intelligence at each sensor to push the data to a central location when an event/change occurs. But if the sensor fails nothing will ever be logged again. So then you need the central location to periodically poll the intelligent sensors, "You still alive and working", to check for faults. Also you need a network to queue events in case two sensors push data at the exact same instance in time. The more complexity that is added the more possible failure points that are also added. There are many advantages to K.I.S.S. https://en.wikipedia.org/wiki/KISS_principle
  
  0 0
  1. Friday 23rd May 2014 13:40 GMT Martin Gregorie
    
    Re: Slight case of subject drift
    
    Nope - he was talking about capture, i.e. permanent data storage. IOW it doesn't matter whether all sensors autonomously send in readings or the logging system(s) poll them for data. Once the data arrives at the server that will record it, its easy to scan through the stream from each device and discard everything except the changes in a sensor reading.
    
    Think systems don't work that way? Here's a real-life example: the switches in mobile phone cells are polled on a daily basis and their call data pulled down as via FTP as a file containing a megabyte or two of data. This is then processed in various ways, e.g. run through fraud detection kit and analysed by the network performance team before being used to populate one or more databases.
    
    1 0
Tuesday 27th May 2014 17:41 GMT Anonymous Coward

Don't forget...

Statistical sampling. If testing 10, 100 bullets in a batch of 10,000 rounds of ammunition was A-OK for the guy that invented it... YMMV.

...and paradoxically, redundant and fail-safe sensors. Thermometers tend to fail catastrophically, either going to top or bottom of scale, which would cause your HVAC to freeze or toast everybody, so a voting circuit reading odd more thermometers would not just increase your reliability, but it would give you equally reliable fault alarms. Old school 4 - 20mA sensors don't fall prey to silent failure, exactly because of those 4mA. As always, YMMV.

I'm not denying anything in the article, but sometimes you don't need perfect sampling. You can combine both, of course, with your datalogger polling randomly 10% of your sensors every 100 minutes, coupled with "IRQ" alarms.

0 0

This topic is closed for new posts.

Topics

Special Features

Vendor Voice

Resources

User topics

Article topics

User topics

Article topics

Big data hitting the fan? Nyquist-Shannon TOOL SAMPLE can save you

COMMENTS

I can't use it

Just occurred to me

Re: Just occurred to me

Re: Just occurred to me

Slight case of subject drift

Re: Slight case of subject drift

Re: Slight case of subject drift

Don't forget...

Other stories you might like

China outlines plan for National Integrated Government Affairs Big Data System

UK.gov finds billions in cash for big data contracts

Airbus pulls up hard, no longer buying 29.9% stake in Atos-owned Evidian

Ex-BigQuery exec and Motherduck CEO: For some users, the answer is to think small

Revealed: US telcos admit to storing, handing over location data

Amazon finally opens doors to its serverless analytics

We've never even built datacenters using robots here on Earth

Pyramid Analytics receives $120m in VC funding for 'decision intelligence'

Mastering metadata key to shifting data fast, says Arcitecta

About Us

Our Websites

Your Privacy