Tips and Tricks

Here are some recipes to effectively use Trouve to it’s full potential.

Specify Sample Period for Reuse

If you’re looking for multiple events in the same data set, then one shortcut is to specify the period once. The :any:find_events function is curried via toolz.curry, allowing a user to specify the period once for reuse.

>>> x = np.array([1, 1, 2, 0, 2])
>>> period = 1
>>> find_events = tr.find_events(period=period)
>>> events_1 = find_events(x == 1)
>>> events_1.as_array()
array([ 1.,  1.,  0.,  0.,  0.])
>>> events_2 = find_events(x == 2)
>>> events_2.to_array()
array([ 0.,  0.,  1.,  0.,  1.])

Multi-parameter Conditional Array

The condition can be as complicated as necessary. Using multiple inputs and the ampersand (&) or the pipe (|). The following example find events where x > 0 and y == 2, or z <= 1. ((x > 0) & (y == 2)) | (z <= 1)

When using more than one parameter, you must put each expression in its own parenthesis

>>> x = np.array([1, 1, 0, 0, 1, 1, 0, 1, 0, 1])
>>> y = np.array([2, 2, 0, 0, 0, 0, 0, 0, 0, 2])
>>> z = np.array([2, 2, 2, 3, 3, 0, 3, 3, 3, 3])
>>> cond = ((x > 0) & (y == 2)) | (z <= 1)
>>> events = tr.find_events(cond, period=1)
>>> events.to_array()
array([ 1.,  1.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  1.])


>>> z = np.array([2, 2, 2, 3, 3, 0, 3, 3, 3, 3])
>>> cond = ((x > 0) & (y == 2)) | (z <= 1)
>>> events = tr.find_events(cond, period=1)
>>> events.as_array()
array([ 1.,  1.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  1.])

Events and the numpy.ma Module

The Events.as_mask method was developed to integrate directly with numpy.ma.MaskedArray and numpy.ma.masked_where . The numpy.ma module makes things like summing or finding min/max of arrays based on your condition.

>>> x = np.array([-1, 1, -1, -1, 1, 1, -1, 1, -1, 1])
>>> cond = x == 1
>>> events = tr.find_events(cond, period=1)
>>> mask = events.as_mask()
>>> np.ma.masked_where(mask, x)
masked_array(data = [-- 1 -- -- 1 1 -- 1 -- 1],
             mask = [ True False  True  True False False  True False  True False],
       fill_value = 999999)

>>> masked_x = np.ma.MaskedArray(x, mask)
>>> masked_x.sum()
5
>>> x.sum()
0

Getting Events into a pandas.DataFrame

The pandas.DataFrame data structure and trouve fit nicely together. You can loop through each occurrence and append a statistical description to the dataframe. This is helpful you your trying to pull features out of time-series data for a machine learning algorithm, or you want to describe all events found in a data set and then use pandas idioms to further process them.

>>> x = np.array([-1, 1, -1, -1, 1, 1, -1, 1, -1, 1])
>>> y = np.array([1, 2, 3, 4, 5, 4, 3, 2, 1, 0])
>>> cond = x == 1
>>> events = tr.find_events(cond, period=1)
>>> columns = ['duration', 'ave_y_value', 'y_value_at_event_start']
>>> df = pd.DataFrame(index=pd.RangeIndex(len(events)), columns=columns)
>>> for i, occurrence in enumerate(events):
...     df.iloc[i] = dict(
...         duration=occurrence.duration,
...         ave_y_value= y[occurrence.slice].mean(),
...         y_value_at_event_start=y[occurrence.start]
...     )
>>> df
  duration ave_y_value y_value_at_event_start
0        1           2                      2
1        2         4.5                      5
2        1           2                      2
3        1           0                      0

Finding Inverse Events

If you’re interested in when events aren’t active, then you can use the inverse of the condition. This would be helpful if you wanted to know the average, min, or max time between events.

>>> x = np.array([-1, 1, -1, -1, 1, 1, -1, 1, -1, 1])
>>> cond = x == 1
>>> events = find_events(cond, period=1)
>>> inv_events = find_events(~cond, period=1)
>>> events.as_array()
array([ 0.,  1.,  0.,  0.,  1.,  1.,  0.,  1.,  0.,  1.])
>>> inv_events.to_array()
array([ 1.,  0.,  1.,  1.,  0.,  0.,  1.,  0.,  1.,  0.])

Events.durations Tips

Total time in seconds events are active.

>>> x = np.array([-1, 1, -1, -1, 1, 1, -1, 1, -1, 1])
>>> cond = x == 1
>>> events = tr.find_events(cond, period=1)
>>> events.durations.sum()
5

Occurrence rate: Occurrences/second

>>> x = np.array([-1, 1, -1, -1, 1, 1, -1, 1, -1, 1])
>>> cond = x == 1
>>> period = 1
>>> events = tr.find_events(cond, period=period)
>>> len(events) / (x.size * period)
0.4

Creating a histogram of event lengths

>>> x = np.array([-1, 1, -1, -1, 1, 1, -1, 1, -1, 1])
>>> cond = x == 1
>>> events = ftr.ind_events(cond, period=1)
>>> np.histogram(events.durations, [0, 0.5, 1, 1.5, 2, 2.5])
(array([0, 0, 3, 0, 1], dtype=int64), array([ 0. ,  0.5,  1. ,  1.5,  2. ,  2.5]))