Thursday 1 December 2011

Poisson Process prediction vs the actual data

In this post, I do some comparison between the actual events that used by the given user, and the predictive events using by PPs. I firstly take 14 training days to calculate the mean number of event types. From the given event file, there is only user ecenergy39 that has the maximum historical data of 17 days, therefore I only use this user to generate the results.
As you can see from the above graph, by using 14 days as a training period, TV and Kettle are predicted to occur in the next 24 hours with the probability very high, approximately 80%. The actual results show that TV event has been annotated for 4 times, while "kettle" event has been annotated for 1 times.

Then, we use 15 days as a training period to predict the events occured in the next 24 hours. The result is below:

The process is repeated for the 17th day, with the result as follows:


As we all know that the data of events are collected from the real experiment, where users manually annotated the information of the event. Therefore, the data itself has so much noise. We need to think of the solution of increase the accuracy of the event information. The direction could be automatically recognise the pattern of the events.

2 comments:

  1. Thank you for posting this Henry.

    I am surprised by the following: the probability of the washing machine does not change much between figure 2 and figure 3, in fact it raises. On the contrary, I was expecting it to decrease, as on day 16 we can see that there is a washing machine event.

    This makes me doubt how good this model is, in this form.

    Henry, does my comment make sense to you?

    ReplyDelete
  2. Possion makes a prediction based on the historical data. The probability of the occurence of the event is P(event, mean). If the mean number of event (e.g., Washing Machine) per day would increase, hence the probability that the event would be occured is also risen.
    I think the prediction is still correct. The probability of washing machine decreases from figure 2 to figure 3, then it raises because there is a washing mamachine event in figure 2.

    Please note in this graph, the higher the probability is, the more likely the event would occur.

    ReplyDelete