#Why does Frigate track stationary

1 messages · Page 1 of 1 (latest)

rugged kettle
#

Frigate didn't always track stationary objects. In fact, it didn't even track objects at all initially.

#

Let's look at an example use case: I want to record any cars that enter my driveway.

#

One might simply think "Why not just run object detection any time there is motion around the driveway area and notify if the bounding box is in that zone?"

#

With that approach, what video is related to the car that entered the driveway? Did it come from the left or right? One approach is to just record 24/7 or for motion (on any changed changed pixels) and not attempt to do that at all. This is what most other NVRs do. Just don't even try to identify a start and end for that object since it's hard and you will be wrong some portion of the time.

#

Couldn't you just look at when motion stopped and started? Motion for a video feed is nothing more than looking for pixels that are different than they were in previous frames. If the car entered the driveway while someone was mowing the grass, how would you know which motion was for the car and which for the person when they mow along the driveway or street? What if another car was driving the other direction on the street? Or what if its a windy day and the bush by your mailbox is blowing around?

#

If you think Frigate's ability to accurately determine the start and end for an event is limited now, this would be so much worse. In order to do it more accurately, you need to identify objects and track them with a unique id. In each subsequent frame, everything has moved a little and you need to determine which bounding boxes go with each object.

#

Tracking objects across frames is a challenging problem. Especially if you want to do it in real time. There are entire competitions for research algorithms to see which of them can do it the most accurately. Zero of them are accurate 100% of the time. Even the ones that can't do it in realtime. There is always an error rate in the algorithm.

#

Now consider that the car is driving down a street that has other cars parked along it. It will drive behind some of these cars and in front of others. There may even be a car driving the opposite direction.

#

Let's assume for now that we are NOT already tracking two parked cars on the street or the car parked in the driveway, ie, there is no stationary object tracking.

#

As the car you are tracking approaches an area with 2 cars parked, the headlights reflect off the parked cars and the car parked in your driveway. The pixel values are different in that area, so there is motion detected. In the previous frame, you had a single bounding box from the car you are tracking. Now you have 4. The original object, the 2 cars on the street and the one in your driveway.

#

Now you have to determine which of those bounding boxes in this frame should be matched to the tracking id from the previous frame where you only had one. Remember, you have never seen these additional 3 cars before, so you know nothing about them. The algorithms here are fairly good. They use a kalman filter to predict the new location using the historical bounding boxes and the bounding box closest to the predicted location is linked. It's right sometimes, but the error rate is fairly high.

#

Stationary object tracking enters the chat...

#

Now let's assume that those other 3 cars were already being tracked as stationary objects, so the car driving down the street is a new 4th car. The object tracker knows we have had 3 cars and we now have 4. As the new car approaches the parked cars, the bounding boxes for all 4 cars is predicted based on the previous frames. The predicted boxes for the parked cars is pretty much a 100% overlap with the bounding boxes in the new frame. The parked cars are slam dunk matches to the tracking ids they had before and the only one left is the remaining bounding box which gets assigned to the new car. This results in a much lower error rate. Not perfect, but better.

#

The most difficult scenario that causes IDs to be assigned incorrectly is when an object completely occludes another object. In this situation, a bounding box disappeared and it's a bit of a toss up when assigning the id since it's difficult to know which one is in front of the other. This happens for cars passing in front of other cars fairly often. It's something that we want to improve in the future.

#

I think your issue is one that we already identified a while back and plan to address in the next version. Tracked objects aren't events, there are several events related to a single tracked object. Tracking objects, stationary or not, is more of a behind the scenes functionality of Frigate to determine (the best it can) which segments of video are related to the event. Your issue is with the way information is presented.

#

In your specific situation, you got an event when a car entered your driveway, right? The snapshot was from that moment. If you had a notification, you would have gotten notified only at that specific moment with a snapshot of the car entering the zone. The fact that Frigate's algorithm to track a car across frames incorrectly associated 36 minutes of prior video footage with the same car driving by is a bit of a nitpick. Something that could probably be solved by simply skipping over video where no motion was detected by default when playing back the video.

#

Which alternative solution enables you to get a notification when a car enters your driveway from the street, but not when a car that was already in the driveway leaves? Could you do this with Ring? I haven't seen any other solutions that aim to provide this kind of functionality. Does it have 100% success rate? No. But any other solution that doesn't do it at all has a 0% success rate.

#

Fundamentally though, you have to accept that video analysis is never going to be perfect. The permutations of different pixel values across a stream of frames is nearly infinite. There are no systems that work perfectly. No video analysis algorithm works 100% of the time. They are all going to have errors. One way to eliminate errors is to use a system with fewer features or only use the features with low error rates.

livid bloom
#

Thanks for the explanation it makes sense and is probably the way to make object tracking the most effective. However, the explanation doesn't explain what I saw in the event timeline that makes sense even considering that it is difficult and errors will occur. It doesn't explain the time-shifted out of order presentation of the actual sequence of events. I would have not questioned stationary object tracking if the time lined had been presented as it occurred. To recap. A car came down my street and did a u-turn and passed through a required zone in front of my drive way triggering object detection event as expected. That entire process took less 30- seconds. The event timeline was as follows. It started with two parked across the street alternating with red bounding boxes. That was surprising because those cars aren't in any zone and weren't moving. I watched the clip, which is 36 minutes long, at 12 minutes anther car drove by and into a garage. Again it didn't enter a detection zone and wasn't boxed and nothing more happened until the moving u-turn car that triggered the event came into view about 30 seconds from the end and was detected. What should have been on the time line is the detected moving car passing the parked cars, and then them being redetected with bounding boxes then the moving car being detected. The entire event was about 30 seconds and that is what should have been recorded. Your explanation does not explain the 36 minute out of sequence time shift in the events. This is not an object detection mistake, or tracking jitter, or detection vs live synch bounding box issue. It has all the markings of bug. The actual video record didn't show those two parked cars being obscured 36 minutes before it happened. If I hadn't watched the entire clip I would have chalked this up to false object detection. It wasn't, all objects were being tracked correctly what was incorrect is the actual time stamping of these events was wrong & incorrect timeline

rugged kettle
#

Having authored the code that does this, it's all totally explained by a single ID swap. The parked car had an ID and was being tracked. During that time, another car drove by and entered a garage. That wasn't saved as a separate event because the car didn't enter a required zone. This is the desired behavior. That just happened to occur while the parked car was being tracked already. At 30 minutes, the other car enters the frame and when it passed the parked car, the tracking IDs were swapped. Now frigate thinks the car driving in was the one that had been parked. That car enters the zone and the event is created. I have looked at frame by frame data for events like this and it's a very plausible explanation for what happened.

#

At this point, I have invested as much time as I am willing to explain what is happening. I have actual improvements to frigate on my to do list.

livid bloom
#

Sounds like you know what the bug is, unless ID swaps are feature 🤣

rugged kettle
#

Again, there isn't a reality where IDs are never swapped. There is going to be an error rate even if I spend a lifetime improving it.

livid bloom
#

If this is fact true, Frigate will be forever plagued with counter intuitive behavior.

#

I actually don't understand why IDs should ever be swapped as something that's an ID should be immutable to be reliable

rugged kettle
#

I am at a loss for how to more clearly explain how bounding boxes from the next frame are associated with the IDs being tracked than what I provided above. Maybe looking at the docs for the library frigate uses will provide more insight, but I doubt it. https://github.com/tryolabs/norfair

livid bloom
#

I understand objects going out of view and then coming back into view but not why IDs would be swapped between objects. Furthermore, there should be some hysteresis to prevent objects from disappearing and losing track when briefly occluded, particularly with stationary objects that have been in the same location for a while, it's likely that they are still there after going missing from frames for short period especially without any prior movement of that stationary object as it would be unlikely physically for an object like car to actually disappear in a few frames . Also, a moving object shouldn't acquire a stationary objects ID and vise versa as that doesn't make physical sense. It's recipe for the bizarre timeline that I experienced. Two similar moving objects near each other being mistaken for each other makes sense. Stationary and moving object confusion doesn't make sense. I doubt that examining the Norfair library will clear things up because I'm sure its how it's being used but a brief look indicates that the docs are approachable by someone not an expert so I'll give it look. ... The use of ReID function seems to be where the problem lies. The example of all objects moving is interesting and tells me that occluded stationary objects should be weighted higher for reacquiring the same ID and there is no reason for the moving object that wasn't occluded to be ReID'd at all as it presumably didn't loose track. I'll have to watch the ReID video in slow motion to be sure but it doesn't appear that any objects swapped ID's but they got new IDs

mental jacinth
#

The example of all objects moving is interesting and tells me that occluded stationary objects should be weighted higher for reacquiring the same ID

this statement shows that you do not understand the problem

#

every frame you are matching bounding boxes to existing tracked object IDs

#

so when you are trying to match 1 object that is stationary to a box and 1 object that is moving to a box, you do not know which box correlates to the moving object and which box correlates to the stationary object

#

also for clarity, like I said previously, frigate does not currently use reID as there has not been an implementation that was both accurate and efficient

livid bloom
#

Update: The exact scenario that initiated my original questions just occurred. A car came down the street, occluded a car across the street (this time there was only one parked car) made a u-turn, passed through a mandatory zone an an event was created. The time was 28 seconds long and the time line ONLY showed the moving car being detected, NOT the stationary car being re-detected 36 minutes prior. This is as expected, unlike the prior occurrence that was bizarre

livid bloom
mental jacinth
livid bloom
mental jacinth
#

It would seem to me that one of the most important data items to track about a stationary object is where it's at in frame for a given camera

all objects need this data constantly

livid bloom
#

After all the discussion I still can't rationalize objects swapping IDs. I don't understand why this would ever be performed. Acquiring a new ID yes, swapping IDs No. The utility of that escapes me

young escarp
#

Are you truly being serious Occam? Blake and Nick have taken A LOT of their time explaining and explaining AND explaining how Frigate works. Specifically Blake did an excellent write up for you.

mental jacinth
#

you have two cars, car-1 and car-2. car-1 is parked and car-2 drives right beside and occludes car-1. The bounding boxes are so similar at that moment that car-1 becomes car-2 and vice versa because there is no direct way to know which box belongs to which car.

#

creating a new ID would make no sense, you only ever had 2 cars

#

you wouldn't just assume car-1 left the scene because it was occluded and now there is a car-3

#

it is the same car, the GOAL of the object tracking would be that every object keeps its same ID the entire time it is in the camera frame

#

Like Blake already said there are competitions and many different algorithms that try to solve this problem and have an error rate. If you truly believe it shouldn't be a problem because it is simple then I don't know what else can be said.

livid bloom
#

As the nearly exact same scenario worked correctly today that failed yesterday proves you can do it right. You need to figure out why you did it wrong. BTW cosmic rays flipping a bit will not be accepted as a cause. I was actually told that once by a so called electrical engineer with MS explaining a rare malfunction 🤣 . I told him to redesign the the product to prevent cosmic ray failure. He decided it would be easier to find the firmware bug and fix it.

rugged kettle
#

Glad to see you are coming around. From the very beginning we have been saying, "That happens because the tracking ids sometimes get swapped when a object is occluded. We know this can be improved and already have plans to improve it and reduce the error rate in the future."

livid bloom
#

When I don't get 36 minute events with 35.5 minutes of nonsense preceding the actual event of interest, I will have come around. BTW based on what happened, and the explanations of what happened, I'm guessing it's conceivable that I could encounter event time lines lasting hours with the actual event showing up after watching a long clip of grass growing. When I see these extended event timelines I know to look at the snapshot, view the end of the clip an make judgement call on viewing anything else in the clip or clicking through the time line that' s likely to be uninteresting

rugged kettle
#

I fully expect that to happen sometimes. Not the way I want it to be, but that's how it does right now. There is a convenient icon above the video to jump you to that point in time. The context you are missing is how much of an improvement the changes are in 0.13 over 0.12.

#

And if you are using motion mode for record, then most of that will be skipped over automatically.

#

Yesterday you were willing to die on the hill that "stationary object tracking was the cause of all our problems and we should remove it". It's the whole reason I did this write up.

livid bloom
#

As they say on TV, but wait there's more. My wife pulled into the driveway 8 minutes ago and no event yet ... It's still in progress ! !! Delayed events are effectively non-events

rugged kettle
#

Then there wasn't a frame where the bottom center of the bounding box was in the zone. Probably best to extend it further down the driveway towards the house.

livid bloom
#

Well then looking at the recordings why do I see Car 99.6% duration in progress ... I don't make this stuff up .. still no event after 15 minutes

rugged kettle
#

You are saying there is "no event", but then you are also saying you are looking at an in progress event. How can both of those things be true? Perhaps you mean "no notification"?

livid bloom
#

My wife's car drove right through the entry zone ... no event posted under events .. looking at the recordings list I saw that... Now 17 minutes later an event was posted in the event log of a different car driving through the entry zone and the original in progress event of my wife's car pulling in the drive way is still in progress. not trying to be snide but there are two buttons under the camera view, Events and Recordings. I see an in progress event under recordings ... nothing under events for that event

rugged kettle
#

It's hard to explain without seeing the video for myself. I would guess that your entry zone needs some tweaking to make sure a frame is captured where the bottom center of the bounding box is inside of it. Not sure what you are describing with the other car.

#

The recordings and events pull from the same api endpoint, so you should see the same events in both places.

livid bloom
#

And under ongoing events a 31 minute inprogress event where the first item in in the time line is an empty red bounding box approximately where my wife's car entered the entry zone. The next item in the timeline is an empty red bounding box of where my wife's car is parked.

rugged kettle
#

The timeline icons don't work properly for in progress events. It's a known issue for the beta.

livid bloom
#

Are you saying the bounding box that looks like it is associated with my wife's car entering the detection zone is an illusion? Something triggered it's creation maybe just motion not detection. That camera angle is certainly not optimal for zone marking. I'll tweak it and chalk this up to the bounding box not in the zone enough or aligned in the zone enough

#

When i get a camera installed over the garage I'll be able to better define zones .. this look like an installation/configuration issue

rugged kettle
#

I'm saying that the video doesn't actually seek to the point in time for the bounding box. It draws the box, but the video doesn't seek properly.

#

This happens for in progress events only.

livid bloom
#

I'd really like to be able to replay these scenarios in debug mode by feeding a playback of recording into the stream, or have all the meta data available when reviewing clips. That would eliminate guessing about what happened. It would be especially helpful to view in slow motion. At 5 frames/second how resource intensive would it be to capture bounding boxes when after an object is detected ? You already have the zones which are static. You could cache that data and only write out periodically to minimize DB I/O

livid bloom
rugged kettle
livid bloom
#

Great ... that will really help understanding what happened and assist in tuning

#

Question: Does the number of objects types to be detected in zone impact detection efficiency or time. I.e. is it easier to id one object vs 20?

#

I'm not sure how object identification works, Is it serial check for one at time, or parallel, or tell me any objects present then I'll decide if I want a detection event

rugged kettle
#

It doesn't impact anything. The number of active objects being tracked would impact total load.

#

It will try and consolidate motion and current objects so they can be updated in a single object detection request. If things are really far apart, it will most likely run two separate detections.

#

Kinda depends on the resolution of the camera. Higher resolutions often result in more regions so they can detect smaller objects.

livid bloom
#

So if you detect motion and submit a region all objects known to "recognizer" will be returned or do you need to specify what objects you're looking for ?

rugged kettle
#

The model will return everything it knows how to recognize, but only the objects listed for the camera in your config are kept. Everything else is ignored.

livid bloom
#

OK ... thanks

livid bloom
#

I'm trying to refine zones. The fact that the only bottom center of the bounding box of an object needs to be in a zone to be detected means the zones I setup for parked cars only need to be large enough and positioned to include possible locations of the bottom center of bounding box for cars. And, the bottom center will always be parallel to the bottom of the frame. And, the bounding box does not need to be entirely within the zone. So inclusion in a zone reduces to a point being in a zone. Are these true statements?

#

Further more it seems like zones need to accommodate some bounding box jitter for stationary objects or imprecision in general. And jitter and imprecision are also based on only the location bottom center of the bounding box. True?

#

Is zone continued presence in a zone after zone entry detection also based on the bottom center of the object bounding box?

livid bloom
#

That will make zone tuning more manageable. The (my) natural tendency was to try to fit objects in zones because that's how you usually see zone examples and zones that don't enclose the object just look wrong 🤣

#

Once it dawned me that I only need to accommodate a point things became much more clear

rugged kettle
#

Yea. That's a common assumption. The single point is the best approach I have found and I think it works pretty well.

livid bloom
#

The single point needs more emphasis in the docs. Bottom center doesn't convey the full implication for noobs unskilled in object detection tech.

chilly meadow
#

Thanks for sharing this @rugged kettle, really helpful.
From what I'm seeing, Frigate is above and beyond any other solution when it comes to object tracking, and it's pretty amazing given it's mostly developed by two guys with a day job.
These are the two most common challenges I'm hearing (and feeling), and why I (speaking for myself only) think they're not such a high priority:

#
  1. Occlusion. While this sometimes happens with humans (usually one person walking by as someone is sitting on a bench or sofa), this mostly seems to relate to cars, and more specifically, cars in the street (unless you have a fleet in your driveway). It's also improved significantly in 0.13 compared to 0.12. Personally, The impact is that occasionally events related to cars on the street will have weird ordering. Not such a big deal to me.
#
  1. New events as a result of objects being treated as new when they aren't. I'm seeing this often with objects "disappearing" without moving. While I think it could be improved (for example, stationary objects in the middle of the frame are unlikely to have gone away in one frame), I'm sure there are things I'm overlooking in terms of complexity and or other complications it could introduce, and in essence, this is mostly a nuisance, and not a real deal breaker. I also hope this improves once I switch over to a Frigate+ model (already labeled about 400 images, just waiting on new HW). Another case I'm often seeing is cars turning around and being treated as a new object. Happens frequently in my cul de sac as cars come by, slow down as they turn around, and go the other direction, sometimes even exiting the frame for a second or two. That last example actually became more common in 0.13, probably due to the changes that improved 1, and I much prefer this trade-off over the 0.12 situation. Once again, mostly a nuisance.
#

When I say I believe these are low priority, there are other things that I think are high priority:

  1. Better motion detection. Wind, shadows, leaves, lights turning on-off (I know the last one is better in 0.13, but is still happening) all get detected as motion. These are obviously really hard to solve, and 0.13 is way better than 0.12, but better motion detection would mean better performance, fewer false positive detections, and fewer cases where stationary objects disappear (because there will be no movement to trigger it)
  2. Better recording screen, with proper scrubbing and motion markers. I know this is already planned and am really excited about it. The only reason I don't do 24/7 recordings is that without motion markers I have no practical way to go over them unless I know timestamps.
  3. Authentication. The hot potato in the Frigate world. I'm glad to hear it's no longer taboo, and hope we'll have something soon.