I was trying to read the 600th frame of a video using
cv2.VideoCapture. However, I found that the following two methods both successfully read an image but the images are different. I was wondering which is the correct way to read the 600th frame, and why the resultant images are different? Is it related to mp4 encoding? Thanks!
cap = cv2.VideoCapture("test.mp4") print(cap.get(cv2.CAP_PROP_FRAME_COUNT)) # 1187 cap.set(1, 600) ret, frame1 = cap.read() # Read the frame
cap = cv2.VideoCapture("test.mp4") print(cap.get(cv2.CAP_PROP_FRAME_COUNT)) # 1187 for i in range(601): ret, frame2 = cap.read() # Read the frame
To read/obtain the
Xth frame of a video or similarly determine the number of frames in a video file, there are two methods:
- Method #1: Utilize built-in OpenCV properties to access video file meta information
which is fast and efficient but inaccurate
- Method #2: Manually loop over each frame in the video file with a counter which is slow and inefficient but accurate
Method #1 is fast and relies on OpenCV’s video property functionality which almost instantaneously determines the frame information in a video file. However, there is an accuracy trade-off since it is dependent on your OpenCV and video codec versions. From the documentation:
Reading / writing properties involves many layers. Some unexpected result might happen along this chain. Effective behavior depends from device hardware, driver and API Backend.
On the otherhand, manually counting each frame until we reach the desired frame number will be 100% accurate although it will be significantly slower. Here’s a example to demonstrate the inconsistent behavior between the two methods. It attempts to perform Method #1 by default, if it fails, it will automatically utilize method #2
def frame_count(video_path, manual=False): def manual_count(handler): frames = 0 while True: status, frame = handler.read() if not status: break frames += 1 return frames cap = cv2.VideoCapture(video_path) # Slow, inefficient but 100% accurate method if manual: frames = manual_count(cap) # Fast, efficient but inaccurate method else: try: frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) except: frames = manual_count(cap) cap.release() return frames
if __name__ == '__main__': import timeit import cv2 start = timeit.default_timer() print('frames:', frame_count('testtest.mp4', manual=False)) print(timeit.default_timer() - start, '(s)') start = timeit.default_timer() print('frames:', frame_count('testtest.mp4', manual=True)) print(timeit.default_timer() - start, '(s)')
Method #1 results
frames: 3671 0.018054921 (s)
Method #2 results
frames: 3521 9.447095287 (s)
Notice how the two methods differ by 150 frames and Method #2 is significantly slower than Method #1. In general, if you need speed but willing to sacrifice accuracy, use Method #1. In situations where you’re fine with a delay but need the exact frame, use Method #2.
So the conclusion is: when you’re using
cap.get or any of the built in VideoCaptureProperties such as
cv2.CAP_PROP_FRAME_COUNT, you’re essentially using Method #1 which is fast and efficient but inaccurate. In your first example when you’re trying to read an exact frame with
cap.set, you’re actually getting an "estimated" frame close to the desired
Xth frame instead of the actual
In contrast, from your second code snippet, you are manually going through each frame one by one so when it lands on the
Xth frame, that is guaranteed to be exact. That’s why when you try to read the same frame number using each of the methods, you may get different images.
Answered By – nathancy
Answer Checked By – Terry (AngularFixing Volunteer)