top of page
Search

Frame Interpolation: How AI Can Enhance Video Frame Rate


As a video editor and former gamer, I've always paid attention to video frame rates. Whether it was attempting to to achieve buttery smooth slow motion videos or to upscale 30 FPS Fortnite clips to 60 FPS, it has always been important to ensure frame rates are set correctly to convey a certain experience to an audience. However, a problem that occurs when slowing down already low frame rate videos is choppy frames. This is due to the footage not having enough frames to fill in per second when stretching out the time resorting to keeping the original video frames for longer (hence why recording slow mode on your camera shoots in frame rates like 120 or 240 FPS). This is where AI comes in...


AI has now become a tool video editors can utilize to generate these missing frames in a process known as frame interpolation. This process predicts and generates new frames in between the original frames allowing slowed down footage to maintain (or even surpass) their original frame rates.


Frame interpolation Before vs After with choppy slow mode footage

One example is Real-Time AI Frame Interpolation (RIFE) known for its quality. It uses a convolutional neural network (CNN) to estimate and approximate motion in videos and generate new frames. Here is the process:


Input

RIFE takes in two input frames with the goal of generating one (or more) in between.


ex.

Frame 0 at time t = 0

Frame 1 at time t = 1

Generate new frame at time t = 0.5


Feature Extraction

RIFE analyzes each frame and breaks it down into important features such as shapes, edges, colors, textures, areas of motion. This is where CNN comes in which helps to find patterns and how objects might be moving. This process is done to understand large movements as well as close up finer details.


Motion Estimation

To estimate how objects move from the first frame to the second, RIFE creates a motion map (or flow field). This tells it which direction and how far and object is moving. RIFE predicts this information on its own based on previous training.


Training involves inputting shot 3 frame videos with a starting frame, end frame, and in-between frame. It then attempts to recreate this in-between frame on its own and compares it to the real frame. This is repeated thousands of times.


Warping and Fusion

The two frames are warped together and stretched to how it would appear in the frame in between. It moves features in the first frame forward in time and backwards in time for the second frame. There are now 2 versions of what the new frame might look like, but some parts may be missing or overlapping. So, RIFE intelligently blends the 2 versions together and decides which parts to use while filling in missing areas.


Frame interpolation example, showcases warping artifacts from blending

This process is repeated throughout the parts of the footage that need frame interpolation and can even be used to upscale footage frame rates to achieve smoother videos. Frame interpolation also has applications outside of filmmaking and in fields like medicine. For example, enhancing MRI, CT, or Ultrasound frame rates allowing patients and doctors to see things like the heart's motion more smoothly without the need of longer scans.


Some more cool examples:


Sources:





 
 
 

Comments


bottom of page