For the purposes of this report, we will be defining super slow-motion video as any video that can be recorded at a frame rate equal to or greater than 1,000 frames per second, typically shot on a high-speed camera. This is not an official definition, however, it is what we will consider for this project. Super slow-motion video is useful in many different areas of research as well as commercial applications due to the technology’s ability to capture and create breakdowns of actions or events occurring too fast for the human eye.

Research Applications

Systems such as FRAME and T-CUP have been developed for the uses of observing physical phenomena in a lab and have led to some major breakthroughs in both super slow-motion video recording techniques as well as many other fields of study. Read more about these systems by exploring “The Future” tab on our website.

Commercial Applications

There are many different commercial applications of super slow-motion video. Obviously, this is very useful in the film industry when creating action movies or commercials, but it is also used in live sports broadcasts in order to generate instant replays, automotive crash tests in order to improve the safety of vehicles and in many different scientific areas of studies such as nuclear physics or biomechanics.

History

There is a very interesting history behind slow-motion video recording. The first major breakthrough on the path to slow-motion video was a breakthrough in high-speed camera technology [1]. In 1878, a British photographer connected twenty-four cameras attached together by a shutter release system and used it to capture a high-speed motion sequence of a horse galloping [1]. Before the rise of digital video, film was used to achieve slow motion effects but it was limited to the physical speed of the film spools and the exposure speed of the film. At the start of the 20th century motion pictures were beginning to become commonplace despite the technology still being very primitive. At this time, projectors had to be hand-cranked resulting in an often variable framerate. They also relied on intermittent motion, the process where a mechanism holds a film temporarily in place before advancing the film and by using shutters to create flashes of light and dark, the mind could be tricked into perceiving motion.

In 1904, a big advancement in projector technology was made August Musger, when he invented the first continuous motion camera, a camera where the film was constantly moving across an open shutter. Shortly after, August discovered an unintended side-effect of his invention, by capturing film with his new projector at 32 frames per second and playing the film at the usual 24 frames per second, August had created the first camera capable of recording high framerate or slow-motion video. August brushed this off initially as it seemed to have no importance to him, this led to him into financial ruin and the dissolution of his patents in 1912 due to the failure of his business and his inability to pay the required fees. Two years later, Hans Lehmann took August’s idea and improved on it and developed a new apparatus, labeled specifically as a slow-motion recorder and player. This product was marketed majorly to scientific purposes rather than cinematography as a means to observe to previously unobservable [17].

The advent of digital recording unlocked much easier access to the ability to film slow-motion and even super slow-motion video. Slow-motion refers to the commonplace effect in videos where time appears to be moving slower than normal. Previously this effect had been used in most part by film, scientific and automotive industries for crash testing, due to prohibitive costs of the technology. However, with tremendous advancements in camera technology, even smartphones are able to record in slow motion. There are three different ways that slow-motion videos can be produced. The first involves recording video at a higher framerate than the frame rate at which the video is played back. The second method commonly referred to as “digital slow motion” is achieved in post-production by interpolating or repeating frames in order to stretch the original motion over a longer period of time. The final and most straightforward slow motion technique is to simply playback normally recorded video at a slower speed, however, this method is similar to the first but with much less favorable results, it has been rendered obsolete as the other two methods have become more accessible. It is not very uncommon for the first and second methods to be used in parallel in order to produce a more pronounced slow-motion effect.

High-End Commercially Available Cameras: Phantom

Vision Research’s Phantom line of cameras produces some of the most popular commercially available ultra-slow motion cameras, including the Phantom v2460 and v2512. The v2460 has unparalleled image quality with a full 4 Mpx sensor for exceptional detail and is the fastest 4 Mpx camera available as well as the most sensitive camera in its class. The v2512, on the other hand, is Phantom’s flagship ultrahigh-speed camera. The fastest camera available, it is able to film at up to 1 million frames per second, with 25 Gpx of throughput. The cost for operating at such a high speed is paid through a reduction in resolution; the v2512 must operate with a resolution of 256x32 in order to film at its maximum of 1 million frames per second [2].

Fig. 1: A screencap of video shot by the Phantom v2460 [3].

Fig. 2: A screencap of video shot by the Phantom v2512 [3].

The v2512 can shoot at 25 Gpx per second. This initially does not get written to an SSD, it goes to RAM. You can order the camera with 72GB, 144GB, or 288GB of dynamic RAM, which is far faster than any SSD. With the 288GB RAM option, you can shoot for about 8 seconds at full resolution. For longer, slower shooting, they offer a custom-designed SSD for high-speed video, the CineMag V. A CineMag can directly record about 1.4 Gpx per second. Users record into camera RAM and then “upload" the recording to the CineMag in seconds. This eliminates camera downtime between shots. If your application demands longer record times, then the CineMag can be used in Run/Stop (R/S) mode, allowing for several minutes of record time at lower frame rates [4].

These high-end ultrahigh-speed cameras, while commercially available, are more expensive than most people can afford - the Phantom v2460 base model with just 72GB of storage costs $135,000 and the best model costs $175,000. A single 1TB CineMag IV will cost you almost $12,000 [5].

Affordable Slow-Motion Cameras

There are more affordable slow-motion cameras available from companies such as GoPro, Sony, and Panasonic, but they typically don’t offer ultrahigh-speed video capture, only being able to film at around 1000 frames per second [6].

The Samsung Galaxy Note10, Galaxy Note10+, and Galaxy Fold smartphones all contain a “Super Slo-mo” feature, which lets you can record approximately 0.4 seconds of video captured at 960 fps at 1080p. Super Slo-mo recordings on these phones are captured at 480 fps and then digitally enhanced to 960 fps with 12 seconds of playback [7].

Part 2: Compression Techniques

Compression is a very important part of super slow-motion video recording. Filming 10 seconds of footage at 100,000 fps will produce the same amount of data as over 9 hours of footage filmed at 30 fps. Without compression, the files created by slow-motion cameras would be nearly impossible to view. Luckily, there are many techniques that can be utilized to compress this data into a much more manageable size. Slow-motion video is created by filming at a very high frame rate, then playing it back at a normal frame-rate like 30 fps. Because of this, techniques such as frame dropping are not very useful because the viewer will be able to see any dropped frames. Typical video compression can be done to the footage produced by super slow-motion cameras, but due to the sheer amount of data, this is very process-intensive and typically cannot be done by the camera’s onboard computer. This causes two main issues: external compression and storage. Because the footage cannot be compressed by the camera as it is being filmed like typical cameras, it must be stored uncompressed. This causes the camera to use massive amounts of storage to hold the footage before it can be compressed by a more powerful computer. Typical operation of these super slow-motion cameras requires the camera to be connected to a computer with fast SSD storage by SDI cables. This limits the functionality of the cameras as they cannot be moved the same way that typical cinema cameras can be. Video compression can be done in a few different ways but all achieve the same goal; maximize quality while minimizing data. Strategies such as motion vectors and macro-block estimation are used in super slow-motion video compression.

Part 3: Intermediate Frame Estimation

Filming “true” slow-motion video is difficult and expensive but it provides us with the unique opportunity to get a new perspective on the world. Due to the inaccessibility of high-speed cameras, the technique of slowing down footage shot at a normal frame rate is very appealing. By filming at a normal frame rate of 30 fps or 60 fps then applying smart algorithms to the footage, a very convincing effect can be achieved. This slow-motion effect is achieved by adding additional frames between the actual frames in the video. These new frames must be estimated by the algorithm and created based on the two reference frames it has to achieve smooth motion. One of NVIDIA’s research teams has achieved this effect while maintaining the high-quality nature of the original footage and they outlined it in a 2018 research paper titled, “Super SloMo: High-Quality Estimation of Multiple Intermediate Frames for Video Interpolation” [8]. Their technique utilizes deep learning techniques to train their algorithms to recreate many intermediate frames between real frames. This technique can also be used to slow down footage that is already slow motion and it works quite well as was shown by NVIDIA’s demonstration of transforming footage from “The Slow-Mo Guys” into extreme slow motion [9].

Fig. 3: A visualization of the optical flow technique [10].

Fig. 4: A network that uses optical flow for improving the accuracy of video action recognition [10].

This software that was developed by NVIDIA takes advantage of optical flow, a technique for estimating an object’s trajectory based on the relationship between the observer and the object first introduced by James Gibson in the 1940s [11]. Optical flow techniques work well for simple videos with little motion but for high-speed objects or complex scenes artifacts can be created using this technique. This is where the deep learning aspect of NVIDIA’s research comes into play to reduce the artifacts and allow for many more intermediate frames to be created. NVIDIA had this to say about their technique for minimizing artifacts:

“To address this shortcoming, we employ another UNet to refine the approximated flow and also predict soft visibility maps. Finally, the two input images are warped and linearly fused to form each intermediate frame. By applying the visibility maps to the warped images before fusion, we exclude the contribution of occluded pixels to the interpolated intermediate frame to avoid artifacts. Since none of our learned network parameters are time-dependent, our approach is able to produce as many intermediate frames as needed” [8]

This technique is a huge advantage over normal optical flow as artifacts can be extremely distracting and ruin the shot. At this point, it is not known how process-intensive this algorithm is as it requires training of the algorithm before it is useful. Implications of this technology can be useful in industries such as live sports as they could harness this power to slow down instant replays instead of relying on large, expensive cameras and tons of high-frame-rate video. By implementing this software into live broadcasting software it could make it a very user-friendly experience as you could, in theory, slow down the footage to any speed you would like regardless of the frame rate that it was shot at. This assumes that the processing can be done on-the-fly.

This algorithm could also be useful in consumer electronics as it could augment the phone’s “slow-motion” camera modes. Rather than adding a “super slow-motion” camera mode such as the 960 fps offered by some phones this feature could take footage filmed at a more manageable 120 or 240 fps and slow it down significantly. This would require much less hardware to be built into the phone and it could save precious storage space. This feature was suggested by another classmate after reviewing our project demo video, and we think that it is a fantastic idea to include in our report.

Part 4: The Future of Slow-Motion Video

The field of super slow-motion video is an active research field and is constantly changing and getting new breakthroughs from research institutes such as MIT and Caltech. New cameras and algorithms are getting introduced and new systems are now capable of filming at over a trillion frames per second. These incredible speeds are unlocking the ability to see physical phenomena such as the speed of light to now be seen on video. One such system that has been developed is called FRAME. FRAME (Frequency Recognition Algorithm for Multiple Exposures), introduced in 2017, is an improvement on previous femtosecond streak cameras. Able to capture up to a trillion frames per second, the FRAME system is an advanced filming technique, which unlike other techniques before it, has the capability to perform ultrafast spectroscopic videography of dynamic single events. Another technique, introduced in 2018, is T-CUP (Trillion-frame-per-second Compressed Ultrafast Photography), which can film up to 10 trillion frames per second. Both FRAME and T-CUP are improvements on previous systems of ultra slow-motion video capture, which operated with standard streak cameras. Of course, these systems are extremely complex, and most of their details are omitted in this section for the sake of providing a digestible overview. To learn more about how exactly they function, see the FRAME and T-CUP papers cited in the references section.

Streak Camera

Streak cameras are used to measure the pulse duration of some ultrafast laser systems and for applications such as time-resolved spectroscopy and LIDAR. They operate by transforming the time variations of a light pulse into a spatial profile on a detector, by causing a time-varying deflection of the light across the width of the detector. A light pulse enters the instrument through a narrow slit along one direction and gets deflected perpendicularly so that photons that arrive first hit the detector at a different position compared to photons that arrive later. This forms a “streak” of light on the resulting image, from which the duration and other temporal properties of the light pulse can be inferred [12].

Figure 5: Operation of the Streak Camera [13]

FRAME

“Many important scientific questions in physics, chemistry, and biology require effective methodologies to spectroscopically probe ultrafast intra- and inter-atomic/molecular dynamics. However, current methods that extend into the femtosecond regime are capable of only point measurements or single-snapshot visualizations and thus lack the capability to perform ultrafast spectroscopic videography of dynamic single events. Here we present a laser-probe-based method that enables two-dimensional videography at ultrafast timescales (femtosecond and shorter) of single, non-repetitive events.”

The novelty of the FRAME concept lies in superimposing a structural code onto the illumination to encrypt a single event, which is then deciphered in the data post-processing. Because each image in the video sequence is extracted by using a unique spatial code, the method does not rely on a specific optical wavelength or laser bandwidth, and hence can be used for spectroscopic measurements [14].

Figure 6: FRAME’s Operating Principle [14]

In Figure 6 above, (a) shows a uniformly illuminated sample with an image that resides mostly near the origin in the Fourier domain. The outer circle marks the resolution limit of the detector. (b) Shows the sample being illuminated with sinusoidal intensity modulation, effectively placing two ‘image copies’ of the object structures in the otherwise unexploited space in the Fourier domain. In (c, d), each ‘image copy’ fills only a fraction of the available reciprocal space, thereby allowing for multiple-illumination schemes without signal cross-talk [14].

T-CUP

T-CUP (Trillion-frame-per-second Compressed Ultrafast Photography), introduced in 2018, is a method of capturing up to 10 trillion frames per second. Previously established ultrafast imaging techniques either struggle to reach the desired exposure time or require repeatable measurements. In the paper, the T-CUP technique is asserted to be better than the FRAME technique; “such approaches are incapable of imaging luminescent transient events, so they are precluded from direct imaging of evolving light patterns in temporal focusing. The requirement of active illumination was recently eliminated by CUP, a new single-shot ultrafast imaging modality. Synergizing compressed sensing and streak imaging, CUP works by first compressively recording a three-dimensional scene into a two-dimensional snapshot and then computationally recovering it by solving an optimization problem” [15].

"We knew that by using only a femtosecond streak camera, the image quality would be limited," says Professor Lihong Wang, the Bren Professor of Medical Engineering and Electrical Engineering at Caltech and the Director of Caltech Optical Imaging Laboratory (COIL). "So to improve this, we added another camera that acquires a static image. Combined with the image acquired by the femtosecond streak camera, we can use what is called a Radon transformation to obtain high-quality images while recording ten trillion frames per second" [16].

The operation of T-CUP consists of data acquisition and image reconstruction. For the data acquisition, the intensity distribution of a 3D spatiotemporal scene, I[m,n,k], is first imaged with a beam splitter to form two images. The first image is directly recorded by a 2D imaging sensor via spatiotemporal integration (defined as spatial integration over each pixel and temporal integration over the entire exposure time). This process, which forms a time-unsheared view with an optical energy distribution of Eu[m,n], can be expressed by

𝐸u[𝑚,𝑛] = 𝜂∑𝑘(ℎu∗𝐼)[𝑚,𝑛,𝑘]

where η is a constant, hu represents spatial low-pass filtering imposed by optics in the time-unsheared view, and * denotes the discrete 2D spatial convolution operation [15].

Figure 7: Schematic of the T-CUP System [15].

By improving the frame rate by two orders of magnitude compared with the previous state-of-the-art, T-CUP demonstrated that the ever-lasting pursuit of a higher frame rate is far from ending. As the only detection solution so far available for passively probing dynamic self-luminescent events at femtosecond timescales in real-time, T-CUP was used to reveal spatiotemporal details of transient scattering events that were inaccessible using previous systems. The compressed-sensing-augmented projection extended the application of the Radon transformation to probing spatiotemporal datacubes. This general scheme can be potentially implemented in other imaging modalities, such as tomographic phase microscopy and time-of-flight volumography. T-CUP’s unprecedented ability for real-time, wide-field, femtosecond-level imaging from the visible to the near-infrared will pave the way for future microscopic investigations of time-dependent optical and electronic properties of novel materials under transient out-of-equilibrium conditions. With continuous improvement in streak camera technologies, future development may enable a 1 quadrillion fps (10^15 fps) frame rate with a wider imaging spectral range, allowing direct visualization and exploration of irreversible chemical reactions and nanostructure dynamics [15].

Part 5: Group Member Contributions and References

Group members were Aidan Collins, Griffin Storback, and Jeff Olmstead. Group member contributions to this project were as follows:

Aidan Collins:

Generation of the project idea
Creation and editing of the project website
Research and writing of Intermediate Frame Estimation section
Filming and editing the project video and demonstration

Griffin Storback:

Editing the project website
Research and writing of The Future of Slow Motion section
Voiceover and filming of the project video
Compiling and formatting the final report

Jeff Olmstead:

Editing the project website
Research and writing of the “What is Super-Slow-Motion Video?” section
Voiceover and filming project video
Updating the project proposal

References

“High-Speed Photography,” Atomic Heritage Foundation, 10-Jul-2017. [Online]. Available: https://www.atomicheritage.org/history/high-speed-photography. [Accessed: 08-Nov-2019].
“v2640,” Phantom v2640. [Online]. Available: https://www.phantomhighspeed.com/products/cameras/ultrahighspeed/v2640. [Accessed: 11-Nov-2019].
“v2512 vs v2460,” Youtube. [Online]. Available: https://www.youtube.com/watch?v=UPpef2Ca1vQ
“CineMag V and CineStation,” CineMag and CineStation. [Online]. Available: https://www.phantomhighspeed.com/products/toolsandaccessories/cinemagandcinestation. [Accessed: 08-Dec-2019].
R. Whitwam, “Phantom v2640 High-Speed Camera Can Film 11,750fps in Full HD,” ExtremeTech, 02-Feb-2018. [Online]. Available: https://www.extremetech.com/electronics/263272-phantom-v2640-high-speed-camera-can-film-11750-fps-full-hd. [Accessed: 08-Dec-2019].
J. Aldredge, J. Aldredge, and J. Aldredge, “6 Slow Motion Cameras You Can Afford,” The Beat: A Blog by PremiumBeat, 06-May-2019. [Online]. Available: https://www.premiumbeat.com/blog/6-affordable-slow-motion-cameras/. [Accessed: 08-Dec-2019].
“How does Super Slow-mo video work on Galaxy Note10, Galaxy Note10 and Galaxy Fold?,” The Official Samsung Galaxy Site. [Online]. Available: https://www.samsung.com/global/galaxy/what-is/slow-motion/. [Accessed: 08-Dec-2019].
Jiang, Huaizu, Sun, Deqing, Jampani, Yang, Learned-Miller, Erik, Kautz, and Varun, “Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation,” arXiv.org, 13-Jul-2018. [Online]. Available: https://arxiv.org/abs/1712.00080. [Accessed: 08-Dec-2019].
“NEWS CENTER,” NVIDIA Developer News Center, 18-Jun-2018. [Online]. Available: https://news.developer.nvidia.com/transforming-standard-video-into-slow-motion-with-ai/. [Accessed: 08-Dec-2019].
“NVIDIA Optical Flow SDK,” NVIDIA Developer, 06-Dec-2019. [Online]. Available: https://developer.nvidia.com/opticalflow-sdk. [Accessed: 08-Dec-2019].
“Optical flow,” Wikipedia, 30-Nov-2019. [Online]. Available: https://en.wikipedia.org/wiki/Optical_flow. [Accessed: 08-Dec-2019].
“Streak camera,” Wikipedia, 13-Oct-2018. [Online]. Available: https://en.wikipedia.org/wiki/Streak_camera. [Accessed: 08-Dec-2019].
K. Uchiyama, B. Cieslik, T. Ai, F. Niikura, S. Abe, “Various ultra-high-speed imaging and applications by Streak camera,” pdfs.semanticscholar.org. [Online]. Available: https://pdfs.semanticscholar.org/a3c9/c46100aef3c0d3cb170717ab84577f14df0a.pdf. [Accessed: 16-Nov-2019].
A. Ehn, J. Bood, Z. Li, E. Berrocal, M. Aldén, and E. Kristensson, “FRAME: femtosecond videography for atomic and molecular dynamics,” Nature News, 15-Mar-2017. [Online]. Available: https://www.nature.com/articles/lsa201745. [Accessed: 08-Dec-2019].
J. Liang, L. Zhu, and L. V. Wang, “Single-shot real-time femtosecond imaging of temporal focusing,” Nature News, 08-Aug-2018. [Online]. Available: https://www.nature.com/articles/s41377-018-0044-7. [Accessed: 08-Dec-2019].
StackPath. [Online]. Available: https://www.laserfocusworld.com/detectors-imaging/article/16571593/tcup-camera-captures-ten-trillion-frames-per-second. [Accessed: 08-Dec-2019].
August Musger: The Priest and Physicist Who Invented Slow Motion. [Online]. Available: https://www.mentalfloss.com/article/86916/august-musger-priest-and-physicist-who-invented-slow-motion [Accessed: 08-Dec-2019]