An event camera, also known as a neuromorphic camera,[1] silicon retina[2] or dynamic vision sensor,[3] is an imaging sensor that responds to local changes in brightness. Event cameras do not capture images using a shutter as conventional cameras do. Instead, each pixel inside an event camera operates independently and asynchronously, reporting changes in brightness as they occur, and staying silent otherwise. Modern event cameras have microsecond temporal resolution, 120 dB dynamic range, and less under/overexposure and motion blur[4][5] than frame cameras.

Functional description

Event cameras contain pixels that independently respond to changes in brightness as they occur.[4] Each pixel stores a reference brightness level, and continuously compares it to the current level of brightness. If the difference in brightness exceeds a preset threshold, that pixel resets its reference level and generates an event: a discrete packet of information containing the pixel address and timestamp. Events may also contain the polarity (increase or decrease) of a brightness change, or an instantaneous measurement of the current level of illumination.[6] Thus, event cameras output an asynchronous stream of events triggered by changes in scene illumination.

Comparison of the data produced by an event camera and a conventional camera.
Comparison of the data produced by an event camera and a conventional camera.
Typical characteristics of image sensors
Sensor Dynamic

range (dB)

Equivalent

framerate* (fps)

Spatial

resolution (MP)

Power

consumption (mW)

Human eye 30–40 200-300 - 10[7]
High-end DSLR camera (Nikon D850) 44.6[8] 120 2–8 -
Ultrahigh-speed camera (Phantom v2640)[9] 64 12,500 0.3–4 -
Event camera[10] 120 1,000,000 0.1–0.2 30

*Indicates temporal resolution since human eyes and event cameras do not output frames.

Types

While all event cameras respond to local changes in brightness, there are a few variants. Temporal contrast sensors (like the pioneering DVS[4] (Dynamic Vision Sensor) or the sDVS[11] (sensitive-DVS)) produce events that indicate polarity (increase or decrease in brightness), while temporal image sensors[6] indicate the instantaneous intensity with each event. The DAVIS[12] (Dynamic and Active-pixel Vision Sensor) contains a global shutter active pixel sensor (APS) in addition to the dynamic vision sensor (DVS) that shares the same photosensor array. Thus, it has the ability to produce image frames alongside events. Many event cameras additionally carry an inertial measurement unit (IMU).

Event cameras
Name Event output Image frames Color IMU Manufacturer Commercially available
DVS128[4] Polarity No No No Inivation No
sDVS128[11] Polarity No No No CSIC No
DAVIS240[12] Polarity Yes No Yes Inivation Yes
DAVIS346[13] Polarity Yes No Yes Inivation Yes
SEES[14] Polarity Yes No Yes Insightness Yes
SilkyEvCam[15] Polarity No No No Century Arks Yes
Samsung DVS[16] Polarity No No Yes Samsung No
Onboard[6] Polarity No No Yes Prophesee Yes
Celex[17] Intensity Yes No Yes CelePixel Yes
IMX636[18] Intensity Yes No Yes Sony / Prophesee Yes

Retinomorphic sensors

Main article: Retinomorphic sensor

Left: schematic cross-sectional diagram of photosensitive capacitor. Center: circuit diagram of retinomorphic sensor, with photosensitive capacitor at top. Right: Expected transient response of retinomorphic sensor to application of constant illumination.
Left: schematic cross-sectional diagram of photosensitive capacitor. Center: circuit diagram of retinomorphic sensor, with photosensitive capacitor at top. Right: Expected transient response of retinomorphic sensor to application of constant illumination.

Another class of event sensors are so-called retinomorphic sensors. While the term retinomorphic has been used to describe event sensors generally,[19][20] in 2020 it was adopted as the name for a specific sensor design based on a resistor and photosensitive capacitor in series.[21] These capacitors are distinct from photocapacitors, which are used to store solar energy,[22] and are instead designed to change capacitance under illumination. They are therefore expected to charge / discharge slightly when the capacitance is changed, but otherwise remain in equilibrium. When the photosensitive capacitor is placed in series with a resistor, and an input voltage is applied across the circuit, the result is a sensor which outputs a voltage when the light intensity changes, but otherwise outputs no signal.

Unlike other event sensors (which typically consist of a photodiode and some other circuit elements), these retinomorphic sensors produce the signal inherently by design. They can hence be considered a single device which produces the same result as a small circuit in other event cameras. Retinomorphic sensors have to-date only been studied in a research environment, but are hoped to have applications in object recognition, autonomous vehicles, and robotics.[23][24][25][26]


Algorithms

A pedestrian runs in front of car headlights at night. Left: image taken with a conventional camera exhibits severe motion blur and underexposure. Right: image reconstructed by combining the left image with events from an event camera.[27]
A pedestrian runs in front of car headlights at night. Left: image taken with a conventional camera exhibits severe motion blur and underexposure. Right: image reconstructed by combining the left image with events from an event camera.[27]

Image Reconstruction

Image reconstruction from events has the potential to create images and video with high dynamic range, high temporal resolution and minimal motion blur. Image reconstruction can be achieved using temporal smoothing, e.g. high-pass or complementary filter.[27] Alternative methods include optimization[28] and gradient estimation[29] followed by Poisson integration.

Spatial convolutions

The concept of spatial event-driven convolution was initially postulated in 1999[30] (before the DVS invention), but later generalized during EU project CAVIAR[31] (during which the DVS was invented) by projecting event-by-event an arbitrary convolution kernel around the event coordinate in an array of integrate-and-fire pixels.[32] Extension to multi-kernel event-driven convolutions[33] allows for event-driven deep convolutional neural networks.[34]

Motion detection and tracking

Segmentation and detection of moving objects viewed by an event camera can seem to be a trivial task, as it is done by the sensor on-chip. However, these tasks are difficult, because events carry very little information[35] and do not contain useful visual features like texture and color that are essential.[36] These tasks become further challenging in the scenario of a moving camera[35] because events are triggered everywhere on the image plane, produced by moving objects and the static scene (whose apparent motion is induced by the camera’s ego-motion). Some of the recent approaches to solving this problem include the incorporation of motion-compensation models[37][38] and using traditional clustering algorithms.[39][40][36][41]

See also

References

  1. ^ Li, Hongmin; Liu, Hanchao; Ji, Xiangyang; Li, Guoqi; Shi, Luping (2017). "CIFAR10-DVS: An Event-Stream Dataset for Object Classification". Frontiers in Neuroscience. 11: 309. doi:10.3389/fnins.2017.00309. ISSN 1662-453X. PMC 5447775. PMID 28611582.
  2. ^ Sarmadi, Hamid; Muñoz-Salinas, Rafael; Olivares-Mendez, Miguel A.; Medina-Carnicer, Rafael (2021). "Detection of Binary Square Fiducial Markers Using an Event Camera". IEEE Access. 9: 27813–27826. arXiv:2012.06516. doi:10.1109/ACCESS.2021.3058423. ISSN 2169-3536. S2CID 228375825.
  3. ^ Liu, Min; Delbruck, Tobi (May 2017). "Block-matching optical flow for dynamic vision sensors: Algorithm and FPGA implementation". 2017 IEEE International Symposium on Circuits and Systems (ISCAS). pp. 1–4. arXiv:1706.05415. doi:10.1109/ISCAS.2017.8050295. ISBN 978-1-4673-6853-7. S2CID 2283149. Retrieved 27 June 2021.
  4. ^ a b c d Lichtsteiner, P.; Posch, C.; Delbruck, T. (February 2008). "A 128×128 120 dB 15μs Latency Asynchronous Temporal Contrast Vision Sensor" (PDF). IEEE Journal of Solid-State Circuits. 43 (2): 566–576. Bibcode:2008IJSSC..43..566L. doi:10.1109/JSSC.2007.914337. ISSN 0018-9200. S2CID 6119048.
  5. ^ Longinotti, Luca. "Product Specifications". iniVation. Retrieved 2019-04-21.
  6. ^ a b c Posch, C.; Matolin, D.; Wohlgenannt, R. (January 2011). "A QVGA 143 dB Dynamic Range Frame-Free PWM Image Sensor With Lossless Pixel-Level Video Compression and Time-Domain CDS". IEEE Journal of Solid-State Circuits. 46 (1): 259–275. Bibcode:2011IJSSC..46..259P. doi:10.1109/JSSC.2010.2085952. ISSN 0018-9200. S2CID 21317717.
  7. ^ Skorka, Orit (2011-07-01). "Toward a digital camera to rival the human eye". Journal of Electronic Imaging. 20 (3): 033009–033009–18. Bibcode:2011JEI....20c3009S. doi:10.1117/1.3611015. ISSN 1017-9909.
  8. ^ DxO. "Nikon D850 : Tests and Reviews | DxOMark". www.dxomark.com. Retrieved 2019-04-22.
  9. ^ "Phantom v2640". www.phantomhighspeed.com. Retrieved 2019-04-22.
  10. ^ Longinotti, Luca. "Product Specifications". iniVation. Retrieved 2019-04-22.
  11. ^ a b Serrano-Gotarredona, T.; Linares-Barranco, B. (March 2013). "A 128x128 1.5% Contrast Sensitivity 0.9% FPN 3μs Latency 4mW Asynchronous Frame-Free Dynamic Vision Sensor Using Transimpedance Amplifiers" (PDF). IEEE Journal of Solid-State Circuits. 48 (3): 827–838. Bibcode:2013IJSSC..48..827S. doi:10.1109/JSSC.2012.2230553. ISSN 0018-9200. S2CID 6686013.
  12. ^ a b Brandli, C.; Berner, R.; Yang, M.; Liu, S.; Delbruck, T. (October 2014). "A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor". IEEE Journal of Solid-State Circuits. 49 (10): 2333–2341. Bibcode:2014IJSSC..49.2333B. doi:10.1109/JSSC.2014.2342715. ISSN 0018-9200.
  13. ^ Taverni, Gemma; Paul Moeys, Diederik; Li, Chenghan; Cavaco, Celso; Motsnyi, Vasyl; San Segundo Bello, David; Delbruck, Tobi (May 2018). "Front and Back Illuminated Dynamic and Active Pixel Vision Sensors Comparison" (PDF). IEEE Transactions on Circuits and Systems II: Express Briefs. 65 (5): 677–681. doi:10.1109/TCSII.2018.2824899. ISSN 1549-7747. S2CID 19091270.
  14. ^ "Insightness – Sight for your device". Retrieved 2019-04-22.
  15. ^ "Event Based Vision Camera - Century Arks". Retrieved 2021-08-30.
  16. ^ Son, Bongki; Suh, Yunjae; Kim, Sungho; Jung, Heejae; Kim, Jun-Seok; Shin, Changwoo; Park, Keunju; Lee, Kyoobin; Park, Jinman (February 2017). "4.1 A 640×480 dynamic vision sensor with a 9µm pixel and 300Meps address-event representation". 2017 IEEE International Solid-State Circuits Conference (ISSCC). San Francisco, CA, USA: IEEE. pp. 66–67. doi:10.1109/ISSCC.2017.7870263. ISBN 9781509037582. S2CID 19371922.
  17. ^ Chen, Shoushun; Tang, Wei; Zhang, Xiangyu; Culurciello, Eugenio (December 2012). "A 64 $\times$ 64 Pixels UWB Wireless Temporal-Difference Digital Image Sensor". IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 20 (12): 2232–2240. doi:10.1109/TVLSI.2011.2172470. ISSN 1063-8210. S2CID 5909607.
  18. ^ "Sony to Release Two Types of Stacked Event-Based Vision Sensors with the Industry's Smallest 4.86μm Pixel Size for Detecting Subject Changes Only Delivering High-Speed, High-Precision Data Acquisition to Improve Industrial Equipment Productivity|News Releases|Sony Semiconductor Solutions Group". Sony Semiconductor Solutions Group (in Japanese). Retrieved 2021-10-28.
  19. ^ Boahen, K. (1996). "Retinomorphic vision systems". Proceedings of Fifth International Conference on Microelectronics for Neural Networks: 2–14. doi:10.1109/MNNFS.1996.493766. ISBN 0-8186-7373-7. S2CID 62609792.
  20. ^ Posch, Christoph; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabe; Delbruck, Tobi (2014). "Retinomorphic Event-Based Vision Sensors: Bioinspired Cameras With Spiking Output". Proceedings of the IEEE. 102 (10): 1470–1484. doi:10.1109/JPROC.2014.2346153. ISSN 1558-2256. S2CID 11513955.
  21. ^ Trujillo Herrera, Cinthya; Labram, John G. (2020-12-07). "A perovskite retinomorphic sensor". Applied Physics Letters. 117 (23): 233501. Bibcode:2020ApPhL.117w3501T. doi:10.1063/5.0030097. ISSN 0003-6951. S2CID 230546095.
  22. ^ Miyasaka, Tsutomu; Murakami, Takurou N. (2004-10-25). "The photocapacitor: An efficient self-charging capacitor for direct storage of solar energy". Applied Physics Letters. 85 (17): 3932–3934. Bibcode:2004ApPhL..85.3932M. doi:10.1063/1.1810630. ISSN 0003-6951.
  23. ^ "Perovskite sensor sees more like the human eye". Physics World. 2021-01-18. Retrieved 2021-10-28.
  24. ^ "Simple Eyelike Sensors Could Make AI Systems More Efficient". Inside Science. Retrieved 2021-10-28.
  25. ^ Hambling, David. "AI vision could be improved with sensors that mimic human eyes". New Scientist. Retrieved 2021-10-28.
  26. ^ "An eye for an AI: Optic device mimics human retina". BBC Science Focus Magazine. Retrieved 2021-10-28.
  27. ^ a b Scheerlinck, Cedric; Barnes, Nick; Mahony, Robert (2019). "Continuous-Time Intensity Estimation Using Event Cameras". Computer Vision – ACCV 2018. Lecture Notes in Computer Science. Springer International Publishing. 11365: 308–324. arXiv:1811.00386. doi:10.1007/978-3-030-20873-8_20. ISBN 9783030208738. S2CID 53182986.
  28. ^ Pan, Liyuan; Scheerlinck, Cedric; Yu, Xin; Hartley, Richard; Liu, Miaomiao; Dai, Yuchao (June 2019). "Bringing a Blurry Frame Alive at High Frame-Rate With an Event Camera". 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA, USA: IEEE. pp. 6813–6822. arXiv:1811.10180. doi:10.1109/CVPR.2019.00698. ISBN 978-1-7281-3293-8. S2CID 53749928.
  29. ^ Scheerlinck, Cedric; Barnes, Nick; Mahony, Robert (April 2019). "Asynchronous Spatial Image Convolutions for Event Cameras". IEEE Robotics and Automation Letters. 4 (2): 816–822. arXiv:1812.00438. doi:10.1109/LRA.2019.2893427. ISSN 2377-3766. S2CID 59619729.
  30. ^ Serrano-Gotarredona, T.; Andreou, A.; Linares-Barranco, B. (Sep 1999). "AER Image Filtering Architecture for Vision Processing Systems". IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications. 46 (9): 1064–1071. doi:10.1109/81.788808. hdl:11441/76405. ISSN 1057-7122.
  31. ^ Serrano-Gotarredona, R.; et, al (Sep 2009). "CAVIAR: A 45k-Neuron, 5M-Synapse, 12G-connects/sec AER Hardware Sensory-Processing-Learning-Actuating System for High Speed Visual Object Recognition and Tracking". IEEE Transactions on Neural Networks. 20 (9): 1417–1438. doi:10.1109/TNN.2009.2023653. hdl:10261/86527. ISSN 1045-9227. PMID 19635693. S2CID 6537174.
  32. ^ Serrano-Gotarredona, R.; Serrano-Gotarredona, T.; Acosta-Jimenez, A.; Linares-Barranco, B. (Dec 2006). "A Neuromorphic Cortical-Layer Microchip for Spike-Based Event Processing Vision Systems". IEEE Transactions on Circuits and Systems I: Regular Papers. 53 (12): 2548–2566. doi:10.1109/TCSI.2006.883843. hdl:10261/7823. ISSN 1549-8328. S2CID 8287877.
  33. ^ Camuñas-Mesa, L.; et, al (Feb 2012). "An Event-Driven Multi-Kernel Convolution Processor Module for Event-Driven Vision Sensors". IEEE Journal of Solid-State Circuits. 47 (2): 504–517. Bibcode:2012IJSSC..47..504C. doi:10.1109/JSSC.2011.2167409. hdl:11441/93004. ISSN 0018-9200. S2CID 23238741.
  34. ^ Pérez-Carrasco, J.A.; Zhao, B.; Serrano, C.; Acha, B.; Serrano-Gotarredona, T.; Chen, S.; Linares-Barranco, B. (November 2013). "Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate-Coding and Coincidence Processing. Application to Feed-Forward ConvNets". IEEE Transactions on Pattern Analysis and Machine Intelligence. 35 (11): 2706–2719. doi:10.1109/TPAMI.2013.71. ISSN 0162-8828. PMID 24051730. S2CID 170040.
  35. ^ a b Gallego, Guillermo; Delbruck, Tobi; Orchard, Garrick Michael; Bartolozzi, Chiara; Taba, Brian; Censi, Andrea; Leutenegger, Stefan; Davison, Andrew; Conradt, Jorg; Daniilidis, Kostas; Scaramuzza, Davide (2020). "Event-based Vision: A Survey". IEEE Transactions on Pattern Analysis and Machine Intelligence. PP: 1. arXiv:1904.08405. doi:10.1109/TPAMI.2020.3008413. ISSN 1939-3539. PMID 32750812. S2CID 234740723.
  36. ^ a b Mondal, Anindya; R, Shashant; Giraldo, Jhony H.; Bouwmans, Thierry; Chowdhury, Ananda S. (2021). "Moving Object Detection for Event-Based Vision Using Graph Spectral Clustering". International Conference on Computer Vision (ICCV) Workshops: 876–884. arXiv:2109.14979. doi:10.1109/ICCVW54120.2021.00103. ISBN 978-1-6654-0191-3. S2CID 238227007 – via IEEE Xplore.
  37. ^ Mitrokhin, Anton; Fermuller, Cornelia; Parameshwara, Chethan; Aloimonos, Yiannis (October 2018). "Event-Based Moving Object Detection and Tracking". 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Madrid: IEEE: 1–9. arXiv:1803.04523. doi:10.1109/IROS.2018.8593805. ISBN 978-1-5386-8094-0. S2CID 3845250.
  38. ^ Stoffregen, Timo; Gallego, Guillermo; Drummond, Tom; Kleeman, Lindsay; Scaramuzza, Davide (2019). "Event-Based Motion Segmentation by Motion Compensation": 7244–7253. arXiv:1904.01293. ((cite journal)): Cite journal requires |journal= (help)
  39. ^ Piątkowska, Ewa; Belbachir, Ahmed Nabil; Schraml, Stephan; Gelautz, Margrit (June 2012). "Spatiotemporal multiple persons tracking using Dynamic Vision Sensor". 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops: 35–40. doi:10.1109/CVPRW.2012.6238892. ISBN 978-1-4673-1612-5. S2CID 310741.
  40. ^ Chen, Guang; Cao, Hu; Aafaque, Muhammad; Chen, Jieneng; Ye, Canbo; Röhrbein, Florian; Conradt, Jörg; Chen, Kai; Bing, Zhenshan; Liu, Xingbo; Hinz, Gereon (2018-12-02). "Neuromorphic Vision Based Multivehicle Detection and Tracking for Intelligent Transportation System". Journal of Advanced Transportation. 2018: e4815383. doi:10.1155/2018/4815383. ISSN 0197-6729.
  41. ^ Mondal, Anindya; Das, Mayukhmali (2021-11-08). "Moving Object Detection for Event-based Vision using k-means Clustering". arXiv:2109.01879 [cs.CV].