Biography

I am a Postdoctoral Researcher at the Smart Eyewear Lab (Politecnico di Milano & EssilorLuxottica), working on event-based vision for intelligent wearable devices, with a particular focus on eye tracking and embedded perception systems. Previously, I was part of the Robotics and Perception Group, led by Prof. Davide Scaramuzza, where I worked on advancing visual perception with event-based cameras. Throughout my career, I have contributed to advancements in neural radiance field reconstruction under fast motion, low-power perception with efficient neural networks and neuromorphic hardware, and egocentric and monocular vision for real-time applications. My work increasingly focuses on bridging research and deployment, including the design of real-time perception pipelines and on-device inference systems for resource-constrained platforms. In the past, I have also explored integrating differentiable algorithms into end-to-end trainable models, with a focus on enhancing their robustness and explainability.

Interests
  • Eye Tracking
  • Egocentric Perception
  • Event-Based Cameras
  • Representation Learning
  • Deep Learning
  • Computer Vision
Education
  • Postdoc in Event-Based Cameras and Computer Vision, 2022-2025

    Robotics and Perception Group, University of Zurich

  • PhD Student in Information Technology, 2018-2022

    Politecnico di Milano

  • MSc in Computer Science and Engineering, 2018

    Politecnico di Milano

  • BSc in Computer Science and Engineering, 2015

    Politecnico di Milano

Publications

Quickly discover relevant content by filtering publications.
(2025). GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control. arXiv preprint arXiv:2412.11198.

Cite

(2024). Deep Visual Odometry with Events and Frames. 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

Cite DOI

(2024). FARSE-CNN: Fully Asynchronous, Recurrent and Sparse Event-Based CNN. European Conference on Computer Vision.

Cite DOI

(2024). FaVoR: Features via Voxel Rendering for Camera Relocalization. arXiv preprint arXiv:2409.07571.

Cite

(2024). Low-power event-based face detection with asynchronous neuromorphic hardware. 2024 International Joint Conference on Neural Networks (IJCNN).

Cite DOI

(2024). Mitigating motion blur in neural radiance fields with events and frames. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Cite DOI

(2024). Monocular Event-Based Vision for Obstacle Avoidance with a Quadrotor. arXiv preprint arXiv:2411.03303.

Cite DOI

(2024). Revisiting token pruning for object detection and instance segmentation. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.

Cite DOI

(2023). A 5-point minimal solver for event camera relative motion estimation. Proceedings of the IEEE/CVF International Conference on Computer Vision.

Cite DOI

(2023). Deep learning for asteroids autonomous terrain relative navigation. Advances in Space Research.

Cite DOI

(2023). NeuralPUMA: Learning to Phase Unwrap Through Differentiable Graph Cuts. IEEE Transactions on Signal Processing.

Cite DOI

(2023). SoftCut: A Fully Differentiable Relaxed Graph Cut Approach for Deep Learning Image Segmentation. International Conference on Machine Learning, Optimization, and Data Science.

Cite DOI

(2022). 6 DoF Pose Regression via Differentiable Rendering. International Conference on Image Analysis and Processing.

Cite DOI

(2022). E2 (go) motion: Motion augmented event stream for egocentric action recognition. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

Cite

(2021). DA4Event: Towards Bridging the Sim-to-Real Gap for Event Cameras Using Domain Adaptation. IEEE Robotics and Automation Letters.

PDF Cite Code DOI

(2021). Neural Weighted A*: Learning Graph Costs and Heuristics with Differentiable Anytime A*. International Conference on Machine Learning, Optimization, and Data Science.

PDF Cite Code Dataset

(2021). Skeleton-based action recognition via spatial and temporal transformer networks. Computer Vision and Image Understanding.

PDF Cite Code DOI

(2021). N-ROD: a Neuromorphic Dataset for Synthetic-to-Real Domain Adaptation. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

PDF Cite Code Dataset Project Video DOI

(2021). Neural weighted a*: Learning graph costs and heuristics with differentiable anytime a. International Conference on Machine Learning, Optimization, and Data Science.

Cite DOI

(2021). Spatial Temporal Transformer Network for Skeleton-Based Action Recognition. Pattern Recognition. ICPR International Workshops and Challenges.

PDF Cite Code DOI

(2020). A Differentiable Recurrent Surface for Asynchronous Event-Based Data. Computer Vision – ECCV 2020.

Cite DOI

(2020). A Differentiable Recurrent Surface for Asynchronous Event-Based Data. Computer Vision – ECCV 2020.

PDF Cite Code Project Video DOI

(2019). Asynchronous Convolutional Networks for Object Detection in Neuromorphic Cameras. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

PDF Cite Code Video DOI

(2019). Attention Mechanisms for Object Recognition With Event-Based Cameras. 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

PDF Cite DOI

Contact

Marco Cannici

Smart Eyewear Lab, Politecnico di Milano and EssilorLuxottica
Via Giovanni Pascoli, 70/3, 20133 Milano MI