Deepayan Das

Hello! I'm currently working as a Ph.D. student at the University of Trento, collaborating with Professor Elisa Ricci. My research focuses on compositionality and semantic concept learning within neural networks. Prior to this, I worked as a research assistant at the Machine Learning Lab at IIT Hyderabad, where I worked under the guidance of Professor Vineeth Balasubramanian, exploring semantic concept grounding. I completed my master's from IIIT-Hyderabad, during which I had the privilege of working with Professor CV Jawahar. Our project was aimed at enhancing OCR performance under limited supervision. Additionally, I had the exciting opportunity to apply my knowledge in a real-world setting as a Data Scientist at Myntra.

Email  /  GitHub  /  Google Scholar  /  LinkedIn Blog

profile photo

Research

I'm interested in computer vision and machine learning.

project image

Adapting OCR with limited supervision


Deepayan Das and CV Jawahar
DAS 2020, 2020
arxiv / code /

We explore the problem of adapting an existing OCR, already trained for a specific collection to a new collection, with minimal supervision or human effort. We explore three popular strategies for this: (i) Fine Tuning (ii) Self Training (ii) Fine Tuning + Self Training and discuss details on how these popular approaches in Machine Learning can be adapted to the text recognition problem of our interest.

project image

A Cost Efficient Approach to Correct OCR Errors in Large Document Collections


Deepayan Das, Jerin Philip, Minesh Mathew and CV Jawahar
ICDAR, 2019
arxiv / slides /

Traditional post-processing schemes lookat error words sequentially since OCRs process documents one at a time. We propose a cost efficient model to address the error words in batches rather than correcting them individually. We demonstrate the efficacy of our solution empirically by reporting more than 70% reductionin the human effort with near perfect error correction.