Parag Dutta
I am a Ph.D. student in the Dept. of CSA at IISc, Bangalore where I am associated with the StatsML Group working under
the guidance of Prof. Ambedkar Dukkipati. I can be found in Room 205 at CSA.
I work in the domain of Artificial Intelligence, specifically towards learning good representation for Reinforcement Learning.
Although most of my exisitng works are in the domain of statistical machine learning and its applications in areas like
computer vision and natural language processing, I am currently excited by thinking machines and the corresponding
theoretical aspects of artificial intelligence.
I am also a full stack web development hobbyist, and voulnteered at IISc CSA department web team from September 2019 to March 2021.
Education and Associations
Indian Institute of Science, Banaglore
GPA: 9.0 / 10
Thesis: -
Indian Institute of Science, Banaglore
GPA: 8.6 / 10 (Awarded Distinction)
Thesis: Reducing Annotation Effort in Supervised Machine Learning
[Link]
(Project Grade: 10 / 10)
Government College of Engineering and Leather Technology, Kolkata
GPA: 8.9 / 10
Thesis: Dimensionality reduction of Genome Features for Colon-cancer prediction
(Project Grade: 10 / 10)
DAV Public School, Kolkata
Percentage: 86.4%
DAV Public School, Kolkata
CGPA: 9.0 / 10
Publications
Causal Feature Alignment: Learning to ignore suprious background features
Authors: Rahul V., Parag Dutta, Vikram M. and Ambedkar Dukkipati
Deep Representation Learning for Predicting Temporal Event sets in the continuous time domain
Authors: Parag Dutta, Kawin M., Pratyaksha S., and Ambedkar Dukkipati
arXiv Link -->
CRUSH: Contextually Regularized and User anchored Self-supervised Hate speech Detection
Authors: Souvic Chakraborty, Parag Dutta, Sumegh Roychowdhury and Animesh Mukherjee
Active2 Learning: Actively reducing redundancies in Active Learning methods for Sequence Tagging and Machine Translation
Authors: Rishi Hazra, Parag Dutta, Shubham Gupta, Md. Abdul Qaathir and Ambedkar Dukkipati
Machine Reading Comprehension for ranking search engine query results by relevance - A review
Authors: Parag Dutta and Debayan Ganguly
Experience and Projects
Contextual detection of code-mixed textual hatespeech in Social Networks
We are working with graph neural networks and pretrained transformers for detecting code-mixed hate speech in social network posts. We are using pretrained transformers to exploit their language understanding aspects, we are using graphs to leverage the context from the tree structure among the posts/comments/replies, and training in an end-to-end manner.
CoviHawkes District wise daily Covid-19 case prediction
I was involved in temporal modelling of daily covid-19 cases for Indian states and districts.
We used inter-district and intra-district mobility features in graphs to predict case counts with low MAPE values.
We validated our model's effectiveness with extensive validation by comparing with various baselines like SIR,
2 layered MLP, LSTMs, GCN, GAT, etc.
I also helped with the website design and poster designs. This project was featured in the IISc website homepage
and IISc EECS division website homepage.
Few-shot Transfer Learning in Medical Imaging from Sparse Annotations
I was involved in implementing two methods to predict medical image segmentation in particular spine segmentation, in the low data regime. Our aim was to train a model on data for which annotations are easily and readily available (lumbar spine images) such that it generalises well to similar but previously unseen regions (thoracic/cervical spine). First, we implemented a contour propagation model that uses priors obtained in a human-in-the-loop setting to generalise to regions of the spine not seen during training. We show that this reduces the effort involved in annotating new data. Second, we implement a transfer learning approach and pretrain a model on a large amount of partially annotated lumbar spine data (source domain) and fine-tune the network on a limited amount of data with complete whole spine annotations (target domain). We show that this method gives us a boost in performance with minimal complete volume supervision at the time of training and without any supervision required at inference. We achieved dice similiraty coefficients of upto 0.86 by using as low as just 20 volumes of CT scans.
Active Learning for Neural Machine Translation
I was involved in generating a deep learning model which can be used to identify and eliminate redundancies in existing Active Learning methods on a variety of tasks. While my co-authors focused on tagging based tasks, I focused on machine translation. I used Siamese networks pretrained with similarity loss to get embeddings of the sentences, followed by clustering and selecting only the most relevant sentences for training a NMT model. We showed that our model beats existing active learning methods by performing better after consuming same amount of data on all NLP tasks we experimented on.
Explainable classification based on mitochondria treatment from microscope images
I had to develop an end-to-end object classification model from the ground-up. Since this was a project in the biological sciences domain, the main focus was to develop a model that was not only accurate, but also had a very high explainablility. Additionally there was also a need for the model to be lightweight, becasue the target user circle are not supposed to have specialized hardware to run the model. I developed a novel convolutional archtitecture with just 4 million parameters, which not only achieved accuracies of 93% but also gave extremely explainable results with saliency maps. A non-explainable model based on Fast-RCNN based architecture achieved more than 96% accuracy.
Deepfake Detection in High Compression Videos
Deepfakes are images or videos which are forged by inserting the facial features of a person over an some other persons' face using some deep neural network like GANs or VAEs. It is hard to detect deepfakes as it is, even more so when the video is compressed. Most of the videos streaming online are compressed videos. Hence, I developed a simple CNN based architecture inspired by the discriminator from DCGANs to detect facial forgeries in highly compressed videos which detected 92% of the forged videos by only sampling 6 to 10 frames per video and less than 200ms per frame; thus outperforming all existing methods by over 9% in accuracy. Additional experiments pushed the accuracy to 95% by considering sequential relations in the video frames.
Polymorphic Malware Detection using Deep Learning
Using the Microsoft Malware Classification Dataset and files from my system, I created a CNN based
binary malware/benign classifier after converting the files to their corresponding byte-wise grayscale
images.
Then I also created a simple polymorphic malware in python, which was then used to
train the classifier using transfer learning. It's just a proof of concept that polymorphic malwares
can be detected using probabilistic deep learning model (because it can't be detected using deterministic
approaches).
Prediction of colon cancer based on genome features
Comparative study on performance of various machine learning models on low data regime. A supervised dataset was provided which had around 2000 features but only 62 observations. The work was mostly driven towards dimensionality reduction. I expiremented with various dimention reduction algorithms like PCA, LDA, ICA, PLS, etc. On the reduced dataset I tried different machine learning algorithms like Random Forest Ensembles, Support Vector Machines, Bayesian Networks, K - Nearest Neighbors, etc. The best results was gained with PLS reduced dataset of 40 features along with Linear SVM.
Providing relevant answers to question asked in search engine queries using MRC
The problem was to provide to-the-point precise answers to questions like 'What is the speed of light?', to which the answer should be '299,792,458 m/s'. To do this, the query along with the top search results would be provided, and after sorting them according to our requirements, we needed to extract the relevant text and display it directly to the user if a certain confidence threshold was achieved. To do this I used a multi-hop attention networks for generating a ranking over relevant passages given a user query. It was a pipelined model, which used pretrained transformers, and significantly outperformed most of the models in the competition.
Social media sentiment anaylsis
A labelled dataset of 1.6 million tweets was provided for training a deep learning model which analyses sentiments in a given tweet on a scale of -1(negative) to 1(positive). I started with basic frequency based classification, followed by RNNs, LSTMs and GRUs. Later I also tried attention and gained significantly higher accuracy. It was basically a comparison based project to determine the capabilities of various recurrent neural network models.
Swarm Intelligence among drones for movement and reconnaissance
Proposed an intelligent system to crontrol the movement of a swarm of drones as if it were a single unit. The individual drones would learn how to collaboarate with others for information sharing and move stochastically without collision using particle swarm optimization techniques. Moreover the whole swarm would have a hive-mind like intelligence which could be used for group reinforement learing, and also share computational resources among themselves. The architecture was extremely scalable, and hence additional hardware could be used to implement features like FLIR, real time identity searches, signal jamming, and gait analysis, which are extremely useful for surveillance.
MERN Full-stack Web Development (Internship)
I was involved in the ground up development of a dynamic, responsive, and reactive progressive web application. The project was developed from scratch, using MERN stack. A RESTful web API service was developed in the backend, using MongoDB Cloud services for database, NodeJS as the server-side scripting language and a combination of ExpressJS & Restify as the framework for Authentication and REST API services respectively. As for the client-side frontend of the system, a Progressive Single Paged Application (P-SPA) was created using ReactJS to handle communication with the server.
Vehicle Numberplate Recognition at security checkpoint
I was involved in development of a system that could use live footage from the cameras positioned at security gates to automtically read the number plates of a vehicle that entered the premises and logged the same into a database. After detecting the region of interest using convloutional filters, I used basic SSD algorithm for OCR, and a modified version of busy wait locking protocol for synchroniztion.
Autonomous ATV movement based on live camera feed
An autonomous All Terrain Vehicle was supposed to traverse a given terrain, visiting some specified locations in an order such that the travelled distance is shortest. In the path between two locations there were some obstacles which it was supposed to avoid. The detection of current position and orientation of the ATV, the locations to be visited, and the obstacles to be avoided were supposed to be detected from the feed of an overhead camera mounted vertically over the terrain. I used A* algorithm to calculate the shortest path, various image processing techniques to rectify the image, detect colors in the terrain, recognize shapes and their orientations, etc. For the hardware of the robot, it was a 4 wheel drive, where 32-stepped stepper motors were used to provide precise movement.
Skills
- Predictive Analytics
- Statistical Analysis
- Machine Learning
- Deep Learning
- Computer Vision
- Natural Language Processing
- Reinforcement Learning
- Automated Software Engineering
- Causal Inference
Interests and Hobbies
Apart for keeping myself updated to the latest innovations and trends in technologies, I enjoy spending most of my time indoors. I love reading novels and watching sitcom tv series. Apart from that I follow a number of sci-fi and fantasy genre movies and television shows. I am a hardcore PC gamer. I like to explore varieties of music and sketch in my free time. I also like to solve puzzles, do competitive coding, and participate in quizzes.
When forced outside, I enjoy playing basketball and table tennis. I also enjoying teaching and motivating young generation to persue their dreams. I like travelling all over the country and study the cultures and heritages of those places. I must admit that I have a knack for photography, which comes in handy during my travels.
Achievements, Awards & Certifications
- Completed a course on Practical Reinforcement Learning, with honors [certificate] from National Research University, Higher School of Economics - 2020
- Completed a course on Machine Learning [certificate] by Andrew Ng from Stanford University - 2020
- McAfee Best Project Award for systems security course project - 2019
- Secured 99.8 percentile (out of 99932) - Graduate Aptitude Test in Engineering (GATE) - 2019
- Qualified for the final round of Microsoft AI Challenge - 2018
- Regional Finalist (East Zone) - DRDO's Robotics and Unmanned Systems Exposition - 2018
- Semi Finalist - ISRO's National Student Space Challenge (Colonizer Event) - 2017
- Attended a course on "Data Mining" by Sunil Vadera of Salford University, Manchester (arranged by GIAN) and got certified - 2016
- Credited a course on "Robotic Vision" (Image processing for machines) by Peter Corke and received a Statement of Attainment from Queensland University of Technology, Brisbane Australia - 2016
Extra Curricular achievements during 4 years of B.Tech.
- 2nd Place - Table Tennis Tournament - GCELT's intra college sports fest - 2019
- 1st Place - Website Designing - GCELT's inter college techno creative fest Enginerds' - 2018
- 3rd Place - Quiz Competition - GCELT's inter college techno creative fest Enginerds' - 2018
- 2nd Place - Line follower Robot - GCELT's inter college techno creative fest Enginerds' - 2017
- 2nd Place - Line follower Robot - GCELT's inter college techno creative fest Enginerds' - 2016
- 3rd Place - Need For Speed Most Wanted - GCELT's inter college techno creative fest Enginerds' - 2015