Rajiv Sarvepalli
Department of Computer Science • University of Virginia
I am an undergraduate student in the Computer Science Department at University of Virginia. I am especially interested in the intersection of computer vision and language.
I have worked as an undergraduate researcher for some time under Professor Yonghwi Kwon. We worked to design ways to improve cybersecurity through machine learning techniques and data analysis strategies. One of the more recent projects has been collecting data on Docker containers on docker hub to perform data analysis of any current Common Vulnerabilities and Exposures (CVEs) that may be left over through older software versions. Last summer, I worked as a machine learning intern finding ways to detect small objects in imagery using PyTorch and state of art object detection algorithms. I have done several projects in the fields of machine learning, broadly. More recently, I built a python package for basic hierarchal networks in PyTorch.
news
Attended TECHCON 2019 to present research work at the University of Virginia through poster publication.
Work on Defending Against Persona Abuse Attacks got accepted as a poster publication for the Semiconductor Research Corporation.
selected papers
A list of interesting papers that are probably unpublished papers (since I am still a undergraduate).
-
Image-Caption Geolocation for Privacy
Image geolocation, classifying the location of an input image, is a difficult problem in computer vision with many applications. In recent years, large datasets of geotagged images have become readily available for researchers to use, and interest in the area has increased. Current state-of-the-art models like img2gps use deep image classification approaches in which the world is split into a quadtree and the model predicts which cell an input image resides in. Unlike these approaches which focus solely on vision, we propose to include not only visual data in our model, but also textual. To elaborate, our model will estimate geographic location with a multi-modal model, which leverages both an image classifier and text-based geolocation parser. Our results indicate that differentiation between geographically similar locations is improved by the use of hierarchical models, and that while a text parser can disambiguate explicit locations from text near perfectly, it is more challenging to disambiguate colloquial, misspelled, and more specific locations.In recent years, “anonymous” social media sources have become increasingly popular, with the rise of sites like Reddit, where true identities are typically masked with usernames. With this comes the need to scan the content you post, making sure it doesn’t reveal anything about your identity. Our goal with this project is to create a tool that allows users to scan their desired images and text pairings to see whether it reveals too much personal information about themselves. The applications of such a tool can be expanded to all forms of social media, allowing users to control the amount of information they are sharing with the world.