Research

project image

A Causal Analysis on NYC Public Transportation

Nathaniel del Rosario, Ilya Zaslavsky
report / code upon request

This research explores an introductory analysis of the relationships between different transportation methods and socioeconomic factors in New York City. It involves Geospatial (GIS) data science as well as basic machine learning approaches.

project image

Multi-Modal LLM Reasoning and Agent Modeling

Nathaniel del Rosario, Zihan Liu, Trevan Nguyen, Aaryan Agrawal, Samuel Zhang, Shibo Hao, Jorge Cheng, Zhiting Hu
report / poster / code

Fall 2024. Ran compute v.s. scaling experiments utilizing Monte Carlo Tree Search with ChatGPT 4o, UITARS7B, Deepseek R1 models on the OSWorld and BrowserGym Benchmarks.

Selected Projects

AI / ML

project image

Exploring-CNN-Architecture-for-Semantic-Segmentation

Nathaniel del Rosario, Hargen Zheng, Ziyue Liu, Adam Tran, Chuong Nguyen
report / code

Semantic Segmentation on PASCAL VOC 2007 using different CNN architectures such as UNET, ResNET 101 Transfer Learning

project image

Tokyo Transit Visualization

Nathaniel del Rosario, Trevan Nguyen
report / code

Interactive visualization for Tokyo's 23 wards population density and metro system

project image

Seq2Seq Language Translation

Nathaniel del Rosario, Hargen Zheng, Ziyue Liu, Adam Tran, Chuong Nguyen
report, code upon request

In this project, we delved into various approaches for text-to-text translation, ranging from enhancing a Baseline model through knowledge distillation to leveraging transfer learning on the T5 model without fine-tuning. Furthermore, we explored transfer learning on the T5 model, which had been previously fine-tuned primarily for languages like French, Romanian, and German. Surprisingly, we found that T5 exhibited remarkable robustness in translating texts from languages with minimal linguistic relation, such as Chinese to English, even without any fine-tuning. Furthermore, we utilized the BARK model to convert the translated text into audio seamlessly, without the need for any additional training.

project image

Spotify Persona Clustering

Nathaniel del Rosario
code

We ask the question: how do the audio features from songs, specifically Spotify Tracks compare to each other? Is there a relationship between the some of these features such as tempo correlating with danceability/energy/liveness and, if so, how are they correlated? Additionally, how can we use these features to cluster songs based on these audio tracks of songs being coverted to numeric features? We aim to answer these questions by using the Spotify API to scrape up to date trending songs and running them through a clustering pipeline to produce persona playlists.

Healthcare

project image

Alzheimers Prediction through Computer Vision Applications

Nathaniel del Rosario, Vladimer Em, Yosen Lin
code & report

An introductory attempt at replicating Alzheimers prediction through fMRI classification on a simpler dataset. We apply fundamental CNN architecture choices to improve our baseline model.