Research

A Causal Analysis on NYC Public Transportation

Nathaniel del Rosario, Ilya Zaslavsky
report / code upon request

This research explores an introductory analysis of the relationships between different transportation methods and socioeconomic factors in New York City. It involves Geospatial (GIS) data science as well as basic machine learning approaches.

Multi-Modal LLM Reasoning and Agent Modeling

Nathaniel del Rosario, Zihan Liu, Trevan Nguyen, Aaryan Agrawal, Samuel Zhang, Shibo Hao, Jorge Cheng, Zhiting Hu
report / poster / code

Fall 2024. Ran compute v.s. scaling experiments utilizing Monte Carlo Tree Search with ChatGPT 4o, UITARS7B, Deepseek R1 models on the OSWorld and BrowserGym Benchmarks.

Selected Projects

AI / ML

	Exploring-CNN-Architecture-for-Semantic-Segmentation Nathaniel del Rosario, Hargen Zheng, Ziyue Liu, Adam Tran, Chuong Nguyen report / code Semantic Segmentation on PASCAL VOC 2007 using different CNN architectures such as UNET, ResNET 101 Transfer Learning
	Tokyo Transit Visualization Nathaniel del Rosario, Trevan Nguyen report / code Interactive visualization for Tokyo's 23 wards population density and metro system
	Seq2Seq Language Translation Nathaniel del Rosario, Hargen Zheng, Ziyue Liu, Adam Tran, Chuong Nguyen report, code upon request In this project, we delved into various approaches for text-to-text translation, ranging from enhancing a Baseline model through knowledge distillation to leveraging transfer learning on the T5 model without fine-tuning. Furthermore, we explored transfer learning on the T5 model, which had been previously fine-tuned primarily for languages like French, Romanian, and German. Surprisingly, we found that T5 exhibited remarkable robustness in translating texts from languages with minimal linguistic relation, such as Chinese to English, even without any fine-tuning. Furthermore, we utilized the BARK model to convert the translated text into audio seamlessly, without the need for any additional training.
	Spotify Persona Clustering Nathaniel del Rosario code We ask the question: how do the audio features from songs, specifically Spotify Tracks compare to each other? Is there a relationship between the some of these features such as tempo correlating with danceability/energy/liveness and, if so, how are they correlated? Additionally, how can we use these features to cluster songs based on these audio tracks of songs being coverted to numeric features? We aim to answer these questions by using the Spotify API to scrape up to date trending songs and running them through a clustering pipeline to produce persona playlists.

Healthcare

Alzheimers Prediction through Computer Vision Applications

Nathaniel del Rosario, Vladimer Em, Yosen Lin
code & report

An introductory attempt at replicating Alzheimers prediction through fMRI classification on a simpler dataset. We apply fundamental CNN architecture choices to improve our baseline model.

Nathaniel del Rosario

Research

A Causal Analysis on NYC Public Transportation

Multi-Modal LLM Reasoning and Agent Modeling

Selected Projects

AI / ML

Exploring-CNN-Architecture-for-Semantic-Segmentation

Tokyo Transit Visualization

Seq2Seq Language Translation

Spotify Persona Clustering

Healthcare

Alzheimers Prediction through Computer Vision Applications