I have worked on a few projects in Data Sciences. I have used XGBoost, LDA, BERTopic, PyTorch, CNN, RNN and LSTM for these projects. During Summer 2024, I worked in Data Science team in New Atlantis Lab, where I contributed to developing open-source machine learning models for predicting ecosystem health.
Data Science and Deep Learning Projects
-
Topic Recognition on New York Times (NYT) Articles
- Developed a topic recognition system using 42,000 data points.
- Created a recommendation system based on topic similarity using LDA and BERTopic models.
- Analyzed topic correlations, verified keyword accuracy, and evaluated trends.
- Produced time series data to track the popularity of 400+ emergent topics over the past year.
-
Deoxygenation Indicators and Forecasting
- Preprocessed data of more than 100,000 data points.
- Found correlations of features with oxygen levels using XGBoost and Time Series XGBoost models.
- Built a model using the neural network LSTM to indicate oxygen levels.
Industry Experience
-
New Atlantis Lab, Los Angeles, CA
Erdos Fellow (06/2024 - 08/2024)
- Contributed to developing open-source machine learning models for predicting ecosystem health.
- Processed large datasets and optimized algorithms for better predictions.
- Applied time series modeling to forecast oxygen levels in the ocean.