Rayan

Hi, I'm Rayan 👋

I'm a data science grad student who's into machine learning and software engineering. I just really enjoy figuring out how to use AI to solve real-world stuff.

About

Pursuing an M.S. in Data Science at the University of Michigan-Dearborn with some experience in machine learning, NLP, and data analysis.

Work Experience

Education

University of Michigan Dearborn

2023 - 2025

Master of Science in Data Science

Mediterranean Insitute of Technology

2019 - 2024

Bachelor of Science in Software Engineering

Google

2025 - present

Advanced Data Analytics Professional Certificate

Skills

Python

SAS

SQL

Java

JavaScript

Scikit-learn

TensorFlow

PyTorch

GCP

Spark

Tableau

Power BI

Matplotlib

NumPy

Pandas

My Projects

Check out my latest work

I've worked on a variety of projects, from language translation pipelines to complex machine learning applications. Here are a few of my favorites.

Brain Tumor Classification with CNNs and Transfer Learning

June 2025 - August 2025

● Built and evaluated four deep learning models (Custom CNN, VGG16, EfficientNetB0, MobileNetV2) for classifying brain MRI images into four categories: glioma, meningioma, pituitary tumor, and no tumor.
● Used a labeled dataset of 3,268 training images and 1,052 testing images, preprocessed with grayscale conversion, resizing to 128×128, and augmented to improve generalization.
● MobileNetV2 outperformed all other models, achieving ~92% accuracy, macro-averaged F1-score above 0.90, and low overfitting, making it ideal for deployment on resource-constrained environments.
● Implemented a full ML pipeline: loading and preprocessing data, visualizing class distributions, training models, evaluating with classification reports and confusion matrices, and comparing performance across models.

Python

Keras

OpenCV

VGG16

EfficientNetB0

MobileNetV2

CNN

Scikit-learn

Matplotlib

TensorFlow

Infant Mortality Prediction

January 2025 - April 2025

● Conducted exploratory data analysis on a dataset spanning 2,923 country-year health records to assess socioeconomic and healthcare determinants of infant mortality.
● Engineered a predictive log-log linear regression model achieving R² = 0.988 and RMSE = 0.16, outperforming standard linear and stepwise models.
● Diagnosed model fit through residual analysis, Shapiro-Wilk, and Durbin-Watson tests, and recommended hierarchical and panel-data extensions for future modeling.

Python

Google Colab Pro

Linear Regression

Logistic Regression

Shapiro-Wilk

Durbin-Watson

EDA

Matplotlib-learn

Fraud Detection Patterns in Insurance Claims

September 2024 - December 2024

This project focuses on identifying fraudulent insurance claims using advanced machine learning techniques. By leveraging detailed claim data and employing extensive preprocessing and feature engineering, we developed and compared models such as Random Forest, Logistic Regression, XGBoost, Naive Bayes, and KNN. The Random Forest model emerged as the most effective, achieving high accuracy and AUC scores. This work highlights the potential of data-driven approaches in addressing real-world challenges like fraud detection.

Python

PySpark

Scikit-learn

XGBoost

Pandas

NumPy

Apache Spark

Matplotlib

Seaborn

Gaussian Noise Data Augmentation

Source

Diabetes Prediction Using AutoML and BigQuery

June 2023 - Present

This project focuses on developing a diabetes prediction system using Google Cloud services, leveraging BigQuery for data preprocessing, AutoML for machine learning model training, Vertex AI for deployment, and Looker Studio for data visualization. It emphasizes early detection of diabetes, aiming to improve patient outcomes and resource efficiency. The project utilizes a dataset from Kaggle, cleans and transforms it into a structured format, trains models for predictions, and creates dashboards to present actionable insights, all while considering data privacy and ethical implications.

GCP

AutoML

VertexAI

BigQuery

LookerStudio

Paper

Translating Tunisian Dialect to French using Machine Learning

January 2024 - June 2024

Developed a machine learning pipeline to automate the translation of Tunisian dialect sentences into French, addressing the operational inefficiencies of manual translation at Emrhod Consulting. The project utilized advanced NLP techniques, including fine-tuned MarianMT and mBART models, combined with semi-supervised learning and a transliteration module for converting alphanumeric inputs to Arabic script. This approach significantly reduced translation time, improved accuracy, and contributed to advancing NLP solutions for low-resource languages.

Python

Google Colab Pro

Selenium

Pandas

Numpy

Matplotlib

Scikit-learn

Pytorch

TensorFlow

Report

Contact

Get in Touch

Have a question or want to connect? Feel free to reach out to me on LinkedIn with a direct message and I'll do my best to respond promptly.

About

Work Experience

Math Learning Center (University of Michigan)

Emrhod Consulting

Vermeg

Education

University of Michigan Dearborn

Mediterranean Insitute of Technology

Google

Skills

Check out my latest work

Brain Tumor Classification with CNNs and Transfer Learning

Infant Mortality Prediction

Fraud Detection Patterns in Insurance Claims

Diabetes Prediction Using AutoML and BigQuery

Translating Tunisian Dialect to French using Machine Learning

Get in Touch