Featured Projects

Showcasing innovative solutions that demonstrate expertise in machine learning and data analytics

Machine Learning Projects

Classification, regression, and predictive analytics

Home Loan Default Predictor

Home loan default prediction analyzing 58.44M records across 7 datasets. XGBoost classifier with 143 engineered features achieves 83% accuracy, 52.8% recall, and 0.785 ROC AUC at threshold 0.60. Addresses class imbalance (8% default rate), memory optimization (68.5% reduction), multicollinearity, and saves 1200+ hours of manual review through intelligent risk assessment.

Python XGBoost Scikit-learn Jupyter

INX Employee Performance Analysis

Employee performance prediction using XGBoost multiclass classification (92.5% accuracy, 93.3% CV F1-score) with SHAP interpretability. Analyzes 1,200 employee records across 28 features, identifies top 3 performance drivers, and provides HR recommendations. Full pipeline includes EDA, feature engineering, model comparison, and deployment-ready inference.

Python XGBoost SHAP Scikit-learn

FicZon Sales Effectiveness

B2B sales lead quality prediction using XGBoost classifier. Achieves 81.06% ROC AUC and 84.74% recall on 7,420 IT sales leads. Handles class imbalance, high-cardinality categoricals, and missing data through frequency encoding and threshold optimization. Includes statistical analysis, cross-validation, feature importance, and actionable business insights.

Python XGBoost Scikit-learn Jupyter

Portuguese Bank Campaign Predictor

Binary classification model for Portuguese bank term deposit prediction using LightGBM. UCI ML dataset (41,188 records). Handles class imbalance (7.87:1), multicollinearity (VIF>26k), and data leakage. Test: 87.8% accuracy, 57.3% recall, 81.1% ROC AUC with comprehensive cross-validation optimization for banking campaign targeting.

Python LightGBM Scikit-learn Jupyter

Capital Bikeshare Demand Forecaster

Daily bike rental demand prediction using Ridge Regression on Capital Bikeshare data (2011-2012). Addresses multicollinearity, zero-inflated features, and non-normal distributions. Test R²=0.832, CV R²=0.815±0.032. Includes statistical analysis, VIF removal, and comparison of 10 regression algorithms for accurate demand forecasting.

Python Ridge Regression Scikit-learn Jupyter

Automobile Price Predictor

Machine learning regression model predicting 1985 automobile prices. Lasso model achieves 91.7% R² with superior generalization over XGBoost. Handles extreme multicollinearity (VIF 16,676→8.36), data leakage detection, and outlier treatment through PCA and domain-driven feature engineering with comparison of 10 regression algorithms.

Python Lasso PCA Scikit-learn

Development Tools & Automation

CLI tools, desktop apps, and automation frameworks

SpectraTact

Desktop application management tool with intelligent window orchestration. Manage and arrange multiple applications across monitors with customizable grid and side-by-side layouts. Windows 10/11. Enhances productivity through smart workspace organization, automated application control, and multi-monitor support.

Electron Python Windows Automation
PyPI

AutoCSV Profiler AutoCSV Profiler Downloads

Automated CSV data analysis with statistical profiling and visualization. Features interactive CLI, automatic delimiter detection, memory-efficient processing, statistical profiling, and visualization generation. Supports Python 3.8-3.13. Published on PyPI v2.0.0 with comprehensive documentation.

Python 3.8+ CLI Pandas Rich

Web Development

Portfolio sites and web applications

Portfolio Website (dhaneshbb.github.io)

Static portfolio website using vanilla JavaScript ES6 modules and Bootstrap 5. Component-based architecture with dynamic nav/footer loading, modular CSS organization, Intersection Observer for animations, ESLint, Stylelint, Prettier for code quality, Lighthouse CI for performance monitoring, and optimized GitHub Pages deployment.

JavaScript Bootstrap 5 GitHub Pages ESLint
View My Code

Explore More Work

Dive deeper into my data science journey by exploring my Jupyter notebooks, Kaggle competitions, and open-source contributions. Each repository tells a story of problem-solving, innovation, and continuous learning in the field of data science and machine learning.

GitHub Repositories

Complete source code, detailed documentation, and comprehensive project implementations with step-by-step analysis

View GitHub Profile

Kaggle Notebooks

51+ data science notebooks covering machine learning, deep learning, geospatial analysis, time series, and AI fairness with comprehensive exercises and implementations

Explore Kaggle Profile

Jupyter Notebooks

Interactive data analysis notebooks with visualizations, insights, and detailed methodology explanations

View Notebooks