e Akshay Pakhle - Data Scientist

Akshay Pakhle

Seattle, WA

A Data Scientist by Qualification, a Problem Solver by Mind and a Tech Enthusiast by Heart. Also, a die hard FC Barcelona fan.


Experience

Sr. Data Scientist

  • Facilitating end-to-end ML projects in my role of Lead Data Scientist for the Return to Stock (RTS) project, right from establishing a proof-of-concept, to delivering and maintaining a Machine Learning model in production that predicts Return to stock prescriptions in CVS pharmacy stores.

  • Simultaneously contributing to two different projects apart from being the owner of the RTS project, which require a combination of software development and Machine learning skills to continuously develop and maintain them based upon business requirements.

  • Effectively maintain and develop communication with business partners to interpret model performance, analyze false positives and collaborate together to identify issues and resolve them through meticulously designed solutions.

Jun 2021 - Present

Data Scientist

  • Developed an ML Lifecycle tracking dashboard using MLflow to version iterated ML models, log model artifacts, monitor and track model performance metrics, all in one place for the leadership.

  • Contributed to an ongoing ML Project for a tech-based client, which required executing distributed queries on RDD's with PySpark and performing big data experimentation with Spark ML on Databricks.

Feb 2021 - Jun 2021

Machine Learning Engineer

Intern at Rho AI

  • Implemented custom NLP-tools such as Document Classifier, Named Entity Recognizer and Text Augmentation tool into the codebase of Rho AI's enterprise Machine Learning platform Sermos, by fine-tuning state of the art models such as Longformer, GPT-2, BERT, etc.

  • Experimented and analysed various Unsupervised Machine Learning approaches to build a similarity-based recommender system by clustering similar companies using key datapoints, for users searching for specific companies on CRANE's web platform.

May 2020 - Present

Columbia - Data Science Institute Scholar

September 2020 - Present

Graduate Teaching Assistant

Columbia University

  • Graduate teaching assistant for MECEE 4520 - Data Science for Mechanical Systems by Dr. Josh Browne.

  • Primary responsibilities of mine involved grading assignments and projects that cover the syllabus of the course, and holding office hours to help students with foundational concepts of Data Science spanning from Probability Theory, Statistics, Programming and Machine Learning.

January 2020 - April 2020

Graduate Teaching Assistant

Columbia University

  • Graduate teaching assistant for COMS 4995 - Machine Learning for Financial Applications by Dr. German Creamer.

  • Instructing Students on foundational algorithms of supervised ML such as Naive Bayes, Regressions, Decision trees/ Random Forests, SVM's, HMM's, etc. Also responsible for grading and evaluating assignments designed on Quantopian.

June 2020 - August 2020

Graduate Course Assistant

Columbia University

  • Graduate teaching assistant for COMSE6998 - Fundamentals of Speech Recognition by Dr. Homayoon Beigi.

  • Primary responsibilities of mine involved grading assignments and projects that cover the syllabus of the course, holding office hours to help students with questions and doubts regarding the same.

January 2020 - April 2020

Python Developer

Intern at Acadview.com

  • Developed various sample python-projects such as Instagram Bot, Genre-based Song Classifier and Web-crawler which served as real-world examples of final deliverables for the course ’Introduction to Python Programming’, by Acadview.com (now acquired by upGrad.com).

Feb 2018 - April 2018


Education

Columbia University in the City of New York

Master of Science
Data Science

GPA: 3.7/4

Fall 2019 - Dec 2020

University of Mumbai

Bachelor of Engineering
Computer Engineering

GPA: 8.96/10

July 2015 - May 2019

Projects

Covid-19 Country-Wise Modeling and Forecasting

DSI, Columbia University

  • This was our attempt at DSI's Covid-19 forecasting challenge. We set up an automated web-scraping system and pre-processing pipeline to scrape and gather daily updated COVID-19 data from various sources(JHU, WHO, OWID, etc).
  • Fit models based on country-wise spread of the disease, some of which were logistic curves, XGBoost, piecewise linear curves, to effectively model and forecast the number of cases, deaths and recoveries for May 1, 2020.

Check the Colab File on Github

April 2020-May 2020

SincNet based Speaker Recognition

Final project for COMSE6998 - Fundamentals of Speech Recognition, Columbia University

  • Python Implementation of the paper - "Speaker Recognition from raw waveform with SincNet" , by Mirco Ravanelli, Yoshua Bengio.
  • I Used the VoxCeleb2 recipe(7363 Speakers) in Kaldi to generate Sinc-convolution based Features, which were then fed through a CNN+DNN ensemble architecture, and finally classified with a SoftMax Layer.
  • The model achieved a WER of 5.9%, which is pretty close to the best WER yet of 5.5% in these tasks.

Check the Project Colab File on Github

Check Project Report

September 2019 - January 2020

NYC Motor Verhicle Collisions

Final Project for EDAV (STAT W5702), Columbia University

  • Our analysis focuses on the impact of the Vision Zero program introduced by Mayor of New York, Bill de Blasio , in 2014. We analyse the overall trends in vehicular collision in the state of New York. By measuring the effect of policies made by the city’s administration, we aim to provide insights to substantiate future policy updates. We leverage the Motor Vehicle Collisions dataset provided by NYC Open Data for our analysis(https://data.cityofnewyork.us/Public-Safety/Motor-Vehicle-Collisions-Crashes/h9gi-nx95).
  • This analysis investigates: Most frequent causes of accidents, Vehicle type involved in most crashes, Intersections that are most vulnerable to crashes, Increase in severity of crashes, etc.

Check the Project on Github

Check Project Report

October 2019-December 2019

Interactive Physiotherapy with Kinect 2.0

University of Mumbai

  • Facilitated the use of Microsoft Kinect 2.0, to sense limb movements of patients under-going physiotherapy, and designed games targeting specific elbow and wrist movementsfor physical rehabilitation.
  • Recorded results showed that this method resulted in better interactivity in the recovery process and the average recovery time of the tested patients was quite impressive.

Check the Project on github

Check Project Report

June 2018-May 2019

Marketing Network for Agriculture Commodities

Smart India Hackathon 2019

  • Developed a dynamic truck/vehicle assignment algorithm to incoming orders on the platform which enabled enabled farmers to directly sell their goods to wholesale buyers in urban areas.
  • It was a web-based solution to connect farmers directly with wholesalers at APMC Mar-kets, eliminating middlemen. Selected for the Finals of Smart India Hackathon 2018, amongst over 300 submissions.

Check out the project idea

Check out the project implementation on Github.

February 2018 - March 2018

Student Portal Web App.

University of Mumbai

  • This is a Django Based Web application for managing student activities online.
  • Its a portal for sharing various college related files and a hub for notices and important college related info.

Check the Project on github

October 2018 - December 2018
Programming Languages & Tools

Awards & Certifications