Brian is available for hire

Brian Todd

Verified Expert in Engineering

Computer Vision Developer

Location

Innsbruck, Austria

Toptal Member Since

July 3, 2019

Brian是一位经验丰富的数据科学家和机器学习工程师，在研究和部署各种自然语言处理(NLP)任务模型方面有着良好的记录, computer vision algorithms, classical machine learning algorithms, Bayesian statistical models, time series analysis models, and large-scale data mining algorithms.

Portfolio

Parabolica Labs

XGBoost，自然语言工具包(NLTK)， SpaCy, Dask, Databricks, Hadoop...

Twosense, Inc.

Amazon Web Services (AWS)， Git, SciPy, Scikit-learn, Pandas, NumPy, SQL, Cython...

Skedaddle

SciPy, Scikit-learn, NumPy, Pandas, Snowflake, SQL, Flask, Python

Experience

Python - 8 years Scikit-learn - 6 years SQL - 6 years NumPy - 6 years Pandas - 6 years Data Science - 6 years Generative Pre-trained Transformers (GPT) - 4 years Natural Language Processing (NLP) - 4 years

Availability

Part-time

Preferred Environment

亚马逊网络服务(AWS)、Linux、Scikit-learn、Pandas、SciPy、NumPy、SQL、c++、Cython、Python

The most amazing...

...project I've developed is a patented, 实时深度学习管道，根据来自腕带设备的传感器数据检测和分类用户行为.

Work Experience

Data Scientist | Machine Learning Engineer Contractor

2019 - PRESENT

Parabolica Labs

提供机器学习和数据科学解决方案，专注于自然语言处理(NLP), time series analysis problems, applications of machine learning to sensor data, deep learning models, and Bayesian statistical models.
为一家财富500强医疗保健公司(Rasa)开发了一个生产级会话AI/聊天机器人, 它允许用户通过对话与他们的帐户和计划功能进行交互.
Performed in-depth analysis (SpaCy, NLTK, Textacy)，并在大规模文本数据集上创建数据可视化以提取语义, keywords, phrases, intents, 等语言特点，便于产品开发和市场开拓.
Developed production models for classifying cohorts of users (Cython, Databricks, SciPy, Scikit-learn, XGBoost) using matrix factorization and prior behavioral data.
使用Cython开发分布式数据管道，用于处理和转换大型金融时间序列数据集, C++, Dask, and Parquet.

Technologies: XGBoost，自然语言工具包(NLTK)， SpaCy, Dask, Databricks, Hadoop, Scikit-learn, Pandas, NumPy, SQL, Cython, Python

Machine Learning Engineer

2018 - 2019

Twosense, Inc.

Researched, developed, and deployed a suite of machine learning models (NumPy, SciPy, scikit-learn, XGBoost, 基于从手机和电脑上的传感器收集的行为生物特征来验证用户的身份.
编写大规模数据处理脚本，使用NumPy使用实时生物识别数据进行模型训练和测试, AWS Redshift, and Pandas.
Produced Jupyter notebooks visualizing model validation metrics, data transformations, and critical data analysis.
Guided best practices, led technical sessions, collaborated on project specifications, and wrote significant amounts of research documentation. 我被聘为机器学习和数据科学团队的第一个成员.
为数据处理、特征、提取和模型验证编写单元测试套件.
完成从大型不同数据集中提取特征的任务，用于模型开发.

Technologies: Amazon Web Services (AWS)， Git, SciPy, Scikit-learn, Pandas, NumPy, SQL, Cython, Python

Senior Data Scientist

2017 - 2018

Skedaddle

Developed a production-grade API for pricing algorithms using NumPy, scikit-learn, Lambda, and API Gateway.
使用EC2构建和维护完整的数据管道平台，读取公共api, Lambda, and Snowflake, for creating time series models predicting product demand.
使用Flask编写了一个无服务器的web应用程序，显示数据可视化和关键指标的实时监控, Zappa, and D3.js.
Provided ad hoc analysis for all domains within the organization, and guided other team members in their analyses.

技术:SciPy, Scikit-learn, NumPy, Pandas, Snowflake, SQL, Flask, Python

Senior Data Scientist

2016 - 2017

Whoop, Inc.

领导团队代码审查，并在季度团队目标和项目的方向上合作.
Researched, developed, and deployed novel, deep, 使用PyTorch基于多维传感器数据对活动进行分类的卷积神经网络.
开发和维护来自Redshift和PostgreSQL数据库的数据管道.
利用NumPy和SciPy构建了生物特征时间序列数据的实时活动检测算法.
使用PyTorch研究并开发了用于压缩感知的卷积自编码器模型.
用NumPy编写了一个基于加速度计和生物特征的实时算法来检测用户如何佩戴传感器, SciPy, and scikit-learn.

Technologies: Amazon Web Services (AWS), SciPy, NumPy, Pandas, Scikit-learn, Redshift, TensorFlow, PyTorch, SQL, Python

Associate Data Scientist

2014 - 2016

Cogo Labs

实现了一个Python库，用于使用贝叶斯统计和其他测量工具进行a /B测试.
开发从用户点击流数据中挖掘url的统计方法(Presto)和开发神经网络(Tensorflow), Keras) to model user level characteristics from the browsing history.
应用NLP (scikit-learn, NumPy)技术基于内容相似性对广告活动进行聚类.
Wrote algorithms (NumPy, SciPy, Pandas, Scikit-learn)用于用户活动选择，并基于点击流数据开发了模型, market intent, and demographics.
构建工具，每天使用Python监控关键指标并对生产模型进行评分, MySQL, PostgreSQL, Presto, and MapReduce.

技术:Scikit-learn, NumPy, SciPy, Pandas, Keras, TensorFlow, SQL, Python

Full-stack Developer

2013 - 2014

Microsoft Project Users Group

维护现有的基础设施，并在遗留LAMP (PHP)堆栈上开发新的后端特性和api.
Performed front-end work using HTML5, CSS, and JavaScript to improve user experience and fix legacy bugs.
编写Python脚本，自动处理来自第三方api和内部数据库源的数据和报告.

Technologies: JavaScript, Python, PHP, MySQL, Apache, Linux

Experience

ABayesian

http://github.com/brian-todd/ABayesian

Python package for Bayesian hypothesis testing.

CyDTW

http://github.com/twosense/CyDTW

High performance DTW library written in Cython for Python 3.x.

FX-Data-Loader

http://github.com/brian-todd/fx-data-loader

该项目作为一个可扩展的框架，用于处理和清理来自Dukascopy数据提要的历史FX数据.

CFBBot

http://github.com/brian-todd/CFBBot

用来回答几个大学橄榄球队问题的聊天机器人. Built using Rasa, Spacy, and requests.

Patented Activity Recognition Algorithm

http://patents.google.com/patent/US10548513B2/

各种技术被用于自动收集和分类由可穿戴生理监视器收集的锻炼数据. 分类过程是为了正确有效地描述锻炼类型而分阶段进行的. 最初，使用运动和心率数据来检测一般的锻炼事件. Then a location of the monitor on a user is determined. 然后可以有条件地应用人工智能引擎(如果正在进行锻炼并且检测到合适的设备位置)来识别锻炼的类型. In addition to improved speed and accuracy, 以这种方式实现的锻炼检测过程可以用足够小的计算空间来实现，以便部署在可穿戴生理监视器上.

US Patent US10548513

Education

2009 - 2013

Bachelor of Science Degree in Mathematics

University of Michigan - Ann Arbor, Michigan, USA

Skills

Libraries/APIs

LSTM, NumPy, SciPy, Pandas, Scikit-learn, PyTorch, XGBoost, Matplotlib, SpaCy, Natural Language Toolkit (NLTK), Dask, Keras, TensorFlow, OpenCV, D3.js

Tools

GitHub, Jupyter, Git, Seaborn, Rasa.ai, Sublime Text, Apache, Gensim, Apache Impala

Languages

Python, SQL, C, PHP, JavaScript, C++, Bash, Snowflake

Paradigms

Data Science, Unit Testing, MapReduce

Platforms

Jupyter Notebook, Linux, Amazon Web Services (AWS), Databricks, Docker, Unix, Google Cloud Platform (GCP), NVIDIA CUDA

Storage

PostgreSQL, MySQL, Redshift, SQLite, Apache Hive

Frameworks

Flask, Hadoop, Presto, Spark

Other

Data Analysis, Data Visualization, Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNNs), Gated Recurrent Unit (GRU), Data Analytics, Scientific Computing, Numerical Methods, Numerical Modeling, Machine Learning, Computer Vision, Natural Language Processing (NLP), A/B Testing, Mathematics, Statistics, Deep Learning, Data Mining, GPT, Generative Pre-trained Transformers (GPT), Cython, Graphics Processing Unit (GPU), GPU Computing, Google BigQuery

Collaboration That Works

How to Work with Toptal

在数小时内，而不是数周或数月，我们的网络将为您直接匹配全球行业专家.

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.

Choose your talent

在24小时内获得专业匹配人才的简短列表，以进行审查，面试和选择.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring