佩德罗·恩里克·罗查·梅
Verified Expert in Engineering
机器学习开发人员
Pedro is a business-oriented seasoned data scientist and data engineer with experience building and deploying production distributed data pipelines and machine learning models at scale, 涵盖从设计开始的整个数据生命周期, construction, optimization, deployment, 以及数据架构和机器学习模型的监控. Pedro's focus is to deliver solutions that are robust to changes in environment and data and flexible to address changes in business requirements.
Portfolio
Experience
Availability
首选的环境
Python, Scala, 亚马逊网络服务(AWS), 工程数据, Data Science, 机器学习, Big Data, 软件架构
The most amazing...
...我建立的系统是算法和概率交易系统. With a limited view of the world, probabilities are essential tools in risk management.
Work Experience
Chief Architect
Rocha Moy贸易公司
- Developed the API for probabilistic and algorithmic options trading with Interactive Brokers and TD Ameritrade. 专长包括数据集成, task automation, 投资组合模拟, risk mitigation, 策略验证.
- 集成了许多不同的数据源,从api到网页抓取.
- Automated trade execution, scheduling of trades, and release of funds for trading completely.
首席数据科学家
Self-employed
- Designed, implemented, and deployed different natural language processing models.
- 与涉众一起工作以理解用例, 产品开发的途径, 以及使用已部署模型的实现.
- 指导和支持团队中的初级数据科学家.
企业首席数据架构师-承包商
Toptal Client
- 处理架构, development, and automation of distributed computing pipelines and data storage in the cloud for the enterprise.
- Automated scalable infrastructure in the cloud to respond to development and consumer demand.
- Co-managed and supervised a team of engineers from designing and delegating tasks, mentoring, 监督工作.
企业高级ETL和数据工程师-承包商
Toptal Client
- Designed, implemented, and deployed to production fully-fledged distributed ETL jobs in Spark/Scala API.
- 处理各种数据源和数据汇,包括绝望文件, Hive tables, Mongo集合, 和Kafka代理.
- Served as the senior engineer and tech lead of the team strengthening engineering and development processes, 改进软件质量控制, 并帮助设计sprint的故事.
Hadoop大气科学项目的概念证明-承包商
Toptal Client
- Built cluster from scratch adhering to client's needs to work with home cluster.
- Designed and implemented generic and specific data architectures meeting the client's query's complexity and performance needs.
- Built PySpark and Python software layers of abstraction to allow the client to build on top of the current infrastructure.
研究数据工程师
尼克劳斯儿童医院
- 为R用户开发现有的分析和数据工作流程, Python, 和英帕拉建立最佳工程实践.
- 提供临时和系统地开发ETL和大数据管道, validation, 以及不同数据源的集成.
- Liaised for the research department to IT and BI departments providing guidance and expertise on analytical and data needs.
技术顾问
Insight数据科学
- Worked with fellows and their data engineering projects on problem definition, 系统架构, and execution.
- Advised on technologies such as Spark, Kafka, Redis, HBase, Cassandra, and PostgreSQL.
- Conducted mock interviews with fellows on scalability concepts, algorithms, and CS fundamentals.
高级软件工程师
NexHealth
- Developed and deployed software to the client's site to perform data collection and server sync.
- Performed both database and web-based data integrations of electronic medical records back to NexHealth servers.
- Developed a smart SMS response system allowing the user to interact with NexHealth products via SMS.
Data Scientist
QuaEra Insights
- Served as the lead data scientist in a consulting project overseeing data management and modeling strategy.
- Used natural language processing to transform unstructured data into features and extract business intelligence.
- Built a recommendation engine as business rules potentially yielding savings on up to 50% of the business.
数据工程研究员
Insight数据科学
- 建造了比赛中场管, 该平台旨在发现YouTube上对全球品牌有影响力的人.
- Deployed Amazon’s EMR Spark with HBase processing and ingesting billions of data tuples.
- 在多达20个节点的测试中获得线性可伸缩性性能.
Data Analyst
Cartesian
- Aided managed analytics efforts promoting best practices within batch workflows and data management.
- Conducted independent research into big data workflows considering data mining and BI integration.
- 构建使用api的短数据管道, transforming, loading, 并向BI工具公开数据连接.
数据分析工程师
Daktari诊断
- Worked as the lead developer of mainstream data processing and data analysis applications in Python for Windows/Mac.
- Developed a calibration model for the Daktari CD4 testing device improving the system's accuracy by 20-30%.
- Deployed machine learning models embedded in standalone applications to end users for data classification.
Experience
持续边缘和套期保值股票交易策略
http://docs.google.com/presentation/d/1zkbfErfwbJvGBXFj9UWKDvq99wkj6EBvqniA4yFNu68/edit?usp=sharingEducation
工商管理高级工商管理硕士
迈阿密大学-迈阿密
计算机科学(机器学习)硕士学位
佐治亚理工学院-亚特兰大,乔治亚州
地球科学与工程(地球物理学)硕士学位
阿卜杜拉国王科技大学-沙特阿拉伯
机械工程学士学位
麻省大学洛厄尔分校
Skills
Libraries/APIs
Microsoft HPC, PySpark, TensorFlow, PyTorch, Scikit-learn, XGBoost, Dask, SpaCy
Tools
ChatGPT, Amazon Elastic MapReduce (EMR), Spark SQL, JMP, Impala, Git, Gensim
Languages
Python, Julia, Scala, SQL, R, SAS, JavaScript, Bash, Snowflake
Storage
NoSQL, MongoDB, Oracle SQL, Microsoft SQL Server, Redis, Cassandra, PostgreSQL, HBase, Apache Hive, 数据集成
行业专业知识
Accounting
Paradigms
Functional Programming, Parallel Programming, Distributed Computing, Data Science
Platforms
Docker, Jupyter Notebook, Apache Kafka, Alteryx, Linux, 亚马逊网络服务(AWS)
Frameworks
Bootstrap, Ruby on Rails (RoR), Spark, Apache Spark, Flask, Hadoop, Streamlit
Other
机器学习, 分布式系统, OpenAI GPT-4 API, 金融建模, Web App UI, APIs, 数据架构, Data Modeling, DocumentDB, Dash, Deep Learning, 自然语言处理(NLP), 工程数据, 人工智能(AI), Algorithms, 算法交易, Optimization, 强化学习, 时间序列分析, Forecasting, Cloud, 数值优化, 情绪分析, Neural Networks, Options Trading, Web Scraping, 概率论, Simulations, Finance, Law, 创业, Leadership, Big Data, 软件架构, GPT, 生成预训练变压器(GPT), Data Analytics, 管理分析
如何使用Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
分享你的需求
选择你的才能
开始你的无风险人才试验
对顶尖人才的需求很大.
Start hiring