Dr. David S. Batista
Angestellt, Lead Natural Language Processing Engineer, Comtravo
Berlin, Deutschland
Über mich
Experienced in both research and industry I enjoy working on solutions from concept to production and transform natural language text into structured data. In the past I've tackled problems with strong Machine Learning and Natural Language Processing components, involving tasks like: information extraction, classification, clustering and information retrieval. I considered myself a practical problem solver and like to deliver production ready software, not just results. • Homepage: http://www.davidsbatista.net • GitHub: https://github.com/davidsbatista • Publications: http://goo.gl/uihrcx
Werdegang
Berufserfahrung von David S. Batista
• Leading the Automation team working on the system that automatically answers incoming email travel requests and assists travel-agents in handling them. • Developed several modularised Python components with type-annotations, building algorithms to map input text to corresponding unique identifiers in a target knowledge base, e.g: airports, train stations, hotels, geographic locations. • Trained and evaluated models for text classification and fine-grained NER, increasing the performance of the system
• Built and maintained several ETLs using PySpark (Apache Spark) and Hive. • Developed the first prototype to manage ETLs pipelines based on Airflow operators which later went into production and was used by the team. • Built a classifier using NLTK and linear models from scikit-learn, to identify customer reviews mentions to different types of issues with the meal kits • Technologies: Python NLTK, scikit-learn, PySpark, Hive, Airflow
Ausbildung von David S. Batista
2011 - 2015
PhD - Information Extraction and Natural Language Processing
Instituto Superior Técnico
Sprachen
Portugiesisch
Muttersprache
Englisch
Fließend
Deutsch
Fließend