Zhenxiao is leading Interactive Analytics Team at Uber. Previously, he led the development and operations of Presto at Netflix and worked on big data and Hadoop-related projects at Facebook, Cloudera, and Vertica. Zhenxiao holds a master's degree from the University of Wisconsin-Madison and a bachelor's degree from Fudan University.
Lu is with Hadoop Infrastructure Team at Uber, mainly working on Presto and geospatial data analysis. Before Uber, Lu worked at Yahoo Ads and Yahoo Finance team, built Yahoo's user profile system and financial data serving system respectively. Lu holds a CS master's degree from University of Southern California and bachelor's degree from Sun Yat-Sen University
Tim Spann was a Senior Solutions Architect at AirisData working with Apache Spark and Machine Learning. Previously he was a Senior Software Engineer at SecurityScorecard ("http://securityscorecard.com/) helping to build a reactive platform for monitoring real-time 3rd party vendor security risk in Java and Scala. Before that he was a Senior Field Engineer for Pivotal focusing on CloudFoundry, HAWQ and Big Data. He is an avid blogger and the Big Data Zone Leader for Dzone (https://dzone.com/users/297029/bunkertor.html).
He runs the the very successful Future of Data Princeton meetup with over 830 members at http://www.meetup.com/futureofdata-princeton/.
He is currently a Solutions Engineer at Hortonworks in the Princeton New Jersey area.
You can find all the source and material behind his talks at his Github and Community blog:
Nagaraj Jay is a Systems Architect at Hortonworks. His main interests are on Spark ML, AI, Flink, Geospatial data replication and data integration.
Carlo A. Curino received a Bachelor in Computer Science at Politecnico di Milano. He participated to a joint project between University of Illinois at Chicago (UIC) and Politecnico di Milano, obtaining a Master Degree in Computer Science at UIC and the Laurea Specialistica (cum laude) in Politecnico di Milano.
During the PhD at Politecnico di Milano, he spent almost two years as a visiting researcher at University of California, Los Angeles (UCLA) working with prof. Carlo Zaniolo (UCLA) and prof. Alin Deutsch (UCSD). He then spent two years as Post Doc Associate at CSAIL MIT working with prof. Samuel Madden and prof. Hari Balakrishnan.
At MIT he was also the primary lecturer for the course on databases CS630, taught in collaboration with Mike Stonebraker. He spent a year as Research Scientist at Yahoo! Research. Currently Carlo is a Principal Research Scientist in the Microsoft Cloud and Information Service Lab (CISL).
Carlo's recent research interests include: large scale distributed systems, performance tuning, scheduling. In the past he worked on: mobile+cloud platforms, entity dedup at scale, relational databases and cloud computing, workload management and performance analysis, schema evolution, temporal databases.
Subru Krishnan is a Principal Research Engineer at Microsoft in the Cloud and Information Services Lab (CISL) currently focusing on YARN, specifically scaling it to 100K+ nodes and providing SLA guarantees. He has been working on the Hadoop ecosystem since 2007. Prior to Microsoft, he worked at Yahoo! where he contributed to Oozie's precursor, near real-time stream processing on Hadoop and HBase replication.
Charles Houchin is a computer scientist at the Air Force Technical Applications Center (AFTAC). His experience includes developing systems for government and military customers ranging from enterprise to mobile applications. Most recently, he has created data acquisition and analysis solutions involving technologies from the Hadoop ecosystem, including Apache Accumulo, Apache NiFi, and Apache Spark.
William N. Junek, Charles A. Houchin, Joseph A. Wehlen, John E. Highcock, Marcus Waineo; Acquisition of Seismic, Hydroacoustic, and Infrasonic Data with Hadoop and Accumulo. Seismological Research Letters ; 88 (6): 1553–1559. doi: https://doi.org/10.1785/0220170056
John Highcock is a Systems Architect at Hortonworks. Prior to joining Hortonworks, he worked on big data projects at the US Department of Justice.
Saurabh is a Systems Architect with strong expertise in Hadoop ecosystem and rich field experience. He helps large to small enterprises solve their business problems strategically, functionally and at scale by leveraging Bigdata technologies. He is equipped with hands-on experience building, coding and directing successful information technology initiatives.
Saurabh has over 14 years of strong IT experience and has served in key positions as Lead Big Data Solution Architect, Performance Architect, Technology Architect in multiple large and complex enterprise programs.
He has extensive knowledge of BigData/NoSql technologies including Hadoop, Yarn, Spark, Hbase, Hive, Pig, Storm, Kafka, Nifi etc. and has been working in this space for last 6+ years.
Saurabh has architected and designed big data platforms and applications that consist of 1000s of nodes , 10s of Petabytes of data and Complex ETL workflows requirements.
Saurabh have provided solutions for GDPR requirement, Just-In-Time analytics, leveraging co-located datasets at scale to provide insight and pattern detection. Building data pipelines that produce results in minutes or hours across peta-bytes of data. Building and discovering new ways to co-locate, integrate and leverage disparate datasets using the Lambda and HTAP BigData architecture and IOT applications.
Arun is a Distributed Systems Architect from Hortonworks who has Authored/Implemented/ Optimized Big-Data Pipelines for many Fortune 500 firms like Apple, Microsoft, SAP, ADP, Fidelity, Expedia, Monsanto, TMobile etc.
Enterprise Big Data Pipelines typically solve Analytics/Warehousing Challenges and comprise of Streaming/Batch/Hybrid processing paradigms. Comprehending the nuances of these Peta-Byte through-put pipelines, analyzing their data flow efficiencies needs in-depth architectural understanding of utilized frameworks/components within pipeline applications. Arun’s unique ability to visualize pipeline dataflows horizontally (data partitioning, shuffling, combining, at a distributed framework level) and vertically (application request, cache, page-cache, disk layer) helps design/ untangle/unclog/optimize pipelines very efficiently.
Distributed Datasets, Map Reduce based Directed Acyclic Graph solutions, Distributed Stream Processing, Column Oriented storage, Column Family Oriented Stores, Dynamic Partitioned Rings, Machine Learning Algorithms and TFD based Spark/Hadoop ETL workflows are some of the topics within Arun’s area of interest/expertise.
Prior to Hortonworks, Arun spent about a decade with Monsanto, working on biotech research in Hadoop as well as various custom in-house distributed parallel processing/storage platforms. Other previous experience includes stints at ADP, Caliper and Yahoo and research work for NASA. Arun’s expertise includes SOA, enterprise security, distributed computing/caching, and scalable read/write behind caching architectures. Arun holds a master’s degree in computer science from University of Alabama and a bachelor’s degree in engineering from Sri Venkateswara College of Engineering, India.
Venkatesh is a senior data scientist at PayPal where he is working on building state-of-the-art tools for payment fraud detection. He has over 20+ years experience in designing, developing and leading teams to build scalable server side software. In addition to being an expert in big-data technologies, Venkatesh holds a Ph.D. degree in Computer Science with specialization in Machine Learning and Natural Language Processing (NLP) and had worked on various problems in the areas of Anti-Spam, Phishing Detection, and Face Recognition.
Janet Li has over 15 years' experience in the IT industry in the areas of databases, analytics & big data. Janet has managed internal and external teams of architects, database administrators, big data architect & infrastructure providers for large IT projects. Most recently Janet is the lead of the Hortonworks Hadoop Data Lake for HP Inc. Janet has a bachelor’s degree from the University of Wuhan University, China and a Master's Degree in Computer Science from the University of St. Thomas, MN, US. Janet is based in Austin Texas where she enjoys spending time with her family and hiking with her dog in the Texas Hill Country.
Pranay is an accomplished Hadoop Architect and Engineer, with hands-on development with Hadoop technologies, includes Installation, maintenance and upgrade of HDP cluster & development using Pig, Hive, Spark, SolR, HBase, Flume, Storm. Pranay has over 12+ years of experience with multiple technologies including server administration, Java Technologies, .NET technologies and Mainframe applications.
As a principal consultant at T4G with over 20 years’ experience, Darryl believes that ‘better delivery is achieved through better design’ on IT projects. This experience in design and delivery of software solutions cuts across a variety of industries including the retail, insurance, manufacturing, financial, and government sectors. Darryl specializes in application design with focus on data science platforms under the Hadoop ecosystem and custom application development. In an architect role, he has produced technical design deliverables on many large enterprise projects and recently provided technical guidance on several Hadoop related projects for a national telecom and banking clients.
Kenneth Poon is a Director of Data Engineering in the Data & Analytics (DNA) group, responsible for architecting, building, and delivering solutions to enable RBC to become a data-driven organization. He has built several large-scale products across the enterprise, specializing in real-time streaming applications. He is currently focused on building out the bank's Channel Analytics Platform.
Toni LeTempt is a senior technical expert at Walmart. Toni has 18 years’ IT experience, five of them working with large secure enterprise Hadoop clusters.
I have over 20 years of experience with Relational database platforms. Have transitioned to working with the Hadoop ecosystemover the last 3 years.
Nishant is Druid PMC member and Software Engineer at Hortonworks. He is part of Business Intelligence team at Hortonworks. Prior to that he was part of Metamarkets backend team and was responsible for analytics infrastructure, including real-time analytics in Druid. He holds a B.Tech in Computer Science from National Institute of Technology, Kurukshetra, India.
Joe Olson is senior manager of big data analytics at United Airlines, focusing on running a big data warehouse, and streaming data analytics.
Prior to United Airlines, Joe worked at several start up companies in Chicago developing both big data and streaming data architectures, built around emerging open source frameworks.
Jon Ingalls is a Software Engineer with Hortonworks with almost 30 years of experience in various IT capacities including Consultant, Programmer, Quality Assurance/Tester, Software Developer, and Software Engineer. As a software engineer over the past 18 years, Jon has worked with several large and small software organizations including Oracle, Endeca, IBM, Cognos, and ParAccel, selling software for MPP Database, Business Intelligence, Search, and Data Discovery. Jon has been focused on the Big Data/Hadoop space since 2014 by bringing solutions to his customers’ real-world use cases leveraging several of the major Hadoop distributions including Hortonworks. Jon holds a B.S. in Computer Sciences from Northern Illinois University (1991).
Ryan Blue works on open source projects, including Iceberg, Spark, Parquet, and Avro, at Netflix.
Shaoxuan Wang is a senior engineering manager in Alibaba, leading the development of Apache Flink SQL. He is a committer of the Apache Flink project. Before Alibaba, Shaoxuan works on Facebook TAO project, a data store for social graph. Shaoxuan Wang received his Ph.D from University of California San Diego in Computer Engineering.
Xiaowei Jiang leads the StreamCompute Platform for AliCloud. The platform provides the real time data processing both internally and externally.
Born in the era of great music, Marc is best known for his ability to cause "Hello World" to segfault. He works on distributed systems with Apache Accumulo and Apache NiFi, but don't let his charm deceive you. At his heart he is still just a de-referenced null pointer. In fact, this biography is better told via assembly language:
mov rbp, rsp
mov QWORD PTR [rbp-8], rdi
mov QWORD PTR [rbp-8], 0
mov rax, QWORD PTR [rbp-8]
mov eax, DWORD PTR [rax]
Timo Walther is a committer and PMC member of the Apache Flink project. He studied Computer Science at TU Berlin. Alongside his studies, he participated in the Database Systems and Information Management Group there and worked at IBM Germany. Timo works as a software engineer at data Artisans. In Flink, he is mainly working on the Table & SQL API.
Ian Downard is a senior developer evangelist and open source ambassador at MapR. He frequently publishes screencasts and technical articles relating to machine learning and data science on the MapR blog and on his personal blog at http://bigendiandata.com. He enjoys connecting with people at meetups and leads the Java User Group in Portland, Oregon.
Will Ochandarena is Senior Director of Product Management at MapR, responsible for Cloud and IoT product strategy. He spends time with customers across several industries, including manufacturing, retail, and energy, helping them use MapR’s converged data fabric at the edge to solve new and interesting business problems. He also writes blogs on MapR.com - https://mapr.com/blog/author/will-ochandarena/.
Edwina Lu is a software engineer on LinkedIn’s Hadoop infrastructure development team, currently focused on supporting Spark on the company’s clusters. Previously, she worked at Oracle on database replication. Edwina holds a Master's degree in Computer Science from Stanford University.
Ye Zhou is a software engineer in LinkedIn’s Hadoop infrastructure development team and mostly focusing on Hadoop Yarn and Spark related projects. Ye holds a Master degree in computer science from Carnegie Mellon University.
Satya N Ramachandran is the vice president of engineering at Neustar, a provider of real-time, cloud-based information services. He has more than 20 years of experience in distributed computing and large scale analytics. Prior to Neustar, Satya led engineering at MarketShare and co-founded JovianDATA a large scale analytics platform built entirely on the cloud. He has held senior engineering roles in teams that built real-time and distributed analytics engines at Cognos, 3ParData and Sybase.
Satya holds a Master’s in Computer Science from the Indian Institute of Science, with emphasis in databases and compilers. Satya holds several patents in distributed computing and has been a presenter at several conferences.
Gustavo Arocena is a Big Data Architect at the IBM Toronto Lab, with over 15 years of experience in database technology. Recently, Gustavo lead the design and implementation of several components of the Big SQL engine, including the Hive-compatible IO layer, the INSERT statement, the integration with Apache Spark and the high-performance ORC ingestion layer.
Gustavo has several publications and has presented at multiple conferences. He holds a Master's degree in Computer Science from the University of Toronto in the area of database language processing.
Sr. Technologist for American Water working with HDP and HDF
Senior Technologist at American Water working on HDP & HDF.
Conrad Fernandes is a long time cyber security engineer and architect, and has worked extensively with US Defense agencies and the DoD since the early 2000's while at Booz Allen Hamilton. Conrad currently serves as a senior cyber security engineer at the Johns Hopkins Applied Physics Laboratory (APL), where he leads security and governance practices on emerging cloud technologies, including commercial and US GovCloud (e.g., Amazon web services) and Hadoop-based data science platforms from Cloudera and Hortonworks. Conrad recently presented strategies for "Incident Response and Spillage Handling in AWS" at Amazon's 2018 reInvent Conference. Additionally, Conrad has been researching and implementing security and audit logging and monitoring strategies on data science platforms at Johns Hopkins Medical Institute (JHMI) that utilize various emerging security services found within Hortonworks Data Platform (HDP). Conrad also enjoys sharing security best-practices and lessons-learned from the experiences with the larger cloud and big-data community.
Chao Sun is a Software Engineer at Uber, working on Hadoop Infrastructure, including stacks such as Hive and HDFS. Before that, he was a Software Engineer at Cloudera, worked on various projects including Hive on Spark and RecordService.
Inigo is working as a research software developer at Microsoft Research in the System Research Group currently focusing on HDFS, specifically scaling it to 100K+ nodes and making it able to harvest idle resources. He has been working on the Hadoop ecosystem since 2011 and is a committer. Prior to Microsoft, he worked at Rutgers University as a postdoctoral researcher.
Matt Laudato is Director of Product Management at Teradata, where his team focuses on delivering end to end analytics solutions to the Global 500. His product portfolio includes open source tools like Kylo for data lake management, and Presto for fast, interactive queries across multiple data sources. Prior to Teradata, Matt created and ran the Data Science team at Constant Contact, and has held leadership positions in the SaaS, ERP, Telecom and Software Tools industries. His prior employers include Dun and Bradstreet, Openwave, and McKesson. Matt holds degrees in Physics from Stony Brook University and the University of Wisconsin, Madison and lives in Boston, MA.
Peter MacKenzie is Teradata’s Services Director for Artificial Intelligence in America and based in the San Francisco Bay Area. He is responsible for the successful delivery of Artificial Intelligence projects in America. Previously, Peter was the Director of Services for Think Big Analytics where he led a number of Data Science and Big Data projects. Peter holds a BComm in Management Science and a Master’s in Computer science from McGill University, where he was also pursuing a PhD. He left the program and founded a startup based on his research in Distributive Computing.
Weiwei Yang is a staff software engineer in Alibaba, he is focusing on evolving data infrastructure to serve large scale data processing in Alibaba. He started working on Hadoop ecosystem since 2010. He is very passionate on Open Source contributions, and an active Hadoop committer. He works on both HDFS and YARN projects on various of improvements to shape it be better fit for internet scale and use cases. Prior to this, he has worked for IBM for more than 6 years, and one of the startup member of Biginsights product. He got his master degree from Peking University and bachelor from Wuhan University, China.
Chunde Ren is a engineering manager, leading the development of Apache YARN in Alibaba.
Hebert Pereyra is a Senior Technical Staff Member (STSM) working in the Hybrid Data Management Division of IBM Analytics. He has been with IBM for over 20 years focusing on the research, development and support of data management systems. He works at the IBM Canada Software Lab and is the Chief Architect for IBM Big SQL which extends the power and performance of enterprise-level SQL solutions to Hadoop ecosystems.
Nagapriya (Priya) Tiruthani is a Product manager for IBM Db2 Big SQL. Her focus has been in defining the product roadmap, actively engaging with customers to understand the use cases and making sure their needs are made available in the product while keeping up with what's going in the Hadoop ecosystem. Prior to this role, she was a UI and backend Java developer working on many of the Db2 tools. She holds a Master of Science in Computer Engineering from San Jose State University, San Jose, USA and a Bachelor of Engineering in Electronic from Bharathidasan University in India.
PAN YUXUAN is a software development engineer in BigData products division , China Mobile（Suzhou） Software Technology Co. since 2014.
He focus on researching and developing the CMH（China Mobile Hadoop） products.
He is the Project Manager of China Mobile Centralized Business Analysis System and now he is also responsible for Data Management Suit in China Mobile.
Duan Yunfeng is Chief designer of China Mobile's big data system ,and he is a Postdoctoral in the information processing Department of Peking University.
Dr. Duan Yunfeng has undertaken the design, construction and operation of China Mobile's data warehouse and big data center, and has accumulated 16 years of practical experience in the field of big data.
He led the team, from the system construction to the system operation and maintenance, and established a lot of big data applications.
The design of technical documents, there are more than 150 copies and about 12000000 words. The data model, data interface, system architecture, quality control, business application, system security and other fields are involved in the big data system.
He has published two books: Big data and big analysis and Big data Internet thinking.
China Mobile big data system currently has 16 thousand nodes (X86), which exceeds 500PB system capacity, including data volume 300PB, 800 million customers' active data.
An InfoSec Generalist. CISSP. My more than a decade long work experience revolves around all aspects of security mainly Secure-SDLC, Source Code Analysis, Vulnerability Assessment, Penetration Testing for Web Applications, Architecture Review, Incident Response, ISMS Compliance, Doing and facilitating 3rd Party Audits. Managed multiple Federal Data Center Operations, O/S and Application Hardening, Linux System Administration. Solution Deployment and Integration for Federal and various State Governments. Contributor to Apache Knox, Apache Zeppelin and Apache Spark.
Have also years of experience in leading and managing a team for monitoring, securing and ensuring "Availability Round-the-Clock" for National Critical Infrastructure. Solving Brain-Teasing needle-in-haystack production issues (Architecture, Application, System & Network) and incorporating new requirements. Conducting Vulnerability Analysis and analyzing VA reports for suggesting corrective and preventive actions (Hotfixes/ CVEs / Design Change/ Hardening/ Patching /Upgrades) to Engineering and Operations team. Panelist for Big Data Security Work Group.
Designing Solution Architecture and Capacity Planning for highly-available applications on Cloud/Data Centre environment.
As a Premier Support Engineer at Hortonworks I balance my time between resolving issues as well proactively identifying ways to deflect issues. I have strong troubleshooting experience based on my foundational Linux background and over 2.5 years of critical enterprise Hadoop support. In addition to troubleshooting, I specialize and have interest in core Security issues revolving around Kerberos, AD/LDAP, SSL/TLS and Identity Management Solutions.
Chris Wojdak, Sr. Program\Managing Architect leading innovation, digital transformation strategies and next generation analytics at Symcor. He has more than 20 years of experience in solutioning and implementing modern, secure and advanced solutions for the financial services industry. He is currently focused on helping customers leverage Machine Learning, IoT and Next Generation Analytics to help detect and prevent fraud as customers move to digital channels. He has also received innovation awards for next generation analytics from Informatica. Chris Wojdak has received a bachelor degree in Technology from the Faculty of engineering at McMaster University.
Michael is a Senior Solutions Engineer at Hortonworks on the Public Sector team. He has worked in the public sector space for 19 years in a broad variety of IT roles, with more than 13 years’ experience as a Solutions Architect. Michael has been focused on the Hadoop space for the last 4 years. His other “big data” passion is information retrieval using Solr and Elasticsearch.
Marcus has been helping Federal, State, and Local governments adopt transformative big data technologies at Hortonworks for the last 4 years. Prior to joining Hortonworks, Marcus was a Computer Scientist at the US Department of Justice where he drove the adoption of Open Source Software and big data technologies such as Apache Hadoop, Apache Accumulo, Apache Nifi and related technology to solve analytic problems for agents and analysts in support of the U.S. National Security mission.
Shivinder Singh is a Senior Big Data Enterprise Architect . His focus has been on building systems Infrastructure for mission critical buisness applications, his work concentrates on maximizing the return on assets for the database portfolio. This involves developing an set of best practices in database strategy and Infrastructure life cycle management, covering B2B, B2C and B2E arenas. His work on strategy is centered on the major theme of consolidation, integration while focusing on Total Cost of Ownership, the critically important discipline of Infrastructure Maintenance and the need for strong application design. Having a sharp focus on innovation his work has resulted in four patents which are currently being processed at USPTO. Singh is a frequent speaker at various technology and executive conferences.
Kamil is a technology leader in the large scale data warehousing and analytics space. He is CTO of Starburst, the enterprise Presto company. Prior to co-founding Starburst, Kamil was the Chief Architect at the Teradata Center for Hadoop in Boston, focusing on the open source SQL engine Presto. Previously, he was the co-founder and chief software architect of Hadapt, the first SQL-on-Hadoop company, acquired by Teradata in 2014.
Kamil began his journey with Hadoop and modern MPP SQL architectures about 10 years ago during a doctoral program at Yale University where he co-invented HadoopDB, the original foundation of Hadapt’s technology.
Kamil holds an M.S. in Computer Science from Wroclaw University of Technology and as well as M.S. and an M.Phil. in Computer Science from Yale University.
Animesh Trivedi is a researcher at IBM Research Lab in Zurich. His interests are in anything, everything related to performance, spanning from multi-core CPUs to distributed environments. Currently, he is investigating how modern high-performance network and storage devices can be leveraged in popular data processing frameworks such as Apache Spark, Hadoop, Hive, etc. He is one of the founding members of the Apache Crail (Incubating) project.
Yanwei (Wayne) Zhang is a senior data scientist at Uber Technologies Inc. He has a Master’s degree in statistics and a PhD degree in quantitative marketing. He has published several research papers at top journals in statistics and actuarial science. His interest is in large-scaled machine learning, with a focus on applications in driving safety and insurance.
Neil Parker is a software engineer at Uber Technologies Inc. He has a Bachelor’s degree in information science and a Master’s degree in computer science from Cornell University. He has worked on projects ranging from creating visualizations to building efficient real-time systems. He has interests in distributed computing with a focus on driving safety.
Anu Engineer was part of the original Windows Azure team, principal author of VMware Certificate Authority, Apache Hadoop committer and PMC member. He works on HDFS and is one of the contributors to ozone.
Xiaoyu yao is Apache Hadoop PMC & Committer working mainly on Hadoop HDFS and Ozone.
Dr. Vega has been a renewable energy consultant and researcher for the last 10-years. He currently leads a staff of engineers, analysts and data scientists to perform analysis and provide business insights and reporting on the energy market operations of CPS Energy in ERCOT. He is responsible for the analytical skills of the team, as well as the data and associated systems required to effectively provide solid analytics.
Before his current position he led the R&D and technical performance in the subjects of renewable energy forecasting, GIS LiDAR analytics, building load forecasting and grid integration at The University of Texas at San Antonio (UTSA). Dr. Vega has led the development of 3 patent pending in the area of distributed energy forecasting and led the development of technologies for distributed IoT traffic monitoring, cyber abnormality detection and prediction for the electric utility industry.
He started and was responsible for the Renewable Energy consulting business in US, Mexico, Brazil and China for a global 1700+ employee consulting company. Dr. Vega helped develop the company’s consulting renewable energy annual revenues to about $6M in 3-years. He drives teamwork, effectively draws from the strengths of his team and focuses in innovative ideas and great communication to provide solutions. Dr. Vega’s former clients include top tier global owners, utilities and manufacturers of renewable energy assets and operations. Dr. Vega is a registered Professional Engineer and holds an active NCEES record for licensure in any U.S. state.
Jesús Camacho Rodríguez is a Member of Technical Staff at Hortonworks, and a PMC member of Apache Hive and Apache Calcite. His current work focuses on extending and improving query processing and optimization, ensuring that the increasingly complex workloads supported by Hive are executed quickly and efficiently. Prior to that, Jesús obtained his PhD in Computer Science from Paris-Sud University and Inria, working on large-scale Web data management. Jesús received his Computer Science and Engineering degree from University of Almería, Spain.
Yanbo is a staff software engineer at Hortonworks. His main interests center around implementing effective machine learning and deep learning algorithms or models in the areas of recommendation system, natural language processing and others. He is an Apache Spark PMC member and contributes to lots of other open source projects such as TensorFlow and Apache MXNet. He delivered the implementation of some core Spark MLlib algorithms. Prior to Hortonworks, he was a software engineer at Yahoo! and France Telecom working on machine learning and distributed system.
Mingjie Tang is an engineer at Hortonworks. He is working on SparkSQL, Spark MLlib and Spark Streaming. He has broad research interest in database management system, similarity query processing, data indexing, big data computation, data mining and machine learning. Mingjie completed his PhD in Computer Science from Purdue University.
John Mertic is director of program management for ODPi, R Consortium, and Open Mainframe Project at the Linux Foundation. John comes from a PHP and open source background. Previously, he was director of business development software alliances at Bitnami, a developer, evangelist, and partnership leader at SugarCRM, board member at OW2, president of OpenSocial, and a frequent conference speaker around the world. As an avid writer, John has published articles on IBM Developerworks, Apple Developer Connection, and PHP Architect and authored The Definitive Guide to SugarCRM: Better Business Applications and Building on SugarCRM.
Pavan Surapaneni has over 18 yrs of experience in communications industry, specializing in implementations of Business support and business intelligence applications. Lead several complex transformation initiatives, from building large billing systems to transforming legacy BI applications to utilize right technologies to assist business meet their goals.
Lead Solutions Engineer for several implementations at Cox: few of these:
- Support Dynamic Ad Insertion - build data analysis pipeline using Hive, Nifi
- Insights in Network Performance - Using Spark, Hbase
Ashutosh is working in Hive for last 7 years. He works at Hortonworks where his focus is on compiler and optimizer area.
Shashank Gugnani is a Ph.D. Student in the Department of Computer Science and Engineering at The Ohio State University. His research is focussed on designing high-performance storage systems for cloud middleware.
Adam Hudson is a software engineer with research and development experience in many diverse industries, including social media, video gaming, finance and online health. He was awarded a PhD from the University of Sydney in 2008 for his research into mobile networking applications. Originally from Sydney, Australia, he recently moved to the San Francisco Bay Area to join Uber on their exciting journey to change the world.
Atul Gupte is a Product Manager on the Product Platform team at Uber. He holds a BS in Computer Science from the University of Illinois at Urbana-Champaign. At Uber, he helps drive product decisions to ensure Uber’s data science teams are able to achieve their full potential, by providing access to foundational infrastructure, stable compute resources & advanced tooling to power Uber’s global ambitions. Previously, at Zynga, he spent time building some of the world’s leading social games and also helped build out the company’s mobile advertising platform.
Konstantinos Karanasos is a Senior Scientist at the Cloud and Information Services Lab (CISL) at Microsoft (based at the Silicon Valley office) and a PMC member of Apache Hadoop. His work at Microsoft has focused on resource management for the company's production analytics clusters and on query optimization for large-scale analytics. Within Apache Hadoop, Konstantinos has worked on adding support to YARN for opportunistic containers and for rich placement constraints. Prior to joining Microsoft, he was a postdoctoral researcher at IBM Almaden Research Center, where he was member of the Big Data analytics group, working on problems related to query optimization. Konstantinos obtained his PhD from Inria and the University Paris-Sud, France. In the context of his PhD, he worked in the areas of view-based query processing and semi-structured data management. He also holds a Diploma in Electrical and Computer Engineering from the National Technical University of Athens, Greece.
Wangda Tan is Product Management Committee (PMC) member of Apache Hadoop and Staff Software Engineer at Hortonworks. His major working field is Hadoop YARN GPU isolation and resource scheduler, participated features like node labeling, resource preemption, container resizing etc. Before join Hortonworks, he was working at Pivotal, working on integration OpenMPI/GraphLab with Hadoop YARN. Before that, he was working at Alibaba cloud computing, participated creating a large scale machine learning, matrix and statistics computation platform using Map-Reduce and MPI.
David is an Architect in the Data Science and Engineering team at GoPro and the creator of their Spark-Kafka streaming data ingestion pipeline. He has been developing scalable data processing pipelines and eCommerce systems for over 20 years in Silicon Valley. David's current big data interests include streaming data as fast as possible from devices to near real-time dashboards and switching his primary programming language to Scala from Java after nearly 20 years. He holds a B.Sc. in Computer Science from The Ohio State University.
Hao joined the Data Science and Engineering team at GoPro in 2016 and immediately started cranking out Java and Scala code for use in both the Spark Streaming and batch data pipelines. Hao continuously supports the data publishing needs of the device and software application development teams at GoPro and assists them in utilizing the most appropriate and efficient ways to stream, store, and access their data. He has a M.Sc. in Computer Science from Northeastern University.
I am IT professional with varied experience, passionate about faster retrieval of data. After a decade of optimizing queries, building and implementing statistics gathering algorithms and the optimizer at IBM-Informix, I'm now an Apache contributor focused on running benchmarks on Hive and related SQL-on-Hadoop technologies such as Impala and Presto.
Qiyuan Gong,Software Engineer, Intel OTC (Open Source Technology Center). Working on: Deep Learning on Hadoop and SSM (Smart Storage Management for Big Data). Ph.D on Data anonymization (related to GDPR). https://www.linkedin.com/in/qiyuangong/
Joy is a Distributed System Architect with 18+ yrs of Software design and development experience, 10+ yrs of Java/Scala development experience, 7+ yrs of work experience in Big-Data and Hadoop technologies, 4+ yrs of Apache Spark experience with a special interest in distributed/parallel computing, currently working on Kubernetes, Cloud and Big Data technologies. Joy is an open-source contributor for Hadoop and Jupyter Notebook echo-system products and technologies. Also, he is actively part of various Software architectural organization. Joy is a frequent speaker in various conferences, user-groups and code-camps.
Gregory is an expert in programming language runtimes, distributed systems, and big data processing. During his time on the ETA team at Lyft, Gregory transformed data processing from a manual process that took weeks to run to a fully automated process that runs every 10 minutes. As a formative member of the Data Science Platform team, Gregory helped define and deliver the vision for expanded use of machine learning techniques across Lyft. Now as a member of the Streaming Platform team, Gregory is focused on the delivery of high quality data to analytics and machine learning applications at Lyft. Before Lyft, Gregory was the lead architect of Salesforce’s Apex ecosystem — including the definition of the Apex language, compiler, runtime, debugger and other tooling, governance, batch processing, and caching — that services billions of requests a month.
Principal Engineer with 12+ years of IT experience including experience in the areas of Cloud Computing, advance system automation & tools design and System administration. Skilled in management of infrastructure and implementing technology to support large user groups, supporting users at corporate headquarters as well as multiple remote locations, and effectively managing high end Hadoop Clusters. Build, Manage and Support hadoop clusters with Thousands of nodes and petabytes of data, running Hadoop distributed file system (HDFS) and map-reduce framework. Setup Automation framework for Management of user access and data on clusters. Automated Management of Yahoo clusters and Break-fixing of Bad nodes. Completely automated solution for 40+ clusters/45k nodes spanned across 3 colos.
Sailaja Polavarapu is an Apache committer and currently works at Hortonworks in Enterprise Security team. Sailaja is mainly responsible for user management module for Apache Ranger product. Prior to this, Sailaja was at Citrix responsible for development of device management service for XenMobile product and LDAP Authentication for Access Gateway product. Sailaja holds an M.S in Computer Engineering from San Jose State University, CA and B.S in Computer Engineering from Anna University, India.
Velmurugan Periasamy (Vel) is part of Enterprise Security Engineering Team at Hortonworks, contributing to Apache Ranger. He is Apache Ranger Committer and PMC member. He has many years of software industry experience in developing and managing large-scale enterprise systems. He has delivered many technical talks at HadoopSummit, JavaOne, OSCON, Jazoon, etc.
With over twenty five years of Information Technology (IT) industry experience, Avinash specializes in providing information management and business analytics solutions. His well-rounded experience spans across areas of advanced analytics, data strategy, information architecture, data modeling, data integration design, and development. Avinash is business focused, creative in problem solving, and committed to the project quality and success. As a trusted advisor to his clients, Avinash has successfully delivered business intelligence solutions in a variety of industries that include financial services, banking, insurance, retail, and telecommunications. Avinash holds Chartered Property Casualty Underwriter (CPCU) and Associate in Insurance Data Analytics (AIDA) professional designations in property-casualty insurance and risk management, administered by The Institutes and MBA in Finance and Information Systems from Stern School of Business, NYU.
Uli has 18 years’ hands on experience as a consultant, architect, and manager in the data industry. He frequently speaks at conferences. Uli has architected and delivered data warehouses in Europe, North America, and South East Asia. He is a traveler between the worlds of traditional data warehousing and big data technologies.
Uli is a regular contributor to blogs and books, holds an Oracle ACE award, and chairs the Hadoop User Group Ireland. He is also a co-founder and VP of the Irish chapter of DAMA, a non for profit global data management organization. He has co-founded the Irish Oracle Big Data User Group.
Last but not least, Uli is the CEO of Sonra, the data liberation company. Sonra develops Flexter, a tool to automate the conversion of complex XML to a database, text, Spark, or Hadoop.
Jordan Martz is the Director of Technology Solutions at Attunity, a leading provider of data integration and data management software. In this role, Jordan works closely with both the alliances team and the product management team at Attunity with Big Data/Hadoop, IoT and cloud solutions. Beyond his work with Attunity, Jordan has taught Data Science, Deep Learning, Spark, Hadoop courses for 10 years.
David Freriks is an Technology Evangelist in the Office of Strategy Management team at Qlik. Dave’s mission is to help spread the word on the power of the Qlik Platform with Big Data systems and integration with partner technologies related to that ecosystem. He has spent 20+ years in the business intelligence space working at Qlik, SAP, IBM, and Cognos helping launch new products to market. Dave has a background in data warehousing and a Mechanical Engineering degree from Texas Tech University. He is married with two kids, and two Australian Shepard’s.
Enterprise software professional with more than 20 years of experience working with some of the best commercial and open source software companies in the industry.
Attila Kanto is Principal Engineer at Hortonworks, committer on Cloudbreak project. He has over 15 years experience in designing and developing reliable, secure scalable distributed computer systems. Currently he is leading the cloud engineering team of Hortonworks in Budapest.
Eric Krenz is a data engineer on the Big Data Platform team at Target. As a student, he presented his research on the integration between OpenStack Swift and Hadoop at the Midwest Instructional Computing Symposium in 2015.
Asher is currently working at Target as Director Data Engineering, where he leads high performing Big Data Platform and Product Engineering teams.
Prior to Target, Asher held leadership roles at UnitedHealth Group and Blue Cross Blue Shield where he directed teams in both Data and Software Engineering.
Asher enjoys learning about new big data technologies and using them to solve complex business problems. Asher has a MS in Software Engineering from the University of St. Thomas and a BS in Computer Science from Minnesota State University, Mankato.
Constantin is a top Hortonworks Community Connection contributor publishing multiple articles around stream analytics and geospatial. He is currently an active member of two SME groups at Hortonworks: Stream Analytics, Geospatial Analytics.
Through his work at Hortonworks, leveraging his vast experience over the last two decades building large-scale data processing systems using a variety of database technologies, he works to build end-to-end solutions involving big data technologies. He holds a Ph.D. in numerical modeling and computer simulation in the field of petroleum engineering. He is also a certified data scientist, PMP and ScrumMaster.
Dr. James Hughes is a mathematician at Commonwealth Computer Research, Inc. in Charlottesville, Virginia. He is a core committer for GeoMesa which leverages Accumulo, HBase and other distributed database systems to provide distributed computation and query engines. He is a LocationTech committer for GeoMesa, SFCurve, and JTS. He serves on the LocationTech Project Management Committee and Steering Committee. Through work with LocationTech and OSGeo projects like GeoTools and GeoServer, he works to build end-to-end solutions for big spatio-temporal problems. He holds a PhD in algebraic topology from the University of Virginia.
I began programming in high school after begging my folks to buy me an Atari 400. I wrote a stock market simulation program using a joystick for input and presented it at the Atari users group in Peoria, Illinois. I went to a junior college to learn programming and got stuck in a Cobol punch card class that I hated and jumped into Bible School. I was OK at being a youth pastor and loved speaking in public, but got a bit bored so I applied and got a job at Caterpillar. Three years in I solved the Y2K problems and finally landed a Java job.
From there I wrote many corporate systems and was known for writing systems that never went down. I personally wrote the third-generation telematics system at Caterpillar on corporate hardware while begging to go to the cloud. Three years ago my requests became a reality when we started using the Azure cloud. We immediately started writing the fourth-generation telematics system at Caterpillar capable of processing messages from 2 million assets. Currently we are up to 500 messages per second and we are preparing for 10x that using technology that includes Storm, HBase, Phoenix, SqlServer, EventHubs and Spark.
Stuart loves storage (208 PB at Criteo) and is part of Criteo's Lake team that runs some small and two rather large Hadoop clusters. He also
loves automation with Chef because configuring more than 3000 Hadoop nodes by hand is just too slow. Before discovering Hadoop he developed
user interfaces and databases for biotech companies. Stuart has presented at ACM CHI 2000, Devoxx 2016, NABD 2016, Hadoop Summit Tokyo 2016, Apache Big Data Europe 2016, Big Data Tech Warsaw 2017, and Apache Big Data North America 2017.
Andy LoPresto is a Sr. Member of Technical Staff at Hortonworks working on the Hortonworks DataFlow team. In this role he serves as both a Committer and Project Management Committee Member for Apache NiFi, an open source, robust, secure data routing and delivery system. Andy focuses on security concerns within NiFi including identity management, TLS negotiation, data protection, access control, encryption and hashing. Andy is also involved with the sub-project, Apache MiNiFi, which drives edge data collection, including secure command and control and immediate data provenance and governance. He has presented about NiFi at DataWorks Summit Berlin 2018, DataWorks Summit Sydney 2017, Hadoop Summit San Jose 2016, FOSDEM '17 in Brussels, and the OpenIoT Summit 2017.
I am a software engineer at Dremio and a committer to Apache Arrow project. Previously, I was part of database kernel team at Oracle, where I worked on storage, indexing, and the in-memory columnar query processing layers of Oracle RDBMS. I hold an MS in software engineering from CMU and a BS in information systems from BITS Pilani, India. During studies, I focused on distributed systems, databases, and software architecture.
My technical blog: https://loonytek.com/
My technical contributions on Quora: https://www.quora.com/profile/Siddharth-Teotia
Apart from my job as a Software Engineer, I love writing technical content and doing technical presentations about the work I do.
I'm a technical lead at Hortonworks where I concentrate on adding support for operations with transactional semantics to Apache Hive. Prior to that I was a lead engineer on a federated SQL engine at Composite Software. Before that I've held various engineering roles at BEA, Oracle and others.
Mukul is currently working with Hortonworks and is an active contributor to Apache Hadoop and Apache Ratis project. He received his master degree from Carnegie Mellon University and bachelors from Visveswaraya Technological University. He has been working actively on filesystems for last 8 years and has worked extensively on Ozone object store, Flash based filesystems, Shingled Magnetic Recording drives, data replication and disaster recovery solutions.
Lokesh Jain is a software engineer at Hortonworks. He has completed B.E.(Hons.) Computer Science and M.Sc.(Hons.) Mathematics from BITS Pilani. He is one of the early developers of Apache Ratis project and also contributes to Apache Hadoop. He also worked on GSOC project for SageMath organisation in 2017.
Cassandra has 20 years experience in search and knowledge management. She has been a Lucene/Solr committer since 2013 and a member of the PMC since 2016. As Director of Engineering at Lucidworks, she manages Solr and open source development.
Marcelline Saunders is Director, Global Partner Enablement at Lucidworks and has over 25 years of experience in information management including enterprise search, knowledge management, and e-discovery technologies for both on-premise and cloud solutions. Marcelline recently held senior product management roles at Hitachi Data Systems where she was responsible for global search solutions and strategy. Marcelline currently manages technical partner relationships at Lucidworks, the leader in open source and commercial search solutions.
Chris Douglas has worked in Apache Hadoop since 2007, starting as a frequent contributor to the MapReduce data path. He is one of the original designers of YARN. As a member of the Cloud an Information Services lab (CISL) at Microsoft, his research focuses on systems for large-scale analytics. His current work builds storage abstractions for big data workloads in cloud settings.
Thomas architects the S3 layer and the Hadoop integration of Western Digital's object storage system 'ActiveScale'. Together with the team, he has contributed multiple improvements to the Apache Hadoop s3a connector and has co-architected the HDFS Provided Storage feature.
He joined WD through the Amplidata acquisition. Previousl,y he obtained a Computer Science PhD in Queueing Theory at Ghent University, Belgium.
Yahoo Inc/ Oath Inc (Feb 2017 to present)
Intern, Yahoo Inc (June 2016 to Sep 2016)
Cognizant Technology Solutions (July 2014 to May 2015)
University of California, San Diego (Sep 2015 to Dec 2016)
Master in Computer Science
Institute of Technology, Nirma University (Aug 2010 to June 2014)
Bachelor in Computer Science & Engineering
I graduated with a Bachelor's Degree in Computer Engineering from the University of Illinois in Champaign Urbana, during which I did 2 internships at Yahoo. After graduating, I joined Yahoo full-time and have been working on large scale batch processing systems, mostly Hadoop, and data analytics platforms.
Data analyst -> DBA -> Big Data engineer -> Big Data architect
Kyle Cooper has over 20 years of experience at Cox Communications experience ranging from Technical Installation, Systems operations Center/Regional Operations Center Management, Alarm Management\Problem Management, Network Data Analytics, and Network Planning.
I am a Principal Software Engineer focusing on Data Science at Hortonworks. I am also the Vice President and an active committer for the Apache Metron project. In the past, I've worked as an architect and senior engineer at a healthcare informatics startup spun out of the Cleveland Clinic, as a developer at Oracle and as a Research Geophysicist in the Oil & Gas industry. Before that, I was a poor graduate student in Math at Texas A&M.
I primarily work with the Apache Hadoop software stack. I specialize in writing software and solving problems where there are either scalability concerns due to large amounts of traffic or large amounts of data. I have a particular passion for data science problems or any thing vaguely mathematical. As a Principal Architect focused on data science, I spend time with a variety of clients, large and small, mentoring and helping them use Hadoop to solve their problems.
Michael Miklavcic is a committer and PMC member for Apache Metron and has been involved with the project for the past two years. He is a software engineer and architect with over ten years of industry experience and worked as a Systems Architect with Hortonworks for three years prior to transitioning to the engineering team for Metron. He has given numerous talks both on the domestic and international stage, including Hadoop Summit San Jose, Apache Con Big Data Europe, and multiple local Hadoop user groups. He is a code contributor to multiple Apache open source projects and has worked directly with clients to implement solutions using Hadoop. Michael has degrees in computer science and computer information systems from Baldwin Wallace in Cleveland, OH.
A core member of R&D Engineering Group in Hortonworks primarily working on HDP (Hortonworks Data Platform) and DPS (Data Plane Service).
An active contributor and committer of Apache Hive project with major contributions on Hive replication and ACID features. Also have rich experience on distributed systems and In-memory database technologies.
Working on Hive Replication for the past 1.5 years, building an effective, easy to use, replication/DR solution. In previous life i have built/managed the Data platform pipelines for a AdTech company.
Barbara Eckman is a Principal Data Architect at Comcast. She leads data governance for an innovative, division-wide initiative comprising near-real-time ingesting, streaming, transforming, storing, and analyzing Big Data. Barbara is a recognized technical innovator in Big Data architecture and governance, as well as scientific data and model integration. Her experience includes technical leadership positions at a Human Genome Project Center, Merck, GlaxoSmithKline, and IBM. She served on the IBM Academy of Technology, an internal peer-elected organization akin to the National Academy of Sciences.
George Vetticaden is Vice President of Product Management within Emerging Products at Hortonworks. In this role, he is responsible for the strategic vision and concerted delivery across all the products within Emerging Products including Hortonworks DataFlow (HDF) that includes Nifi, Storm, Kafka, Streaming Analytics Manager, Schema Registry as well as solutions built on top of the platform including CyberSecurity/Metron.
Over the last 5 years at Hortonworks, George has spent a lot of time in field with enterprise customers helping them build big data solutions on top of Hadoop. In his previous role at Hortonworks, George was the Director of Solutions Engineering where he led a team of 15 Big Data Senior Solution Architects helping large enterprise customers with use case inception, design, architecture, to implementation of use cases monetizing data with Hadoop. In addition, he is also a committer on the Apache Metron project. George graduated from Trinity University with a BA in Computer Science.
(LinkedIn Profile: https://www.linkedin.com/in/georgevetticaden)
Product Designer behind Metron Investigator, Streaming Analytics Manager, and the new Kafka product.
Roy is a field data scientist with Interset, working with customers to identify new data sources that can be incorporated into Interset's advanced security analytics software. He holds a PhD in Applied Mathematics from McGill University, and has been applying advanced statistical methods to analyze data for over 10 years: early on as a data miner, and more recently in the capacity of a data scientist.
Have been involved with the design and development of Streaming Analytics platforms at Hortonworks for the last two and half years. Also been contributing to Apache Storm (http://storm.apache.org/) and currently a committer and a PMC member of the project. Prior to Hortonworks was involved in the development of various streaming and BigData products at Informatica and multi-tenant distributed systems at Yahoo.
Have more then 15+ years of Java experiences and during theses years worked with allmost all the form of Java solutions from the low-latency multithread application to highly distributed enterprise application as developer, architect and trainer. Currently working with the Apache bigdata projects and created various type of containerized solution for the components of the Hadoop ecosystem.
Founder of the first Hungarian Java User group and regular speaker at meetup events and conferences.
Commiter on Apache Ratis project and working on the dockerization of Apache Hadoop project.
Kevin Brown is the Big Data Service Platform Engineer for Data & Analytics at ExxonMobil with a team of data architects and engineers focused on a mission to embed world class analytics to solve big data problems across the corporation. Kevin’s previous experience in software development and Linux administration played a critical role in helping pioneer a Big Data platform at ExxonMobil. Kevin holds an Information Technology degree from Brigham Young University.
Magnus Hyttsten is a Developer Advocate for TensorFlow @ Google. He works on developing the TensorFlow product, is a developer fanatic, and an appreciated speaker at major industry events such as Google I/O, The AI Summit, AI Conference, ODSC, and MWC on machine learning and mobile development . Right now, he is focusing on Reinforcement Learning models, as well as making model inference effective on Mobile.
I work as a Mobile Software Engineer at Uber on iOS and fraud related problems. I'm focused on the intersection of mobile and machine learning; I combine both to build capabilities into Uber that create a great, safe, and fraud free user experience. I hold a Bachelor of Science in Physics and a Bachelor of Arts in Economics from. UC Davis.
Lenny Evans is a data scientist at Uber focused on the applications of unsupervised methods and deep learning to fraud prevention, specifically developing anomaly detection models to prevent account takeovers and computer vision models for verifying possession of credit cards.
Alan is a founder of Hortonworks and an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. Alan is PMC member on Apache Hive, Pig, and many other Apache projects. As part of the Apache Incubator PMC he has mentored many new Apache communities. Alan has a BS in Mathematics from Oregon State University and a MA in Theology from Fuller Theological Seminary. He is also the author of Programming Pig, a book from O’Reilly Press. Follow Alan on Twitter: @alanfgates.
Arun is a senior member of technical staff at Verizon and is responsible for Video Analytics and BI applications on their Big Data Platform. He works on designing the platform that collects user interactions of millions of STBs and mobile devices in real time and builds business insights. He makes it easier for business units to create metrics and analytics of user's video usage, STB diagnostics and network performance.
Ajay Anand is the Vice President of Products & Marketing at Kyvos Insights. His association with Hadoop goes back to 2007, when he was Director of Product Management at Yahoo and led their initial Hadoop projects and releases, after which he founded Datameer. Ajay also held product management and market development roles at SGI and Sun. Ajay earned an M.B.A. and an M.S. in computer engineering from the University of Texas at Austin, and a BSEE from the Indian Institute of Technology.
I have worked as a software developer at Oath for over 4 years, and I'm currently a member of the audience data team. Our team builds data pipelines which process all user activity data across Oath. I have attended DataWorks Summit for the last several years and have presented talks at other conferences such as XLDB and Tech Pulse (yahoo's internal conference).
I started worked with Druid and Druid supporting tool beginning from 2016. In 2017, it brought me a chance to build a SQL interface for Druid in order to allow more consumers to access Druid. Right now, I am still working closely with Druid and developing streaming data system with Druid. Looking forward to make more progress!
I am a Ranger project committer and PMC member.
Apache Ranger PMC and Committer
Trevor Grant is a PMC Member of the Apache Mahout and Apache Streams projects. In his day job he is an Open Source Evangelist / AI Engineer at IBM. He strongly believes / is interested in: (1) Math at scale (2) machine learning at speed (in the stream) (3) AI on the edge (4) anyone can read code from home- conference talks should be fun and informative. He has various videos, blogs, rants, advanced degrees, and code which can all be easily found online against his common handle “rawkintrevo”.
Scott Cote is a data science evangelist and open source promotor who organized DFW Data Science - a 2400+ member user group focused on promoting knowledge sharing, opportunity, and growth for the Dallas/Ft. Worth Community. During the day, he works as a Senior Software Engineer for Lucidworks to help create new features for Fusion Server. Scott is interested in Math, Music, Running, Diabetes (T2D) Management, and making the world a better place.... Follow him on twitter @scottccote and enrich by watching for live stream DFW Data Science presentations by following @dfwdatascience
Xiao Li is a software engineer in Databricks. His main interests are on Spark SQL, data replication and data integration. Previously, he was an IBM master inventor and an expert on asynchronous database replication. He received his Ph.D. from University of Florida in 2011. He is a Spark committer and a Spark PMC member.
Wenchen Fan is a Software Engineer at Databricks, working on Spark Core and Spark SQL. He mainly focus on the open source community, helped to discuss and review many features/fixes in Spark. He is a Spark committer and a Spark PMC member.
Partha Seetala joined Robin Systems in 2015 as Chief Technology Officer, with more than 16 years of technology and product expertise. His previous position was with Symantec’s information management business, known as Veritas, where he was Distinguished Engineer and Senior Director of Engineering. In that capacity he conceived, architected and led engineering teams to take multiple products from concept to market in the scale-out storage, distributed systems, content-aware file systems and information analytics space. He was also an adviser on multimillion-dollar product lines including NetBackup Appliance, Cluster File System, Veritas Cluster Server and Information Fabric. Earlier positions include serving as architect at a Data/Information Management startup acquired by EMC – and architect at Veritas Technologies. He holds a master’s degree in Computer Science and Engineering from the University of Minnesota.
-Sr. Director, Product Management, Hadoop Core, Data Science and Data Management, Hortonworks, 2016-Present
-CEO/Co-Founder, Stealth Mode Startup, 2015-2016
-Head of Product Management & Technical Marketing, Skeyra (acquired by Western Digital), 2013-2015
-Director, Product Management, EMC, 2011-2013
-Staff Engineer->Sr. Product Manager, Brocade, 2001-2011
Data Management Architect in Micron Technology’s IS Enterprise Architecture group overseeing precision manufacturing analysis and Big Data for Micron's multibillion-dollar memory-fabrication facilities.
Avrilia is a Senior Scientist at Microsoft's Cloud and Information Services Lab,
where her research is focused on scalable real-time stream processing systems.
She is also an active contributor to Heron, collaborating with Twitter.
Prior to her current role, she was a research scientist in IBM Research working
on SQL-on-Hadoop systems. She holds a PhD in data management from University of Wisconsin-Madison.
Ashvin Agrawal is a Senior Research Engineer at Microsoft, where he works on streaming systems and contributes to the Apache Heron and Dhalion project. Ashvin has more than 15+ years of software development experience. He specializes in developing large-scale distributed systems. Previously, he worked at VMware, Yahoo, and Mojo Networks. Ashvin holds an M.Tech. in Computer Science from IIT Kanpur, India.
Sanjeev Mohan is a Research Analyst for the Data Management Strategies group within the Gartner. He covers end-to-end data pipeline including ingestion, persistence, data transformation and advanced analytics. His research includes machine learning, IOT and data governance across the pipeline. research areas include storage infrastructure and architectures,
Srikanth Venkat is currently responsible for Security & Governance portfolio of products at Hortonworks which include Apache Knox, Apache Ranger, Apache Atlas, Platform wide security and Hortonworks DataPlane Service. Prior to Hortonworks, Srikanth has held multiple roles in areas of cloud services, marketplaces, security, and business applications. His experience includes leadership across Product Management, Strategy and Operations, and Technical Architecture with broad experience in startups to global organizations including Telefonica, Salesforce.com, Cisco-Webex, Proofpoint, Dataguise, Trilogy Software, and Hewlett-Packard. Srikanth holds a PhD in Engineering with a focus on Artificial Intelligence from University of Pittsburgh, and an MBA in General Management from Indiana University and a Masters in Global Management from Thunderbird School of Global Management. Srikanth is a Data Sciences & Machine Learning hobbyist and enjoys tinkering with Big Data technologies.
Davor is serving as a chair of the Apache Beam Project Management Committee, and is a CEO of Operiant, a company he founded that helps users get Big Data to production. He was previously a software engineer at Google where he worked on Google Cloud Dataflow, the predecessor to Apache Beam, since its beginnings.
As General Manager for Insurance, Cindy Maike is responsible for global insurance strategy and customer engagement for Hortonworks. She works with customers and partners leveraging analytics for current day business growth and exploring the use of new data sources to drive innovation in the evolving world of insurance. She has over 25 years of finance, consulting and advisory services experience in the insurance industry working with clients globally on their business strategy leveraging analytics and technology to further drive business results.
Cindy has deep industry knowledge in both claims and underwriting with a focus on the use of analytics and data to enhance business outcomes. She has held positions with the IBM Watson Solution Group, Carrier Insurance, Director of Strategy at ACORD, and was co-founder of Strategy Meets Action Research and Advisory Services. Cindy has also held and is a CPA.
She is passionate about solving business problems and eternally believes in process improvement and strongly believes that today's next generation of business intelligence in the form of advanced analytics will revolutionize the insurance industry. Cindy frequently speaks to business events on the value of business analytics, cognitive computing and the evolution of insurance in a connected world.
I am a software engineer in Big Data
Anant Chintamaneni brings more than 16 years experience in SaaS, Analytics and Big Data to his role of vice president of Products at BlueData. Anant is responsible for spearheading product development, as well as driving the go-to-market strategy for the BlueData platform. Prior to BlueData, Anant was head of Product Management and Strategy for Pivotal’s Business Data Lake portfolio, which included Pivotal HD, HAWQ SQL-in-Hadoop, GemfireXD In Memory Processing and Machine Learning Runtimes. While at Pivotal, he established Pivotal HD as the Hadoop platform for data-driven applications, streamlined field enablement and grew the installed user base by over 400 percent in just over a year. Prior to Pivotal, Anant was head of Big Data Analytics Solutions at Merced Systems (acquired by NICE Systems), where he created a new Analytics business division that resulted in multi-million dollar annual subscriptions.
Anant holds a Master of Science in Engineering from Stanford University and a Bachelor’s degree in Engineering from the Indian Institute of Technology, Varanasi.
Nanda Vijaydev is director of solution management at BlueData, where she leverages Hadoop, Spark, and Tachyon to build solutions for enterprise analytics use cases. Nanda has 10 years of experience in data management and data science. Previously, she worked on data science and big data projects in multiple industries, including healthcare and media; was a principal solutions architect at Silicon Valley Data Science; and served as director of solutions engineering at Karmasphere. Nanda has an in-depth understanding of the data analytics and data management space, particularly in the areas of data integration, ETL, warehousing, reporting, and Hadoop.
I have close to 17+ years of experience in various Data Warehousing and Business Intelligence technologies. At PayPal, I am part of the Merchant team working on building analytical solutions on user behavioral data, building algorithms for efficient consumer targeting and building Customer Identification Repository.
I am currently focused on Apache Spark, Scala, Hive, HBase and Machine Learning models running on HortonWorks Hadoop platform. I am also one of the founding member at PayPal to use Druid and build analytical solutions on top of terabytes of data utilizing the existing Hadoop environment at PayPal.
My areas of Interest are Spark, Hadoop, HBase, Druid, DruidSQL and Machine Learning models. I am also interested in bringing my BI expertise and help build analytical and reporting solutions on top of the Big Data platforms.
Deepika is a seasoned Big Data technologist with a passion for delivering applications that move the needle. She’s been hooked into the Big Data space from the early days of Hadoop and has over a decade of expertise building scalable applications with Hadoop Map Reduce, Hive, HBase, Spark, ES and Druid. At PayPal she is currently leading the Merchant Analytics team and pioneering scalable platforms enabling Merchant Insights, Personalization and Business Growth analytics.
Sriram is an architect in the IBM Analytics group tasked with delivering modern cloud-native offerings such as Data Science Experience (DSX) in Private Clouds, ensuring reliability, scalability using technologies like Docker & Kubernetes. His current focus is on integrating DSX with Hadoop clusters and enabling Machine/Deep Learning at scale. Prior to this he worked on delivery and operations of data services, such as dashDB on IBM Bluemix. He has years of experience developing enterprise worthy relational database, warehouse, ETL and tools offerings. Sriram started his career at Informix Software where he worked on application server technology, web content management software and database tooling as well as on the Red Brick Data Warehouse suite.
Issac works at LinkedIn in the data management team which is in charge of ingestion, lifecycle, and compliance of most HDFS data, as well as providing tools for the big data ecosystem in LinkedIn. He is a core developer and committer for Apache Gobblin, a distributed big data integration framework for batch and streaming systems. Previous work focused on analytics for video streaming.
Anthony is a Staff Software Engineer working on the Data Management team at LinkedIn, where he works on LinkedIn’s data access layer, Dali, and has contributed to Apache Hive and Pig. He holds a B.S. in Computer Science from Yale University.
In my current role I work on various tools & technologies related to big data, monitoring, messaging, streaming analytics, machine learning etc and develop solutions that could be used by multiple teams inside the company. Throughout my career I have helped development teams in creating highly available and scalable solutions and solved problems related to payment fraud, healthcare & customer loyalty.
Software Engineer at Hortonworks and PMC for Apache Atlas and Apache Lens. Currently working on Apache YARN, a well-known resource management framework for workloads on Hadoop.
Rohith Sharma K S is member of Project Management Committee for Apache Hadoop and Senior Software Engineer at Hortonworks. Full time contributor for YARN. Most of the contributions on YARN are RM restart, HA. Since last year working on ATSv2 for making production ready and auto spawning of services on YARN. Before joining Hortonworks, working at Huawei in Hadoop team. Rohith is passionate for traveling across the globe and discovering people culture and their habitat.
Julien Le Dem is the coauthor of Apache Parquet and the PMC chair of the project. He is also a committer and PMC Member on Apache Pig, Apache Arrow and a few others. Julien is a Principal Engineer at WeWork and was previously Architect at Dremio and tech lead for Twitter’s data processing tools, where he also obtained a two-character Twitter handle (@J_). Prior to Twitter, Julien was a principal engineer and tech lead working on content platforms at Yahoo, where he received his Hadoop initiation. His French accent makes his talks particularly attractive.
Aniket Mokashi is a tech lead on the engineering team that prototyped and built Procella project at Youtube. Throughout his career, he has contributed to development of large scale data processing frameworks and platforms. Prior to Google, he has worked on data platform teams at Twitter and Netflix. He is also a committer and PMC member on Apache Parquet and Apache Pig projects. Aniket holds a Master's degree in Information Networking from Carnegie Mellon University.
Vinod Kumar Vavilapalli has been contributing to Apache Hadoop project full-time since mid-2007. At Apache Software Foundation, he is V.P. Apache Hadoop, a long-term Hadoop contributor, committer, member of the Project Management Committee, and a ASF member. He is Director of Engineering at Hortonworks Inc and runs the Hadoop compute platform teams there. Before Hortonworks, he was at Yahoo!, working in the Grid team that made Hadoop what it is today, running at large scale - upto tens of thousands of nodes.
Vinod loves reading books of all kinds and is passionate about using computers to change the world for better, bit by bit. He has a bachelor’s degree in computer science and engineering from the Indian Institute of Technology Roorkee. He can be reached at twitter handle @tshooter.
Sunil Govindan is contributing to Apache Hadoop project since 2013 in various roles as Hadoop Contributor, Hadoop Committer and member Project Management Committee (PMC). He is working as Staff Software Engineer at Hortonworks in YARN team. He is majorly contributing in YARN Scheduling improvements such as Intra-Queue Resource preemption, Multiple Resource types support in YARN with Resource Profiles, Absolute Resource configuration support in Queues etc. He also drove efforts to improve YARN UI for better user experience with community. Before Hortonworks, he worked at Juniper on a custom resource scheduler. Prior to that, he was associated with Huawei and worked on Platform and Middleware distributed systems including Hadoop platform. He loves reading books, an ardent music lover and passionate about go-green efforts.
He is a PMC member of Apache Zeppelin and works at LINE. In Apache Zeppelin, He focuses on stabilizing Apache Zeppelin so that enterprises use it in their production. He develops a new data pipeline at LINE to make a new data platform, and which includes all kind of data produced by LINE and its family app. He is really interested in developing and operating scalable and fault-tolerant application solving mission critical problems.
Kat Petre is a technology rebel. Strong open source supporter, early adopter and fully dedicated to being in a state of constant learning, she recently left her Product Specialist role in MapR field organization to join the open source efforts of making Ambari great again. Previously, she served as Solution Architect in Cloudera's professional services team, specializing in building secure yet usable, highly available distributed big data solutions, to detect the scarce meaningful signals from the datalakes of noise.
Always looking for company in tackling the interesting problems with big data technologies, her ultimate goal is to create a simulation of the real world, fed by real time data, and overclock it. Because the people that are crazy enough to think they can change the world, are the ones that do.
Jesus Alvarez is a Technical Evangelist with a passion for Data Science, Security, and Crypto-Currency. Currently an advisory engineer, building integration tools to allow IBM DSX to integrate with open and closed source components.
Pioneer to the Hadoop ecosystem, building installers for "big data accelerators" in 2012 at IBM. Deep understanding for the importance of security, having spent 4 years in Healthcare IT/Software Design. Contributor to Apache Knox, Ambari, and an array of data science notebooks and tools.
Veteran in the realm of Crypto-Currency, Jesus was pushing the limits of home electric breakers in 2012-2014 running GPU and ASIC miners and architecting high frequency cryptocurrency arbitrage bots since the era before Mt-Gox and Cryptsy went dark.
Sumit is a software engineer in the IBM Watson Studio development team. He works on the integration of tools with compute engines for large-scale analytics. His work focuses on machine learning, simplifying AI, and making data science workflows more efficient. As an engineer with data science skills, he helps clients solve their technical and business challenges and realize their data analytics goals. He holds a degree in Automation and Industrial IT. Sumit shares his knowledge through talks at various meetups.
Michael Ger has over 25 years of experience working in industry and Information Technology strategy roles. He has deep cross-industry knowledge in product development, manufacturing, supply chain and customer experience related business processes. As General Manager of Manufacturing and Automotive Industries at Hortonworks, Mike is responsible for driving the solution vision and go-to-market strategies within each industry segment and works with industry leaders to drive next-generation business insights through Big Data Analytics. Prior to joining Hortonworks, Mike worked at Oracle for over 20 years as their Automotive Industry lead, at A.T. Kearney as an Automotive Management Consultant and at General Motors (Saturn Division) as a Product Engineer.
A solutions consultant for over 20 years, Ryan has a proven track records for successfully designing and delivering data processing and management solutions to customers across several vertical industries. In his current role, he is responsible for supporting sales objectives and working with customers to ensure success in their Hadoop adoption and development efforts. His current interests include developing streaming solutions leveraging Apache Storm and Apache Phoenix and applying predictive models to real time data.
Prior to joining Hortonworks, Ryan was a Lead Architect in the big data division at Actian and Integration Solutions Architect at Pervasive Software.
Ryan has his undergraduate in Architecture from Texas A&M University and lives in Pflugerville, Texas with his wife, son and daughter.
Sandeep Chandra is the Director for the SDSC Health CI Division and the Executive Director for Sherlock Cloud. Since joining SDSC in 2003, he has worked on different aspects of infrastructure deployment for scientific data management, as a principal investigator and in other leadership roles across a wide range of cross-disciplinary NSF, DOE, NIH and Foundation initiatives. As the Director of the Health Cyberinfrastructure division, Sandeep Chandra provides strategic vision, direction, management and implementation of concepts and methodologies for building Sherlock’s technology platforms including Cloud computing and Big Data solutions. Sandeep brings strong knowledge of the healthcare ecosystem with deep focus in compliance including NIST, FISMA and HIPAA requirements. He led the deployment of Sherlock’s compliant services in AWS making it the first compliant, hybrid Cloud platform in academia. Sandeep holds as MS in Computer Science from North Carolina State University and has over 15 years of experience providing direct policy, business, operations, and technology advice to the leadership at federal, state and academic institutions.
Dr. Yu is the CTO of AsiaInfo Data, which provides big data and AI solutions to all three telecom carriers in China. Its distributed big data platform processes over 7PB of data daily. Before joining AsiaInfo, Dr. Yu served as VP Engineering and Chief Architect for Mafengwo, the largest online travel community in China, with over 100 mobile and online users. Previously, he was VP Engineering and Chief Architect at OpenX, responsible for the company's data strategy, mobile product line, and overall architecture, consisting of more than 6000 servers and 15PB of data distributed in 5 global data centers. Dr. Yu is also a serial entrepreneur, co-founded two startups, Portaura in social mobile big data and Martsoft in e-commerce search engine.
Early in his career, he spent a number of years in HP(DEC) Systems Research Center, one of the top research labs in the world, where he worked closely with numerous Turing Award recipients in browser technology, search engine, multimedia, and distributed file systems. Dr. Yu holds PhD in Computer Science and Engineering from UNSW and BS in Computer Sciences and BA in Mathematics from UT Austin. He has published papers and gave keynote speeches in numerous major international conferences.
Data science expert with expertise in machine learning and big data systems. Leading innovation projects and R&D activities to promote data science best-practice in many business verticals (Telco, Finance, Healthcare, etc.). Pushing the cutting-edge application of AI and Data Science. Published and presented research paper and posters at many top-tier conferences and journals, including: ACM Computing Surveys, ACSAC, CEAS, EuroSec, FGCS, HiCoNS, HSCC, IEEE Systems Journal, MASHUPS, PST, SSS, TRUST, and WiVeC. Served as reviewers for many highly reputable international journals and conferences.
Ray is an architect and technical leader for Comcast's Enterprise Big Data ecosystem and collaborates to define strategic direction for the company's valuable data assets. He brings 25 years of experience in cable and telecommunications in both startups and fortune 50 companies to deliver next generation systems that build positive customer experiences and revenue generating products for the company. He is an avid amateur wildlife and landscape photographer and enjoys traveling and spending time with his family in the beautiful state of Colorado.
Prashant Khanolkar is a Principal Big Data Architect at Comcast with the dX (Data eXperience) team and is focused on key initiatives such as Open Ingest and Open Egress to make Big Data universally available to everyone within the enterprise and make it easy to onboard and consume datasets by building an easy to use self service platform that integrates technologies such as Streamsets, Kafka, Presto and AWS Athena. He has more than 25 years of IT experience around the Cable and Telecommunication industry building mission critical applications using a diverse array of database technologies both relational and NoSQL.
He previously worked with reputed consulting companies such as PriceWaterhouseCoopers (PwC) and Coopers and Lybrand.
Kevin Doran is Software Developer on the Data Flow Management team at Hortonworks. He is also a committer on the Apache NiFi project, where he contributes to the NiFi Registry flow versioning tools and the security framework for various NiFi projects. At Hortonworks, he is helping design and implement a scalable command and control solution for Apache MiNiFi deployments. Kevin is enthusiastic about open source software and building communities of developers. He enjoys automating workflows and creating simple technology that solves practical problems. Prior to joining Hortonworks, Kevin helped build a global scale optimization platform at Videology and advanced numerous research projects in the areas of cyber security, networking protocols, and secure mobile computing at BBN Technologies. Kevin holds a B.S. in Computer Engineering from the University of Maryland (go Terps!). He enjoys listening to podcasts, reading, and spending time with his two kids.
Masato Asahara (Ph.D.) is currently leading developments of Spark-based machine learning and data analytics systems, which fully automate predictive modeling. Masato received his Ph.D. degree from Keio University, and has worked at NEC for 8 years as a researcher in the field of distributed computing systems and computing resource management technologies.
Yoshiki Takahashi is a student of the master of computer science program at the graduate school of Tokyo Institute of Technology.
His academic research proposal is accepted in SysML 2018 which has attracted attention since its previous workshop era in NIPS.
He worked on development of a Spark-based machine learning platform for automatic predictive modeling in his internship program at NEC Data Science Research Laboratories in 2017.
He received his B.S. degree in 2017 from Tokyo Institute of Technology.
Sr. Product Specialist at Hortonworks Support. Skilled in Kerberos, SSL, PKI, Cryptography, Linux and Hadoop (Contributor in Apache Zeppelin, Core Hadoop projects to name a few).
- Contributing Security related bugs & patches to Apache Zeppelin, Core Hadoop, Apache Atlas
- SME for Kerberos, SSL, Apache Ranger, Apache KMS, Apache Knox, Apache Atlas, Apache Zeppelin
- Evangelist for Kerberos (& Security in general) to Customer, Support and Dev. teams.
Gio is the Vice President of Data Technology at Canadian Tire Corporation. He brings with him 20 years of experience across different industries such as telecommunications, banking, and now retail. Currently, Gio oversees all of the Data Platforms and Technology Roadmaps at Canadian Tire Corporation and its family of companies.
Gio is an innovative leader who is constantly striving to enhance our customers’ end-to-end experience. While he enjoys complex challenges, he also has a strong passion for using a variety of methodologies to transform his deliverables.
Gio graduated from Delta College with a specialization in Network Engineering and holds a certificate of leadership experience from Banff Centre
Steve Loughran works on Hadoop att Hortonworks, currently cloud storage integration, including improving integration with Amazon's S3 in Hadoop, Hive and Spark
He's the author of Ant in Action, a member of the Apache Software Foundation, and a committer on the Hadoop core since 2009. Prior to joining Hortonworks in 2012, he was a Research Scientist at HP Laboratories.
He lives and works in Bristol, England.
Indrajit Poddar is a Senior Technical Staff Member and Master Inventor in IBM Systems. He currently works on cloud enabled and hardware accelerated machine learning and deep learning software. He has 18 years of industry experience in distributed computing and holds a MS in CS from Penn State University and B.Tech in CSE from the Indian Institute of Technology, Kharagpur.
Robert Hryniewicz has over 10 years working on various projects related to Artificial Intelligence, Enterprise Software, IoT, Robotics, Blockchain and more. Currently, he’s a Data Scientist and Evangelist at Hortonworks. Previously, Robert was a CTO at a Singularity Labs startup, Sr. Architect at Cisco, NASA et al. He’s a frequent speaker at DataWorks / Hadoop Summits.
Suneel is a Member of Apache Software Foundation and is a Committer and PMC on Apache Mahout, Apache OpenNLP, Apache Streams. He's presented in the past at Flink Forward, Hadoop Summit, Berlin Buzzwords, Machine Learning Conference, Big Data Tech Warsaw and Apache Big Data.
SDE At Amazon AI
Previously Principal Engineer at GE Healthcare, specializing in Medical Imaging
Dinesh Chandrasekhar is a technology evangelist, a thought leader and a seasoned product marketer with over 23+ years of industry experience. He has an impressive track record of taking new integration/mobile/IoT/Big Data products to market with a clear GTM strategy of pre-and-post launch activities. He has extensive experience working on enterprise software as well as SaaS products delivering sophisticated solutions for customers with complex architectures. His areas of expertise include IoT, Application/Data integration, BPM, Analytics, B2B, API management, Microservices and Mobility. He can articulate detailed use cases across multiple industry verticals like retail, manufacturing, utilities and healthcare. He is a prolific speaker, blogger and a weekend coder. He currently works at Hortonworks, managing their HDF product line. He is fascinated about new technology trends including blockchain and deep learning.
Paige Bartley is a Senior Analyst in Ovum's Data and Enterprise Intelligence team specializing in all aspects of the data lifecycle including creation, cleansing, security, privacy, and productivity.
Working across the information management space, Paige researches how data use affects both large organizations and individuals alike. She provides insight and analysis into data ROI and successful organizational strategy.
Paige’s other areas of expertise include, regulatory and legal matters, data quality, unstructured data and NLP, master data and records management, and neuroscience and cognitive science.
Prior to joining Ovum in 2016, she worked in research and marketing for ZL Technologies.
I have a great passion for technology. Always in to learn new skills and loving the way how fast things are changing in this industry. I am technical person and also like to connect with other passionate people in the technology sector. Knowledge sharing is very important to me and love the role of mentoring colleagues. My latest challenge is to know everything about Docker .
I am really a hands-on person and love to solve difficult and challenging problems. There is no greater joy then getting things done and have a good working system. Always there to go the extra mile to deliver a finished and proper implemented project.
Zakeera is a Data Specialist that has worked at many of the leading financial institutions in South Africa. She’s been involved in the full lifecycle around data, from designing and building Enterprise physical datawarehouses, logical datawarehouses, implementing master data management solutions, establishing Enterprise data governance structures, defining technical roadmaps to meet business strategies. Zakeera played a crucial role in the simplification of the data architecture at the Johannesburg Stock Exchange by combining the implementation of Master Data Management with Data Virtualisation. She’s been invited to speak at the Open Group forum SA and also participated in a webinar on Data Virtualisation. She now leads the Big Data practice within standard Bank.
Kristel is responsible for managing the delivery of business value from the Big Data stack. Kristel facilitates the prioritisation of business use cases for the Data Lake across all divisions in the Group.
Kristel is an agent for change in an ever-evolving technology landscape and has been instrumental in acceptance and approval of investment cases for open source technology.
Kristel supports the community of Data Science practitioners by facilitating regular Guilds.
Her experience spans the implementation of new technologies and platforms across the Enterprise in support of the Group strategy. Kristel is the Platform Owner for Big Data at Standard Bank.
Software engineer at Google Seattle, working on Cloud Dataflow SDK, focusing on streaming SQL support for Apache Beam.
Ali Bajwa is Principal Partner Solutions Engineer at Hortonworks, where he helps partners learn about and integrate with open source Big Data technologies. He has developed Ambari plugins for NiFi and Zeppelin and training materials related to security/governance. Prior to joining Hortonworks, he worked as a Principal Member of Technical Staff at Oracle.
Krishna Potluri is a Big Data Architect at TMW Systems, A TRIMBLE COMPANY. He is passionate about technology and has immense knowledge and experience designing and architecting Big Data Solutions.
Donnie Wheat is the Senior Big Data Architect at TMW, with a devotion for business intelligence , data integration, and data science. Recently focusing on near real time analytics using Apache NiFi, Donnie has experience providing actionable business intelligence with the Hadoop platform, implementing data warehouses, and delivering transportation optimization.
Don Bosco Durai (Bosco) is a thought leader in enterprise security and is a committer in open source projects like Apache Ranger, Apache Ambari, and Apache HAWQ. He has also contributed towards the security for most of the Hadoop components. Bosco was the co-founder of XA Secure, which is the genesis of Apache Ranger. Bosco is currently working at Privacera to automate discovery, control, and monitoring of large dataset in Big Data and Cloud.
Madhan Neethiraj is an Apache committer and PMC for Apache Atlas and Apache Ranger projects. He works at Hortonworks as Sr. Director of Engineering in Enterprise Security Team. His contributions include Apache Ranger features like audit framework, stack model, tag-based policies, masking and row-filter policies; and Apache Atlas features like V2 APIs, search enhancements. Prior to Hortonworks, Madhan was at Oracle in development of security access management suite, governance and real-time fraud detection/prevention products. Prior to Oracle, he was with Bharosa Inc. responsible for the development of real-time fraud detection solution for Financial Institutes, HealthCare and eCommerce.
Henry Sowell is Hortonworks Technical Director in the Public Sector.
In this capacity, Mr. Sowell leads an engineering group responsible for the technical architecture and engineering of Big Data solutions supporting missions across the Intelligence Community, Department of Defense, Federal Civilian Government Agencies, and State, Local, and Higher Education institutions, helping improve speed to mission.
Prior to joining Hortonworks, Mr. Sowell used several technologies, including Apache Hadoop, to protect the nation in support of the FBI’s counterterrorism mission. In addition to supporting the counterterrorism mission, he leveraged these technologies to support cross-division law enforcement advancements with the FBI’s Cyber Division. Mr. Sowell enlisted in the United States Marine Corps in 2003. He served with distinction as a decorated combat veteran, having earned the Bronze Star with Valor for his actions in Iraq.
Leo Garciga serves as the Chief of JD-OI6 and Chief Technology Officer for the Joint Improvised-Threat Defeat Organization (JIDO), Defense Threat Reduction Agency (JIDO/DTRA). In these roles, he provides leadership and oversight of Mission Information Technology services and personnel that directly contribute to the implementation of the JIDO/DTRA mission and its support to the warfighter, Department of Defense (DoD), Combatant Commanders, Coalition partners, the intelligence and interagency organizations.
Mr. Garciga is also JIDO’s senior information technology advisor, who discovers and rapidly implements new technology and innovation to counter threat networks, improvised threats and improvised explosive devices to support counter-terrorism and counter-insurgencies operations and to prevent battlefield surprise.
He advocates and spearheads efforts across DoD, the Intelligence Community, US Government Agencies, academia and industry to integrate a myriad of Research and Development work to rapidly introduce new information technology that provides immediate operational impacts for the warfighter and the nation. His efforts have resulted in continuous enhancements to Catapult, a rapid response data analytic platform, to improve situational awareness to thousands of users. He made JIDO an early adopter and leader in DoD of the implementation of Secure Dev Ops, which unified security, software development and operations to automate processes for innovation in information technology. He also is key contributor to DoD understanding of the potential of artificial intelligence and machine learning to future missions.
Mr. Garciga has a BA in Mechanical Engineering Technology, is a certified Information Technology professional. He has also served in a variety of roles in DoD, to include active duty service in the US Navy, the Combatant Commands, and the Intelligence Community.
Stephen Wu is a senior program manager for big data at Microsoft.
Konstantin V. Shvachko is an expert in Big Data technologies, file systems, and storage solutions. He specializes in efficient data structures and algorithms for large-scale distributed storage systems. Konstantin is known as an open source software developer, author, inventor, and entrepreneur. He is a senior staff software engineer at LinkedIn.
Erik is a software engineer with a passion for all things distributed systems. He currently focuses on Big Data storage and analytics at LinkedIn. His work mainly focuses around the scalability of HDFS, both internally and via contributions to open source. Erik is particularly excited by investigating research into new and interesting storage technologies, and is passionate about the promotion of female involvement and empowerment in the technology space.
Kevin Martelli is a Managing Director at KPMG and in this role he is the U.S. Technology Lead in KPMG’s Lighthouse – Data & Analytics Solution Center. In this capacity, he is responsible for the technology platforms and applications that are leveraged to deliver data and analytic solutions to address client needs in Artificial Intelligence (AI), Big Data, and Blockchain space. In addition he also supports and leads the build out of technology solutions and analytical data science use cases at clients sites and oversees the Big Data software Engineer team.
Balaji Wooputur is a Director in Data Analytics at Freddie Mac leading Business Data solutions in BigData platform for Single Family Risk. He has extensive leadership on Data solutions, Diginomics and Advance analytics in building and establishing business data solution across diverse industries like Finance, Federal and various verticals. Currently, he is responsible for developing and managing business data solutions on Big Data, AI, Diginomics and Advance analytics using Machine Learning for Single Family Credit Reporting Analytics (CAR).
Ankit Singhal is a committer and a member of Apache Phoenix PMC (Project Managment Committee) for more than 2 years now. He has also been contributing to projects like HBase, Tephra, Calcite . He specializes in designing and developing big data solutions for different line of business. With over 7 years of Big Data experience, he has architected and created various analytics products and data warehouse solutions using Hadoop technologies like Hadoop, Kafka, Hive, HBase, Phoenix, spark.
Rajeshbabu Chintaguntla is Staff Software Engineer at Hortonworks R&D-Data team. He is expert in designing and developing bigdata solutions as for business need. Throughout the career, he has contributed to Development of Hadoop ecosystem also he committer for Apache HBase and Project Management Committee member for Apache Phoenix projects.
Engineering manager in the Big Data team at Microsoft.
Software engineer in the Big Data team at Microsoft.
Ohad Shacham is Senior Research Scientist at Yahoo Research. He works on scalable big data and search platforms. Most recently, he focused on extending the Omid transaction processing system with high availability and integrating Omid with Apache Phoenix. Ohad received his PhD in concurrent software verification from Tel-Aviv University CS in 2012. Prior to Yahoo, Ohad lead the SAT based formal verification activities at IBM Research and worked on automatic software vectorization at Intel.
While employed by some of the largest and recognizable organizations in the health care, financial services, and digital publishing industries, Andy has led major strategic data integration and advanced analytics efforts of critical importance to those firms. Currently, Andy leads the strategic redesign of Health Care Service Corporation’s information architecture and data management activities, heavily leveraging big data technology and open source projects to improve the quality of members’ health care services within a rapidly changing industry.
Dr. Leon Li serves as Software Architect and designer of Northrop Grumman’s Hadoop based enterprise data analytics platform. He is an expert in Hadoop based enterprise system architectures, and advises Northrop Grumman executive leadership on analytics technologies. At Northrop Grumman, Leon previously served as Senior Software Engineer for a national cyber security information sharing program, led a university research effort in cryptography, and led systems engineering efforts on Cloud based big data systems for genomics research. Leon graduated with a PhD in Electrical Engineering from MIT.
Kunal Umrigar, Sr. Director, Engineering at PubMatic, leads the big data & analytics team to design and develop a big data platform to ingest and process terabytes of data. Along with his big data experience, he is also has been a lead architect of PubMatic Platform APIs and has designed and developed the Microservice and API infrastructure from the ground up. Prior to PubMatic, Kunal held engineering roles at Unica (now part of IBM) where he worked on designing and developing enterprise applications.
Plamen J. Jeliazkov is has been an HDFS contributor for about 6 years and considered an expert by his peers. He specializes in HDFS knowledge, most notably behind the NameNode internals. He was part of the team that brought truncate functionality to Hadoop. He is currently a senior Hadoop engineer at PayPal. His excitement comes in shining and polishing HDFS clusters to work at their best.
From his Twitter:
"Programmer. Gamer. Nerd. UCSD alumni. I develop Hadoop and HBase. I like computer systems, video games, and crypto."
Russ McElroy is the Big Data Platform Manager at PayPal responsible for availability, architecture, strategy and vision for PayPal's Big Data assets including Hadoop, Druid and others. He has 20+ years in e-commerce and payments DevOps with a special interest in distributed systems. In previous roles, he took on eBay's scaling challenges in transaction databases and storage, search, analytics and cloud environments. His current focus is on enhancing PayPal's data governance capabilities by gaining greater insight into analytics metadata (analytics on analytics) while leveraging and providing open source options. He has a B.S. in Computer Science and Engineering from UC Davis.
Matt Aslett is a Research Director for the Data Platforms and Analytics Channel at 451 Research. Matt has overall responsibility for the data platforms and analytics research coverage, which includes operational and analytic databases, Hadoop, grid/cache, stream processing, search-based data platforms, data integration, data quality, data management, analytics, machine learning and advanced analytics. Matt's own primary area of focus includes data management, reporting and analytics, and exploring how the various data platforms and analytics technology sectors are converging in the form of next-generation data platforms.
Matt is a regular speaker at client and industry events and has delivered keynotes and moderated panels at Strata + Hadoop World, Hadoop Summit, Percona Live MySQL Conference, GraphConnect, Data Leadership and NoSQL Roadshow. Matt has been named by AnalyticsWeek as being among the top 200 Thought Leaders in the field of Big Data and Analytics.
Kiran works a PM for Apache Knox and platform security in the security and governance team at Hortonworks. His past experience has been in data security for Hadoop and Security Analytics domains.
Sandeep More is a committer and PMC member for Apache Knox project and has an extensive experience in the areas of enterprise security, API Management, XML gateways and Tokenization. Previously, he has worked as a security architect and worked on security frameworks for financial institutions. Currently, He works as a part of Hortonworks’ team dedicated to Apache Knox project.
Mark is currently in his fifth year within Hortonworks Partner Engineering and has 29 years of experience working with Advanced Analytic, Distributed Computing and Data platforms. He is currently focused on helping customers leverage Advanced Analytics, IoT and Big Data capabilities to achieve a competitive advantage. Mark has a BS in Computer Science from North Carolina State University and also holds a Six Sigma Black Belt.
Viplava Madasu is a Big Data Systems Engineer at Hewlett Packard Enterprise where he currently works on evaluating emerging big data technologies and creating reference architectures for HPE converged infrastructure platforms. Previously, he worked developing software in different groups at HPE in Application Server Middleware/Java Hotspot JVM/SQL database engine areas. He holds a Masters degree in Computer Science from Indian Institute of Technology, Kharagpur.
As the General Manager, Energy, Kenneth is responsible for establishing and leading the execution of a winning go-to-market strategy for Hortonworks in the energy industry. Duties include development of successful thought leadership, and proactively working with Hortonworks sales & product management, prospects, customers, and partners to identify new solutions and industry specific requirements for the energy market. He also ensures appropriate energy industry expertise is provided and leveraged during the sales and solution implementation cycle, and helps develop successful strategies to drive adoption of Hortonworks products and services in the segment.
Kenneth has over 15 years of proven success as a sales executive across multiple industries, the last seven spent in the energy sector, through the delivery of high-value propositioned products and solutions, with a track record of developing reciprocal, rewarding relationships with his customers, partners, and colleagues.
Alejandro Fernandez has 10 years as a software engineer and 3+ years in the Hadoop ecosystem. He's currently at Airbnb in the Data Infrastructure team working on Hadoop security, Apache Airflow (incubating), and Hive. Previously, he was at Hortonworks for 3 years where he became a PMC for Apache Ambari and contributed to features like Rolling & Express upgrades. He has spoken at major conferences like Dataworks Summit in San Jose and Melbourne, and at the Apache Big Data conference in Miami. He graduated from Carnegie Mellon University, where he got his Bachelor of Science in Computer Science and additional major in Mathematics.
Sarath Subramanian is an Apache Atlas committer/PMC member and currently works for Hortonworks as Staff Software Engineer. He is actively engaged in the development of Apache Atlas. Prior to Hortonworks, Sarath worked at IBM in big data analytics platform using Hadoop and Spark with particular focus on Social Data Analytics.
I'm currently a Sr. Software engineer at Hortonworks primarily working on Apache Atlas. Prior to this I was working at PayPal Inc. on the Auth platform.
Pushkar is leading data and ML engineering platform at Machine Zone.Platform powers performance marketing across 300+ channels.Previously he was leading development of low latency messaging platform for enterprise social collaboration at salesforce.com.
Software Engineer with 10+ years of experience.
Specialized in building high load distributed platforms, data transformation platforms, machine learning platforms.
Tendü Yoğurtçu, Ph.D., is Syncsort’s Chief Technology Officer (CTO). She has 20+ years of software industry experience, including extensive Big Data and Hadoop industry knowledge. As CTO, Tendu directs the company’s technology strategy and innovation, leading all product research and development programs.
Prior to her CTO role, Tendü has served as Syncsort’s General Manager of Big Data, leading the global software business for Data Integration, Hadoop and Cloud, including sales, marketing, engineering and support. Tendu has held several engineering management roles where she directed the development of ETL, Sort, and Application Modernization products for Syncsort’s Data Integration business. She also was an Adjunct Faculty Member at the Computer Science Department at Stevens Institute of Technology.
Tendü is a dedicated advocate for STEM education for women and diversity.
Tendü received her PhD in Computer Science from Stevens Institute of Technology, NJ, a Master’s degree in Industrial Engineering, and a Bachelor’s degree in Computer Engineering from Bosphorus University, Istanbul.
Amit Anand is a senior software developer at Bloomberg on the Hadoop Servics/Infrastructure team, where he is involved in designing and developing tools, that are used by users, around hadoop platform. Amit is involved on the Hadoop Infrastucture team as well, where he is responsible for deployment and management of hadoop clusters. He focuses on HDFS, Yarn, HBase and Spark. He holds a Bachelors in Commerce and a Masters in Computer Science.
Esther Kundin is a senior software developer at Bloomberg on the Machine Learning Text Analysis team, where she is the lead architect and engineer of the data archival project for the Engineering News department. She focuses on HDFS ingestion pipelines and PySpark integration for news data and analytics. Previously, Esther has worked on the Hadoop Infrastructure team as well as the Portfolio Analytics team in Bloomberg. She holds a BA in Computer Science and Mathematics from New York University and a Masters in Computer Science from Columbia University.
As Managing Director and Head of Machine Learning, Sourav is responsible for the overall delivery of data science and data product services to make clients successful. Before Manifold, Sourav led teams to build data products across the technology stack, from smart thermostats and security cams (Google / Nest) to power grid forecasting (AutoGrid) to wireless communication chips (Qualcomm). He holds patents for his work, has been published in several IEEE journals, and has won numerous awards. He earned his PhD, MS, and BS degrees from MIT in Electrical Engineering and Computer Science.
Abhas, an electrical engineer by training, is a seasoned strategy consultant and a passionate entrepreneur. An ardent innovator, he holds an inimitable experience of incubating a Digital start-up, and successfully growing it as a lean alternative growth engine bringing in revenues in excess of $10M.
Currently he heads up Strategy & Innovation at Hortonworks and helps early stage startups with angel Investments, product prototyping, affiliate partnership strategy, marketing and loyalty biz-stream creation and sustainable 360 degree growth, across both sides of the Atlantic.
He was selected as a Global Shaper by the World Economic Forum, '100 Visionaries under 30' along side the likes of Nobel Laureate Malala Yousafzai by the Real Leaders Magazine, and a Founders of the Future under 35 by the Founders Forum.
Apache Software Foundation President. Key previous ASF roles: Director (13 years), Secretary, Treasurer, VP Legal, VP Infrastructure, VP Jakarta.
Published author: Agile Web Development with Rails 5 (Pragmatic Programming, 2009-present) and RESTful Web Services (O'Reilly, 2007)
W3C HTML5 WG co-chair (2009-2015)
IETF Atom Working Group Secretary (2004-2007)
Convener of C# and .Net CLI WGs at ECMA (2000-2005)
PHP Core Group (technically 1999-present, but no longer active)
Mr. Menon currently serves as the Vice President of Enterprise Information Management for Hilton. He has responsibility providing strategic direction for data management and business intelligence. He has also held multiple positions within Hilton, ranging from senior architect to director of designing and building OnQ, hospitality’s first integrated platform. Mr. Menon joined Hilton in 1994, prior to which he worked for Tata Consultancy Services supporting numerous commercial clients like General Electric and National Stock Exchange of India. Mr. Menon has a Master’s degree in business administration from University of Memphis and a Master’s in Electronics Engineering from National Institute of Technology, India.
Lisa is currently serving USAA and its members in their Chief Technology Office as a Technical Architect over Data and Analytics Infrastructure. She has focus in setting architectural direction for our big data infrastructure and analytic appliances. Prior to this role, she has devoted her expertise in the implementation and support of USAAs data warehousing and analytic systems as a Lead DBA.
Robert is currently a Hadoop Administrator at USAA. He began his big data journey in 2011 when as part of a team of 4 he helped to bring Hadoop into USAA. He is a certified Cloudera Hadoop Administrator and has experience working with a few different Hadoop distributions including IBM BigInsights and Hortonworks HDP. He also worked a bit with the MapR and Pivotal distributions. Since 2011 he has assisted in building and maintaining the USAA Hadoop clusters. He also assisted in identifying the skills needed, growing the team of Hadoop Administrators and is always willing to take the time to train others on the job. He has also assisted in developing a chargeback model, lead efforts to perform upgrades (not a small effort with BigInsights and GPFS!), lead an effort to combine clusters and assisted in identifying and purchasing the next generation hardware for the clusters. When he had spare time he enjoyed setting up his own clusters at home though with his wife and six kids (including 7 month old twins) there is little time or budget left for that now.
Pranoop Erasani is a Senior Technical Director for world’s No.1 storage operating system, NetApp® ONTAP®. NetApp is widely recognized as the market leader in storage software solutions. Since joining NetApp, Pranoop has led technology innovations related to areas of NAS protocol, Scalable filesystem, Analytics and Caching technology for the NetApp flagship ONTAP operating system. He is a passionate about clustering and distributed systems and is a strong advocate of leveraging NAS for analytics storage. He acts as a key technical advisor to technical marketing and product management for design and development of technologies required to make ONTAP a successful data management platform for Hadoop, NoSQL and Machine Learning. Prior to NetApp, Pranoop worked at Sun Microsystems on the Solaris Clustering product. Pranoop holds a Master’s degree in computer science from the University of Minnesota, Minneapolis.
Shankar Pasupathy is NetApp’s Technical Director for Active IQ, a telemetry system that provides data driven insights for NetApp’s customers. He leads both the data science and data engineering teams that are responsible for processing and deriving insights from 70 billion data points a day. Shankar also drives strategy for the company around analytics and machine learning in the hybrid cloud. Previously Shankar was a senior manager and principal architect in NetApp’s Advanced Technology Group. He has published more than 20 research papers, is a co-author on 40 patents, and has won several NetApp innovation awards. He has a Masters in Computer Science from the University of Wisconsin-Madison and dual degrees in Math and Computer Science from BITS, India.
Arindam is a Principal Group Program Manager in the Microsoft Big Data group where he leads the Azure HDInsight team. He comes to big data after extensive experience in Transaction Processing, Authentication & Authorization technologies in Windows Core OS, and designing the Microsoft Dynamics security model. He is currently focused on making it easy for developers to build analytics applications using tools of their choice on cloud platforms that offer best of breed security, compliance, efficiency and cost-control.
Sandip Shah, Director of Data Management, is responsible for managing the Enterprise Information Systems and Services for Rogers communications. Sandip is strategic leader with over 15 years’ experience in transforming organizations into a true data-as-an-asset mindset and harnessing the database services, Big Data and AI for business insights. Sandip joined the Enterprise IT team of Rogers Communications in 2013 with the responsibility for centralizing all the data across the organization to deliver Enterprise data warehouse and Enterprise Data lake. Sandip’s major role is to Enable Rogers business units; Consumer, Enterprise and Media to provide true data insights and analytics. Collaborative partnerships were built inside and outside the company, which enabled the IT Organization to reduce the operational cost by 45% for data management team i.e roughly 12M+ within 3years. These strategic relationships include product partners like Hortonworks, SAS, Informatica, Microstrategy as well as SI Partners like TCS, Tech M, Cap Gemini to build High-Performance enterprise data warehouse, Big Data Analytics, and Enterprise reporting teams.
Prior to Rogers, Sandip was managing Scotiabank’s Enterprise data warehouse and helped the organization to build a financial data warehouse for Marketing and Retail Units. Sandip holds a Master of Computer Engineering degree from Canada and Bachelors degree in Computer science from India;
David Nesarajah, Senior Director and Head of Data Management & Technology at Rogers Communications, is responsible for managing the Enterprise Information Systems and Services. David, leading a department of over 400 onshore and offshore resources, is a strategic leader with over 20 years of experience in Finance and IT. In David’s current role, he has been promoting data as an asset for decision making and strategic investments. Under David’s leadership, capital intensity in Data Management has nearly doubled in a three year period.
David joined the Data Management team with a focus to simplify existing technologies, implement new technologies, and drive improvements in BI delivery. The team pursued an aggressive path to centralize all data across the organization by replatforming systems, reducing data redundancy, and streamlining technologies, leading to a 45% reduction in operating costs over a three year period. Through this journey of simplification the team has driven improved security and access to data to ensure it is treated as a corporate asset.
Speed and quality of program delivery has improved by developing strategic partnerships with product partners such as Hortonworks, SAS, Informatica, Microstrategy as well as SI Partners such as TCS, Tech M, Cap Gemini and Adastra.
In addition to the technology transformation, David has worked with the business units across the organization to implement automated and self-serve reporting, and advance analytics and modeling capabilities.
Dennis is responsible for the execution of a multi-year business data strategy for Freddie Mac's Single Family (SF) division with a specific focus area on business intelligence and analytics enablement. As a shared data asset owner and data steward he is responsible for the SF Datamart, the Big Data Analytic Platform and several analytical tool implementations. He launched a joint IT and business Big Data CoE and chairs enterprise BI and Analytics CoPs. He was an IT Development Director for ten years where he led the creation of a centralized IT data delivery department before moving to the business in 2014 to help stand-up SF Data Governance and Management. He was a Senior Principal at American Management Systems (now CGI Federal) where he led the development and operations of an internal data warehouse and a global web-based BI application.
Simon is a data scientist, has experience in product management, and has worked for numerous data technology companies, from vendors like Hortonworks to various data users in retail, hedge funds and the web. His focus is on big data, machine learning, and using these technologies to foster results.
Mr. Justin Langseth founded Zoomdata in 2012, and is its Chairman of the Board. Previously, he co-founded Clarabridge, Inc. in 2005 and served as its President Chief Technology Officer, and co-founded Claraview Inc. and served as its Chief Technology Officer. Prior to Claraview, he was founder and Chief Technology Officer of Strategy.com, a real-time data analysis and alerting subsidiary of MicroStrategy. He holds 12 patents related to big and real-time data. He graduated from the Massachusetts Institute of Technology where he received a SB in Management of Information Technology from the MIT Sloan School of Management.
I am a Principal Engineer working at Hortonworks, focussed on governance and metadata management products. I lead the Hortonworks DataPlane Services platform and Hortonworks Data Steward Studio. Earlier, I was an active contributor and committer on Apache Atlas. I am interested in building scalable data processing systems and metadata management systems that operate in the Apache Hadoop ecosystem. I have been involved with Apache Hadoop since early days, and was a lead responsible for MapReduce before Hadoop graduated to become a 1.0 product.
Niru Anisetti is the product manager for Data Lifecycle Manager at Hortonworks. She is part of a passionate team building the next generation disaster recovery product to make millions of data managers’ lives easier. Before Hortonworks, she worked at IBM, Intuit and Yahoo among other companies to build products to not only generate revenues but to change lives of people for the better. She can be reached at firstname.lastname@example.org.
Billie Rinaldi is a Principal Software Engineer I at Hortonworks, currently prototyping new features related to long-running services and containers in Apache Hadoop YARN. Prior to August 2012, Billie engaged in big data science and research at the National Security Agency, where she provided early leadership for Apache Accumulo. Billie is a member of the Apache Software Foundation and a committer for Apache Hadoop and a number of other Apache projects in the Hadoop ecosystem. She holds a Ph.D. in applied mathematics from Rensselaer Polytechnic Institute.
Shane Kumpf is a Software Engineer on the Apache Hadoop YARN R&D team at Hortonworks.
Mohammad Kamrul Islam is currently working at Uber on its Data Infrastructure team as a Staff Software Engineer. Previously, he worked at LinkedIn for more than two years as a Staff Software Engineer in their Hadoop Development team. Before that, he worked at Yahoo! for nearly five years as an Oozie architect/technical lead. He has been intimately involved with the Apache Hadoop ecosystem since 2009. Mohammad has a Ph.D. in computer science with a specialization in parallel job scheduling from Ohio State University. He is a Project Management Committee (PMC) member of both Apache Oozie and Apache TEZ and frequently contributes to Apache HDFS/YARN/MapReduce and Apache Hive.
Wei Han is currently working at Uber managing Hadoop security team. Before that, he worked on Cherami, Uber's message queue system. Before Uber, he worked at Microsoft Bing for 7 years, mainly focusing on Bing's indexing generation system and large scale storage systems.
Siddharth Seth works as a Software Engineer at Hortonworks, and has been involved with various Hadoop ecosystem projects for the past 7 years. He currently works on the Hortonworks cloud effort, and in the past has worked on Apache Hive-LLAP, Apache Tez, and Apache Hadoop with a focus on YARN and MapReduce. He is a Hive committer, a member of the Apache Tez PMC, and the Apache Hadoop PMC. Prior to this he spent several years working on search at Yahoo.
Christopher Crosbie has over fifteen years of experience developing and deploying data technology in enterprise environments. He is currently on the Cloud Partner Engineering team at Google where he serves a trusted advisor to software vendors that build Data, Analytics and ML solutions on the Google Cloud platform.
Previous to joining Google, Chris was a development manager at Amazon and before that he headed up the data science team at Memorial Sloan Kettering Cancer Center where he implemented the enterprise Hortonworks architecture and strategy. Chris started his career as a biostatistics application engineer at the NSABP, a not-for-profit clinical trials cooperative group supported by the National Cancer Institute. He holds an MPH in Biostatistics and an MS in Information Science.
Jennifer Nicholson leads the GDPR offering for Accenture Technology's Data Business Group in North America. She works closely with colleagues and clients to develop sustainable regulatory compliance programs for robust data ecosystems. Prior to joining Accenture she had been helping clients comply with data privacy regulations without negatively impacting their bottom line for over 10 years. She has her J.D., but doesn't technically practice law.
Pallavi Galgali is an Offering Manager at IBM and is responsible for IBM Spectrum Storage Suite bundled SDS offering and IBM Spectrum Scale Big Data Solutions. She has 14+ years of industry experience that includes software development, product/function delivery, offering management and people management. Her technical background is in the area of storage management and infrastructure for cloud & analytics. She has 2 patents and has also published articles and papers on technical topics.
Douglas is came to IBM with a background that includes several start-ups, a few publicly traded companies and several open source projects. His focus has been on introducing new technology into the marketplace, including x86-64, online game rentals, Linux, and parallel-NFS. A graduate of Yale University in Physics and a former member of IASTE Local 1 and USA 829, he raised his family near Boston and hopes you don’t hold that against him.
Douglas has lived on four continents; worked professionally in three. His only wish would be to speak fluently in something aside from English, but he still practices his Spanish whenever he can.
Bob has broad experience as a Pre-Sales Big Data Solutions Architect (Systems Engineer) for Hewlett-Packard Enterprise designing computer hardware solutions for Big Data Hadoop environments across the various major distributions for large corporations using the leading vendors software deployments including Hortonworks. With a focus on solving customer data problems and frequently collaborate with them on Hadoop Big Data projects, Bob has specialized in Big Data security for complete end-to-end security of Big Data environments.
Kevin Bates is Fannie Mae’s Vice President for Enterprise Data Strategy Execution. Reporting to the Chief Data Officer, Bates is responsible for leading Enterprise Data’s execution and delivery of end-to-end solutions supporting corporate initiatives. This role includes responsibility for delivery of the Enterprise Data Infrastructure (EDI), the primary vehicle for enterprise-scale data aggregation, integration, and standardization at Fannie Mae.
Anand is a techno-business leader at Impetus Technologies providing product strategy, product marketing and sales leadership for the StreamAnalytix business at Impetus. He is focused on evangelizing and delivering business value from big data and fast data analytics to Fortune 1000 enterprises. Having spoken at numerous big data conferences on a range of topics including big data use cases, ROI, real-time streaming analytics, enterprise big data bus – Anand is a well-known thought leader in the big data ecosystem. He brings 22+ years of software technology, architecture and go-to-market experience in hi-tech, telecom, mobile, gaming and enterprise big data and analytics systems.
Former Executive at General Motors and CTO/VP of Information Management at US Xpress Inc, Timothy has over 30 years’ experience in configuration management planning for Information Management Systems, Operational Data Stores and Order Entry Systems. His expertise includes delivery/implementation of mechanisms to identify, control, and track changes during project rollouts across functional teams.
A “patent holder” for Internet Order Entry System, Mr. Leonard, a publisher of several articles on effective collaboration techniques, has practical experience in all phases of project lifecycle from product inception, maturity, through steady-state rollout, and cohesive integration into mainstream quote through cash systems.
Mr. Leonard has spear-headed and deployed proven methods at fortune 10, 100 and 500 companies to identify, track and evaluate the implication of changes through statistical configuration synthesis and measurements across initiation and release cycles of enterprise solutions. His methods proven by empirical evidence have effectively converged the optimum balance of delivering business solutions through efficient and enabling IT assets.
A technology leader and visionary and prolific writer/speaker, his vast accomplishments are showcased in over 30 media outlets such as BeyeNETWORK, Bloomberg Business Week, Computer Weekly, ComputerWorld UK, Information Management and TechTarget. Recognized by Information Week as one of the 2011 Top 25 Information Managers and by Informatica as a winner of two 2012 Innovation Awards in the Best of the Best and Megatrends: Big Data, Cloud, Social Media categories.
In 2017, Leonard was a recipient of the Hortonworks’ Data Visionary Award for his work in business intelligence. Leonard is the pioneer of blockchain technology in the transportation industry by identifying business use cases and fully developing the first set of technology addressing smart contracts.
Namrata Ghadi is a Software Development Engineer (ML and Data
Science) in Workday’s Syman team. She has been working on ML and DS
based projects for 2+ years and as a Software Engineer for 6+ years.
Namrata has a MS in CS from Carnegie Mellon University.
Adam Baker is a Senior Software Development Engineer (ML and Data Science) in Workday’s Machine Learning team. He's worked on computational linguistics in graduate school and in NLP and ML for one year at Workday. He's worked in software development for 7 years. He has a MA in Linguistics from University of Chicago and B.S. in CS from Ohio State University.
Artem Ervits is a Solutions Engineer at Hortonworks. Hortonworks is a leading big data software company based in Santa Clara, California. The company develops and supports Apache Hadoop, for the distributed processing of large data sets across computer clusters. Artem is an organizer of the NYC Future of Data Meetup and contributor to Apache Oozie. He works with Workflow Manager and Oozie product management and engineering teams to shape the future direction for Workflow Manager and Oozie. You may reach him with questions on Oozie, HBase, Phoenix, Pig and Hive.
Clay Baenziger - is an architect for the Hadoop Infrastructure Team at Bloomberg. Clay comes from a diverse background in systems infrastructure and analytics. At Sun Microsystems, his team built out an automated bare-metal Solaris deployment tool for Solaris engineering labs and later his contributions were core to the OpenSolaris Automated Installer. Providing a good introduction to Hadoop, his team at Opera Solutions built out a financial portfolio analytics product. Merging the two, his team at Bloomberg has now openly developed infrastructure for low-latency HBase, Spark, scalable ingest with Kafka and big-data warehousing using much of the Hadoop ecosystem.
Clay is a past leader and presenter at the Front Range OpenSolaris Users Group (FROSUG) and has provided big data ideas at Hadoop Summit North America (2014), the San Francisco Hadoop Users Group (July '14), Chef Conf (2015) and HBase Con East (2016), ApacheCon Big Data (2017), DataWorks Summit San Jose (2017).
Seasoned technology leader in Quality Engineering with a proven history of creating effective TestFrameworks and Tools, building and developing strong QE teams globally. Passionate in driving efficiencies in test certification using innovative test strategies and tools.
Sunitha has been part of Hortonworks for last 5+ years and is responsible for delivering quality releases for Hadoop Storage and Compute, Ambari, Security Governance and all things Platform Certification for the full HDP Stack.
Prior to Hortonworks, she has lead technical initiatives in Quality Org at Yahoo! including but not limited to developing test frameworks, lead Automation special interest group, delivered high quality technical workshops/presentations/consultations boosting adoption of newer test tools across cross functional orgs
Vijay Bommireddipalli leads IBM’s Center for Open-Source Data & AI Technologies (CODAIT – http://codait.org) formerly known as the Spark Technology Center. His team focuses on creation and curation of Open Source Data and AI technologies at IBM. He joined IBM after finishing his MS in Computer Engineering at University of Massachusetts – Dartmouth. He has expertise in various technologies including Apache Spark and the Big Data ecosystem, Data Persistence, Data Management tooling, and Data Warehousing. He has presented extensively on these topics at various conferences worldwide.
Bryan Cutler is a software engineer at IBM’s Center for Open Source Data and AI Technologies where he works on big data analytics and machine learning systems. He is a committer of Apache Spark in the areas of ML, SQL, Core and Python and a committer for the Apache Arrow project. His interests are in pushing the boundaries of software to build high performance tools that are also a snap to use.