Ron is a Cloud Architect on Microsoft’s Azure Global Black Belt team, working with customers from the CxO level to the technical architects helping them develop strategy, architecture, and a DevOps mindset to solve business challenges with the Cloud.
I am a software engineer on the Azure HDInsight Team working on provisioning and operation of open source software on Azure. Specifically I focus on optimizing HBase and Phoenix on Azure as a managed service. In general, my interests lie in algorithms for optimization problems, performance and scale and big data computation. Prior to joining Microsoft I completed my PhD in Computer Science from the University of Iowa.
Cathy O’Neil is the author of the New York Times bestselling Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, which was also a semifinalist for the National Book Award. She is a columnist for Bloomberg View and founded the company ORCAA, an algorithmic auditing company.
She earned a Ph.D. in math from Harvard, was a postdoctoral fellow in the MIT math department, and a professor at Barnard College where she published a number of research papers in arithmetic algebraic geometry. She then switched over to the private sector, working as a quantitative analyst for the hedge fund D.E. Shaw in the middle of the credit crisis, and then for RiskMetrics, a risk software company that assesses risk for the holdings of hedge funds and banks. She left finance in 2011 and started working as a data scientist in the New York start-up scene, building models that predicted people’s purchases and clicks.
Cathy wrote Doing Data Science in 2013 and launched the Lede Program in Data Journalism at Columbia in 2014. She is a columnist for Bloomberg View.
Hilary is general manager, machine learning, at Cloudera. She was the founder and CEO of Fast Forward Labs, an applied machine learning research company that Cloudera acquired in 2017. She also serves as data scientist in residence at Accel Partners, a leading global venture capital firm. Previously, Hilary was chief scientist at bitly. She co-hosts DataGotham, a conference for New York's home-grown data community, and co-founded HackNY, a non-profit that helps engineering students find opportunities in New York's creative technical economy. She is on the board of the Anita Borg Institute and an advisor to several companies, including Sparkfun Electronics and Wonder. Hilary served on Mayor Bloomberg’s Technology Advisory Board and is a member of the Brooklyn hacker collective NYC Resistor.
Notable expert clinical information systems specialist offering 25-plus years of strategic leadership. Successful architect of healthcare data warehouses, clinical and business intelligence tools, big data ecosystems, and a health information exchange.
Excel at leading the development of long-term systems strategy for major medical organizations and executing plans to select innovative technology, implement systems, and leverage and maximize system functionality to enhance the health care delivery process.
Evangelist for the use of clinical technology to drive daily operations, analysis, and decisioning. Reputation for building consensus among medical, nursing and research leadership, clinical departments, and IT.
Scope of expertise encompasses designing technology-enabled processes, leading the clinical model for transformation, coordinating clinical workflow, ensuring the use of standardized data elements in clinical systems to meet clinical and research data warehouse requirements, leading teams through development and launch, and directing training.
Charles Boicey MS, RN-BC is the chief innovation officer for Clearsense, an outcomes-driven healthcare technology company based in Jacksonville, FL. Previously, Charles was the enterprise analytics architect for Stony Brook Medicine, where he developed the analytics infrastructure to serve the clinical, operational, quality, and research needs of the organization. He was a founding member of the team that developed the Health and Human Services award-winning application NowTrending to assist in the early detection of disease outbreaks by utilizing social media feeds. Charles is a former president of the American Nursing Informatics Association.
Nick Psaki is the Principal, Office of the CTO for Pure Storage Federal and is based in the Washington, DC area. Nick is Pure Storage's senior technical resource for Federal customers, providing deep technical knowledge of flash storage system architectures that enable business and technological transformation for government enterprises.
A 20-year veteran of the United States Army, Nick has extensive experience in designing, developing, deploying and operating information systems for data analysis, sensor integration, and large-scale server virtualization. He was the Intelligence Architectures Chief for the Army G2 (Intelligence), and the Technology and Integration Director for Army G2 Futures Directorate. He has served in multiple peacekeeping and combat operations ranging from the Balkans in the 1990s (Operation Able Sentry VI and Operation Joint Endeavor/Joint Guard) to Iraq and Afghanistan in the post-9/11 era. For the past several years, Nick has been focused on ways in which new and emerging technologies can enable more rapid and cost-efficient analysis of ever-growing bodies of data.
Nick's Thought Leadership/Public Examples and links:
Barbara Eckman is a Senior Principal Software Architect at Comcast and a recognized innovator in Big Data architecture and governance. She leads data discovery and lineage platform architecture for a division-wide initiative comprising streaming, transforming, storing, and analyzing Big Data. Barbara is also the Lead Metadata Architect for the Comcast Privacy Program, an initiative tackling the challenge of legislation like the California Consumer Privacy Act. Her prior experience includes scientific data and model integration at the Human Genome Project, Merck, GlaxoSmithKline, and IBM, where she served on the peer-elected IBM Academy of Technology.
As chief marketing officer, Mick leads Cloudera’s worldwide marketing efforts, including advertising, brand, communications, demand, partner, solutions, and web. Mick has had a successful 25-year career in enterprise and cloud software. Prior to joining Cloudera in 2016, he served as CMO of sales acceleration and machine learning company InsideSales.com. Under Mick’s leadership, InsideSales pioneered a shift to data-driven marketing and sales that has served as a model for organizations around the globe. Previous to InsideSales, Mick served as global vice president of marketing and strategy at Citrix, where he led the company’s push into the high-growth desktop virtualization market. Before Citrix, Mick managed executive marketing at Microsoft and held numerous leadership positions at IBM Software. Mick is an advisory board member for InsideSales and a contributing author on Inc.com. He is also an accomplished public speaker who has shared his insightful messages about the business impact of technology with audiences around the world. Mick graduated from the Georgia Institute of Technology, with a bachelor’s of science degree in management.
Jerry Green is a World Wide leader driving sales and product strategy to ensure IBM is differentiated in the Open Source marketplace by not relying exclusively on either open source or proprietary tools, but by combining their unique strengths in our platforms to form an "Open+" strategy. He is a sales professional with over twenty years of successful sales, business management, and technical experience.
Highly efficient and results-oriented data scientist with strong quantitative skills, development experience and strong education background with a MSc (Imperial College London (World Rank within Top 10 QS)). Responsible self-starter with demonstrated experience in statistical programming language (R, Python, SAS) and programming language python for API’s. High ability holder on visualization with tools such as Tableau as well as good understanding of relational database such as SQL and oracle and non-relational database such as hbase, mongoDB and redis. Machine learning tools such as Hadoop, Spark, H2O, sparkling-water, pysparkling, SAS etc. as well as deep learning tools such as Keras, Tensorflow, Theano, MXnet, PyTorch. GPU cuda programming. Scaling data science. Expert in Predictive Modeling such as XGBoost, regression, Logit, Probit, GBM, RandomForest, Neural Network (generative model, GAN, VAE, RNN, CNN, word2vec etc.) , Naive Bays, K-nearest learn, PCA etc. (supervised learning, unsupervised learning, semi-supervised learning , reinforcement learning etc.) and also probabilistic modeling (PyMC3, Edward, Pyro) such as MCMC, HMC, NUTS, bayesian linear regression, variational models etc, Data mining skills such as parsing, nlp (natural language processing) and proficient in language modeling such as topic model, text clustering, word embedding, Word2Vec, Glove, text classification, RNN, Convolutional RNN etc. familiar with all the development environment such as Hadoop, Cloud (AWS, GCP, Azure) , GPU, Spark. etc.
Strong communication and relationship-building skills with diverse parties; fluent in English and Korean
Tim Spann was a Senior Solutions Architect at AirisData working with Apache Spark and Machine Learning. Previously he was a Senior Software Engineer at SecurityScorecard ("http://securityscorecard.com/) helping to build a reactive platform for monitoring real-time 3rd party vendor security risk in Java and Scala. Before that he was a Senior Field Engineer for Pivotal focusing on CloudFoundry, HAWQ and Big Data. He is an avid blogger and the Big Data Zone Leader for Dzone (https://dzone.com/users/297029/bunkertor.html).
He runs the the very successful Future of Data Princeton meetup with over 1192 members at http://www.meetup.com/futureofdata-princeton/.
He is currently a Senior Solutions Engineer at Cloudera in the Princeton New Jersey area.
You can find all the source and material behind his talks at his Github and Community blog:
Mehul is the founder of Infinity Services Inc, a blockchain development company focused on 'Blockchain for Business' solutions. Mehul is also the inventor of Zippy Logic, a product that implements Fuzzy Logic based AI without the need for developers.
Mehul has 20+ years of hands-on and management experience in implementing complex IT projects across global markets including North America, India, UK and Australia.
Mehul is also a top 5 winner among 1000+ participants in IBM’s blockchain competition
David is a Director of Solution Architecture at Streamlio, and also a contributor to the Apache NiFi, and Apache Pulsar projects. He was formerly the Practice Director at Hortonworks, where he was responsible for the development of best practices and solutions for the professional services team, with a focus on HDF-related technologies including Kafka, NiFi, and Storm. He is a co-author of “Practical Hive: A Guide to Hadoop’s Data Warehouse System”, and holds a B.S and Master’s Degree in Computer Science from Kent State University.
Akshitha Ramachandran is a junior at Harvard University pursuing a joint degree in both Computer Science and Statistics. She was a founding member and Lead Engineer at Harvard Student Agencies - DEV, a start-up focused on developing mobile and web applications for third party clients. She is both a senior developer and board member at ProMazo, a campus organization partnering top students from leading universities with projects at leading companies ranging from Unilever to Whirlpool. She is the Director of the Harvard College Consulting Group, and is responsible for the acquisition, organization and execution of eighteen client projects along with managing organization operations. She directly manages 13 board members who collectively oversee more than 120 members of the group. Additionally, Akshitha attends Hackathons and has been on the board of Harvard’s Women Engineers Code (WECode) conference.This past summer she spent time at Novetta expanding their Machine Learning practices, specifically in the Named Entity Resolution space. She has contributed to the company’s internal pipeline, designed a demo for them, and published some of her work (https://www.novetta.com/2018/09/named-entity-recognition-and-graph-visualization/ and https://www.novetta.com/2018/08/evaluating-solutions-for-named-entity-recognition/).
Don Bosco Durai (Bosco) is a thought leader in enterprise security and is a committer in open source projects like Apache Ranger, Apache Ambari, and Apache HAWQ. He has also contributed towards the security for most of the Hadoop components. Bosco was the co-founder of XA Secure, which is the genesis of Apache Ranger. Bosco is currently the co-founder of Privacera where he is tackling the data security challenges in modern data architecture, like Big Data and Cloud, where large data set constantly moves between different environments, which can result major security breaches or compliance violation if not managed properly. Privacera automates discovery of sensitive data, does transparent encryption/anonymization, manages access policies and monitors access.
Madhan Neethiraj is an Apache committer and PMC for Apache Atlas and Apache Ranger projects. He works at Hortonworks as Sr. Director of Engineering in Enterprise Security Team. His contributions include Apache Ranger features like audit framework, stack model, tag-based policies, masking and row-filter policies; and Apache Atlas features like V2 APIs, search enhancements. Prior to Hortonworks, Madhan was at Oracle in development of security access management suite, governance and real-time fraud detection/prevention products. Prior to Oracle, he was with Bharosa Inc. responsible for the development of real-time fraud detection solution for Financial Institutes, HealthCare and eCommerce.
Dave is an Enterprise Software Architect with over twenty-five years of technical leadership in the telecommunications, financial services, and healthcare domains. Dave’s diverse experience ranges from engineering event-based and rule processing systems at “PaaS” (Platform as a Service) scale to building an autonomous-agent workplace simulation engine. At Comcast, Dave is leading the end-to-end ingest, compute, and machine learning pipeline architectures for supporting Customer Experience Big Data applications.
Jeff is a software engineer and cloud architect. He is a committer and PMC on Apache OpenNLP. Jeff currently works on natural language processing pipeline projects and resides outside of Morgantown, WV.
At Partners & Co., Eric Wolok specializes in the sale of commercial real estate in Chicago. Partners & Co. uses open source tools such as emacs, sed, awk, Apache NiFi and Apache Spark to identify, track and facilitate unique investment opportunities for their clients.
Dr. Sanjian Chen is a data science expert with deep knowledge in scalable machine learning algorithms. He has developed cutting-edge data-driven modeling techniques and autonomous systems in both academic and industry settings. He designed data-analytics solutions that drove numerous high-impact business decisions for multiple Fortune 500 companies across several industries, including retail, banking, automotive, and telecommunications. He is currently working on building cutting-edge cloud-based AI engines for high-performance distributed database systems that support scalable data analytics in multiple business areas. Dr. Chen is a frequent invited speaker at top international conferences, including the Strata Data Conference (San Francisco, London), the IEEE Cyber-Physical Systems Week (Chicago), the IFAC conference on Analysis and Design of Hybrid Systems (Atlanta), and IEEE International Conference on Healthcare Informatics (Philadelphia, Dallas).
Dr. Chen received his Ph.D. in Computer and Information Science at the University of Pennsylvania. He received two IEEE Best Paper Awards (IEEE RTSS 2012 and IEEE ISORC 2018). He has published over 25 papers in top journals and conferences, including 2 articles published in the Proceedings of IEEE (IF=9.1). He has served as an invited reviewer for numerous top international journals and conferences, e.g., the IEEE Design & Test, IEEE Transactions on Computers, ACM Transactions on Cyber-Physical Systems, IEEE Transactions on Industrial Electronics, IEEE RTSS conferences, and ACM HSCC conference.
Sridhar is an Enterprise Architect delivering high impact IT solutions with cross functional executions. He comes with many years of applications programming in diverse industries including Retail, Healthcare, Manufacturing, Utilities and Telco. Stint includes building and managing operations for multi-tenant Hadoop clusters consisting over 500 nodes and growing, where he focuses on optimized and stable clusters, proactive maintenance and efficient operations.
I am currently working for Hortonworks as Senior Software Engineer focused on data management products. Actively contributing to the Hortonworks DataPlane Services platform and Hortonworks Data Lifecycle Manager. Prior to Hortonworks, I worked at Informatica in the Intelligent data warehouse and big data platform using Hadoop, Hive, and Teradata connectors. Prior to Informatica, I worked at Teradata in Data Movement products such as Teradata Parallel Transporter and Teradata connector for Hadoop.
Niru Anisetti is the Director of Product Management at American Express building a true Enterprise Hybrid cloud platform service for global customer base. In her previous roles, she worked at Hortonworks/Cloudera, IBM, Intuit and Yahoo among other companies to build products to not only generate revenues but to change lives of people for the better. She can be reached at email@example.com.
Nandakumar Vadivelu is a Senior Software Engineer at Cloudera and an Apache Hadoop committer. He has been working on Hadoop for over 5 years and contributed to HDFS and Ozone. Before Cloudera, he worked at Ericsson, where he was part of the team which migrated telecom data warehouse form Oracle to Hadoop ecosystem.
Experienced Software development professional with a strong exposure in various big data technologies. Skilled in Hadoop eco system components(HDFS, MapReduce, Pig, Hive, SQOOP), Cassandra, Spark, Core Java, Scala, Relational databases and Data warehousing, and also possess good skills in various SDLC methodologies.
Experienced software development professional with a strong exposure across big data technologies including HDF, Spark, HBase, Pheonix and expertise in building end to end Datawarehousing/BI solutions
Kamil is a technology leader in the large scale data warehousing and analytics space. He is CTO of Starburst, the enterprise Presto company. Prior to co-founding Starburst, Kamil was the Chief Architect at the Teradata Center for Hadoop in Boston, focusing on the open source SQL engine Presto. Previously, he was the co-founder and chief software architect of Hadapt, the first SQL-on-Hadoop company, acquired by Teradata in 2014.
Kamil began his journey with Hadoop and modern MPP SQL architectures about 10 years ago during a doctoral program at Yale University where he co-invented HadoopDB, the original foundation of Hadapt’s technology.
Kamil holds an M.S. in Computer Science from Wroclaw University of Technology and as well as M.S. and an M.Phil. in Computer Science from Yale University.
Dipti Shankar is a Ph.D. Candidate at the Department of Computer Science and Engineering at The Ohio State University. She is currently a Graduate Research Associate at the Network-Based Computing Lab (NOWLAB) working under Dr. Dhabaleswar K. (DK) Panda and Dr. Xiaoyi Lu. Her research interests include high-performance networking and storage media for Big Data middleware, including, Remote Direct Memory Access (RDMA) aware, non-volatile memory technologies, and memory-centric storage systems. At NOWLAB, she has been working on assisting with the research and development of RDMA-based accelerations for Apache Spark, Apache Hadoop, and Memcached, which are publicly available at (http://hibd.cse.ohio-state.edu). More details about Dipti are available at http://web.cse.ohio-state.edu/~shankar.50/.
Carolyn Duby is a Solutions Engineer and Cyber Security SME at Hortonworks, where she helps customers harness the power of their data with Apache open source platforms. Previously, she was the architect for cybersecurity event correlation at SecureWorks. A subject-matter expert in cybersecurity and data science, Carolyn is an active leader in the community and frequent speaker at Future of Data meetups in Boston, MA, and Providence, RI, and at conferences such as Strata Data Conference, Dataworks Summit, Open Data Science Conference and Global Data Science Conference. Carolyn holds an ScB (magna cum laude) and ScM from Brown University, both in computer science. She is lifelong learner and recently completed the Johns Hopkins University Coursera Data Science Specialization.
John has a wide range of experience, he led the team that built the first release of an integrated customer service agent experience (UI), architected a solution to map SOA dependancies by watching network traffic and helped design an analytics platform as an architect in the Enterprise Data Warehouse for a large travel group. He designed and led the team that developed next generation SEO tooling at scale for the largest office supplier in the country and is now working to help T-mobile rethink the way they secure their network by automating response to attacks and enable cyber incident responders to do their job with as little friction as possible.
Terry Padgett is an accomplished Hadoop Systems Architect, with over 8 years of hands-on installation, integration and development with Hadoop technologies. Terry also has extensive experience in the development and application of advanced information technologies, providing software project leadership, software architecture development and assisting the customer in the application of technologies to provide capabilities and solve pressing problems. A seasoned technical lead and software developer, Terry is experienced with multiple programming languages, among them Java and C, with application throughout the entire software development lifecycle.
Dr. Nitin Naik is an Information Technology leader with more than 25 years of experience designing
systems to address business problems and deliver services in education, financial and R&D enterprises.
He presently serves as the Chief Technology Officer at US Census Bureau, US Department of Commerce.
He is responsible for the development of requirements, architecture, and engineering of the systems
that support the data collection, processing, management, and publication of 100+ surveys conducted
by the Bureau in the decennial, demographic, economic arenas nationally and internationally. He is also
responsible developing and maintaining the technology standards, IT roadmaps and the innovation
portfolio for the Bureau. One of his primary focus is designing the next‐generation platforms for
statistical data modernization and on the implementation of a highly scalable data lake supporting the
adoption of massively parallel processing data analytics with disclosure avoidance and security features.
I am currently a Engineering Manager at Uber where I am a member of the Hadoop Platform team working on large scale data ingestion and dispersal pipelines and libraries leveraging Apache Spark. I was also previously the tech lead on the metrics team at Uber Maps building data pipelines to produce metrics to help analyze the quality of our mapping data. Before joining Uber, I worked at Twitter as an original member of the Core Storage team building Manhattan, a key/value store powering Twitter's use cases. I love learning anything about storage and data platforms and distributed systems at scale.
Surekha Saharan is a Druid Committer and Software Engineer at Imply. Previously, she has worked at cloud startup and Cisco Systems where she prototyped, architected and implemented large scale systems. She holds a MS in Computer Science from University of Southern California and BS in Computer Engineering from National Institute of Technology, India.
Benjamin Hopp has been involved in architecting big data and streaming data solutions for companies of all sizes. Currently, he is a Solutions Architect with Imply where he assists organizations to deploy and manage Apache Druid solutions. Previously, he worked as a Senior Systems Architect with Hortonworks specializing in streaming data use-cases using HDF and Apache NiFi.
Dr. Zhong Wang is a career computational biologist and group leader for genome analysis at DOE Joint Genome Institute (JGI); he is also an adjunct professor at University of California at Merced. He received his Ph.D. in Cell Biology from Duke University in 2004. He did his postdoc in the Institute of Genome Science and Policy at Duke University before becoming a research scientist at Yale University in 2008. He joined DOE Joint Genome Institute in 2009 and established his independent research in transcriptomics, metagenomics, and big data analytics. Dr. Wang published over 30 high-quality papers including several on Science and Nature. More information about his research can be found at http://jgi.doe.gov/our-science/scientists-jgi/genome-analysis/
As a former retail and consumer goods executive and more recently as a business strategy consultant and solution provider, Brent has extensive experience working with a variety of retail and consumer goods companies to provide thought leadership and help them to align strategic business objectives with technology and analytic solutions to create a differentiated competitive advantage in the marketplace.
He has an extensive track record of imagining, designing and executing high impact business solutions, driving innovation and transformation for retail and consumer goods organizations. Brent is passionate about analytics, emerging technologies, consumer behavior, collaborative supply chains and retail transformation.
As General Manager of Retail and Consumer Goods Solutions at Hortonworks, Brent is responsible for driving the solution vision and go-to-market strategies with each segment. As industry leaders increasingly invest in Big Data Analytics to help drive transformation within their organizations,
Brent engages globally to share, discuss, provide keynote talks, and facilitated workshops to help define and create solutions to drive next-generation insights and positive business outcomes across the value chain.
I am a data scientist with Miner & Kasch, a data science consulting firm. I specialize in developing automated solutions for our clients using machine learning, specifically in the domains of computer vision and natural language processing. Additionally, I lead the deep learning training sessions that Miner and Kasch holds.
Across a variety of domains I have successfully applied deep learning to computer vision problems involving image classification, object detection and segmentation. For Natural Language Processing tasks I have created neural information retrieval systems, semantic similarity search engines, and question answering systems. My favorite machine learning techniques are representation learning methods that result in surprising and useful latent variables that facilitate higher level tasks.
Nitin Khandelwal is working at Qubole as a Staff Engineer. He has worked in a different arena of projects like adding encrypted communication for ephemeral clusters nodes running in the cloud, providing Hive as a multi-tenant service, Autoscaling, etc. He has been contributing significantly in optimizing Tez engine for ETL workloads by adding features like workload-aware autoscaling, fault-tolerance, effective use of spot nodes, etc.
Previously, Nitin was working with Microsoft on VPN Site-to-site gateway service which forms the backbone of Microsoft Azure Stack's network.
Nitin has completed his Masters in Computer Science from IIIT-Hyderabad. His main areas of focus there were distributed computing, databases and networks.
Shreya Bhatia is working in Qubole as a Member of Technical Staff. She works there on Hive Stack, and has been part of projects like providing Hive as a service on a cloud agnostic platform, building Metrics and alerting solution for HiveServer2 and stabilizing it under a highly concurrent load, performance analysis of MapReduce on Yarn in the Qubole Stack etc.
She completed here Masters in Computer Science from Stony Brook University, New York in 2016. Previously she was working in India with InfoEdge (Naukri.com) as part of Search Team and worked on building extraction systems like Resume/Email parser, Job Crawler etc.
Ian Brooks holds a Ph.D. in Computer Science from University of North Texas, and his dissertation focused on virtual teams, leadership, and predictive analytics. He is committed to improving his craft, and he has a great passion for science, data, and computing. Currently, Ian is a member of the Public Sector team at Hortonworks, and he recently relocated to Washington DC. When he isn't stressing over the details, Ian enjoys mountain biking, kettlebells, and beer making.
Pradeep is a Senior Big Data Engineer at Hotels.com in London where he builds and manages cloud infrastructure and core services like Apiary. Pradeep has worked in the big data space for the last 7 years, building large scale platforms.
Elliot is a principal engineer at Hotels.com in London where he designs tooling and platforms in the big data space. Prior to this Elliot worked in Last.fm’s data team, developing services for managing large volumes of music metadata.
Kai Liu is a Senior Program Manager in AI and Research group of Microsoft. He has 8 years of experience in data driven engineering, big data platform and AI infrastructure for Office and Bing product families. He led his team to create a service health portal for SharePoint Online, inject a distributed log collection and storage system for Exchange Online, publish curated data sets, key business metrics, and enable sub-hour experimentations in Office 365.
Currently he is working on the next generation of Big Data and Deep Learning platform for Bing based on Open Source technologies.
Sujit Somandepalli is a Storage Solutions Engineer with Micron Technology. He has worked with various storage platforms including iSCSI, FC and DAS solutions before coming to Micron Technology to focus on NoSQL databases, BigData architectures, distributed storage systems.
His current focus is performance analysis and tuning for SSD products.
Sujit has a Master of Science degree in Computer Science from North Carolina State University.
Anthony is a Staff Software Engineer working in the Hadoop Dev team at LinkedIn. He currently works on machine learning infrastructure. Previously, he worked on LinkedIn’s data access layer (Dali) and workflow scheduler (Azkaban).
Sanjeev Koranga is leading the PayPal’s instrumentation and analytics platform team. He is responsible for making PayPal’s behavioral analytics self-serve & designing and developing systems for turning data into meaningful insights.
Shobana is a Big Data enthusiast, passionate about data analytics and solving complex and interesting problems.
She leads the Tracking Platform team in PayPal which is responsible for collecting all the behavioral server side events of PayPal which is then enriched and processed for various downstream analytics.
She has experience building real time analytics platform that collects and processes more than 15 billion events a day using technology stacks that includes open source software's like Squbs, Kafka, Hadoop, Spark, Presto, Druid and Teradata.
Sunil Govindan is contributing to Apache Hadoop project since 2013 in various roles as Hadoop Contributor, Hadoop Committer and member Project Management Committee (PMC). He is working as Staff Software Engineer at Hortonworks in YARN team. He is majorly contributing in YARN Scheduling improvements such as Intra-Queue Resource preemption, Multiple Resource types support in YARN with Resource Profiles, Absolute Resource configuration support in Queues etc. He also drove efforts to improve YARN UI for better user experience with community. Before Hortonworks, he worked at Juniper on a custom resource scheduler. Prior to that, he was associated with Huawei and worked on Platform and Middleware distributed systems including Hadoop platform. He loves reading books, an ardent music lover and passionate about go-green efforts.
Weiwei is a Software Engineer from Cloudera, an Apache Hadoop committer and PMC member. He has been working on Hadoop for over 8 years and contributed to both HDFS and YARN. His work mainly includes some storage features in Ozone, and scheduling features like placement constraints, async scheduling, and CSI adoption, etc in YARN. He is now focused on adding scheduling features in Kubernetes, in order to support both batch and service workloads. Before Cloudera, he worked in Alibaba’s data infrastructure team, with experiences of evolving big data platform at 10k+ nodes scale. Prior to that, he worked in IBM for several years as one of the startup member of Biginsights project.
Leader of HDFS/ZooKeeper project at Xiaomi, focus on distributed filesystem. 6 years experience on large scale distributed storage system
Owen O'Malley is a co-founder and technical fellow at Hortonworks, a rapidly growing company (25 to 1,000 employees in 5 years), which develops the completely open source Hortonworks Data Platform (HDP). HDP includes Hadoop and the large ecosystem of big data tools that enterprises need for their data analytics. Owen has been working on Hadoop since the beginning of 2006 at Yahoo, was the first committer added to the project, and used Hadoop to set the Gray sort benchmark in 2008 and 2009. In the last 8 years, he has been the architect of MapReduce, Security, and now Hive. Recently he has been driving the development of the ORC file format and adding ACID transactions to Hive. Before working on Hadoop, he worked on Yahoo Search's WebMap project, which was the original motivation for Yahoo to work on Hadoop. Prior to Yahoo, he wandered between testing (UCI), static analysis (Reasoning), configuration management (Sun), and software model checking (NASA). He received his PhD in Software Engineering from University of California, Irvine.
Srikanth Venkat is currently responsible for Security & Governance portfolio of products at Hortonworks which include Apache Knox, Apache Ranger, Apache Atlas, Platform wide security and Hortonworks DataPlane Service. Prior to Hortonworks, Srikanth has held multiple roles in areas of cloud services, marketplaces, security, and business applications. His experience includes leadership across Product Management, Strategy and Operations, and Technical Architecture with broad experience in startups to global organizations including Telefonica, Salesforce.com, Cisco-Webex, Proofpoint, Dataguise, Trilogy Software, and Hewlett-Packard. Srikanth holds a PhD in Engineering with a focus on Artificial Intelligence from University of Pittsburgh, and an MBA in General Management from Indiana University and a Masters in Global Management from Thunderbird School of Global Management. Srikanth is a Data Sciences & Machine Learning hobbyist and enjoys tinkering with Big Data technologies.
Naveen is an Engineering Manager with 7+ Years of Data Engineering, Data Science & Analytics experience across Retail, Finance & Marketing industries. In his current role at WalmartLabs, Naveen leads Walmart's Customer Experience and Store Marketing engineering team and one of his key initiatives in the last few months has been to bring all the various data assets at Walmart under one single data lake platform. He has worked across various database technologies throughout his career and has been extensively working on the entire Hadoop stack at WalmartLabs. Over the past few years, he led several teams building end to end data and visualization platforms and also worked on evaluating and implementing multiple query acceleration and SQL on Hadoop layers such as Druid, LLAP, Spark, Kinetica, SAP HANA etc. to power Walmart's BI platforms.
Speaking & Presentation Experience:
@NWA IISE conference: Topic: 'Data Cafe: Enabling Real Time Insights Through Visualization'
@Bentonville Data Science Meetup: ' Data Cafe: Ask Me Anything - Bot Framework using NLP'
Naveen's work at WalmartLabs was featured on Forbes as one of the "The Most Practical Big Data Use Cases Of 2016" https://www.forbes.com/sites/bernardmarr/2016/08/25/the-most-practical-big-data-use-cases-of-2016/#1a1206531625
This is Abhishek Gupta with around 4 years of professional experience in IT Industry, currently working in Walmart Labs as a Software Engineer 3 - Tech. At Walmart, I am working in the Data Lake Initiative practicing principles of different pillar of data solutions such as Data Architecture, Data Engineering, Metadata Management & Data Governance. From tools & technlogies standpoint, I'm an active user of Hadoop, Hive, Spark, Springboot etc.
Prior to this, I had worked for more than 2 years in the area of data warehousing and business intelligence at AIG (American International Group). I have pursued my Master's in Management Information Systems with the specialization in Data Analytics from the University of Arizona. I enjoy public speaking and I am starting to put my foot in the arena of knowledge sharing by public speaking, talks, sessions etc. Recently, I had the chance to be a host at the "Open Data Science Conference West 2018".
Always have a vision !!
Senior Technologist at American Water working on HDP & HDF.
Experienced Technologist with a demonstrated history of working in the utility space.
Expert in Hadoop, Hive, Spark, and NiFi.
"Creating something unique involves a different kind of plan altogether, combining the determination of a brave pioneer, the innovation of a free thinker, and the outlook of a visionary."~H. Rubin.
Pioneer in area of applying data science in well-health; Freethinker in capitalizing on the power of pull; and demonstrated visionary by implementing a Data Science Institute at a small private HBCU medical school.
Entrepreneurial, innovative and dynamic soft skills to foster a vision, establish and achieve strategic initiatives.
Mayank Kejriwal is a research scientist at the University of Southern California's Information Sciences Institute (ISI), and a research assistant professor in the Department of Industrial and Systems Engineering. He received his Ph.D. from the University of Texas at Austin. His dissertation involved Web-scale data linking, and in addition to being published as a book, was recently recognized with an international Best Dissertation award in his field. His research is highly applied and sits at the intersection of knowledge graphs, social networks, Web semantics, network science, data integration and AI for social good. He has contributed to systems used by both DARPA and by law enforcement, and has active collaborations across academia and industry. He is currently co-authoring a textbook on knowledge graphs (MIT Press, 2018), and has delivered tutorials and demonstrations at numerous conferences and venues, including top academic venues such as KDD, AAAI, and ISWC, and industrial venues . He is currently serving as general chair of the ACM K-CAP conference in 2019, and is co-editing a special issue on knowledge graphs in the Semantic Web Journal. He was awarded a Key Scientific Challenges award in 2018 by the Allen Institute for Artificial Intelligence, and was recently named a Forbes Under 30 Scholar. He has also been nominated as a 2019 Forbes 30 Under 30 in the Science category.
Sridhar is a technology leader and currently responsible for building a Finance data lake in Walmart. He is Sr Manager II Engineering, Global Data Analytics Platform, Walmart. Before working in the Data Analytics area, Sridhar led multiple HR implementations in Walmart. Previously, he worked in Deloitte Consulting, Hyderabad.
Sridhar has 15+ years of IT experience in Retail, Healthcare and Finance domains.
@NWA Arkansas IISE Chapter Conference on Data Cafe: Enabling Real-Time Insights Through Visualization
2+ years of IT experience in data and retail domain. Currently a Software Engineer - II at Walmart Labs. Completed MS in Computer Science with a specialization in Data Science from UT Dallas.
Michael Ger has over 25 years of experience working in industry and Information Technology strategy roles. He has deep cross-industry knowledge in product development, manufacturing, supply chain and customer experience related business processes. As General Manager of Manufacturing and Automotive Industries at Hortonworks, Mike is responsible for driving the solution vision and go-to-market strategies within each industry segment and works with industry leaders to drive next-generation business insights through Big Data Analytics. Prior to joining Hortonworks, Mike worked at Oracle for over 20 years as their Automotive Industry lead, at A.T. Kearney as an Automotive Management Consultant and at General Motors (Saturn Division) as a Product Engineer.
DeepIQ is a seasoned Cloudera partner, and is one of Houston's fastest growing startup accelerating time to value for the Oil, Gas, Chemical and Energy Industries. DeepIQ team members have solved some of the industry's toughest challenges, authored over 100 publications on Machine Learning, Deep Learning and Optimization creating new opportunities for businesses seeking competitive advantages from advanced analytics and AI.
Jeff has been an IT consultant for Deloitte and Sogeti working at Houston's top companies. He was an architect at Dell and Schlumberger. Jeff currently runs the services practice at DeepIQ.
Sanjay is a telecom industry veteran with extensive experience in the strategy and execution of next generation data-centric industry solutions for enhancing customer experience, optimizing network operations and increasing revenue generation through digital transformation.
Sanjay currently leads the global communications & media business at Hortonworks helping communication service providers leverage Hadoop and NiFi to transform their data into a force of business growth and competitive differentiation and to drive data-centric solutions for the connected world & for Industrial IOT. Previously, he held executive roles, leading the global telecom industry business, solutions, and strategy at VMware, Pivotal, Progress Software, Savvion, and TMNG and has help drive business transformation, end-to-end architecture and new business initiatives at Bell Canada, Level3, AT&T Canada, Iowa Telecom, ETB, ATT/Ameritech, Wingcast, and other global service providers.
Paul Gibeault has a B.S. degree in computer science from the University of Idaho and has spent most of his career developing software frameworks and tools to enable the automation of semiconductor equipment. He is currently a Big Data Solution Architect with the IT group at the third largest memory manufacturer in the world, Micron Technology, headquartered in Boise, Idaho. For the past five years he has focused on the design and implementation of an automated approach to create Micron’s Global Data Warehouse.
Focused on architecting ingestion feeds through Apache NiFi and loading data into Hadoop and Teradata.
In my spare time I like to work network my house and build do fun things with IoT devices.
Lohit is part of Hadoop and Log Management team at Twitter. He has been concentrating on scaling Hadoop FileSystem, Hadoop Resource Manager, Log Ingestion and Processing pipelines at Twitter. Previously he has worked at few startups building scalable file systems and was also part of Hadoop team at Yahoo! when it was open sourced. He has Masters degree in Computer Science from Stony Brook University.
Vrushali Channapattan is an active Apache Hadoop Committer & PMC member who is currently managing the Product Analytics team at Twitter. She has previously worked in the Hadoop team at Twitter and had focused on ensuring that Hadoop can keep meeting the rapidly expanding storage and computation needs at Twitter. In past roles, she has also worked with Intuit, Yahoo!, Oracle, Persistent Systems and Tata Institute of Fundamental Research in India.
Henry Sowell is Cloudera Technical Director in the Public Sector.
In this capacity, Mr. Sowell leads an engineering group responsible for the technical architecture and engineering of Big Data solutions supporting missions across the Intelligence Community, Department of Defense, Federal Civilian Government Agencies, and State, Local, and Higher Education institutions, helping improve speed to mission.
Prior to joining Cloudera, Mr. Sowell used several technologies, including Apache Hadoop, to protect the nation in support of the FBI’s counterterrorism mission. In addition to supporting the counterterrorism mission, he leveraged these technologies to support cross-division law enforcement advancements with the FBI’s Cyber Division. Mr. Sowell enlisted in the United States Marine Corps in 2003. He served with distinction as a decorated combat veteran, having earned the Bronze Star with Valor for his actions in Iraq.
Leo Garciga serves as the Joint Improvided-Threat Defeat Directorate (JD) Chief Technology Officer under the Defense Threat Reduction Agency (DTRA). In his role, he provides leadership and oversight of Mission Information Technology services and personnel that directly contribute to the implementation of the DTRA mission and its support to the warfighter, Department of Defense (DoD), Combatant Commanders, Coalition partners, the Intelligence & Interagency organizations.
Mr. Garciga is also DTRA JD senior information technology advisor, who discovers and rapidly implements new technology and innovation to counter threat networks, improvised threats and improvised explosive devices to support counter-terrorism and counter-insurgencies operations and to prevent battlefield surprise.
He advocates and spearheads efforts across DoD, the Intelligence Community, US Government Agencies, academia and industry to integrate a myriad of Research and Development work to rapidly introduce new information technology that provides immediate operational impacts for the warfighter and the nation. His efforts have resulted in continuous enhancements to Catapult, a rapid response data analytic platform, to improve situational awareness to thousands of users. He made JIDO (JD) an early adopter and leader in DoD of the implementation of Secure Dev Ops, which unified security, software development and operations to automate processes for innovation in information technology. He also is key contributor to DoD understanding of the potential of artificial intelligence and machine learning to future missions.
Mr. Garciga has a BA in Mechanical Engineering Technology, is a certified Information Technology professional. He has also served in a variety of roles in DoD, to include active duty service in the US Navy, the Combatant Commands, and the Intelligence Community.
Suresh Yadagotti Jayaram is the Senior IT Application Architect for Florida Blue, Florida’s Blue Cross and Blue Shield company, which is the largest health insurance provider in the state. His extensive experience includes software architecture and engineering leadership roles at multiple global firms including HP, PayPal, Tata, and Deloitte. Suresh is passionate about business intelligence and implementing business architecture to reflect strategies that support elite IT departments, regardless of industry. He holds a master’s degree in Innovations and Entrepreneurship from HEC, Paris.
Kishore Huilgol is the Senior IT Manager for Florida Blue, Florida’s Blue Cross and Blue Shield company, which is the largest health insurance provider in the state. His extensive experience includes leading Innovation team for enabling the technologies and data for Enterprise ,Extensive experience in Architecture design and implementation of CRM and BPM solutions at Florida Blue
Michael is a frequent speaker at major conferences, such as HIMSS, the BI Summit, TDWI, Oracle Open World, Independent Oracle Users Group (IOUG) Collaborate, Oracle Development Tools User Group (ODTUG), He has written articles for the Journal of Management Excellence, produced the white paper Understanding an OLAP Solution from Oracle for Oracle Corporation, and coauthored Oracle Data Warehousing Unleashed (published by Sams). Michael is the lead author for Oracle Essbase & Oracle OLAP: The Guide to Oracle’s Multidimensional Solutions (published by Oracle Press) (9th Best Seller at Oracle Open World - 2011).
Praveen Kanumarlapudi is a Lead Data Engineer with Aetna’s (a CVSHealth company) Global Security team. Prior to his time at Aetna, Praveen worked on big data solutions for Apple and Bank of America.
John has a degree in Physics from Rutgers University where he did various research in condense matter physics. He went on to become the Supervisor of Testing and Assembly for Leonardo Helipcopters (then AgustaWestland) Philadelphia. Recently John had various roles at American Water where he worked on emerging technology and then transitioned to data engineering where he was in charge of the HDF and HDP platform. Before joining Cloudera, John transitioned to the Autonomous Intelligence team where he was in charge of integrating the platforms to allow data scientists to work with various types of data. Currently, John works as a Solutions Engineer for Cloudera supporting the mid-atlantic region.
Nisha Muktewar is a Research Engineer at Cloudera Fast Forward Labs, where she spends time researching latest ideas in machine learning, builds prototypes that showcase these capabilities when applied to real-world use cases, and advises clients in this space. Prior to joining Cloudera, she worked as a Manager in Deloitte’s Actuarial, Advanced Analytics & Modeling practice leading teams in designing, building, and implementing predictive modeling solutions for pricing, consumer behavior, marketing mix, and customer segmentation use cases for insurance and retail/consumer businesses. She holds a Bachelor of Engineering degree in computer science from University of Pune, India.
Justin leads Cloudera's Fast Forward Labs team. Justin is a career data professional and Data Science leader with experience in multiple industries and companies. Previously, Justin was the head of Applied Machine Learning at Fitbit, the head of Cisco’s Enterprise Data Science Office and a Big Data Systems Engineer with Booz Allen Hamilton after serving as a Marine Corps Officer, with a focus in Systems Analytics and Device Intelligence. Justin is a graduate of the US Naval Academy with a degree in Computer Science and the University of Southern California with a Master’s Degree in Business Administration and Business Analytics.
Alice Albrecht leads our strategic engagements and advising at Cloudera Fast Forward Labs. She is passionate about helping organizations see a return on their investment in data and helping them build the future. Previously she was a research engineer at Cloudera Fast Forward Labs where she spent her days researching the latest and greatest in machine learning and artificial intelligence and bringing that knowledge to working prototypes and delivering concrete advice for clients. Prior to joining Cloudera, Alice worked in both finance and technology companies as a practicing data scientist, data science leader, and a data product manager. In addition to helping organizations harness the power of machine learning, Alice is passionate about mentoring and helping others grow in their careers. Alice holds a PhD from Yale in cognitive neuroscience where she studied how humans summarize sensory information from the world around them.
Ferd Scheepers is the Chief Information Architect of ING. Ferd has been driving ING’s journey to becoming a data driven company for the last 4 years. He has published on Data Lakes, and is a frequent speaker on both major vendor conferences, and on open source summits. Currently he is championing the Apache Atlas open metadata initiative. Passionate about data, both on the opportunities and the risks, Ferd loves to share his vision and ideas on what data will mean for both companies, and for individuals.
Robert is an AI evangelist at Cloudera and has over 12 years of experience working on various projects related to Artificial Intelligence, Robotics, IoT, Enterprise & Embedded Software. His primary focus at Cloudera is building communities around IoT, Big Data and Data Science, and enabling Enterprises to accelerate adoption of cutting edge open-source technologies (from Edge to AI).
Dave Shuman is Managing Director: Connected Industries & Smart Cities at Cloudera, the Enterprise Data Cloud company. He advises customers globally as they adopt cutting-edge solutions leveraging sensor technologies (IoT), Machine-Learning (ML), and connected data across infrastructure and industries. Prior to joining Cloudera, Dave built and ran Vision Chain, an innovative data warehousing & insights start-up serving as Chief Operations Officer, VP of Field, and VP of Product over his 11 year tenure. Previously Dave developed ecommerce applications and business processes at the dawn of the ecommerce era with enews.com (a Barnes and Noble company) managing software development, operations and customer analytics. Dave has an extensive background in the Apache and Eclipse open source ecosystems, IoT architecture and deployment, business intelligence applications, database architecture, logical and physical database design and data warehousing. He holds an M.B.A. with a concentration in Information Systems from Temple University and a B.A. from Earlham College.
Jim Kapsis is the co-creator and host of Technopolis, the new podcast from CityLab, on how technology is disrupting, remaking and sometimes overrunning our cities. Jim is also a the founder of The Ad Hoc Group, an advisory firm that helps startups succeed in complex regulated markets, such as energy, mobility, and smart cities. He has been a Senior Advisor to Sidewalk Labs, Alphabet’s urban venture, and spent six years building and leading the global regulatory team at Opower, Inc. Before entering the private sector, Jim was an Energy Advisor at the U.S. Department of the Treasury where he helped broker the Copenhagen Climate Accord in 2009. Jim has also worked in the State Department, Defense Department and in Congress. Jim earned a B.A. in Political Science from Haverford College and a M.P.A. from the Woodrow Wilson School at Princeton University. Jim lives in Alexandria, Virginia with his wife and three daughters. He chairs the city’s Environmental Policy
Commission and serves on the board of its transit authority.
Through a decade of virtualisation and launching two startups, Dan has been nerdy on three continents and in every line of business from UK bulge bracket banking to Australian desert public services.
After being impressed by the Hortonworks commitment to Open Source and integrated Hadoop implementation in his last startup, he came onboard in technical pre-sales and automated presenting the corporate sales pitch using the company’s technology in his first week.
In the following years he’s taken on being the Streaming SME for the EMEA region and TAM on key accounts, contributing a Python client for Apache NiFi, and leading the Field Engineering effort to fully automate the company Demo estate.
Now in Cloudera, Dan now leads a global team of specialist Field Engineers around all things Data Streaming based out of London, UK.
Eyad is a Sr. Solutions Engineer at Cloudera. He helps clients define their Data Management strategies and guides them through their Big Data and Streaming Analytics journeys.
Nathan began his career working at the Australian Department of Defence working in big data and using NiFi. In 2018, Nathan moved to the US to begin a new role as security engineer at Hortonworks for the NiFi Hortonworks Dataflow team.
Alex Zeltov is a Global Blackbelt TSP in Big Data and Advanced Analytics at Microsoft with over 17 years of industry experience in Information Technology and most recently in Big Data and Predictive Analytics. Prior to joining Microsoft Alex worked as a Big Data Solutions Architect at Hortonworks.
Nagapriya (Priya) Tiruthani is a Product manager for IBM Db2 Big SQL. Her focus has been in defining the product roadmap, actively engaging with customers to understand the use cases and making sure their needs are made available in the product while keeping up with what's going in the Hadoop ecosystem. Prior to this role, she was a UI and backend Java developer working on many of the Db2 tools. She holds a Master of Science in Computer Engineering from San Jose State University, San Jose, USA and a Bachelor of Engineering in Electronic from Bharathidasan University in India.
Dave leads alliances for the IBM Storage Business Unit and helps with the solution building, sales enablement, and helping customers to solve data challenges at scale. Prior to this role, Dave was the Global Sales Executive for IBM Spectrum Scale. Dave has spent his career leading business development, sales and marketing teams in emerging technologies like Workstations, Technical Computing, Linux, Cloud, and parallel file systems. Dave resides in Houston, where he moved while working for Hewlett Packard leading HPC and Linux Business Development teams. Dave joined IBM via the acquisition of Platform Computing, where he was leading global alliances for OEMs and ISVs. Dave is a gradate of Queen's Engineering in Economics, Electrical Engineering, and their Executive MBA program.
At Pure Brian enables customers’ journeys to high performance outcomes with Artificial Intelligence, Advanced Analytics, and Platform as a Service. Brian’s served in a variety of roles including heading up IT Departments, creating and fostering technology alliances and business development initiatives for the data storage industry. He’s shared his passion for technology as a business enabler through mentoring programs for new hires, and advising start ups on IT strategy and value propositions to the business.
His work alter-ego is a labor of love; sharing his intellectual curiosity with the world through The Hot Aisle Podcast which was created to deliver content for the next generation of data center technologists and executives.
Ian McCulloh is the chief data scientist for Accenture Federal Services. His current work focuses on the application of artificial intelligence to improve democracy and government services. He also maintains adjunct faculty appointments at Johns Hopkins University in the Bloomberg School of Public health and the Whiting School of Engineering. His most recent academic papers have been focused on the neuroscience of persuasion and implementation of artificial intelligence programs. He is the author of “Social Network Analysis with Applications” (Wiley: 2013), “Networks Over Time” (Oxford: forthcoming) and has published 55 peer-reviewed papers. Prior to joining Accenture, McCulloh served as associate professor at the Johns Hopkins University School of Public health, senior lecturer in Hopkins’ Whiting School of Engineering and senior scientist at the University’ Applied Physics Laboratory. His latest research focused on strategic influence in online networks. McCulloh retired as a Lieutenant Colonel from the U.S. Army after 20 years of service with expertise in social network analysis, special operations and improvised explosive device forensics. During his military tenure, he founded the West Point Network Science Center and created the Army’s Advanced Network Analysis and Targeting (ANAT) program. In addition, he led interdisciplinary teams of scientists at Special Operations Command Central (SOCCENT) and Central Command (CENTCOM) to conduct global, data-driven, social science research to inform strategy for countering extremism and assessing military operations. McCulloh holds a Ph.D. and Master’s degree from the Carnegie Mellon University School of Computer Science, Masters degrees in Industrial Engineering and Applied Statistics from the Florida State University, and a Bachelor’s degree in Industrial Engineering from the University of Washington.
Technology strategist and business solutions Architect with a track record of driving strategic change through tactical execution, optimizing IT performance within start-up, turnaround, and/or established environments. Extensive & a diverse IT background combined with deep understanding of the intersection between technology, business, and operational needs. Progressive, strategic thinker and astute technology analyst, with a keen eye for ever changing technical landscape and evolving trends helping organizations remain ahead of their competition.