This course introduces and demonstrates the components that make up the Hortonworks Data Platform (HDP) ecosystem. Apache Hive will then be explored at a more detailed level, including hands-on demos.
This is a technical overview with hands on exercises of Apache Hadoop and Hive. It includes high-level information about concepts, architecture, operation, and uses of the HDP and the Hadoop ecosystem. A deeper focus will also be utilized for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Hive.
Software developers, business and reporting analysts, and technical managers, who need to understand the capabilities and build applications for Hadoop.
Students should be familiar with programming principles and have experience in software development. SQL knowledge is also helpful. No prior Hadoop knowledge is required.
This course will introduce the big data science workflow. Specifically discussed will be how to move from working with small datasets to working with big data using Spark, Hive, and Zeppelin.
Big Data Science with HDP will cover all aspects of the data science workflow. Special focus will be given to transitioning from the single-machine Python scientific stack to the big data science stack of Hive, Spark, and Zeppelin.
Topics covered will include how to ingest, store, and munge data; data exploration and visualization; feature engineering and machine learning including supervised and unsupervised model building.
Developers, Analysts, and Data Scientists who are interested in learning how to use big data tools to do data science at scale.
Students should be comfortable with programming principles, have prior experience/exposure to statistical and/or computational modeling concepts, and preferably experience with SQL. No prior Hadoop knowledge is required.
Hortonworks is a leading innovator at creating, distributing and supporting enterprise‐ready open data platforms. Our mission is to manage the world’s data. We have a single‐minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. Our open Connected Data Platforms power Modern Data Applications that deliver actionable intelligence from all data: data‐in‐motion and data‐at‐rest. Along with our 1600+ partners, we provide the expertise, training and services that allows our customers to unlock the transformational value of data across any line of business. We are Powering the Future of Data.Learn More
Microsoft believes anyone should be able to get insights from Big Data. So, we bring the power of the cloud to Big Data making it easier than ever to work with all data types. With Microsoft data solutions, everyone can bring Big Data business insights to life through advanced analytics and stunning visualizations – all powered by our enterprise-grade, flexible, and open cloud.Learn More
Hewlett Packard Enterprise is an industry leading technology company that enables customers to go further, faster. With the industry’s most comprehensive portfolio, spanning the cloud to the data center to workplace applications, our technology and services help customers around the world make IT more efficient, more productive and more secure.Learn More
IBM is a globally integrated technology and consulting company headquartered in Armonk, New York. With operations in more than 170 countries, IBM attracts and retains some of the world’s most talented people to help solve technology problems and provide an edge for businesses, governments and non-profits. Innovation is at the core of IBM’s strategy. The company has reinvented itself through multiple technology eras and economic cycles, creating differentiating value for its clients. Today, as the IT industry is fundamentally changing at an unprecedented pace, IBM is much more than a “hardware, software, services” company. IBM is now emerging as a cognitive solutions and cloud platform company. Cognitive solutions powered by analytics and the cloud are the key to clients’ digital transformation. This transformation requires breakthroughs at every level of the enterprise IT foundation, from processors and computer design to storage, applications and analytics tools, networking and the integration layer. IBM solutions are built with open technologies and designed for mission-critical applications, offering a comprehensive platform for cognitive workloads.Learn More
The Oracle Cloud delivers hundreds of SaaS applications and enterprise-class PaaS and IaaS services to customers in more than 195 countries and territories while processing 55 billion transactions a day.Learn More
Teradata empowers companies to achieve high-impact business outcomes through analytics. With a powerful combination of Industry expertise and leading hybrid cloud technologies for data warehousing and big data analytics, Teradata unleashes the potential of great companies. Partnering with top companies around the world, Teradata helps improve customer experience, mitigate risk, drive product innovation, achieve operational excellence, transform finance, and optimize assets. Teradata is recognized by media and industry analysts as a future-focused company for its technological excellence, sustainability, ethics, and business value.Learn More
Dell EMC, a part of Dell Inc., enables organizations to modernize, automate and transform their data center using industry-leading converged infrastructure, servers, storage and data protection technologies. This provides a trusted foundation for businesses to transform IT, through the creation of a hybrid cloud, and transform their business through the creation of cloud-native applications and big data solutions. Dell EMC services its customers – including 98 percent of the Fortune 500 – with the industry’s broadest, most innovative infrastructure portfolio from edge to core to cloud.Learn More
Attunity, voted Hortonworks ISV Partner of the Year, provides modern data integration software with change data capture technology, that efficiently delivers data in real-time and with no manual coding. Attunity software, serving half of the Fortune 100, non-disruptively replicates data from production sources such as Oracle, mainframe and SAP across database/data warehouse, data lake, streaming and cloud architectures. Attunity also accelerates data lake pipelines by automating the creation, updates and provisioning of analytics-ready data.
Pentaho, a Hitachi Group Company, is a leading data integration and business analytics company with an enterprise-class, open source-based platform for diverse big data deployments. Pentaho’s unified data integration and analytics platform is comprehensive, completely embeddable and delivers governed data to power any analytics in any environment. Pentaho has over 15,000 product deployments and over 1,500 commercial customers including ABN-AMRO Clearing, BT, Caterpillar Marine Asset Intelligence, EMC, Halliburton, and NASDAQ.Learn More
Unlock business potential from your Big Data faster and easier with SAP Vora. SAP Vora is an in-memory, distributed computing solution to run enriched, interactive analytics on both enterprise and Hadoop data, quickly and easily. SAP Vora is a complete, production-ready, fully integrated solution between the SAP HANA® platform and Hadoop environments – enabling high-performance, interactive bi-directional analytics across enterprise data in SAP HANA and data stored in Hadoop.Learn More
SAS is the global leader in analytics solutions and services, and the largest privately-held software company in the world. Our innovative solutions – driven by a 26% reinvestment into R&D – help more than 83,000 customers around the globe make better decisions faster. Since 1976, SAS has provided businesses and government agencies with industry-leading solutions to help them transform their operations. Simply put, we help organizations turn large amounts of data into knowledge they can use.Learn More
Unico is operating on the cutting edge of big data solutions working directly with major product vendors and enterprises to bring established as well as nascent experimental products to market. We help our customers all the way through their data journey, from initial discovery and exploring the potential value of data being generated by their organisation through to our data scientists generating near real-time.Learn More
Established in 1991, VSTECS Holdings Limited (“VSTECS”) is the largest technology product solutions and supply chain services platform in the Asia Pacific. In 2016, VSTECS achieved record high revenue of overUSD6.18 billion. VSTECS has five major business segments: supply chain services, finished products sales, components supply, enterprise systems services and IT value-added services. VSTECS’s products portfolio comprises 12 fields, including cloud computing, mobile devices, system equipment, software, information security, network infrastructure, data storage, computer components, internet of things application, gaming, drones and virtual reality products. VSTECS also provides supply chain financing services. VSTECS has strategic partnership with over 240 global top 500 technology companies for upstream vendors and over 40,000 downstream channel partners. VSTECS has 81 offices in nine countries, namely China, Thailand, Malaysia, Singapore, Indonesia, Cambodia, Myanmar, Laos and the Philippines.Learn More
We’re out to change the way people build technology. Among data storage companies, we defy industry conventions, and by pairing our partners and customers with our elegantly simple solutions we hope to forever change expectations of what’s possible from a data storage company.Learn More
Data is our heritage and has always been at the core of everything we do. Our mission is to enable our customers to use their data and analytics to build competitive advantage. Our expertise in data and analytics strengthens our ability to provide data driven solutions for our Digital and Customer Engagement services, aided by our expertise in Cloud & Technology.Learn More
ICC Sydney Convention Centre, Darling Drive, Sydney, New South Wales, Australia
+61 2 9215 7100
Novotel Sydney on Darling Harbour, Murray Street, Pyrmont, New South Wales, Australia
+61 2 9288 7180