Apache Hadoop — свободно распространяемый набор утилит, библиотек и фреймворк для разработки и выполнения распределённых программ, работающих на кластерах из сотен и тысяч узлов.Используется для реализации поисковых и контекстных механизмов многих высоконагруженных веб-сайтов, в том числе, для Yahoo! и Facebook.Разработан на Java в рамках вычислительной парадигмы MapReduce, согласно которой приложение разделяется на большое количество одинаковых элементарных заданий, выполнимых на узлах кластера и естественным образом сводимых в конечный результат.
Apress, 2017. — 140 p. — ISBN13: 9781484223697. Leverage Phoenix as an ANSI SQL engine built on top of the highly distributed and scalable NoSQL framework HBase. Learn the basics and best practices that are being adopted in Phoenix to enable a high write and read throughput in a big data space. This book includes real-world cases such as Internet of Things devices that send...
Addison-Wesley Professional, 2016. — 848 p. — ISBN13: 978-0134597195. In Expert Hadoop Administration, leading Hadoop administrator Sam R. Alapati brings together authoritative knowledge for creating, configuring, securing, managing, and optimizing production Hadoop clusters in any environment. Drawing on his experience with large-scale Hadoop administration, Alapati integrates...
Packt, 2018. — 482 p. Explore big data concepts, platforms, analytics, and their applications using the power of Hadoop 3 Apache Hadoop is the most popular platform for big data processing, and can be combined with a host of other big data tools to build powerful analytics solutions. Big Data Analytics with Hadoop 3 shows you how to do just that, by providing insights into the...
Packt, 2018. — 482 p. Explore big data concepts, platforms, analytics, and their applications using the power of Hadoop 3 Apache Hadoop is the most popular platform for big data processing, and can be combined with a host of other big data tools to build powerful analytics solutions. Big Data Analytics with Hadoop 3 shows you how to do just that, by providing insights into the...
Packt, 2018. — 482 p. Explore big data concepts, platforms, analytics, and their applications using the power of Hadoop 3 Apache Hadoop is the most popular platform for big data processing, and can be combined with a host of other big data tools to build powerful analytics solutions. Big Data Analytics with Hadoop 3 shows you how to do just that, by providing insights into the...
Sams Publishing, 2017. — 744 p. — (Sams Teach Yourself). — ISBN: 978-0672338526. Apache Hadoop is the technology at the heart of the Big Data revolution, and Hadoop skills are in enormous demand. Now, in just 24 lessons of one hour or less, you can learn all the skills and techniques you'll need to deploy each key component of a Hadoop platform in your local environment or in...
O’Reilly, 2016. — 288 p. — ISBN: 978-1-491-91370-3. Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you’ll focus on particular analyses you can build, the data...
Wrox, 2016. — 206 p. — ISBN10: 111926717X. — ISBN13: 978-1119267171. Professional Hadoop is the complete reference and resource for experienced developers looking to employ Apache Hadoop in real-world settings. Written by an expert team of certified Hadoop developers, committers, and Summit speakers, this book details every key aspect of Hadoop technology to enable optimal...
BPB Publications , 2018. — 721 p. The book contains the latest trend in IT industry ‘BigData and Hadoop’. It explains how big is ‘Big Data’ and why everybody is trying to implement this into their IT project. It includes research work on various topics, theoretical and practical approach, each component of the architecture is described along with current industry trends. Big...
Packt Publishing, 2013. — 150 p. — ISBN: 9781783281275. If you have always wanted to crunch billions of rows of raw data on Hadoop in a couple of seconds, then Cloudera Impala is the number one choice for you. Cloudera Impala provides fast, interactive SQL queries directly on your Apache Hadoop data stored in HDFS or HBase. In addition to using the same unified storage...
Wiley, 2014. — 411 p. — ISBN: 1118607554, 9781118607558
Let Hadoop For Dummies help harness the power of your data and rein in the information overload
Big data has become big business, and companies and organizations of all sizes are struggling to find ways to retrieve valuable information from their massive data sets with becoming overwhelmed. Enter Hadoop and this...
Packt Publishing, 2017. — 206 p. — ISBN: 139781787124769. This book will teach you how to deploy large-scale dataset in deep neural networks with Hadoop for optimal performance. Starting with understanding what deep learning is, and what the various models associated with deep neural networks are, this book will then show you how to set up the Hadoop environment for deep...
Packt Publishing, 2017. — 206 p. — ISBN: 139781787124769. This book will teach you how to deploy large-scale dataset in deep neural networks with Hadoop for optimal performance. Starting with understanding what deep learning is, and what the various models associated with deep neural networks are, this book will then show you how to set up the Hadoop environment for deep...
Packt Publishing, 2017. — 206 p. — ISBN: 139781787124769. This book will teach you how to deploy large-scale dataset in deep neural networks with Hadoop for optimal performance. Starting with understanding what deep learning is, and what the various models associated with deep neural networks are, this book will then show you how to set up the Hadoop environment for deep...
Packt Publishing, 2017. — 206 p. — ISBN: 139781787124769. This book will teach you how to deploy large-scale dataset in deep neural networks with Hadoop for optimal performance. Starting with understanding what deep learning is, and what the various models associated with deep neural networks are, this book will then show you how to set up the Hadoop environment for deep...
Addison-Wesley Professional, 2015. — 304 p. — ISBN: 978-0134049946. Get Started Fast with Apache Hadoop 2, YARN, and Today’s Hadoop Ecosystem With Hadoop 2.x and YARN, Hadoop moves beyond MapReduce to become practical for virtually any type of data processing. Hadoop 2.x and the Data Lake concept represent a radical shift away from conventional approaches to data usage and...
N.Y.: Apress, 2014. — 392 p. Many corporations are finding that the size of their data sets are outgrowing the capability of their systems to store and process them. The data is becoming too big to manage and use with traditional tools. The solution: implementing a big data system. As Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset shows, Apache Hadoop offers...
O’Reilly Media, 2015 — 400 p. — e-ISBN: 978-1-4919-0004-8, ISBN10: 1-4919-0004-0 Get expert guidance on architecting end-to-end data management solutions with Apache Hadoop. While many sources explain how to use various components in the Hadoop ecosystem, this practical book takes you through architectural considerations necessary to tie those components together into a...
2nd Edition. — Packt Publishing, 2015. — 322 p.
Starting with installing Hadoop YARN, MapReduce, HDFS, and other Hadoop ecosystem components, with this book, you will soon learn about many exciting topics such as MapReduce patterns, using Hadoop to solve analytics, classifications, online marketing, recommendations, and data indexing and searching. You will learn how to take...
Packt Publishing, 2013. — 368 p. — ISBN: 978-1-78216-516-3. На англ. языке. We are facing an avalanche of data. The unstructured data we gather can contain many insights that could hold the key to business success or failure. Harnessing the ability to analyze and process this data with Hadoop is one of the most highly sought after skills in today’s job market. Hadoop, by...
Packt Publishing, 2013. — 368 p. — ISBN: 978-1-78216-516-3. На англ. языке. We are facing an avalanche of data. The unstructured data we gather can contain many insights that could hold the key to business success or failure. Harnessing the ability to analyze and process this data with Hadoop is one of the most highly sought after skills in today’s job market. Hadoop, by...
Автор неизвестен. Выходные данные неизвестны. Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. This brief tutorial provides a quick introduction...
Manning Publications, 2021. — 482 p. — ISBN 978-1617296901. Data Pipelines with Apache Airflow teaches you the ins-and-outs of the Directed Acyclic Graphs (DAGs) that power Airflow, and how to write your own DAGs to meet the needs of your projects. With complete coverage of both foundational and lesser-known features, when you’re done you’ll be set to start using Airflow for...
Manning Publications, 2021. — 482 p. — ISBN 978-1617296901. Data Pipelines with Apache Airflow teaches you the ins-and-outs of the Directed Acyclic Graphs (DAGs) that power Airflow, and how to write your own DAGs to meet the needs of your projects. With complete coverage of both foundational and lesser-known features, when you’re done you’ll be set to start using Airflow for...
Manning Publications, 2021. — 482 p. — ISBN 978-1617296901. Data Pipelines with Apache Airflow teaches you the ins-and-outs of the Directed Acyclic Graphs (DAGs) that power Airflow, and how to write your own DAGs to meet the needs of your projects. With complete coverage of both foundational and lesser-known features, when you’re done you’ll be set to start using Airflow for...
Manning Publications, 2021. — 482 p. — ISBN 978-1617296901. Code Files Only! Data Pipelines with Apache Airflow teaches you the ins-and-outs of the Directed Acyclic Graphs (DAGs) that power Airflow, and how to write your own DAGs to meet the needs of your projects. With complete coverage of both foundational and lesser-known features, when you’re done you’ll be set to start...
Manning Publications Co, 2021. — 482 p. — ISBN: 978-1617296901/ Data Pipelines with Apache Airflow teaches you the ins-and-outs of the Directed Acyclic Graphs (DAGs) that power Airflow, and how to write your own DAGs to meet the needs of your projects. With complete coverage of both foundational and lesser-known features, when you’re done you’ll be set to start using Airflow...
O’Reilly, 2017. — 338 p. Until recently, Hadoop deployments existed on hardware owned and run by organizations. Now, of course, you can acquire the computing resources and network connectivity to run Hadoop clusters in the cloud. But there’s a lot more to deploying Hadoop to the public cloud than simply renting machines. This hands-on guide shows developers and systems...
O’Reilly, 2017. — 338 p. Until recently, Hadoop deployments existed on hardware owned and run by organizations. Now, of course, you can acquire the computing resources and network connectivity to run Hadoop clusters in the cloud. But there’s a lot more to deploying Hadoop to the public cloud than simply renting machines. This hands-on guide shows developers and systems...
O’Reilly, 2017. — 338 p. Until recently, Hadoop deployments existed on hardware owned and run by organizations. Now, of course, you can acquire the computing resources and network connectivity to run Hadoop clusters in the cloud. But there’s a lot more to deploying Hadoop to the public cloud than simply renting machines. This hands-on guide shows developers and systems...
O’Reilly, 2017. — 300 р. — ISBN: 978-1491959633. Up until recently, Hadoop deployments have existed on hardware owned and run by organizations, often alongside legacy “big-iron” hardware. Today, cloud service providers allow customers to effectively rent hardware and associated network connectivity, along with a variety of other features like databases and bulk storage. But...
O’Reilly, 2017. — 300 р. — ISBN: 978-1491959633. Up until recently, Hadoop deployments have existed on hardware owned and run by organizations, often alongside legacy “big-iron” hardware. Today, cloud service providers allow customers to effectively rent hardware and associated network connectivity, along with a variety of other features like databases and bulk storage. But...
O’Reilly, 2017. — 300 р. — ISBN: 978-1491959633. Up until recently, Hadoop deployments have existed on hardware owned and run by organizations, often alongside legacy “big-iron” hardware. Today, cloud service providers allow customers to effectively rent hardware and associated network connectivity, along with a variety of other features like databases and bulk storage. But...
Manning Publications, 2012. — 536 p. — ISBN: 1617290238, 9781617290237. Hadoop in Practice collects 85 Hadoop examples and presents them in a problem/solution format. Each technique addresses a specific task you'll face, like querying big data using Pig or writing a log file loader. You'll explore each problem step by step, learning both how to build and deploy that specific...
2nd Edition. — Manning, 2014. — 512 p.
Hadoop in Practice, Second Edition provides over 100 tested, instantly useful techniques that will help you conquer big data, using Hadoop. This revised new edition covers changes and new features in the Hadoop core architecture, including MapReduce
2. Brand new chapters cover YARN and integrating Kafka, Impala, and Spark SQL with Hadoop....
Автор неизвестен. Выходные данные неизвестны. Impala is the open source, native analytic database for Apache Hadoop. It is shipped by vendors such as Cloudera, MapR, Oracle, and Amazon. The examples provided in this tutorial have been developing using Cloudera Impala.
Packt Publishing, 2018. — 220 p. — ASIN B07K46H6VV. A fast paced guide that will help you learn about Apache Hadoop 3 and its ecosystem Key Features Set up, configure and get started with Hadoop to get useful insights from large data sets Work with the different components of Hadoop such as MapReduce, HDFS and YARN Learn about the new features introduced in Hadoop 3 Book...
Packt Publishing, 2018. — 220 p. — ASIN B07K46H6VV. A fast paced guide that will help you learn about Apache Hadoop 3 and its ecosystem Key Features Set up, configure and get started with Hadoop to get useful insights from large data sets Work with the different components of Hadoop such as MapReduce, HDFS and YARN Learn about the new features introduced in Hadoop 3 Book...
Packt Publishing, 2018. — 220 p. — ASIN B07K46H6VV. !Code files only A fast paced guide that will help you learn about Apache Hadoop 3 and its ecosystem Key Features Set up, configure and get started with Hadoop to get useful insights from large data sets Work with the different components of Hadoop such as MapReduce, HDFS and YARN Learn about the new features introduced in...
Packt Publishing, 2014. — 374p. — ISBN: 978-1-78398-364-3
Hadoop is synonymous with Big Data processing. Its simple programming model, "code once and deploy at any scale" paradigm, and an ever-growing ecosystem makes Hadoop an all-encompassing platform for programmers with different levels of expertise.
This book explores the industry guidelines to optimize MapReduce jobs...
Apress, 2017. — 304 p. — ISBN: 978-1-4842-1909-6 Learn advanced analytical techniques and leverage existing toolkits to make your analytic applications more powerful, precise, and efficient. This book provides the right combination of architecture, design, and implementation information to create analytical systems which go beyond the basics of classification, clustering, and...
Apress, 2017. — 304 p. — ISBN: 978-1-4842-1909-6 Learn advanced analytical techniques and leverage existing toolkits to make your analytic applications more powerful, precise, and efficient. This book provides the right combination of architecture, design, and implementation information to create analytical systems which go beyond the basics of classification, clustering, and...
Packt Publishing, 2018. — 394 p. — ISBN: 1787122765. Becoming a Big Data Architect: Expert techniques for architecting end to end Big Data solutions to get valuable insights A comprehensive guide to design, build and execute effective Big Data strategies using Hadoop The complex structure of data these days requires sophisticated solutions for data transformation, to make the...
O’Reilly Media, 2017. — 350 p. — ISBN: 978-1-4919-6927-4. Early Release! With Early Release ebooks, you get books in their earliest form—the author's raw and unedited content as he or she writes—so you can take advantage of these technologies long before the official release of these titles. You’ll also receive updates when significant changes are made, new chapters are...
PE Press, 2021. — 120 р. — ISBN: 978-1-716-10839-6. This book provides alternative approach to get started with Big Data Query using Apache Impala. This book describes how to work with Apache Impala and to perform queries inside Apache Impala. Apache Impala is a modern, open source, distributed SQL query engine for Apache Hadoop. With Impala, we can query data, whether stored...
PE Press, 2021. — 120 р. — ISBN: 978-1-716-10839-6. This book provides alternative approach to get started with Big Data Query using Apache Impala. This book describes how to work with Apache Impala and to perform queries inside Apache Impala. Apache Impala is a modern, open source, distributed SQL query engine for Apache Hadoop. With Impala, we can query data, whether stored...
PE Press, 2021. — 120 р. — ISBN: 978-1-716-10839-6. This book provides alternative approach to get started with Big Data Query using Apache Impala. This book describes how to work with Apache Impala and to perform queries inside Apache Impala. Apache Impala is a modern, open source, distributed SQL query engine for Apache Hadoop. With Impala, we can query data, whether stored...
PE Press, 2021. — 120 р. — ISBN: 978-1-716-10839-6. This book provides alternative approach to get started with Big Data Query using Apache Impala. This book describes how to work with Apache Impala and to perform queries inside Apache Impala. Apache Impala is a modern, open source, distributed SQL query engine for Apache Hadoop. With Impala, we can query data, whether stored...
Apress, 2016. — 300 p.
Re-architect relational applications to NoSQL, integrate relational database management systems with the Hadoop ecosystem, and transform and migrate relational data to and from Hadoop components. This book covers the best-practice design approaches to re-architecting your relational applications and transforming your relational data for usage with the...
Manning Publications Co., 2012. — 336 p. — ISBN: 978-1-93518219-1. Целевая аудитория: опытные разработчики. В век интенсивно растущей информации, расширяющейся блогосферы и пользовательской активности большие данные считаются обычным явлением, и для работы с ними существуют много инструментов. В этой книге дано подробное описание проекта от Apache Hadoop. Apache Hadoop — это...
Wrox, 2013. — 504 p. — ISBN: 1118611934, 9781118611937 The go-to guidebook for deploying Big Data solutions with Hadoop. Today's enterprise architects need to understand how the Hadoop frameworks and APIs fit together, and how they can be integrated to deliver real-world solutions. This book is a practical, detailed guide to building and implementing those solutions, with...
Addison-Wesley Professional, 2016. — 256 p. — (Addison-Wesley Data & Analytics). — ISBN10: 0134024141. — ISBN13: 978-0134024141. As adoption of Hadoop accelerates in the enterprise and beyond, there's soaring demand for those who can solve real world problems by applying advanced data science techniques in Hadoop environments. Now Practical Data Science with Hadoop(R) and Spark...
Addison-Wesley Professional, 2016. — 387 p. — (Addison-Wesley Data & Analytics). — ISBN10: 0134024141. — ISBN13: 978-0134024141. As adoption of Hadoop accelerates in the enterprise and beyond, there's soaring demand for those who can solve real world problems by applying advanced data science techniques in Hadoop environments. Now Practical Data Science with Hadoop(R) and Spark...
Addison-Wesley Professional, 2016. — 283 p. — (Addison-Wesley Data & Analytics). — ISBN10: 0134024141. — ISBN13: 978-0134024141. As adoption of Hadoop accelerates in the enterprise and beyond, there's soaring demand for those who can solve real world problems by applying advanced data science techniques in Hadoop environments. Now Practical Data Science with Hadoop(R) and Spark...
Addison-Wesley Professional, 2016. — 381p. — (Addison-Wesley Data & Analytics). — ISBN10: 0134024141. — ISBN13: 978-0134024141. As adoption of Hadoop accelerates in the enterprise and beyond, there's soaring demand for those who can solve real world problems by applying advanced data science techniques in Hadoop environments. Now Practical Data Science with Hadoop(R) and Spark...
O’Reilly Media, 2013. — 233 p. — ISBN: 1449327176, 978-1449327170. На англ. языке. MapReduce — модель распределённых вычислений, представленная компанией Google, используемая для параллельных вычислений над очень большими (несколько петабайт) наборами данных в компьютерных кластерах. MapReduce — это фреймворк для вычисления некоторых наборов распределенных задач с...
O’Reilly Media, 2015. — 272 p. Get a solid grounding in Apache Oozie, the workflow scheduler system for managing Hadoop jobs. With this hands-on guide, two experienced Hadoop practitioners walk you through the intricacies of this powerful and flexible platform, with numerous examples and real-world use cases. Once you set up your Oozie server, you’ll dive into techniques for...
Addison-Wesley, 2014. — 336 p. — ISBN: 0321934504. Apache Hadoop is helping drive the Big Data revolution. Now, its data processing has been completely overhauled: Apache Hadoop YARN provides resource management at data center scale and easier ways to create distributed applications that process petabytes of data. And now in Apache Hadoop YARN, two Hadoop technical leaders show...
Packt Publishing, 2013. — 316 p. Helping developers become more comfortable and proficient with solving problems in the Hadoop space. People will become more familiar with a wide variety of Hadoop related tools and best practices for implementation. Hadoop Real-World Solutions Cookbook will teach readers how to build solutions using tools such as Apache Hive, Pig, MapReduce,...
Packt Publishing, 2013. — 316 p. Helping developers become more comfortable and proficient with solving problems in the Hadoop space. People will become more familiar with a wide variety of Hadoop related tools and best practices for implementation. Hadoop Real-World Solutions Cookbook will teach readers how to build solutions using tools such as Apache Hive, Pig, MapReduce,...
Packt Publishing, 2013. — 316 p. — ISBN: 978-1-78439-550-6. Helping developers become more comfortable and proficient with solving problems in the Hadoop space. People will become more familiar with a wide variety of Hadoop related tools and best practices for implementation. Hadoop Real-World Solutions Cookbook will teach readers how to build solutions using tools such as...
Packt Publishing, 2013. — 316 p. — ISBN: 978-1-78439-550-6. Helping developers become more comfortable and proficient with solving problems in the Hadoop space. People will become more familiar with a wide variety of Hadoop related tools and best practices for implementation. Hadoop Real-World Solutions Cookbook will teach readers how to build solutions using tools such as...
Packt Publishing, 2013. — 316 p. — ISBN: 978-1-78439-550-6. Helping developers become more comfortable and proficient with solving problems in the Hadoop space. People will become more familiar with a wide variety of Hadoop related tools and best practices for implementation. Hadoop Real-World Solutions Cookbook will teach readers how to build solutions using tools such as...
Ravi Prasad», 2024. — 84 p. — ASIN B0DKDSB9NK. Unlock the power of big data with "Hadoop Essentials", your comprehensive guide to understanding and utilizing Hadoop for data processing and analysis. Designed specifically for beginners, this book breaks down complex concepts into manageable steps, making it easy for anyone to grasp the fundamentals of Hadoop. Preface. Frequently...
Ravi Prasad», 2024. — 84 p. — ASIN B0DKDSB9NK. Unlock the power of big data with "Hadoop Essentials", your comprehensive guide to understanding and utilizing Hadoop for data processing and analysis. Designed specifically for beginners, this book breaks down complex concepts into manageable steps, making it easy for anyone to grasp the fundamentals of Hadoop. Preface. Frequently...
New York: O’Reilly, 2016. — 71 p. Hadoop is mostly written in Java, but that doesn't exclude the use of other programming languages with this distributed storage and processing framework, particularly Python. With this concise book, you'll learn how to use Python with the Hadoop Distributed File System (HDFS), MapReduce, the Apache Pig platform and Pig Latin script, and the...
Exelixis Media P.C., 2016. — 149 p. Apache Hadoop is an open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common and should be automatically handled by the...
Engineering Science Reference, 2018. — 260 p. — ISBN: 9781522537908. Due to the increasing availability of affordable internet services, the number of users, and the need for a wider range of multimedia-based applications, internet usage is on the rise. With so many users and such a large amount of data, the requirements of analyzing large data sets leads to the need for...
O’Reilly, 2014. — 152 p. — ISBN: 978-1-491-90577-7. Learn how to write, tune, and port SQL queries and other statements for a Big Data environment, using Impala—the massively parallel processing SQL query engine for Apache Hadoop. The best practices in this practical guide help you design database schemas that not only interoperate with other Hadoop components, and are...
O’Reilly Media, 2012. — 298 p. — ISBN: 1449327052, 9781449327057. If you’ve been asked to maintain large and complex Hadoop clusters, this book is a must. Demand for operations-specific material has skyrocketed now that Hadoop is becoming the de facto standard for truly large-scale data processing in the data center. Eric Sammer, Principal Solution Architect at Cloudera, shows...
John Wiley & Sons, Inc., 2012. In this book, we provide you with a solid understanding of key Big Data concepts and trends, as well as related architectures, such as MapReduce and Hadoop. We also present some suggestions about how to implement high-performance Hadoop. + Bonus: Big Data and Hadoop Guide: 8 Essential concepts of Big Data and Hadoop. A quick reference guide....
Packt Publishing, 2015. — 222 p. — ISBN: 978-1-78528-899-9. Integrate Elasticsearch into Hadoop to effectively visualize and analyze your data The Hadoop ecosystem is a de-facto standard for processing terra-bytes and peta-bytes of data. Lucene-enabled Elasticsearch is becoming an industry standard for its full-text search and aggregation capabilities. Elasticsearch-Hadoop...
Packt Publishing, 2015. — 222 p. — ISBN: 978-1-78528-899-9. Integrate Elasticsearch into Hadoop to effectively visualize and analyze your data The Hadoop ecosystem is a de-facto standard for processing terra-bytes and peta-bytes of data. Lucene-enabled Elasticsearch is becoming an industry standard for its full-text search and aggregation capabilities. Elasticsearch-Hadoop...
Packt Publishing, 2015. — 222 p. — ISBN: 978-1-78528-899-9. Integrate Elasticsearch into Hadoop to effectively visualize and analyze your data The Hadoop ecosystem is a de-facto standard for processing terra-bytes and peta-bytes of data. Lucene-enabled Elasticsearch is becoming an industry standard for its full-text search and aggregation capabilities. Elasticsearch-Hadoop...
Packt Publishing, 2015. — 222 p. — ISBN: 978-1-78528-899-9. Integrate Elasticsearch into Hadoop to effectively visualize and analyze your data The Hadoop ecosystem is a de-facto standard for processing terra-bytes and peta-bytes of data. Lucene-enabled Elasticsearch is becoming an industry standard for its full-text search and aggregation capabilities. Elasticsearch-Hadoop...
Packt Publishing, 2019. — 531 p. — ISBN: 1788620445. Apache Hadoop is one of the most popular big data solutions for distributed storage and for processing large chunks of data. With Hadoop 3, Apache promises to provide a high-performance, more fault-tolerant, and highly efficient big data processing platform, with a focus on improved scalability and increased efficiency. With...
Packt Publishing, 2019. — 531 p. — ISBN: 1788620445. !Code files only. Apache Hadoop is one of the most popular big data solutions for distributed storage and for processing large chunks of data. With Hadoop 3, Apache promises to provide a high-performance, more fault-tolerant, and highly efficient big data processing platform, with a focus on improved scalability and increased...
Packt Publishing, 2017. — 348 p. — ISBN13: 9781787126732. Over 100 practical recipes to help you become an expert Hadoop administrator Hadoop enables the distributed storage and processing of large datasets across clusters of computers. Learning how to administer Hadoop is crucial to exploit its unique features. With this book, you will be able to overcome common problems...
Packt Publishing, 2017. — 348 p. — ISBN13: 9781787126732. Over 100 practical recipes to help you become an expert Hadoop administrator Hadoop enables the distributed storage and processing of large datasets across clusters of computers. Learning how to administer Hadoop is crucial to exploit its unique features. With this book, you will be able to overcome common problems...
Packt Publishing, 2017. — 348 p. — ISBN13: 9781787126732. True PDF Over 100 practical recipes to help you become an expert Hadoop administrator Hadoop enables the distributed storage and processing of large datasets across clusters of computers. Learning how to administer Hadoop is crucial to exploit its unique features. With this book, you will be able to overcome common...
Packt Publishing, 2015. — 100 p. — ISBN: 978-1-78328-155-8. Get to grips with the intricacies of Hadoop monitoring using the power of Ganglia and Nagios With the exponential growth of data and many enterprises crunching more and more data, Hadoop as a data platform has gained a lot of popularity. The Hadoop platform needs to be monitored with respect to how it works and...
Packt Publishing, 2015. — 100 p. — ISBN: 978-1-78328-155-8. Get to grips with the intricacies of Hadoop monitoring using the power of Ganglia and Nagios With the exponential growth of data and many enterprises crunching more and more data, Hadoop as a data platform has gained a lot of popularity. The Hadoop platform needs to be monitored with respect to how it works and...
Packt Publishing, 2015. — 100 p. — ISBN: 978-1-78328-155-8. Get to grips with the intricacies of Hadoop monitoring using the power of Ganglia and Nagios With the exponential growth of data and many enterprises crunching more and more data, Hadoop as a data platform has gained a lot of popularity. The Hadoop platform needs to be monitored with respect to how it works and...
Packt Publishing, 2015. — 100 p. — ISBN: 978-1-78328-155-8. Get to grips with the intricacies of Hadoop monitoring using the power of Ganglia and Nagios With the exponential growth of data and many enterprises crunching more and more data, Hadoop as a data platform has gained a lot of popularity. The Hadoop platform needs to be monitored with respect to how it works and...
Packt Publishing, 2015. — 161 p. — ISBN: 1785880381. Unleash the power of Apache Oozie to create and manage your big data and machine learning pipelines in one go. If you are an expert Hadoop user who wants to use Apache Oozie to handle workflows efficiently, this book is for you. This book will be handy to anyone who is familiar with the basics of Hadoop and wants to automate...
Sams, 2015 2016. —592 p. — ISBN: 9780672337277 With Microsoft HDInsight, business professionals and data analysts can rapidly leverage the power of Hadoop on a flexible, scalable cloud-based platform, using Microsoft's accessible business intelligence, visualization, and productivity tools. Now, in just 24 lessons of one hour or less, you can learn all the skills and techniques...
O’Reilly Media, 2015. — 132 p. If your organization is about to enter the world of big data, you not only need to decide whether Apache Hadoop is the right platform to use, but also which of its many components are best suited to your task. This field guide makes the exercise manageable by breaking down the Hadoop ecosystem into short, digestible sections. You'll quickly...
O’Reilly Media, 2015. — 132 p. — ISBN13: 978-1-491-94793-7. Целевая аудитория: опытные разработчики. Hadoop - это популярный проект, использующийся для надёжного хранения больших объёмов данных. Если вы собираетесь работать с большими данными, вам просто необходимо начать изучать Hadoop и его многочисленные компоненты. Это руководство рассматривает Hadoop на примере...
O’Reilly, 2018. - 200p. - ISBN: 1491980257 Fast data ingestion, serving, and analytics in the Hadoop ecosystem have forced developers and architects to choose solutions using the least common denominator–either fast analytics at the cost of slow data ingestion or fast data ingestion at the cost of slow analytics. There is an answer to this problem. With the Apache Kudu...
O’Reilly, 2018. — 200 p. — ISBN: 1491980257. Fast data ingestion, serving, and analytics in the Hadoop ecosystem have forced developers and architects to choose solutions using the least common denominator–either fast analytics at the cost of slow data ingestion or fast data ingestion at the cost of slow analytics. There is an answer to this problem. With the Apache Kudu...
Автор неизвестен. Выходные данные неизвестны. Sqoop is a tool designed to transfer data between Hadoop and relational database servers. It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and export from Hadoop file system to relational databases. This is a brief tutorial that explains how to make use of Sqoop in Hadoop ecosystem.
Packt Publishing, 2015. - 264p.
Big Data forensics is an important type of digital investigation that involves the identification, collection, and analysis of large-scale Big Data systems. Hadoop is one of the most popular Big Data solutions, and forensically investigating a Hadoop cluster requires specialized tools and techniques. With the explosion of Big Data, forensic...
Syncfusion, 2016. — 83 p. An Apache open source project, Hadoop stores huge amounts of data in safe, reliable storage and runs complex queries over data in an efficient way. It is at the core of a whole host of the most popular Big Data tools. Mastering Hadoop ensures you get the best out of all these tools and better insight from your data. Elton Stoneman’s Hadoop Succinctly...
Knowledge Powerhouse, 2016. — 58 p. Do you want to make career in Data science and Data warehouse in Big Data technology? Apache Hadoop is an essential part of Big Data systems. For a career in Data Science, Data Analytics and Data Warehousing, good knowledge of Hadoop is required. Your career in Data science, Data analytics and Data warehouse can get a boost with the knowledge...
Knowledge Powerhouse, 2016. — 58 p. Do you want to make career in Data science and Data warehouse in Big Data technology? Apache Hadoop is an essential part of Big Data systems. For a career in Data Science, Data Analytics and Data Warehousing, good knowledge of Hadoop is required. Your career in Data science, Data analytics and Data warehouse can get a boost with the knowledge...
Knowledge Powerhouse, 2016. — 58 p. Do you want to make career in Data science and Data warehouse in Big Data technology? Apache Hadoop is an essential part of Big Data systems. For a career in Data Science, Data Analytics and Data Warehousing, good knowledge of Hadoop is required. Your career in Data science, Data analytics and Data warehouse can get a boost with the knowledge...
Packt Publishing, 2013. — 398 p. Data is arriving faster than you can process it and the overall volumes keep growing at a rate that keeps you awake at night. Hadoop can help you tame the data beast. Effective use of Hadoop however requires a mixture of programming, design, and system administration skills. The book removes the mystery from Hadoop, presenting Hadoop and related...
Packt Publishing, 2015. — 518 p. — ISBN10: 1783285516, ISBN13: 9781783285518. Код примеров к книге выложен здесь. This book introduces you to the world of building data-processing applications with the wide variety of tools supported by Hadoop2. Starting with the core components of the framework—HDFS and YARN—this book will guide you through how to build applications using a...
Packt Publishing, 2015. — 518 p. — ISBN10: 1783285516, ISBN13: 9781783285518. Код примеров к книге выложен здесь. This book introduces you to the world of building data-processing applications with the wide variety of tools supported by Hadoop2. Starting with the core components of the framework—HDFS and YARN—this book will guide you through how to build applications using a...
Packt Publishing, 2015. — 518 p. — ISBN10: 1783285516, ISBN13: 9781783285518. Код примеров к книге выложен здесь. This book introduces you to the world of building data-processing applications with the wide variety of tools supported by Hadoop2. Starting with the core components of the framework—HDFS and YARN—this book will guide you through how to build applications using a...
Packt Publishing, 2015. — 518 p. — ISBN10: 1783285516, ISBN13: 9781783285518. Код примеров к книге выложен здесь. This book introduces you to the world of building data-processing applications with the wide variety of tools supported by Hadoop 2. Starting with the core components of the framework—HDFS and YARN—this book will guide you through how to build applications using a...
Packt Publishing, 2015. — Code Only. — ISBN10: 1783285516, ISBN13: 9781783285518. Код примеров к выложенной здесь книге в формате PDF, EPUB, MOBI, AZW3. This book introduces you to the world of building data-processing applications with the wide variety of tools supported by Hadoop2. Starting with the core components of the framework—HDFS and YARN—this book will guide you...
Packt Publishing, 2016. — 979 p. — ISBN: 978-1-78712-516-2. Unlock the power of your data with Hadoop 2.X ecosystem and its data warehousing techniques across large data sets As Marc Andreessen has said “Data is eating the world,” which can be witnessed today being the age of Big Data, businesses are producing data in huge volumes every day and this rise in tide of data need to...
Packt Publishing, 2016. — 979 p. — ISBN: 978-1-78712-516-2. Unlock the power of your data with Hadoop 2.X ecosystem and its data warehousing techniques across large data sets As Marc Andreessen has said “Data is eating the world,” which can be witnessed today being the age of Big Data, businesses are producing data in huge volumes every day and this rise in tide of data need to...
Packt Publishing, 2016. — 979 p. — ISBN: 978-1-78712-516-2. Unlock the power of your data with Hadoop 2.X ecosystem and its data warehousing techniques across large data sets As Marc Andreessen has said “Data is eating the world,” which can be witnessed today being the age of Big Data, businesses are producing data in huge volumes every day and this rise in tide of data need to...
Packt Publishing, 2016. — 979 p. — ISBN: 978-1-78712-516-2. Unlock the power of your data with Hadoop 2.X ecosystem and its data warehousing techniques across large data sets As Marc Andreessen has said “Data is eating the world,” which can be witnessed today being the age of Big Data, businesses are producing data in huge volumes every day and this rise in tide of data need to...
Packt Publishing, 2016. — 979 p. — ISBN: 978-1-78712-516-2. Unlock the power of your data with Hadoop 2.X ecosystem and its data warehousing techniques across large data sets As Marc Andreessen has said “Data is eating the world,” which can be witnessed today being the age of Big Data, businesses are producing data in huge volumes every day and this rise in tide of data need to...
Apress, 2016. — 429 p. — ISBN10: 1484221982. — ISBN13: 978-1484221983 This book is a practical guide on using the Apache Hadoop projects including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout and Apache Solr. From setting up the environment to running sample applications each chapter is a practical tutorial on using a Apache Hadoop ecosystem project....
Apress, 2014. — 428 p. — 2nd ed. — ISBN: 9781430248637 Pro Apache Hadoop, Second Edition brings you up to speed on Hadoop – the framework of big data. Revised to cover Hadoop 2.0, the book covers the very latest developments such as YARN (aka MapReduce 2.0), new HDFS high-availability features, and increased scalability in the form of HDFS Federations. All the old content has...
4th Edition. — O’Reilly, 2015. — 805 p. — ISBN: 1491901632. На англ. языке. Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want...
2th Edition. — O’Reilly Media / Yahoo Press, 2010. — 628 p. — Print ISBN: 978-1-4493-8973-4 | ISBN10: 1-4493-8973-2. Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework -- an open source implementation of MapReduce, the algorithm on which...
4th Edition. — O’Reilly, 2015. — 805 p. — ISBN: 1491901632. Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and...
4th Edition. — O’Reilly, 2015. — 805 p. — ISBN: 1491901632. Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and...
4th Edition. — O’Reilly, 2015. — 805 p. — ISBN: 1491901632. Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and...
4th еdition. — O’Reilly, 2015. — 756 p. — ISBN: 1491901632. Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and...
Packt Publishing, 2013. — 126 p. — ISBN: 1783281715, 9781783281718 Big Data is the hottest trend in the IT industry at the moment. Companies are realizing the value of collecting, retaining, and analyzing as much data as possible. They are therefore rushing to implement the next generation of data platform, and Hadoop is the centerpiece of these platforms. This practical guide...
McGraw-Hill Companies, 2012. — 176 p. — ISBN: 9780071790543 Big Data represents a new era in data exploration and utilization, and IBM is uniquely positioned to help clients navigate this transformation. This book reveals how IBM is leveraging open source Big Data technology, infused with IBM technologies, to deliver a robust, secure, highly available, enterprise-class Big Data...
М.: ДМК Пресс, 2012. — 424 c. — ISBN: 978-5-94074-785-7. Обработка больших массивов данных с помощью традиционных СУБД может оказаться трудным делом. Apache Hadoop - это каркас для разработки приложении, предназначенных для выполнения в распределенном кластере, без применения SQL. Такие приложения прекрасно масштабируются и могут обрабатывать гигантские массивы данных. Если вам...
М.: ДМК Пресс, 2015. — 424 c. — ISBN: 978-5-97060-156-3. Обработка больших массивов данных с помощью традиционных СУБД может оказаться трудным делом. Apache Hadoop - это каркас для разработки приложении, предназначенных для выполнения в распределенном кластере, без применения SQL. Такие приложения прекрасно масштабируются и могут обрабатывать гигантские массивы данных. Если вам...
Автор и источник неизвестны.
Подробная методичка по файловой системе HDFS на русском языке.
Рассматривает вопросы архитектуры, настройки кластера и файловой системы, использования основных команд.
Все сопровождается примерами.
Перевел с английского Е. Матвеев. — СПб.: Питер, 2013. — 672 с. — (Бестселлеры O’Reilly). — ISBN: 978-5-496-00662-0 Apache Hadoop — фреймворк с открытым исходным кодом, в котором реализована вычислительная парадигма, известная как MapReduce, позволившая Google построить свою империю. Эта книга покажет вам, как использовать всю мощь Hadoop, чтобы создавать надежные,...
М.: ДМК Пресс, 2021 (2022). — 505 c. Конвейеры обработки данных управляют потоком данных с момента их первоначального сбора до консолидации, очистки, анализа, визуализации и многого другого. Apache Airflow предоставляет единую платформу, которую можно использовать для проектирования, реализации, мониторинга и обслуживания конвейеров. Простота пользовательского интерфейса,...
Комментарии