国产AⅤ无码一二三区_亚洲av无码专区在线看_日本午夜福利视频一区二区三区 _日本午夜福利视频一区二区三区

課程信息

課程名稱： Hadoop開發(fā)工程師（CCDH）認(rèn)證

公開班、定制班

開課時(shí)間：2024-03-26

課程介紹

Hadoop開發(fā)工程師（CCDH）認(rèn)證

【課程簡(jiǎn)介】

作為大數(shù)據(jù)核心技術(shù)，hadoop 為企業(yè)提供了高擴(kuò)展、高冗余、高容錯(cuò)、和經(jīng)濟(jì)有效的“數(shù)據(jù)驅(qū)動(dòng)”解決方案。針對(duì)目前普遍缺乏海量數(shù)據(jù)技術(shù)人員的現(xiàn)狀，Cloudera公司推出面向開發(fā)人員的認(rèn)證Cloudera Certified Developer for Apache Hadoop (CCDH)。通過在青藍(lán)咨詢的CCDH課程培訓(xùn)您將學(xué)習(xí)到：

* Hadoop核心

* HDFS和MapReduce工作原理

* 如何開發(fā)MapReduce應(yīng)用

* 如何單元測(cè)試MapReduce應(yīng)用

* 如何使用MapReduce combiners, partitioners和distributed cache

* 開發(fā)調(diào)試MapReduce應(yīng)用

* 如何實(shí)現(xiàn)MapReduce應(yīng)用中的輸入／輸出

* 常見MapReduce算法

* 如何用MapReduce來聯(lián)結(jié)數(shù)據(jù)集

* 如何把Hadoop嵌入到企業(yè)已有的計(jì)算環(huán)境里

* 如何使用Mahout來進(jìn)行機(jī)器學(xué)習(xí)

* 如何使用Hive和Pig來快速開發(fā)數(shù)據(jù)分析應(yīng)用

* 如何使用Oozie來創(chuàng)建管理工作流

【授課對(duì)象】

企業(yè)管理者、CIO、CTO、政府信息部門官員、項(xiàng)目（開發(fā)）經(jīng)理、咨詢顧問、IT經(jīng)理，IT咨詢顧問，IT支持專家、系統(tǒng)工程師、數(shù)據(jù)中心管理員、云計(jì)算管理員及想加入云計(jì)算隊(duì)伍的您需要使用Apache Hadoop來開發(fā)功能強(qiáng)大的數(shù)據(jù)分析應(yīng)用的程序開發(fā)人員。

學(xué)員需具備程序設(shè)計(jì)經(jīng)驗(yàn)，特別是Java方面的技能和背景。無需Hadoop方面的基礎(chǔ)和經(jīng)驗(yàn)。

【授課內(nèi)容】

了解MapReduce和HDFS是如何組合相互匹配，提供可擴(kuò)展的強(qiáng)大系統(tǒng)。

學(xué)習(xí)編寫針對(duì)Hadoops API的程序，掌握編寫更有趣的數(shù)據(jù)處理任務(wù)所需的基本技能。

掌握如何在數(shù)據(jù)中心服務(wù)器上或Amazons EC2上部署Hadoop，利用Hadoop擴(kuò)充現(xiàn)有系統(tǒng)。

掌握如何把不同類型數(shù)據(jù)導(dǎo)入Hadoop作進(jìn)一步分析，以及利用Sqoop導(dǎo)入現(xiàn)有數(shù)據(jù)庫。

掌握如何使用Hive，涉及數(shù)據(jù)導(dǎo)入、表格創(chuàng)建及作出查詢。

掌握最佳方案以減輕MapReduce程序調(diào)試難度，及規(guī)模調(diào)試的本地測(cè)試工具和技術(shù)。

深入了解Hadoop API，包括自定義數(shù)據(jù)類型和文件格式，HDFS的直接訪問，中間數(shù)據(jù)劃分，以及其他工具，如DistributedCache。

深入了解圖算法，以及PageRank。了解有效執(zhí)行聯(lián)接的策略，比較不同數(shù)據(jù)模型的不同技術(shù)。

掌握如何進(jìn)行MapReduce程序優(yōu)化，提高性能。

模塊

內(nèi)容

The Motivation for Hadoop

l Problems with Traditional Large-Scale Systems

l Introducing Hadoop

l Hadoopable Problems

The Motivation for Hadoop

l Problems with Traditional Large-Scale Systems

l Introducing Hadoop

l Hadoopable Problems

Hadoop: Basic Concepts and HDFS

l The Hadoop Project and Hadoop Components

l The Hadoop Distributed File System

Introduction to MapReduce V2

l MapReduce Overview

l Example: WordCount

l Mappers

l Reducers

Hadoop Clusters and the Hadoop Ecosystem

l Hadoop Cluster Overview

l Hadoop Jobs and Tasks

l Other Hadoop Ecosystem Components

Writing a MapReduce Program in Java

l Basic MapReduce API Concepts

l Writing MapReduce Drivers, Mappers, and Reducers in Java

l Speeding Up Hadoop Development by Using Eclipse

l Differences Between the Old and New MapReduce APIs

Writing a MapReduce Program Using Streaming

l Writing Mappers and Reducers with the Streaming API

Unit Testing MapReduce Programs

l Unit Testing

l The JUnit and MRUnit Testing Frameworks

l Writing Unit Tests with MRUnit

l Running Unit Tests

Delving Deeper into the Hadoop API

l Using the ToolRunner Class

l Setting Up and Tearing Down Mappers and Reducers

l Decreasing the Amount of Intermedi-ate Data with Combiners

l Accessing HDFS Programmatically

l Using The Distributed Cache

l Using the Hadoop API’s Library of Mappers,Reducers, and Partitioners

Practical Development Tips and Techniques

l Strategies for Debugging MapReduce Code

l Testing MapReduce Code Locally by Using

LocalJobRunner

l Writing and Viewing Log Files

l Retrieving Job Information with Counters

l Reusing Objects

l Creating Map-Only MapReduce Jobs

Partitioners and Reducers

l How Partitioners and Reducers Work Together

l Determining the Optimal Number of Reduc-ers for a Job

l Writing Customer Partitioners

Data Input and Output

l Creating Custom Writable and Writable-Comparable Implementations

l Saving Binary Data Using SequenceFile andAvro Data Files

l Issues to Consider When Using File Compression

l Implementing Custom InputFormats and OutputFormats

Common MapReduce Algorithms

l Sorting and Searching Large Data Sets

l Indexing Data

l Computing Term Frequency — Inverse Document Frequency

l Calculating Word Co-Occurrence

l Performing Secondary Sort

Joining Data Sets in MapReduce Jobs

l Writing a Map-Side Join

l Writing a Reduce-Side Join

Integrating Hadoop into the Enterprise Workflow

l Integrating Hadoop into an Existing Enterprise

l Loading Data from an RDBMS into HDFS by Using Sqoop

l Managing Real-Time Data Using Flume

l Accessing HDFS from Legacy Systems with FuseDFS and HttpFS

An Introduction to Hive, Imapala, and Pig

l The Motivation for Hive, Impala, and Pig

l Hive Overview

l Impala Overview

l Pig Overview

l Choosing Between Hive, Impala, and Pig

An Introduction to Oozie

l Introduction to Oozie

l Creating Oozie Workflows

Conclusion

l Conclusion

注：具體開課時(shí)間將根據(jù)實(shí)際進(jìn)行調(diào)整，請(qǐng)關(guān)注青藍(lán)咨詢官方公眾號(hào)消息或咨詢課程顧問！

【聯(lián)系青藍(lán)咨詢】

地址：深圳市南山區(qū)高新南一道06號(hào)TCL大廈B座3樓309室（公交站：大沖地鐵站：一號(hào)線高新園C出口）

郵編：518057

電話：0755-86950769

郵箱：peixun@shzhchina.com

網(wǎng)址：http://www.mycalorietracker.com

掃碼關(guān)注了解更多課程信息

欧美a级在线现免费观看_丰满少妇13p_午夜大尺度精品福利视频_av网址在线播放

IT技術(shù)