flume篇– flume使用步骤、flume运行机理

flume介绍

日志数据收集器

flume使用步骤

定义source,channel(通道),sink(转存的位置) 启动agent 如果有数据,就已经开始接受转存了

flume运行机理

flume type介绍

source type

Avro, Exec, Jms, Spooling directory, Netcat, Http,

Syslog, Thrift, twitter等

高级编写自己的source type channel

可以存放在memory、jdbc、file中 sink type

HDFS, Hbase 或SPARK STREAM也可能是另一个sink

flume demo

安装解压flume:
/home/hadoop/opt/apache-flume-1.8.0-bin/conf
vi spooldir.conf

========================================================
spooldir.sources=sa
spooldir.channels=ma
spooldir.sinks=ha

spooldir.sources.sa.type=spooldir
spooldir.sources.sa.spoolDir=/home/hadoop/firstdemo/flume_spider
spooldir.sources.sa.fileHeader = true

spooldir.channels.ma.type=memory
spooldir.channels.ma.capacity=10000
spooldir.channels.ma.transactioncapacity=1000000

#spooldir.sinks.ha.type=logger
spooldir.sinks.ha.type=hdfs
spooldir.sinks.ha.hdfs.fileType=DataStream
spooldir.sinks.ha.hdfs.path=/user/hadoop/spider
spooldir.sinks.ha.hdfs.writeFormat=Text
spooldir.sinks.ha.hdfs.batchSize=10000
spooldir.sinks.ha.hdfs.rollCount=1000
spooldir.sinks.ha.hdfs.fileSuffix=.csv
spooldir.sinks.ha.hdfs.filePrefix=test
spooldir.sinks.ha.hdfs.rollSize=0
spooldir.sinks.ha.hdfs.rollInterval=0


spooldir.sources.sa.channels=ma
spooldir.sinks.ha.channel=ma 
me
==

启动flume
./bin/flume-ng agent -n spooldir -c conf -f conf/spooldir.conf
重新加载flume
./bin/flume-ng agent -n spooldir -c conf -f conf/spooldir.conf -Dflume.root.logger=INFO,console
“`

Leave A Comment