Skip to content

Commit b164ff0

Browse files
committed
Impove set up instructions to make them more standalone.
1 parent d1c3c55 commit b164ff0

File tree

2 files changed

+41
-18
lines changed

2 files changed

+41
-18
lines changed

demo/README.md

Lines changed: 39 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -7,39 +7,60 @@ session data with SQL using Impala or Hive.
77

88
## Pre-requisites
99

10-
Before trying this example, you need to have enabled Flume user impersonation and
11-
installed the CDK event serializer module in Flume (this is explained in the `logging`
12-
example).
10+
* __Enable Flume user impersonation__ Flume needs to be able to impersonate the owner
11+
of the dataset it is writing to. (This is like Unix `sudo`, see
12+
[Configuring Flume's Security Properties](http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Security-Guide/cdh4sg_topic_4_2.html)
13+
for further information.) In Cloudera Manager, for the [HDFS service](http://localhost:7180/cmf/services/status),
14+
click "View and Edit" under the Configuration tab then
15+
search for "Cluster-wide Configuration Safety Valve for core-site.xml"
16+
and add the following XML snippet, then save changes.
1317

14-
Next, start a Flume agent on the QuickStart VM. You can do this via Cloudera Manager by
18+
```
19+
<property>
20+
<name>hadoop.proxyuser.flume.groups</name>
21+
<value>*</value>
22+
</property>
23+
<property>
24+
<name>hadoop.proxyuser.flume.hosts</name>
25+
<value>*</value>
26+
</property>
27+
```
28+
* __Install the CDK event serializer module__ This is necessary
29+
since Flume 1.3.0 does not come with a HDFS sink that can write Avro data files.
30+
Note that the HDFS sink in Flume 1.4.0 can write Avro data files so this step is not
31+
needed for that version of Flume or later.
32+
33+
```bash
34+
sudo wget https://repository.cloudera.com/artifactory/libs-release-local/com/cloudera/cdk/cdk-flume-avro-event-serializer/0.4.0/cdk-flume-avro-event-serializer-0.4.0.jar \
35+
-P /usr/lib/flume-ng/lib/
36+
# or if wget is not available:
37+
( cd /usr/lib/flume-ng/lib/ ; sudo curl -O https://repository.cloudera.com/artifactory/libs-release-local/com/cloudera/cdk/cdk-flume-avro-event-serializer/0.4.0/cdk-flume-avro-event-serializer-0.4.0.jar ; )
38+
```
39+
* __Start a Flume agent__ You can do this via Cloudera Manager by
1540
selecting "View and Edit" under the Flume service Configuration tab, then clicking on the
1641
"Agent (Default)" category, and pasting the contents of the `flume.properties` file in
1742
this project into the text area for the "Configuration File" property.
1843

19-
If you are running this example from you machine and not from a QuickStart VM login,
44+
If you are running this example from you machine and not from a QuickStart VM login,
2045
then make sure you change the value of the `proxyUser` setting in the agent
2146
configuration to the user that you are logged in as. Save changes,
2247
then start the Flume agent.
2348

24-
To use Impala in this example you need to be running Impala 1.1 or later. If the
25-
version of QuickStart VM you are running is earlier than Impala 1.1, you can upgrade it
26-
using [these instructions](http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_upgrading.html)
27-
Make sure you follow the instructions under the heading "To upgrade Impala in a
28-
Cloudera Managed environment, using parcels". You should also upgrade CDH at the same
29-
time.
30-
31-
For Oozie you need to have Oozie's sharelib installed (which is taken care of already in
32-
the QuickStart VM) and the Oozie service must be running - so start it using Cloudera
33-
Manager.
34-
35-
Finally add the HCatalog Core JAR to the Hive Oozie sharelib,
36-
by logging in to the VM and running:
49+
* __Set up Hive Oozie sharelib__ Add the HCatalog Core JAR to the Hive Oozie
50+
sharelib, by logging in to the VM and running:
3751

3852
```bash
3953
sudo -u oozie hadoop fs -put \
4054
/usr/lib/hcatalog/share/hcatalog/hcatalog-core-0.5.0-cdh4.3.0.jar \
4155
/user/oozie/share/lib/hive
4256
```
57+
* __(Optional) Upgrade Impala__ To use Impala in this example you need to be running
58+
Impala 1.1 or later. If the
59+
version of QuickStart VM you are running is earlier than Impala 1.1, you can upgrade it
60+
using [these instructions](http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_upgrading.html)
61+
Make sure you follow the instructions under the heading "To upgrade Impala in a
62+
Cloudera Managed environment, using parcels". You should also upgrade CDH at the same
63+
time.
4364

4465
## Building
4566

logging/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ needed for that version of Flume or later.
3434
```bash
3535
sudo wget https://repository.cloudera.com/artifactory/libs-release-local/com/cloudera/cdk/cdk-flume-avro-event-serializer/0.4.0/cdk-flume-avro-event-serializer-0.4.0.jar \
3636
-P /usr/lib/flume-ng/lib/
37+
# or if wget is not available:
38+
( cd /usr/lib/flume-ng/lib/ ; sudo curl -O https://repository.cloudera.com/artifactory/libs-release-local/com/cloudera/cdk/cdk-flume-avro-event-serializer/0.4.0/cdk-flume-avro-event-serializer-0.4.0.jar ; )
3739
```
3840
* __Start a Flume agent__ You can do this via Cloudera Manager by
3941
selecting "View and Edit" under the Flume service Configuration tab, then clicking on the

0 commit comments

Comments
 (0)