Co-group joins the data set by grouping one particular data set only. It groups the elements by their common field and then returns a set of records containing two separate bags. The first bag consists of the record of the first data set with the common data set and the second bag consists of the records of the second data set with the common data set.
FOREACH is used to apply transformations to the data and to generate new data items. The name itself is indicating that for each element of a data bag, the respective action will be performed. Syntax : FOREACH bagname GENERATE expression1, expression2, ….. The meaning of this statement is that the expressions mentioned after GENERATE will […]
It could mean that the Namenode is not working on your VM.