In this assignment, you will be designing and implementing MapReduce algorithms for a variety of common data processing tasks. Problem 6: Assume you have two matrices A and B in a sparse matrix format, where each record is of the form i, j, value. Design a MapReduce algorithm to compute matrix multiplication: A x B
Map Input
The input to the map function will be matrix row records formatted as lists. Each list will have the format [matrix, i, j, value] where matrix is a string and i, j, and value are integers.
The first item, matrix, is a string that identifies which matrix the record originates from. This field has two possible values:
‘a’ indicates that the record is from matrix A ‘b’ indicates that the record is from matrix B
Reduce Output
The output from the reduce function will also be matrix row records formatted as tuples. Each tuple will have the format (i, j, value) where each element is an integer.
You can test your solution to this problem using matrix.json:
python multiply.py matrix.json
You can verify your solution against multiply.json.
import MapReduce import sys """ Word Count Example in the Simple Python MapReduce Framework """ mr = MapReduce.MapReduce() # ============================= # Do not modify above this line def mapper(record): # key: document identifier # value: document contents matrix, row, col, value = record for n in range(5): if matrix == 'a': cell = (row,n) matr = 'L' col_row = col else: cell = (n,col) matr = 'R' col_row = row mr.emit_intermediate(cell, (matr, col_row, value)) #key = (cell, matrix, col_row, value) def reducer(key, list_of_values): # key: word # value: list of occurrence counts left_matrix = [(item[1],item[2]) for item in list_of_values if item[0] == 'L' ] right_matrix = [(item[1],item[2]) for item in list_of_values if item[0] == 'R' ] result = 0 for item_L in left_matrix: for item_R in right_matrix: if item_L[0] == item_R[0] : result += item_L[1] * item_R[1] #mr.emit((key[0], item[2], result)) if result != 0: mr.emit((key[0],key[1], result)) # Do not modify below this line # ============================= if __name__ == '__main__': inputdata = open(sys.argv[1]) mr.e xecute(inputdata, mapper, reducer)
Leave a Comment