Big Process

petro_chemical1

Big Data: it is (still) all the rage. And rightfully so because we have access to more data than ever before. The "bigness" of data refers mainly to the sheer magnitude or volume of data available for our consumption. This volume has increased exponentially and shows no sign of abatement. But data indicates one side of the coin. Process is the flip side. Without process, the data repository is a large pile of undifferentiated pieces. Even a "data structure" is a process in disguise since the structure reflects a procedure that must read and write to this structure--thus creating it. This idea of process, in a miniature scale, can be seen in programming. Once upon a time, the program and the data occupied the same memory space, and to the unaided eye, the computation looked like a collection of hexadecimal bytes -- which ones were the data and which were the operations? Who knew? A careful deciphering and knowledge of the opcodes showed us the way. Today, the idea of "process" is essential if one has lots of data. How are the data sensed, processed, merged, diverted, massaged, and transformed? Like a petrochemical plant, the raw materials (e.g., data) are useless without a clearly engineered, and formally represented, process flow. Big data needs big process and big model.