This is the first in the series of blog posts about Big Data and Analytics. This article will lay the ground work for a series of posts that outline how Big Data and its implementation can improve performance for both NPO’s as well as Corporate Social Responsibility (CSR) efforts.


What is Big Data Really? Key Definitions

It’s critical to get on the same page about what Big Data is. These definitions by Gartner and Wikipedia offer excellent jumping on points:

Big Data according to Gartner:

“Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.”

And Wikipedia describes Big Data as having the following characteristics:

  • Volume – The quantity of generated and stored data. The size of the data determines the value and potential insight- and whether it can actually be considered big data or not.
  • Variety – The type and nature of the data. This helps people who analyze it to effectively use the resulting insight.
  • Velocity – In this context, the speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development.
  • Variability – Inconsistency of the data set can hamper processes to handle and manage it.
  • Veracity – The quality of captured data can vary greatly, affecting accurate analysis.

Why all the fuss about Big Data?

Big Data allows for the collection and analysis of disparate types of data such as employee details, social media, and business news. Originally, all of this data existed in separate applications with completely different data models, but by pulling this data together into a single data repository, insightful analytics can be performed.

An interesting aspect of this approach is that the original loosely bound data can still be leveraged using ‘late data binding’ after the fact. With traditional analytic approaches (OLAP), the relationships are generally implemented at design time, which limits the end user to answer only the questions that were thought of at the time. Late data binding allows new questions to be answered as the volume of data grows and/or new data sources are added.

Interesting Use Cases

Below are a few example use cases of how this data can be used. In my follow-up blog posts I will go into specific details offering real world solutions for both non-profits and corporations.

Big_Data_Graphic_2Auditing

Most business applications provide some form of auditing. The auditing function allows users to determine who changed what and when. Depending on the implementation, the level of detail in the audit trail might be limited. For example, you might be able to see when or who changed a record, but you might not be able to determine exactly what or why it was changed. An example is to imagine a full history of an employee’s/donor’s records from the beginning of time.

Click tracking

As a developer I am always interested in which features of our applications are used the most. Additionally, I would like to know which features require redesign because they are confusing to our users. By collecting detailed usage information, this analysis can be performed after the fact (or in near real time). This would also open the door to a more proactive support model.

Log Analysis

By collecting and analyzing the myriad of information that applications generate, we can provide interesting value added services for our customers. For example, we can determine features in our applications that are under performing or generating errors. By linking this information with our support systems we can provide much more detailed and proactive support.

Trends

Combining giving/volunteering information with social media and news allows organizations to identify interesting trends. These trends can improve campaign performance as well as help to predict giving based on external events.

How should I start?

In the beginning, the most important thing is to start collecting data (as much as possible). It’s always easier to delete data than it is to re-create it after the fact. Define interesting data sources and start collecting. Start small, and store as much raw, unprocessed data as feasible.

Another alternative is to use vendors with a vision for leveraging your data today and in the future.

Stay tuned for more posts in this series on Big Data.