MEET BUMBLEBEE

Associations cite record duplication and matching across multiple systems as one of their biggest data challenges. They also struggle with cleansing dirty data, identifying missing or incorrect information, and normalizing non-standard data formats. Many associations try to tackle these issues with manual effort but are unable to keep up. This is not the way forward!

BUMBLEBEE

Drawing upon our background in data and automation, we set out to build a tool to help associations - of any size - better manage their data. This tool, Bumblebee, automates data transformation and cleansing activities while using data science to provide fast and accurate duplicate identification and record matching.

HOW DOES IT WORK?

Bumblebee assesses and transforms your data in four key ways:

Duplication ID & Matching

  • Using Bumblebee's proprietary matching technology, your data is analyzed to identify duplicates within a single file and/or matches between multiple files. Identified match candidates are returned to you with shared group ID, a confidence score and a single record flagged as the "Queen Bee" or primary record. This is especially important to use as a guide for merging records quickly and efficiently. No more reading through each duplicate to see which is best.

Completeness

  • All supplied data fields will be assessed for incomplete data: NULL, Blank, #N/A, etc. You'll receive a detailed report of missing data by field.

Standardization

  • Where possible, your data is transformed to conform to the most commonly used formats existing in your data including, cASe, state, country and phone number.

Validation

  • Leading/trailing white spaces and leading punctuation are removed and invalid data (e.g. emails without '@', URLs having 'wwww', phones/zips without required numeric characters, etc.) are flagged.
  • Emails and postal addresses are run through NCOA, DSF and email validation testing and are returned flagged. When available, new or corrected postal addresses will be updated and emails will be assigned a status code with risk assessment.

Once the transformation is complete, we'll return your data back to you labeled and cleansed. From there you can load your data back into your database, perform merges or add on additional data projects like smart appending, prospect research and more. See "What is included in the Bumblebee Deliverable" for details. 

WHEN SHOULD I USE BUMBLEBEE?

Data deduplication and cleansing with Bumblebee can be done at any time but we highly recommend using it if you are:

  • changing your AMS
  • integrating data analytics
  • implementing a data governance strategy
  • acquiring a new data set of unknown quality

Bumblebee is also available as a subscription service that allows you to schedule quarterly deduplication cleansing as part of your data governance strategy.

WHAT KIND OF DATA CAN BUMBLEBEE TRANSFORM?

Bumblebee is optimized to transform association membership data but can be used to transform other record types as well. Have a different type of data that you're interested in having transformed? Drop us an email at data@associationtrends.com or give us a call to discuss your specific data challenges.

HOW LONG DOES IT TAKE?

Most projects can be completed in as little as one to two weeks from receipt of data. Additional data projects (appending, research, etc.) are on a case by case basis.

HOW MUCH DOES IT COST?

Bumblebee data transformation services are available on an individual project basis and as an annual ongoing data transformation cleansing service. Packages start at just $2,000 for a single data source and up to 50,000 records plus a small, per thousand records postal and email validation fee.

DO YOU PUT MY DATA BACK IN MY AMS?

At this time we do not push data back into your live AMS. While we are exploring this as an option, the preference to auto-merge and overwrite data varies greatly by association. Bumblebee is designed to provide you with clean data and a duplicate identification guide that helps you merge faster and more efficiently.

IS MY DATA PROTECTED?

Your data will never be used for any other purpose other than its transformation and cleansing. We maintain strict data privacy policies internally and with our third party vendors.

WHERE DOES THE NAME BUMBLEBEE COME FROM?

AT Bumblebee's name has not one, but two origin stories. Firstly, bees are some of the most organized creatures on earth. A colony of bees can move into an abandoned hive, clean it up and get it back in working order, just like Bumblebee does with your data. Second, we are big fans of the Transformers and Bumblebee is everyone's favorite good guy!

WHAT IS INCLUDED IN THE BUMBLEBEE DELIVERABLE?

Once Bumblebee has finished transforming your file(s), you will receive three items showing you the work Bumblebee has done as well as a final transformed and cleaned file. At this time, Bumblebee does not directly push your data back into your AMS or system of record however this is something we are exploring. 

Bumblebee Process Documents: These documents are meant to provide a comprehensive guide for you to interpret and leverage your transformed data. They also help you to quickly and efficiently make record merging decisions:

  • Results Guide: This document will explain how to interpret your Bumblebee results.
  • Summary Tab: Provides a high-level summary of your file's completeness, standardization, validation and duplication scores and counts.
  • Flagged File: This is your original file with errors flagged and duplicate identification columns added. No data is overwritten in this file. Data merging & mapping

Transformed and Cleaned File: This is the file that you can use to reload into your AMS or system of record. It will have the following transformations and updates applied:

  • Common Format Auto-Detect and Correct: Bumblebee will auto-detect common formats used in your file and auto-correct data that doesn't conform including:
    • cASe - Ex. will detect most commonly used case (UPPER, lower or Proper) by field and then correct all non-conforming records
    • State format - Ex. will detect most commonly used state format (state abbreviations or full name) and then correct all non-conforming records
    • Country format - Ex. will detect most commonly used country format (2 or 3 character abbreviation or full name) and then correct all non-conforming records
    • Phone format - Ex. will detect most commonly used phone format and then correct all non-conforming records
  • Field Trimming: Removes leading/trailing whitespace and leading punctuation.
  • Postal Address Validation: Identifies invalid postal addresses and provides, where available, corrected information from the NCOA and DSF databases along with a mailability score.
  • Email Validation and Risk Assessment: Identifies valid, invalid, unknown and trap email addresses. Emails are coded with a deliverability risk score.
  • Phone Field Validation: Identifies invalid phone numbers (typos, letters in the field, incorrect numeric count etc.)

DO YOU PROVIDE OTHER DATA SERVICES?

Bumblee is an extremely efficient tool to get your data in shape but often associations need more help to improve and enhance their data. Below is a list of other data services that our team can provide.

  • Smart-appending empty fields
  • Data merging & mapping
  • Custom prospect research
  • Ongoing cleansing services (annual subscription)
  • Data mining on unstructured data
  • Data science consulting for modeling projects

WANT TO KNOW WHERE YOUR DATA STANDS BEFORE YOU GET STARTED?

GET YOUR DATA REPORT CARD!

Take the first step in data improvement when you submit a sample of your data for review by Bumblebee. You'll receive a Data Report Card with grades for Duplication, Completeness, Validation, Standardization. You'll also receive an action plan to improve your data.

DUPLICATION

F*15%

COMPLETENESS

C-72.4%

VALIDATION

A93.6%

STANDARDIZATION

B+86.8%

SUBMITTING YOUR DATA FOR SCORING

  • For organizational records, sort by organization name alphabetically and send the first or last 1,000 records
  • For person records, sort by last name and select the first or last 1,000 records
  • Excel or any flat file format is acceptable
  • Report cards are typically returned within 5 business days
  • Send files via email to data@associationtrends.com

*Unlike the other areas of grading, duplication does not follow the typical percentage ranges of the grading scale. The following scale is used. (0-1%) (2-4%) (5-9%) (10-14%) (15%+)

I Have a Data Challenge... HELP!

^