Update: Public Twitter Data up through May 26, 2016 is now available for downloading.
Challenge: The Indy Big Data Conference Visualization Challenge encourages all participating Gigabyte Sponsor Companies to visualize trends in cancer, such as side effects, medications, etc., based on the unstructured Twitter data. This data consists of more than 500,000 Twitter accounts filled with tweets (from 5 to many). Each file is one user and includes 3 columns – an identification number of the anonymous Twitter account, the date and time each tweet was posted, and the unstructured text of that tweet or retweet. Use only the amount of Twitter data you deem necessary.
The challenge requires that you have a hypothesis and then start looking at difference possibilities, i.e. phrases, words or hashtags. Link this data to the time period it was posted to build a trend or pattern and create your visualization solution.
- Prepare the raw data for visualization. While any software can be used, you may want to investigate using Knime, Java, Python or Open NPL. You may also use any proprietary software you have available.
- Preprocess and visualize the prepared data, using as suggested Tableau, D3.JS Library or Knime. You may also use any proprietary software you have available.
What we are looking for:
Visualizations, animations, maps, numbers or time frame, heat maps or clouds, that show executives/decision makers something about the trends in cancer, and anything else interesting that this data may reveal. Each participating company will have up to 10 minutes to discuss their solution to the general session audience beginning at 4:00 pm on Thursday, September 1, 2016 (the day of the Conference). The session will be guided by a moderator. All participating Gigabyte Sponsor Companies will be asked the same five following questions:
- What trends did you detect?
- How much of the data did you use?
- What tools did you use?
- How fast is your solution?
- Describe the ease of use of your solution.
Please Contact Josette Jones and Hamed Abedtash with any questions.Josette JonesAssociate Professor, IU School of Informaticsjofjones@iupui.edu Hamed AbedtashPhD Candidate Health Informatics, IU School of Informatics and Computingabedtash@iupui.edu