Boston Data Festival

2013 Events

| Page Down for Event Descriptions and Registration Info


Session 1
Session 2
11/04 MonCambridge HyattKeynote: Dr. William Kahn of AIGBoston Data Panel: VC, Academia, Industry
11/05Tuehack/reduceMining Highly Imbalanced Data7:30 start. Two talks: (1) Secure, Scalable NoSQL for Real-Time Apps; and (2) Schemaless SQL
11/05TueDataXuUsing Data & Analytics to Solve Marketing’s Toughest ChallengesSocial Hour!
11/05TuePlastiqBoston FinTech TuesdayBoston FinTech Tuesday [cont]
11/06WedCICData Vis 101: Principles for DesignHumanizing Big Data with Data Visualization
11/06Wedhack/reduceConverting Text to DataPredicting Diabetes from a Relational Database of Medical Records
11/07ThuFidelityDeep Learning – The Future of Machine Learning and Artificial IntelligencePredicting Stock Prices with Maximum Accuracy
11/07Thuhack/reduceIgnite Data Boston: Lightning TalksIgnite Data Boston [cont]
11/08Frihack/reduceData Science CareersStartup Showcase
11/08FriCICMATLAB Workshop from 3:30 - 5:30
11/08FriMicrosoft NERDUsing API's to Analyze Boston and Silicon Valley's Meetup EcosystemsAPI Tutorial and the Programmable Web




Session 1
Session 2
11/09Sathack/reduceCoder Dojo [12-6]Coder Dojo [Cont]
11/09SatCICBeginner R workshop
11/10Sunhack/reduceMATLAB Workshop [9-10]
11/10Sunhack/reduceHackathon [10-6]Hackathon [cont].
11/10SunCICR workshop: Intermediate programming + Matrix Decomposition [10-1]
Monday November 4
06:00 PM Dr. William Kahn – Keynote – Hyatt Regency Cambridge

Dr. William Kahn leads the Analytic Capabilities team in the Science function at American International Group, a $70+ Billion financial services company. Previously, he led the statistical team at Capital One. His talk is titled “Ten challenges for the next generation data scientist from a last generation statistician.” [RSVP]

07:00 PM Boston All-Star Data Panel – Hyatt Regency Cambridge

Data all-star panel drawn from the world of venture capital, academia, and industry. Panelists include Chris Lynch – Partner, Atlas Venture, Sam Madden – Professor and Lead of BigData@CSAIL & MIT, Dr. Willard (Bill) Simmons,Co-Founder and CTO of DataXu. [RSVP]

08:00 PM Network Social

Connect with fellow data enthusiasts from across the greater Boston area. [RSVP]

Tuesday November 5
05:45 PM Q&A with Adam Broun formerly CTO in Residence at Fintech Innovation Lab @ Plastiq

Hosted by the Boston FinTech meetup and featuring a Q&A with Adam Broun, currently of Kensho Finance and formerly CTO in Residence at Fintech Innovation Lab, Managing Director and CIO of Front Office systems at Credit Suisse. [RSVP]

06:00 PM Mining Highly Imbalanced Data @ hack/reduce

David Weisman gives a talk titled “Mining Highly Imbalanced Data.”
Constructing classifiers from imbalanced data is fascinating from both theoretical and practical perspectives. Validating classifiers is also challenging with imbalanced data, as a trivial model that always predicts the majority class will superficially appear accurate. We’ll survey class imbalance from several perspectives, and investigate successful approaches to constructing classifiers from imbalanced data. [RSVP]

06:00 PM Using Data & Analytics to Solve Marketing’s Toughest Challenges @ DataXu

Join DataXu co-founder and SVP of Analytics and Innovation Sandro Catanzaro as he looks at how data and analytics are disrupting the marketing industry. We’ll look at how these new technological advances turn advertising and marketing into real-time, always-on market research, and how marketers and brands are adapted to this new normal. Stick around following the talk for snacks and networking! [RSVP]

07:20 PM Schemaless SQL : Easily Question All Data Types Through One Familiar Interface @ hack/reduce

In this session you will learn how to dramatically reduce the complexity of multi-structured data analysis on Hadoop and accelerate time to insights. Key concepts in this talk will include:

    – Creating a query able view of not only traditional structured data, but also non-relational data such as text, documents and key-value pairs
    – Identify and present dynamically changing attributes within data, thereby dramatically reducing ETL
    – Enable analysts to run Machine Learning algorithms over multi-structured in parallel as SQL functions”


08:20 PM Secure, Scalable NoSQL for Real-Time Apps with Apache Accumulo @ hack/reduce

Adam Fuchs presents his talk titled “Secure, Scalable NoSQL for Real-Time Apps with Apache Accumulo”
Data volumes and security requirements present serious challenges for real-time applications. Apache Accumulo enables online model building and dynamic indexing to support both retrospective analysis and enrichment of streaming data. These mechanisms are built on a foundation of fine-grained access control, supporting a bloom of innovative applications without sacrificing security. This talk will outline the framework that we use to support secure, scalable real-time analysis, as well as dive deeper into many of the supporting features of Accumulo. [RSVP]

Wednesday November 6
06:00 PM Seminar: How Text Becomes Data @ hack/reduce

People and businesses want to make decisions based on large amounts of quantifiable data. This talk by Rob Speer will show you how to create text models that can be built into useful tools such as search engines, recommender systems, and classifiers. [RSVP]

06:30 PM Data Vis 101: Principles for Design @ CIC

Our speaker, Lynn Cherny, has condensed an intro workshop into 45 minutes and will review the principles for successful design with data, including tips on visual encodings, story-finding, and principles for developing exploratory or explanatory visualizations. We’ll look at a couple redesigns and award winners, plus maybe a few #WTFvis examples along the way. [RSVP]

07:00 PM Predicting Diabetes from a Relational Database @ hack/reduce

Jeremy Achin presents “Predicting Diabetes from a Relational Database of Medical Records.” The theme here is “how to build highly accurate predictive models when your data is messy and spread out across many data sources.” [RSVP]

07:30 PM Humanizing Big Data with Data Visualization @ CIC

Mark Schindler presents his talk – ‘Data visualization is the human front-end of big data’. In order for people to solve problems and make decisions using insights drawn from big data, they need a clear understanding of the stories that are often buried. How can UI designers and data visualization practitioners help make those insights understandable and useful to decision-makers? We all deal with the challenge of how to identify meaningful objects or events in a raw datastream, and present those events to users in a way that provides context and helps them get a qualitative understanding of what is going on. We’ll look at approaches to accomplishing this, and how techniques like visual abstraction, attention-management and metaphor can help. [RSVP]

Thursday November 7
06:00 PM Deep Learning – The Future of Machine Learning and AI @ Fidelity Auditorium

An overview of Deep Learning (the future of machine learning and artificial intelligence).
Speakers: Dallin Akagi and/or Alec Radford. Dallin worked on deep learning at the NSA for the last 2 years, while Alec is an accomplished Kaggler and an expert at applied deep learning. [RSVP]

06:30 PM Ignite Data Boston @ hack/reduce

“Enlighten us, but make it fast”
Featured in various cities all over the country, Ignite presentations give experts, professionals, and just plain geeks the chance to share their passions with an audience. What’s the twist? The presentations only contain 20 slides that auto-advance every 15 seconds, leaving presenters with a strict five-minute presentation.Ignite Data Boston will give attendees the opportunity to see some of the diverse and interesting data projects going on in the Boston area. [RSVP]

07:00 PM Predicting Stock Prices with Maximum Accuracy @ Fidelity Auditorium

Predicting Stock Prices with Maximum Accuracy – This talk will be presented by Sergey Yergenson. Sergey, currently ranked 11 out of over 100,000 Kaggle users, most recently placed 2nd out of 448 competitors in a stock price prediction Kaggle. Sergey will show people how to predict stock price movement by walking through his 2nd place solution to the competition. [RSVP]

Friday November 8
03:30 PM MATLAB Workshop @ CIC

Todd Atkins lead this MATLAB Workshop, Friday 3:30 to 5:30pm at the CIC. [RSVP]

06:00 PM Data Science Careers @ hack/reduce

Data Scientists Panel Event. Panelist include

  • Andy Palmer, Co-Founder at Data Tamer
  • John Piekos, VP of Engineering at VoltDB
  • Catherine Havasi, Co-Founder and CEO of Luminoso
  • Chris Rocca, VP of Engineering at Hadapt
  • Network with recruiters from sponsoring companies (Before and after the panel discussion). [RSVP]

    06:00 PM API Tutorial with Use Case (Boston’s Meetup Ecosystem) @ Microsoft NERD

    APIs continue to grow in numbers as evidenced by the website the Programmable Web. The night will have two parts: first an example use case, Boston’s Meetup Ecosystem, (which used API generated data) will be presented, and then a hands-on tutorial will be given on how to pull data from an API towards using it for analysis, mashups, etc. [RSVP]

    07:00 PM Data-centric Startup Showcase @ hack/reduce

    Data-centric startup showcase highlighting innovative companies doing cool things with data: featuring Outbrain, Luminoso, JAZE, and Nutonian. [RSVP]

    08:00 PM Data Social @ Hack/Reduce

    Data (Speed) Dating: pitch your data-centric company or idea and find potential team members. [RSVP]

    Saturday November 9
    12:00 PM CoderDojo @ hack/reduce

    Hack/Reduce hosts a coding workshop for local high school students.

    01:00 PM Beginner R Workshop @ CIC

    This 3-hour workshop is focused on learning R. In recent years, R has become an essential tool for data mining, machine learning, predictive analytics, and more traditional statistical methods. Facebook, Google, and the New York Times use R for complex data analysis and information visualization.  Numerous employers are actively seeking strong R skills.

    This workshop is designed to get folks up and running with R. We will cover basic R usage including data management, simple statistical analyses, and data visualization. After completing this hands-on workshop, you will have a solid foundation for moving onto more extensive analysis in R. [RSVP]

    [See also our 2-part Intermediate R workshop on Sunday at 10 AM]

    Sunday November 10
    09:00 AM Pre-Hackathon MATLAB Workshop @ hack/reduce

    Jiro Doke will lead this pre-hackathon MATLAB Workshop, Sunday 9:00 to 10:00am. The Sunday session will be a consolidated version of the Friday one. [RSVP]

    10:00 AM Stock Prediction Hackathon @ hack/reduce

    Boston Data Festival is hosting a Hackathon on Sunday 11/10/13 from 10 am to 6 pm. The event will take place at Hack/Reduce (275 Third Street, Cambridge, MA). The goal of the Hackathon is to predict the directional accuracy of a stock prices.
    The following cash prices will be awarded!

      1st place: $500
      2nd place: $250
      Best submission using Matlab: $250
      Self-respect from doing better than a monkey throwing darts: Priceless.

    During the Hackathon every participant can use a free license from Matlab. [RSVP]

    10:00 AM Programming with Data: Intermediate R and Matrix Decomposition @ CIC

    This workshop consists of two 90-minute segments, “Intermediate R” and “Matrix Decomposition”.

    In “Programming with Data: Intermediate R” we will teach you to start thinking like an R programmer. We will cover topics including advanced data I/O, vectorized computation, _apply_ functions and the map-reduce pattern, table reshaping and pivoting, string manipulation, and coding patterns for data visualization with ggplot2. The Intermediate R workshop assumes some experience with R programming. Advanced statistical knowledge is not required for this workshop.

    Matrix Decomposition is the followup to the Boston Data Mining Meetup talk “Mining the Matrix (Decomposition).” The workshop doesn’t assume any prior knowledge of matrix decomposition and will start with a recap of matrix decomposition and the examples from the talk, making it accessible to a general audience. We will cover how to perform the decompositions in R and focus on how to interpret the resulting factors as actional information. In addition to a number of toy examples, we will cover a Netflix-style movie genre discovery problem, and a stock market trend problem. Requirements: a laptop with a working R install and a basic understanding of the R environment. [RSVP]