Loading…
Data Tech has ended
Use the event Sched to plan your day and rate the sessions you attend. View venue map
Thursday, May 30
 

8:00am CDT

Registration, Networking, & Coffee
Thursday May 30, 2019 8:00am - 9:00am CDT

9:00am CDT

Develop a Data Strategy to Compete with the Giants
For most companies, data is viewed as a problem instead of an asset. Data is often stuck in systems that don’t talk to each other, manual processes affect data quality, and analytics tools aren’t providing clear insights. But those companies who use their data to drive business strategy are out-performing their competitors.

To be more competitive in your industry, you must take advantage of the ever-growing amount of available data – and that starts with a Data Strategy. A documented roadmap that clearly defines company goals and the specifics on how to get there will put you on the path towards data-driven decision making.

In this session, we will discuss:
• Why understanding your data is so important in today’s environment
• The key elements and benefits of a Data Strategy
• How to start developing your own Data Strategy
• How to cut through industry noise and focus on the technologies and approaches that are right for you

Speakers
avatar for Dave Williams

Dave Williams

Director of Data Strategy, Analytics8
David works with companies of all sizes in helping mature analytics capabilities. 


Thursday May 30, 2019 9:00am - 9:30am CDT
(C) A2564 - A2566 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:00am CDT

Connecting Data to Decisions: Where to Start with Modern Analytics
The analytic landscape is changing faster than ever, and companies who don’t embrace that change are quickly being left by the roadside. Join data evangelist Doug Bordonaro to learn how world-class companies are thinking about data, how you can get started, and why in this new analytic world it’s more important to think about people than features.

Speakers
avatar for Doug Bordonaro, MBA

Doug Bordonaro, MBA

Field CTO & Chief Data Evangelist, ThoughtSpot
With over 20 years of experience with cutting-edge BI and DW solutions and roles, Doug is a thought leader in the analytics space and spends much of his time advising enterprises on how they can solve BI pain. 


Thursday May 30, 2019 9:00am - 9:30am CDT
(A) Auditorium Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:00am CDT

Evangelizing Python For Business
Python's simple structure has been vital to the democratization of data science. But as the field rushes forward, making splashy headlines about specialized new jobs, everyday Excel users remain unaware of the value that elementary building blocks of Python for data science can bring them at the office.

Join us for a conversation about bringing Python out of IT and into the business. We'll share challenges and successes from writing tutorials, teaching classes, and advocating adoption among new users.

Speakers
avatar for Chris Moffitt

Chris Moffitt

Sr Director Revenue Operations and Analytics, CSI
I have spent the majority of my career in business roles where I try to leverage python to solve real world problems. I believe python is a very powerful tool that can replace a lot of the mundane Excel work that most businesses have.
KK

Katie Kodes

Database Developer
Katie integrates Oracle, SQL Server, & Salesforce databases into web sites and 3rd-party tools. For fun, she teaches Python classes and writes online Pandas tutorials for Salesforce administrators and other Excel users ready to move on from VLOOKUP.



Thursday May 30, 2019 9:00am - 9:30am CDT
(D) P0808 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:00am CDT

How to Model with Millions of Variables
In some fields, big data doesn't just mean lots of rows — it also means lots of variables. For example, in the fields of proteomics and genomics, the analysis of gene expression data or mass spectrometry readings is increasingly important for diagnosing diseases. But there can be hundreds of thousands of SNPs from thousands of patients in a single study. So how do you build predictive models in which the number of independent variables can number in the millions?

In this demonstration we show how one manufacturer used a series of 'wide data' innovations in the TIBCO Data Science platform to build a digital twin model that identified causes of yield loss in a semiconductor fab. The result represents the successful convergence of IoT, big data, and machine learning to build a predictive model on over 6 million features. 

Speakers
avatar for Steven Hillion

Steven Hillion

Sr. Director of Data Science, TIBCO
Steven Hillion has been leading large engineering and analytics projects for over fifteen years. At TIBCO Software, he leads innovations in large-scale machine learning and collaborative analytics. He received his Ph.D. in mathematics. 


Thursday May 30, 2019 9:00am - 9:30am CDT
(H) P1838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:00am CDT

Robotic Process Automation (RPA): The Path to AI
UiPath Robots are learning new skills with increasingly sophisticated Pragmatic AI capabilities to enable automation of progressively complex, cognitive tasks. AI and RPA, when paired together, can: read, write, listen and make decisions to work effectively. UiPath is making major strides to provide a delivery mechanism for AI skills by building its own embedded AI offerings, while providing the ability to host custom skills for partners and customers, allowing last mile delivery of AI capabilities. RPA and AI fix major challenges in being able to apply AI to solve real-world problems.

Speakers
avatar for Chirag Halani, MS

Chirag Halani, MS

Customer Success Manager, UiPath


Thursday May 30, 2019 9:00am - 9:30am CDT
(F) P0838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:00am CDT

Artificial Art - Exploring Methods of Neural Style Transfer
Neural Style Transfer (NST) is the process of using Convolutional Neural Networks (CNNs) to create artistic imagery by separating and recombining content and style.

This interactive session will discuss the architectures used in various methods of NST and will show detailed examples of how this can be done using Python. Users will be able to follow along in Jupyter notebook, and all code will be made publicly available via GitHub.

Speakers
avatar for Nick Morgan

Nick Morgan

Data Scientist, Korn Ferry
Nick has been working as a Data Scientist at various Fortune-500 companies for the past 5 years. He specializes in Natural Language Processing, Image Processing, and Development-Operations. View his GitHub here... Read More →


Thursday May 30, 2019 9:00am - 9:30am CDT
(G) P1808 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:00am CDT

Graph based approach to Healthcare Analytics
This presentation will discuss graph databases, graph algorithms and where graph based approaches can be used in Healthcare analytics use cases.

Speakers
avatar for Sudeep Vishnumurthy

Sudeep Vishnumurthy

Distinguished Engineer, Optum
As a technology strategist, architect and developer, I have a fairly broad exposure to healthcare related solutions in the US. My current focus area is application of graph based approaches for healthcare use cases.


Thursday May 30, 2019 9:00am - 9:30am CDT
(B) Garden Room Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:00am CDT

AllenNLP Introduction
This presentation introduces AllenNLP; a deep-learning framework explicitly focused on natural language processing. We'll cover the short-coming with other deep learning frameworks that AllenNLP attempts to address. We'll also walk through some basics machine-learning tasks that AllenNLP covers.

Speakers
avatar for John Hudzina, PhD

John Hudzina, PhD

Senior Research Scientist, Thomson Reuters
Dr. John Hudzina works at Thomson Reuter's Center for AI & Cognitive Computing (C3).   In his role at C3, John specializes in cloud computing services that supports natural language processing.


Thursday May 30, 2019 9:00am - 9:30am CDT
(I) K1450 (Fireside) Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:00am CDT

Startup Showcase
The startup showcase features pitches from promising startups addressing a major market need or problem with a solution related to data/analytics, AI, or machine learning. An audience Q&A follows each pitch, so bring your questions!

Session Chair: Graeme Thickins (graeme@minneanalytics.org)

Startups
avatar for Click360

Click360

An AI-driven analytics platform for marketers, to help businesses understand their customers' journeys and put each customer on the best path to revenue.
avatar for Homi

Homi

A networking platform designed to build meaningful relationships between students and alumni, allowing students to ask questions, meet, and connect with alumni using career profiles and organizational groups.
avatar for Nested Knowledge

Nested Knowledge

A platform that transforms the way clinicians review research, from a static text-based model to a dynamic and interactive visual landscape.
avatar for Phenomix_Sciences

Phenomix_Sciences

A comprehensive, data-driven platform that brings precision medicine to obesity management using innovative diagnostics and AI algorithms developed at Mayo Clinic.  
avatar for ProcessBolt

ProcessBolt

State-of-the-art vendor risk assessment platform, replacing spreadsheets with an automated solution that delivers quantitative and actionable vendor risk data.
avatar for SilentMD

SilentMD

A digital recovery platform for total knee replacement patients, utilizing wearable sensors, data analytics, and direct human support to streamline post-surgical recovery.
avatar for ZIFF

ZIFF

An AI database for image, audio, and video data with built-in indexing, search, training, and inference capabilities, for the type of data that doesn’t fit into the tidy rows and columns of traditional databases.


Thursday May 30, 2019 9:00am - 11:15am CDT
(E) P0806 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:30am CDT

Break
Thursday May 30, 2019 9:30am - 9:45am CDT

9:45am CDT

Discerning Industry Requirements for Graph Analytics
Earlier this year, Gartner® lists graph processing among the top 10 data and analytics technology trends for 2019: “The application of graph processing and graph DBMSs will grow at 100 percent annually through 2022 to continuously accelerate data preparation and enable more complex and adaptive data science.” Almost ten years ago, Intel launched an academic research program to study parallel algorithms for non-numeric computation. This research gave rise to GraphBLAS, a graph analytics specification based on sparse linear algebra. This approach leverages decades of existing R&D on high-performance, parallel sparse linear algebra. In addition to GraphBLAS, Intel is evaluating several other approaches to graph analytics, including parallel frameworks and domain-specific languages.

The landscape of graph analysis is diverse and highly-fragmented. What are the critical graph algorithms? What graph sizes are typical for a given application? Are streaming, dynamic graphs required, or is static analysis enough? When is the graph topology sufficient, and when must vertex and edge attributes be considered? This talk covers Intel’s internal evaluation and benchmarking of various graph solutions in order to answer these questions. Our recent work on truly large graphs (i.e., roughly one trillion edges) will also be presented.

Speakers
avatar for Henry A. Gabb

Henry A. Gabb

Senior Principal Engineer, Intel
Henry A. Gabb is a Senior Principal Engineer in Intel’s Compute Performance and Developer Products group. Much of his career has been spent promoting the value of parallel computing. He was the program manager for the Universal Parallel Computing Research Centers, a joint Intel/Microsoft... Read More →


Thursday May 30, 2019 9:45am - 10:30am CDT
(C) A2564 - A2566 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:45am CDT

MinneAnalytics Open Forum
What's next for MinneAnalytics?

Join us for a discussion about possible future events such as Farm, Ag, Supply Chain, SportCon, or People Analytics.

Thursday May 30, 2019 9:45am - 10:30am CDT
(I) K1450 (Fireside) Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:45am CDT

Founder Stories: Building a Successful Data or AI Startup in MN
Once again, we'll have a panel of founders who pitched at previous Data Tech events, in 2017 and 2018, talking about their successes since then -- including the millions they have raised.

Moderator: Graeme Thickins, Board Member, MinneAnalytics

Speakers
avatar for Scott Burns

Scott Burns

CEO & Cofounder, Structural
avatar for Dan Mallin

Dan Mallin

Managing Partner & Cofounder, Equals3
avatar for Daren Klum

Daren Klum

CEO & Founder, Secured2


Thursday May 30, 2019 9:45am - 10:30am CDT
(E) P0806 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:45am CDT

Pure 1 Meta: AI Platform to Enable Self Driving Storage
In this presentation, we will discuss how Pure1 Meta, the Pure AI engine, will help further advance the vision of self-driving storage. We will recap our innovation efforts, including building a global sensor network that collects over 1 trillion data points per day. 

Speakers
avatar for Stan Yanitskiy

Stan Yanitskiy

Technical Marketing Engineer, Pure Storage
Stan is a technical marketing expert who acts as a bridge between engineering and product marketing. He has the ability to simplify complex technical concepts to non-tech audiences.  Stan works with cross-functional teams to set  strategic direction. 


Thursday May 30, 2019 9:45am - 10:30am CDT
(F) P0838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:45am CDT

Deep Dive into the Doughboy’s Data
General Mills is one of the largest food companies in the US, with an ever-widening family of brands that consumers love. Our presentation will focus on how we track, collect and synthesize branded impression data from web, social, email and beyond for Pillsbury.com. We'll demonstrate how we are using Tableau to visualize this impression data in a simple and easy-to-understand format for our brand teams. Join us for a deep dive into the Doughboy's data!

Speakers
avatar for Devan Sayles

Devan Sayles

Data Engineering Manager, General Mills
Devan has been with General Mills for 8 years. She has worked as an SAP Developer, a Digital Marketing analyst, and for the past 3 years as a Data Developer using Hadoop and other Big Data technologies. 
avatar for Sammi Crocker

Sammi Crocker

Business Analyst, General Mills
Sammi has been a Business Analyst, overseeing the Digital Analytics product team at General Mills since December of 2017. Prior to General Mills, she spent 5 years in the agency world where she was able to take a deep dive into the world of marketing analytics and data visualization... Read More →


Thursday May 30, 2019 9:45am - 10:30am CDT
(B) Garden Room Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:45am CDT

Trust the Algorithm: Recommending Technicians for On-Site Work in a B2B Marketplace
In order to provide a positive user experience, a gig economy marketplace must be able to facilitate optimal matches between contract workers and jobs. In my presentation I will share our learnings from building recommendation systems for the Field Nation online marketplace. I will also further discuss the latest AI technologies included in this work, such as: Pyro/PyTorch for Bayesian inference and text categorization based on universal language models.

Speakers
avatar for Alex Smith, PhD

Alex Smith, PhD

Senior Data Scientist, Field Nation
After getting his PhD in physics, Alex was a researcher in experimental high-energy physics for almost two decades before a career change to data science.  


Thursday May 30, 2019 9:45am - 10:30am CDT
(G) P1808 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:45am CDT

A Data Science Playbook for Explainable AI - Navigating Interpretable vs Predictive Models
This presentation covers the content of my blog posts and gives a broader context to the important issue of what people are starting to call explainable AI.

Speakers
avatar for Joshua Poduska

Joshua Poduska

Chief Data Scientist, Domino Labs


Thursday May 30, 2019 9:45am - 10:30am CDT
(D) P0808 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:45am CDT

Building Your Own R Package
Do you have one function or two, or more, that you keep copying into multiple R scripts? If your answer is yes, and you have not put them into an R package yet, then this session is for you. I will be going through how you can build your very own R package. This presentation is suitable for R users at all levels.

Speakers
avatar for Kim Eng Ky

Kim Eng Ky

Principal Data Scientist, UHC Government Operations


Thursday May 30, 2019 9:45am - 10:30am CDT
(H) P1838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

9:45am CDT

Do-It-Yourself Interactive Data Visualization: Starting Simple (and Keeping It That Way)
A major part of persuading with data is how you present it. For many data scientists and related professionals, the heavy lifting of data crunching is done in Python, R, Julia, Scala, etc. But making interactive visualizations of data and serving up the results in an app yourself means working with JavaScript. If you are like the presenter, starting a second side-career as a front-end developer is not a realistic option. Fortunately, it has gotten easier in the past few years to get the benefits of interactive plotting without having to become a JavaScript expert (and without paying for an expensive licensed product). I will walk through several examples of how to take data from Python code and serve it up for free as interactive plots in easily deployable apps. I will provide examples on github for trying out afterwards.

Speakers
avatar for Dinesh Shenoy, PhD

Dinesh Shenoy, PhD

Data Scientist, Veritas Technologies LLC
Dinesh is a data scientist with an academic background (physics and astronomy) and a focus on implementing solutions for data analysis and visualization. He also hosts a twice-monthly meetup group focused on data science and related topics. The code examples for today's presentation... Read More →



Thursday May 30, 2019 9:45am - 10:30am CDT
(A) Auditorium Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

10:30am CDT

Break
Thursday May 30, 2019 10:30am - 10:45am CDT

10:45am CDT

Why Marketing Analytics Fails (and How to Do It Right)
Marketers spent $5 billion on analytics tools last year — yet most organizations still struggle to turn data into insight. What’s the best way to invest in your team?

Attendees will learn how to:
  • Correctly identify their team’s current analytics needs and objectives
  • Understand the essential roles required for a healthy analytics practice
  • Determine how to choose the right tools for data management and dashboarding
  • Discover what they need to do to advance to the next level of performance

Speakers
avatar for Matt Hertig

Matt Hertig

CEO / Co-Founder, Alight Analytics
Matt Hertig is the CEO of Alight Analytics, the leader in marketing analytics. For more than a decade, Matt has shown the world’s best-known companies — Maui Jim, AMC Theatres, Mattel, Adidas and more — how to turn data into powerful insights. 


Thursday May 30, 2019 10:45am - 11:15am CDT
(C) A2564 - A2566 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

10:45am CDT

Building a Music Recognition AI to Convince Your Friends You’re Not the Worst at Guitar
Sometimes in life you decide to play a song on your guitar for your friends. Inevitably, the conversation arises in which they have no idea what song you were trying to play, but they’re unsure whether or not they can ask what song it was without offending you. Listen in as Hopper explains in concise, easily understandable, laughable terms how you can forego getting better at guitar and just build a neural network to solve the problem. By letting an AI tell your friends what song you’re trying to play, they can remain “Minnesota nice,” and you can keep your dignity.

Speakers
avatar for Stephen Hopper

Stephen Hopper

Senior Software Engineer, Signifyd
Functional programming advocate and machine learning engineer.



Thursday May 30, 2019 10:45am - 11:15am CDT
(I) K1450 (Fireside) Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

10:45am CDT

Building Pipelines with Apache NiFi
In this presentation, Mac Noland (Chief Data Officer) and Safwan Islam (Senior Data Engineer) will provide an overview of the distributed data flow tool - Apache NiFi. They will walk through a commonly used industry pipeline and implement it with the NiFi user interface. Additionally, the talk will touch on popular end points (origins and destinations), appropriate use cases, and comparisons to other similar tools.

Speakers
avatar for Mac Noland

Mac Noland

Principal Solutions Architect, phData
avatar for Safwan Islam

Safwan Islam

Machine Learning Engineer, phData
I'm a Machine Learning Engineer and have worked at phData for over 2.5 years. Within this time, I've built automated big data pipelines, performed complex data transformations, and contributed to internal software development projects. I graduated from the University of Minnesota... Read More →


Thursday May 30, 2019 10:45am - 11:15am CDT
(H) P1838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

10:45am CDT

Data as a trusted partner
AmeriPride's journey to transform data into a trusted partner. 

Speakers
avatar for Tony Ordner

Tony Ordner

Director of Information, AmeriPride Services Inc.
Tony aligns business, people and technology to provide a competitive edge and profitable business opportunities.


Thursday May 30, 2019 10:45am - 11:15am CDT
(F) P0838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

10:45am CDT

From Data Science to Knowledge Engineering: How Graphs Change Everything
In the past, 75% of data science dealt with the "janitorial" work of getting clean well-connected data. Enterprise Knowledge Graphs are changing this. Now data scientists can get direct access to high-quality connected data sets that have been validated and are continuously groomed by machine learning agents.

Speakers
avatar for Dan McCreary, MS, MBA

Dan McCreary, MS, MBA

Distinguished Engineer, Optum
Dan McCreary is a Distinguished Engineer in AI at Optum.  He is an author of  "Making Sense of NoSQL" and is focused on ML, graph and knowledge representations in AI.  He also founded NoSQL Now! and was an event chair for the inaugural Big Data Tech conference. 


Thursday May 30, 2019 10:45am - 11:15am CDT
(A) Auditorium Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

10:45am CDT

Power the Personalized Recommendation with Deep Learning and Pre-trained Embeddings.
In this session, I will talk about how to use word embedding techniques in E-Commerce, how to get and validate your product embeddings, and how to use embeddings to make personalized recommendations.

Speakers
avatar for Yufeng Wang

Yufeng Wang

data scientist, best buy
Yufeng (Louis) Wang is a data scientist working on personalized recommendation with machine learning and deep learning. He is also a blogger at Medium, player at Kaggle, and coder at Github.


Thursday May 30, 2019 10:45am - 11:15am CDT
(G) P1808 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

10:45am CDT

Story of a Data Engineer
This presentation will uncover several challenges that data engineers and other professionals face on a daily basis while processing big data.

Speakers
avatar for Jayesh Patel

Jayesh Patel

Sr Data Engineer, Rockstar Games
Jayesh Patel currently works at Rockstar Games as Senior Data Engineer, focusing on developing data-driven decision-making processes on Big Data Platform. He has successfully built machine learning pipelines and architected big data analytics solutions over the past several years... Read More →


Thursday May 30, 2019 10:45am - 11:15am CDT
(D) P0808 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

10:45am CDT

Dynamic Modeling and Forecasting
We introduce new tools for modeling and forecasting multivariate time series data using vector autoregression, and demonstrate the use of the BigVAR and bigtime R packages.

Speakers
avatar for David Matteson, PhD

David Matteson, PhD

Associate Professor, Cornell University
Associate Professor, Cornell University. Data Science, Statistical Science, Machine Learning, Applied Mathematics, Operations Research and Econometrics. PhD Statistics, U. Chicago; BSB Finance, Mathematics, Statistics, U. of Minnesota.


Thursday May 30, 2019 10:45am - 11:15am CDT
(B) Garden Room Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

11:15am CDT

Break
Thursday May 30, 2019 11:15am - 11:30am CDT

11:30am CDT

Leadership in Data Science Panel
Most of the talks at Data Tech will lie on a fairly technical spectrum. This panel is about the higher-level inner-workings of Data science. Topics include how to initially engage clients, scoping projects, managing stakeholder relationships, training team members, etc.

Moderators
avatar for Jake Mason

Jake Mason

Senior Data Scientist, UnitedHealthcare

Speakers
avatar for Ylan Kazi, MHA

Ylan Kazi, MHA

VP, Data Science + Machine Learning, UnitedHealthcare
avatar for Rebeccah Stay

Rebeccah Stay

Global Leader of Data Science, Cargill, Inc
Leader of data science team at Cargill:  you can't spell "Cargill" without "AI"!  We are the at the intersection of agriculture, manufacturing, R&D, and data science; working in computer vision, optimization, molecular data science, forecasting, generation, and predicting.
avatar for Peter Eliason, MS

Peter Eliason, MS

Director of Analytics and Data Science, Revel Health
avatar for Mark Zuchowski

Mark Zuchowski

Senior Manager, General Mills


Thursday May 30, 2019 11:30am - 12:15pm CDT
(D) P0808 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

11:30am CDT

Using deep learning to play COD on the Xbox and what it means for your business
This presentation will walk you through the process of designing an AI system to play Call of duty on the Xbox. Then the presentation will steer that into the enterprise and what this means for businesses who are interested in applying AI into their human workflows. 

Speakers
avatar for Ben Taylor

Ben Taylor

Chief Data Officer, Ziff
Ben Taylor has over 16 years of machine learning experience. He has worked for Intel/Micron, a hedge fund as a quantitative analyst, and he helped build out HireVue’s data science team and AI product. Ben is a recognized expert with deep-learning and is currently the chief AI officer... Read More →


Thursday May 30, 2019 11:30am - 12:15pm CDT
(E) P0806 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

11:30am CDT

Data Science Isn't Just A Job
The growing corporate excitement about machine learning and AI has led to a push to professionalize the practice of data science. Unfortunately, because enterprises tend to create technology silos, this threatens to destroy the most valuable aspect of data science itself. Peter will talk about why data science shouldn’t be seen as merely another technical job within the business, and also why open source is such a critical aspect of innovation in the field of data science. 

Speakers
avatar for Peter Wang

Peter Wang

CTO, Anaconda, Inc.
Peter is co-founder and CTO of Anaconda, Inc., where he leads the open source and community innovation group. He founded the PyData community and conferences, and devotes time and energy to growing the Python data science community around the world. 


Thursday May 30, 2019 11:30am - 12:15pm CDT
(C) A2564 - A2566 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

11:30am CDT

The Top 4 Hazards to Avoid When Building Your Data Lake
Organizations often underestimate the technical complexities and the resources needed to build an effective data lake—-not to mention operating and maintaining the data lake once it’s developed. But you can circumvent these obstacles by learning the most common pitfalls faced in developing and deploying your data lake.

In this session, learn how to manage:

  • Data ingestion issues with managing source side change data capture and subsequent merging and syncing of data
  • Data preparation challenges including effective troubleshooting of data pipelines and how to prep data to scale and support high-speed BI queries
  • Data lake operational hazards including scaling up for agile enterprise-grade performance
  • Constantly changing data platforms and how to deal with hybrid and multi-cloud environments
  • Data lake governance challenges such as managing access control and maintaining regulatory compliance

Speakers
avatar for Ramesh Menon

Ramesh Menon

Vice President of Product, Infoworks.io
Ramesh Menon leads the product management team for Infoworks' Autonomous Data Engine. He has been in the data management and big data space for the past 20 years. Previously, he headed products teams at YarcData and Informatica. 


Thursday May 30, 2019 11:30am - 12:15pm CDT
(B) Garden Room Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

11:30am CDT

Bridging the Skills Gap Between Data Enthusiast and Data Scientist with AutoML
AutoML can automate data preparation, feature extraction, model selection, and model tuning. This can save a Data Scientist loads of time. So instead of hiring four Data Scientists, you may only need two, right?

It’s no secret the shortage of data science talent to help companies produce advanced analytics from their stockpiles of data. There is also a plethora of vendor tools available making promises of turning an analyst into the next great data scientist (which BTW, is possible).

From the depths of the hardcore mathematicians, statisticians and computer scientists (who created this stuff in the first place), have created more advanced tools automate the model creation process to help Data Scientists become more efficient, and (hopefully) better at our jobs.

I will demo AutoML, discuss some pros/cons, and what it can do for you.

Speakers
avatar for Josh Janzen, MS, MBA

Josh Janzen, MS, MBA

Data Science Manager, CH Robinson
The deck will be posted afterwards on my Data Science blog: www.joshjanzen.com


Thursday May 30, 2019 11:30am - 12:15pm CDT
(A) Auditorium Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

11:30am CDT

Deliberations on Scientific and Methodological Aspects of Machine Learning
Many diverse fields, such as applied mathematics, statistics, machine learning, data mining, econometrics, bioinformatics etc. are concerned with estimation of data-analytic models. More recently, due to the abundance of data and cheap computing power, machine learning (ML) algorithms have become very popular in various applications, even though many such algorithms are heuristics vaguely motivated by biological (as opposed to mathematical) arguments. This disconnect (between mathematics and practical applications) may seem strange, given the deep intrinsic connection between mathematics, science and engineering. The purpose of my talk is to explain various reasons for the current disconnect, including (a) conceptual (philosophical) aspects; (b) technical (mathematical) aspects and (c) non-technical (social) aspects. In particular, my talk will elaborate on different interpretation of philosophical concepts (of deductive and inductive reasoning), in classical science, statistics and ML. 

Speakers
avatar for Vladimir Cherkassky, PhD

Vladimir Cherkassky, PhD

Professor, Electrical & Computer Eng., University of Minnesota
Vladimir Cherkassky is Professor of Electrical and Computer Engineering at the University of Minnesota, Twin Cities. He received a PhD in Electrical and Computer Engineering from the University of Texas at Austin in 1985. 


Thursday May 30, 2019 11:30am - 12:15pm CDT
(H) P1838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

11:30am CDT

Open Source GPU Analytical Database
OmniSci Core natively supports standard SQL and returns query results hundreds of times faster than CPU-only analytical database platforms. Analysts and data scientists can still rely on their existing SQL knowledge, querying data using industry-standard SQL.

OmniSci can operate as a standalone SQL engine using the command line tool mapdql, or the SQL editor that is part of the OmniSci Immerse visual analytics interface. OmniSci query results can output to OmniSci Immerse or to third-party software such as Birst, Power BI, Qlik or Tableau, via a variety of connectors.

Speakers
avatar for Paul Wickman

Paul Wickman

Technology Director, RESPEC
Technologist who needs tangible, real-world evidence demonstrating the results of my work. I develop software products and computational modeling systems for Earth Analytics engineering consulting; currently Director of Applied Technology at RESPEC.


Thursday May 30, 2019 11:30am - 12:15pm CDT
(G) P1808 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

11:30am CDT

Question Answering in the Legal Domain
We present a non-factoid QA system that provides legally correct, jurisdictionally relevant, and conversationally responsive answers to user-entered questions in the legal domain. This commercially available system is entirely based on NLP and IR, and does not rely on a structured knowledge base. Our system aims to provide concise one sentence answers for basic questions about the law. It is not restricted in scope to particular topics or jurisdictions. The corpus of potential answers contains approximately 20M documents classified to over 120K legal topics.

Speakers
avatar for Tonya Custis

Tonya Custis

Senior Director, Research, Thomson Reuters
Dr. Tonya Custis is Senior Research Director at Thomson Reuters where she leads a team of scientists performing applied research in Artificial Intelligence technologies, focused on information retrieval, NLP, and machine learning. 


Thursday May 30, 2019 11:30am - 12:15pm CDT
(F) P0838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

11:30am CDT

Systematic Innovation in Data Science
Data science, as a dynamic and emerging discipline, is wrought with contradictions:

- Quantitative / qualitative (linear / non-linear; intuitive / analytical; left brain / right brain)
- Design / development (design / implementation; strategy / execution; strategic / tactical)
- Quality / speed
- Technical perspective / social perspective
- Machine intelligence / human intelligence
- Deterministic domains / non-deterministic domains

Conventional approaches to data science pursue optimization of these contradictions. Optimization instead of resolution limits the effectiveness of data science. Systematic innovation pursues resolution instead of optimization. Systematic innovation is a combination of art plus science; it is a powerful approach that eats contradictions for breakfast.

Let's explore using systematic innovation and systems thinking to resolve these contradictions. Let's make data science more effective.

Data Science 2.0?

Speakers
avatar for David Quimby, MBA

David Quimby, MBA

Principal, Innovation Radiation
I am a mathematical economist and systems analyst. I have diverse experience in large enterprise and small enterprise as a technology executive and a software entrepreneur. I am a patented inventor in the user experience and Web middleware domain. 


Thursday May 30, 2019 11:30am - 12:15pm CDT
(I) K1450 (Fireside) Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

12:15pm CDT

Lunch
Thursday May 30, 2019 12:15pm - 1:15pm CDT

1:15pm CDT

Overcoming the challenges of Software 2.0 for Healthcare
Software 2.0 is written in the weights of deep neural networks that are trained using vast amounts of labelled data. Data then is becoming the new source code, compiled by deep neural networks into weights. Creating a virtuous cycle of this crucial asset differentiates the firms that will survive the current AI revolution from those that don’t. However, managing data across edge, core and cloud environments creates challenges that many Healthcare firms fail to address. In this session NetApp will outline using case studies how NetApp’s “Data Fabric” empowers customers to innovate using A.I at the edge, in the core and on the Cloud.

Speakers
avatar for David Arnette

David Arnette

Technical Marketing Engineer, NetApp
David Arnette is a Sr. Technical Marketing Engineer focused on NetApp solutions for Artificial Intelligence and Machine Learning. He has published numerous reference architectures for enterprise applications and emerging workloads with NetApp products and has over 20 years’ experience... Read More →


Thursday May 30, 2019 1:15pm - 2:00pm CDT
(G) P1808 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

1:15pm CDT

The Pace of Tech is Accelerating
The pace of technology is changing, and the next wave will be bigger than ever. The presentation gives a preview of what's to come in the next 5-10 years.

Speakers
avatar for Gene Munster

Gene Munster

Managing Partner, Loup Ventures


Thursday May 30, 2019 1:15pm - 2:00pm CDT
(A) Auditorium Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

1:15pm CDT

Three Secrets to Geospatial Analysis
Geospatial analysis has become an increasingly important part of many companies tool set, to understand everything from supply chains to consumer demand. With petabytes of geospatial data analyzed using machine learning and deep learning models at Descartes Labs, this session will expose some of our "secrets" to getting the most value from geospatial data.

Speakers
avatar for Kristopher Purens

Kristopher Purens

Applied Scientist, Descartes Labs
Kristopher's main focus has been finding out how to measure things that are tough to measure, and now works for Descartes Labs, the top geospatial AI  startup.


Thursday May 30, 2019 1:15pm - 2:00pm CDT
(B) Garden Room Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

1:15pm CDT

Come On, Get SHAPpy: Interpretable Machine Learning with SHAP
Not all machine learning models are created equal. More advanced methods lead to greater accuracy, but at the expense of interpretability. This talk will discuss how to utilize SHAP (SHapley Additive exPlanations) in R and Python to understand and interpret the results of complex machine learning models without sacrificing accuracy or interpretability.

Speakers
avatar for Brianna Frederick, MA

Brianna Frederick, MA

Senior Data Scientist, General Mills
Brianna is an experienced Data Scientist working in the consumer goods industry. She is skilled in advanced analytics, data strategies & leadership, global collaboration, and analytic consulting.


Thursday May 30, 2019 1:15pm - 2:00pm CDT
(E) P0806 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

1:15pm CDT

Hyper-Segmentation for Automated Insights Across Industries
Unsupervised learning is still very relevant in the new world of advanced AI and it is especially helpful when the right answer is unclear, complex or multi-dimensional. In this session we will introduce a methodology of Hyper-Segmentation which is an automated clustering approach that aims to create the most meaningful groups across a vast feature space. It can be applied to many industries and problems to deliver novel insights to data owners. We will review examples including targeted marketing for ecommerce customers, predictive asset failure understanding for utilities, and sales support for healthcare payers. 

Speakers
avatar for Matt Mazzarell

Matt Mazzarell

Senior Data Scientist, Teradata
Matt Mazzarell is a Senior Data Scientist for Teradata. Based in New Orleans, LA, Matt has diverse experience working with clients in Banking and Financial Services, Telecom, Manufacturing and Healthcare. He has been a frequent speaker at Teradata Analytics Universe working with clients... Read More →


Thursday May 30, 2019 1:15pm - 2:00pm CDT
(D) P0808 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

1:15pm CDT

Machine-Learned School Dropout Early Warning at Scale
Schools use targeted interventions to help students at risk of school dropout or late graduation. Identifying these students takes significant investment in people, process, and technology, and warning signs are often context-specific and scattered across data sources. Extremely high counselor caseloads compound the problem, and are worse in schools serving children with other structural disadvantages. We describe how machine learning technology can improve existing educational systems like the Minnesota Early Indicator and Response System, and provide details on this new technology's statewide implementation in Kentucky. We discuss key technical challenges for early warning data systems and best practices for overcoming them. These challenges specifically include foregrounding fairness, accountability, and transparency, and we offer a discussion of how computational public policy systems can mitigate or worsen equitable access for all students.

Speakers
avatar for Daniel Jarratt, MS

Daniel Jarratt, MS

Head of learning science technology, Infinite Campus
Data scientist with a decade of experience in education data, software engineering, and product management, with a focus on recommender and decision support systems. 
avatar for Thomas Christie

Thomas Christie

Data Scientist, Infinite Campus
Thomas Christie is a data scientist at Infinite Campus, a student information system for 8 million students across the United States.  At Infinite Campus, Thomas applies state-of-the-art machine learning techniques to solve problems in education.  Thomas is also graduate student... Read More →


Thursday May 30, 2019 1:15pm - 2:00pm CDT
(F) P0838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

1:15pm CDT

Modern Data Warehousing in the Cloud
In this talk, you'll learn more about:

  1. Key challenges of the legacy data warehouses such as data diversity, concurrency, scalabilily, performance, cost, management, ...
  2.  How modern data warehouses in the cloud not only overcome most of these challenges but also how some of them bring additional technical innovations and capabilities such as pay-as-you-go service, decoupling of storage and compute, scaling up or down, near-zero management, native support of semi-structured data ...
  3. How the capabilities brought by modern data warehouses in the cloud not only enable new use cases but also help businesses during the phases of their life cycle such as launch, growth, maturity and renewal/decline.
  4. Near-Real-Time Data Warehousing through a use case built on Snowflake with a live demo to showcase ease of use, fast provisioning, continuous data ingestion, support of JSON data ...

Speakers
avatar for Slim Batalgi

Slim Batalgi

Director, Big Data & ML, Cervello
Slim Baltagi is a Director of Big Data and Machine Learning at Cervello, an A.T. Kearney company. He delivered many end-to-end data projects to major Fortune 500 companies. He enjoys speaking at meetups and conferences in the US and abroad.  


Thursday May 30, 2019 1:15pm - 2:00pm CDT
(C) A2564 - A2566 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

2:00pm CDT

Break
Thursday May 30, 2019 2:00pm - 2:15pm CDT

2:15pm CDT

Using PyTorch to Classify Traffic Signs
Are you interested in learning and applying an easy to use and intuitive machine learning framework?  Do you want to know if there a viable test and development alternative to TensorFlow?  Are TensorFlow or other Python tracebacks inscrutable?
We will walkthrough and discuss a Python3 notebook.  The notebook is a tutorial that shows how to classify the signs in the (cropped) BelgiumTS dataset.  The tutorial goes through image processing, classification, and transfer learning functionality available in Pillow, PyTorch, and torchvision.  The plan is to use Google’s Colaboratory to run a live demo (in an environment accessible to everyone).

Speakers
avatar for Jeffrey Van Voorst, PhD

Jeffrey Van Voorst, PhD

Principal Software Engineer, Veritas


Thursday May 30, 2019 2:15pm - 3:00pm CDT
(G) P1808 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

2:15pm CDT

Cows Always Eat - Optimizing a network with noisy inputs
Problem
Cargill's Feed team provides "Feed As a Service". This means that the team has to move 3 million tons of feed from the corn milling plants to some of the world's largest feed customers across MidWest and Texas and do that in the most efficient way. There is just one rule "Never run out of feed" which we have to follow while managing our systems for:
  1. Variable Demand - We want to keep our customers at a target inventory while accounting for variable feed usage, shrinkage and degradation
  2. Variable Supply - Corn milling assets provide us with the raw material to prepare feed which can vary depending on the milling business througput
  3. Drivers - We want to maximize the driver's time to do maximum deliveries in the shortest time window.

Solution
The objective is to decouple the feed business from the variability in different components of their supply chain. To accomplish that the data science team in collaboration with Digital Labs has broken down the problem into different optimization modules. Currently we are working on the foundational piece of our solution which is the KIX(Customer Inventory Execution) module.The KIX optimization module has two key features:
  1. Usage Model - Ingests driver estimated inventory readings across all the 130 feed customers to calculate the current customer feed inventories and usage rates.
  2. Recommendation Model - The recommendation optimization model takes the usage model numbers to create a recommended loads of deliveries for next 7 days. This list can range from 300-350 truckloads to 70 different customers per day to optimally fulfill the most urgent customer's needs considering different constraints such as customer closed days and gate hours, target inventory and plant capacity.

Benefits
  • The business used to spend days to get the right usage and inventory and then use that information to schedule fulfillment orders. By automating all of it using the KIX module, the business now spends more time with customers.
  • KIX model can run at the snap of finger at any time of the day to readjust the delivery plan on the basis of changes in customer demand
  • KIX enables the business to respond to changes in demand more accurately and efficiently and at the same time also plan for any future events such a snow storms or planned plant downtimes
  • The business more efficiently manages for variable supply and variable demand by raising/reducing the target inventory at our customers automatically using the KIX module. This means that the business can use customers as distributed warehouses in case of excess supply and vice versa
  • KIX has laid the foundation to develop DAX(Driver Assignment Execution), which will maximize driver utilization by dynamically assigning delivery loads to drivers based on driver availability, customer gate hours, plant capacity and other critical constraints in the system

Speakers
avatar for Abhishek Roy

Abhishek Roy

North America Data Science Leader, Cargill
avatar for Zachary Skalko

Zachary Skalko

Senior Software Engineer, Cargill


Thursday May 30, 2019 2:15pm - 3:00pm CDT
(C) A2564 - A2566 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

2:15pm CDT

Modern Agriculture: Redefining AI
In 2030, the world population is estimated to be 8.3 billion people. Even at our current world population 800 million people go hungry. Malnutrition affects 1 in 3 people. Agriculture is the only occupation in the world that can directly lead to Zero Hunger. It compels us to come together, utilize new technologies, new farming practices and artificial intelligence in ways that will so directly impact our world.

In this talk, we speak to the ways in which modern agriculture is already adopting Artificial Intelligence and what a modern farm looks like through stunning imagery and a virtual immersive agriculture experience. We also speak to the technologies of the future that will continue to change the agricultural landscape, and what farmers still need from artificial intelligence to be able to feed the world.

Speakers
avatar for Karen Hildebrand

Karen Hildebrand

CTO, FarmFemmes
Karen Hildebrand, PhD co-founded Farm Femmes in late 2017 to grow and extend AI in agriculture, building on her 10+ yrs in AI/ML. Farm Femmes focuses on how modern technology will help to ensure we all have the opportunity to eat.


Thursday May 30, 2019 2:15pm - 3:00pm CDT
(E) P0806 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

2:15pm CDT

Powerful Data from Unexpected Places
Most companies understand the need to develop an analytics and big data strategy to enhance customer or partner experience or to improve business processes and outcomes. Internal data is where most companies start as they build out a data platform. Integrating external data sources such as location, government or weather information can provide new opportunities to create more meaningful outcomes that are greater than simply combining the data would suggest.

James Peterson, a meteorologist from IBM, and Michael Downs, CTO at Evolving Solutions share some fascinating use cases for combining weather data with internal data to produce high value and sometimes surprising outcomes.

Speakers
avatar for Michael Downs

Michael Downs

Chief Technology Officer, Evolving Solutions
Michael brings over 18 years of experience solving complex business problems. His broad technology, business and industry experience provides valuable perspective to our clients working to implement innovative technologies to support their success. 
JP

James Peterson

Meteorologist & Climatologist, IBM Watson Media and Weather


Thursday May 30, 2019 2:15pm - 3:00pm CDT
(B) Garden Room Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

2:15pm CDT

Statistical Optimization of Deep Learning Hyperparameters and Data Augmentation Methods
Training dataset sizes are a critical factor in the creation of accurate neural network image classifiers where the axiom “bigger-is-better” appears to hold true. Image augmentation (creating several instances of an original image by applying image transformations) is often used to increase the size of datasets when the original dataset is too small. Although several augmentation strategies have been developed that have successfully improved neural network performance, little research has been done to study the appropriate ratio of augmented to original data. In this presentation, Tom will introduce a search strategy that uses statistical mixture experiments to identify the optimal blend of several different image augmentation methods. He will also discuss how hyperparameter tuning can be incorporated into this process to simultaneously tune hyperparameters and augmentation strategies for efficient deep learning model optimization. The presentation concludes with a case study where a mixture experiment was used to identify the optimal augmentation strategy for a neural network used for manufacturing visual defect detection, resulting in a significant improvement in performance on a validation dataset.

Speakers
TA

Tom Albrecht

Principal Data Scientist, Boston Scientific
Tom Albrecht specializes in developing deep learning and natural language processing (NLP) models for high dimensional data, including image and free text data. He has also developed several linear and non-linear experimental design techniques. 


Thursday May 30, 2019 2:15pm - 3:00pm CDT
(D) P0808 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

2:15pm CDT

Leveraging Natural Language Processing (NLP) to drive insight from consumer feedback
Incorporating consumer feedback into business processes is essential when prioritizing future work and direction for many companies. While some information from consumers comes in the form of standardized question/answer which can be easily curated for analysis, a large amount of the information exists solely in free text fields. Application of NLP techniques to free text processing and classification can reduce human bias and improve scalability by minimizing manual efforts. Topic modeling or clustering text can help to understand broad themes. Scoring algorithms can assist to find text responses that are similar to one another. Supervised machine learning is useful to assign documents to predetermined classifications. We will provide implementation examples of NLP techniques to drive insight from consumer data such as Net Promotor Score (NPS) Survey and other customer feedback.

Speakers
avatar for Anna Mityushina

Anna Mityushina

Data Scientist, Polaris
Data Science professional with over 6 years of work experience in analytics and statistical methods. Currently working on Natural Language Processing.
avatar for Tyler Harpole

Tyler Harpole

Data Scientist, Polaris


Thursday May 30, 2019 2:15pm - 3:00pm CDT
(A) Auditorium Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

2:15pm CDT

Litigation Analytics: Extracting legal data from federal court dockets
Dockets for federal court cases contain a wealth of information for lawyers to develop a litigation strategy, but the information is locked up in semi-structured text. Manually deriving the outcomes for each party (e.g., settlement, verdict) would be very labor intensive. We used Natural Language Processing (NLP) techniques and deep learning methods in order to scale the automatic analysis of millions of US federal court dockets. The automatically extracted information is fed into a Litigation Analytics tool that is used by lawyers to plan how they approach concrete litigations. 

Speakers
avatar for Frank Schilder, PhD

Frank Schilder, PhD

Sr. Research Director, Thomson Reuters
Frank Schilder is currently a Sr. Research Director in the Thomson Reuters R & D group. He leads a team of researchers who explore new machine learning and artificial intelligence techniques in order to create smart products.


Thursday May 30, 2019 2:15pm - 3:00pm CDT
(F) P0838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

3:00pm CDT

Break
Thursday May 30, 2019 3:00pm - 3:15pm CDT

3:15pm CDT

Developing a Skilled Global Workforce Through Collaboration
In the rapidly growing field of data analytics and AI it is becoming increasingly difficult to attract and retain qualified workers. Historically, finding talent through immigration provided adequate solutions but with changes to policies and increased global competition to train and attract skilled workers that option is no longer resolving the challenge. Hear how Canada is trying to address this same challenge now and into the future through a skills development model at the corporate and academic levels coupled with open immigration policies.

Montreal International will discuss how it is collaborating with key players of the Montreal AI ecosystem including Scale.AI, Canada’s lead in the AI Super Cluster, to feed an effective workforce.

Speakers
avatar for Gwenaelle Thibaut

Gwenaelle Thibaut

Project Director, USA, Montreal International
Gwen is on a mission to help companies expand their activities in Montreal where they can find the right talent. As a lawyer with an MBA, she has an extensive network and a unique background in project management, strategic planning and business development. 


Thursday May 30, 2019 3:15pm - 3:45pm CDT
(F) P0838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

3:15pm CDT

Human-Centered Data Science
We all know the power of data science and predictive modeling to help businesses make strategic and informed decisions. However, so many times, the focus begins and ends on the data and the models we fit. At Sprocket, we take an innovative approach to data science, infusing a human-centered collaboration process at the bookends of every project. In this session, we'll walk you through how we approach data science and how you too can improve the predictive capability and implementational success of your data science projects.

Speakers
avatar for April Seifert, PhD

April Seifert, PhD

Co-Founder | Data Scientist, Sprocket
April Seifert, Ph.D., is a Social Psychologist turned Data Scientist and a co-Founder of Sprocket CX, a data-science-driven customer experience firm based in Minneapolis, MN. April’s passion is using data science to power customer experiences that are engaging for the customer and... Read More →


Thursday May 30, 2019 3:15pm - 3:45pm CDT
(A) Auditorium Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

3:15pm CDT

Help! How do I get better prices on flights?!
Buy flights on a Tuesday! Compare multiple websites! Buy months in advance!

These are all the common pieces of advice that you'll hear when you ask Google how to get the best deals on flights. Unfortunately, they're all misleading at best and completely wrong at worst. But what these things really point to is the psychology of human nature -- we want to feel like we got a great deal! Even if the price we're paying is higher than necessary.

But being data people, (hashtag) facts matter! What is a good deal? The answer, as always, is "it depends!". The truth is, most airlines excel at revenue management. Fare pricing is oftentimes an arbitrary exercise based on external requirements instead of a predictable, patterned behavior. That means, flight pricing is effectively chaotic in nature and difficult to predict on a time domain.

Using machine learning (ML) and artificial intelligence (AI) on Azure we've built and proven our ability to find that great deal. A couple of recent deals we've booked using this process (though manual, instead of automated) include Bahamas for $200, Argentina in Business for $1500, and Portugal for $550.

Come learn what we've accomplished and how we're doing it.

Speakers
avatar for Martin Lyness

Martin Lyness

Principal Engineer, EDC
Software delivery expert.


Thursday May 30, 2019 3:15pm - 3:45pm CDT
(B) Garden Room Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

3:15pm CDT

Modernizing Data and Analytics at Be The Match with the Cloud
Data and Analytics platforms are benefitting from an innovation surge driven by the cloud and organizations are finding new ways to take advantage of these innovations. In this session, Be The Match will discuss how they established a new data platform on the cloud for Analytics. They will discuss the internal challenges they had with the use and proliferation of data, how they identified what they needed, how they chose their preferred platform, and the results of where they are on their implementation journey. The session will close with demonstrations of key capabilities on the platform that were critical to Be The Match. 

Speakers
avatar for Kevin McGinley

Kevin McGinley

Field CTO, Snowflake Computing
UiPath Robots are learning new skills with increasingly sophisticated Pragmatic AI capabilities to enable automation of progressively complex, cognitive tasks. AI and RPA, when paired together, can: read, write, listen and make decisions to work effectively. UiPath is making major... Read More →
avatar for Heidi Perry

Heidi Perry

Manager, Business Intelligence, Be The Match
Heidi has worked for 10 years in the Data and Business Intelligence space, with the last 6 years focused on running Business Intelligence at Be The Match. She recently led the initiative to establish a new Data & Analytics cloud platform at Be The Match focused on delivering new capabilities... Read More →


Thursday May 30, 2019 3:15pm - 3:45pm CDT
(C) A2564 - A2566 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

3:15pm CDT

mlpack: or, how I learned to stop worrying and love C++
mlpack is a general-purpose flexible, fast machine learning library written in C++. The library aims to provide fast, extensible implementations of cutting-edge machine learning algorithms. These algorithms are provided as simple command-line programs, Python bindings (and bindings to other languages), and also C++ classes which can then be integrated into larger-scale machine learning solutions. In this talk I will introduce mlpack and discuss how it achieves its fast implementations via template metaprogramming and by implementing more asymptotically efficient algorithms. Even though C++ is fairly unpopular for machine learning, I will show that it is possible to have easy, understandable, production-quality C++ machine learning code with mlpack. I'll also give some examples of usage, including how we use mlpack inside of RelationalAI, and also talk about the future goals and development of the library.

Speakers
avatar for Ryan Curtin

Ryan Curtin

Computer Scientist, Relational AI
Ryan Curtin is a Computer Scientist at RelationalAI.  His Ph.D. work at Georgia Tech focused on fast machine learning algorithms.  These algorithms are the basis of the mlpack C++ machine learning library, which he has maintained for nearly a decade. 


Thursday May 30, 2019 3:15pm - 3:45pm CDT
(G) P1808 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

3:15pm CDT

Probabilistic programming for investigation and discovery
In several machine learning modeling frameworks, results of simulations can look like nodes of a graph and it is up to the user to take the inputs and outputs of models and connect them to each other. It is also common for data to be missing, and one would like to to link these nodes and treat them in a system as a coherent whole.

Approaching these models as a computational graph allows one to use a probabilistic programming framework. This talk will address the questions of "why?", "what?", an "how?" to use probabilistic programming, and offer two examples of probabilistic programming applied to problems with large data sets to illustrate the efficacy and effectiveness of the approach.

Speakers
avatar for Chase Dwelle

Chase Dwelle

Founder, Artesian
Recently graduated PhD, completing post-doctorate and founded firm to analyze optimal growing conditions for growing and prototyping agriculture products. 


Thursday May 30, 2019 3:15pm - 3:45pm CDT
(D) P0808 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

3:15pm CDT

Obscuring sensitive information with generative adversarial networks
Much of the data collected by corporations and public institutions is too sensitive to share publicly, or even with third parties. Medical records in particular are very difficult to release to outside researchers. Generative Adversarial Networks (GANs) may provide a solution to this problem: a GAN can be trained to generate new data that is representative of real data, but without confidentiality issues. In this talk, I will provide a general introduction to GANs, and present results showing that successful models can be trained using only generated data.

Speakers
avatar for Nicole Bridgland, PhD, MS

Nicole Bridgland, PhD, MS

Data Scientist, World Wide Technology
Nicole Bridgland is a data scientist with World Wide Technology.  She completed her PhD in mathematics at the University of Minnesota in 2018. 


Thursday May 30, 2019 3:15pm - 3:45pm CDT
(E) P0806 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

3:45pm CDT

Break
Thursday May 30, 2019 3:45pm - 4:00pm CDT

4:00pm CDT

Data literacy for my business partners: What is it, why I want, and how I can help?
Whether you are a Chief Data/Analytics Officer, data scientist, data analyst, or data engineer you are doing a lot of work to provide value from data in your organization. Yet your business partners still aren't making decisions off your data and analytics assets. Your organization's maturity as a data culture could be holding you back from getting the ROI on your hard work.

As a data person you can help close this gap. Reducing your partner's data literacy gap will help you spend less time doing basic reporting and answering basic questions. Increasing the whole organization's emphasis on data, and making data-informed decisions.

In this session we will show you what data literacy is and how your business partners can become data literate. We will also explore ways to foster a business-side community to help encourage data literacy.

Speakers
avatar for Dave Mathias

Dave Mathias

Data Coach // Director, Beyond the Data // MinneAnalytics
Dave combines his passions around data, experience, and community to make impact. Founder of Beyond the Data, co-host of Data Able podcast, and part of MinneAnalytics and TC Data Viz Group. 
avatar for Matthew Jesser, MA

Matthew Jesser, MA

Co-Founder and Data Coach, Beyond the Data


Thursday May 30, 2019 4:00pm - 4:30pm CDT
(A) Auditorium Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

4:00pm CDT

Data Science is Dev Ops
Data Science projects are difficult to plan, particularly for those coming from a software development tradition. In this talk, I'll discuss some approaches we use in data science consulting to bring projects from conception to productionalization. The main trick: engage data engineering and DevOps early in the process and produce prototypes that allow them to begin work.

Speakers
avatar for John Chandler

John Chandler

Consultant & Clinical Professor, Data Insights & Univ of MT
John has worked in Data Science for 20 years. He founded a start-up acquired by Microsoft and has been consulting for 7 years for a number of large and small companies.
avatar for Andrew Van Benschoten, Ph.D.

Andrew Van Benschoten, Ph.D.

Senior Manager, Data Science, Ovative Group


Thursday May 30, 2019 4:00pm - 4:30pm CDT
(B) Garden Room Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

4:00pm CDT

Forecasting S&P 500 Price using Automl
Hussain will discuss if Automl can be used to forecast the price of S&P 500 using various technical analysis indicators with the historical prices to forecast the price.

Speakers
avatar for Hussain Jiwani, MS

Hussain Jiwani, MS

Sr. Merchant Manager, CHS Inc.
Hussain Jiwani is senior merchandising manager and head of the Central Research and Trading group at CHS Inc.  He has held this position since 2014. 


Thursday May 30, 2019 4:00pm - 4:30pm CDT
(E) P0806 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

4:00pm CDT

Apache Sqoop, Spark, and cloud-based computing with Databricks
This presentation will cover using Apache sqoop for data acquisition and movement, along with the use of PySpark and Databricks in day-to-day data munging activities. Full/finalized abstract to follow.

Speakers
avatar for Rebecca Dysthe

Rebecca Dysthe

Data Scientist, Optum
Rebecca is a data scientist at Optum, specializing in data acquisition and supervised machine learning on PHI data.


Thursday May 30, 2019 4:00pm - 4:30pm CDT
(C) A2564 - A2566 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

4:00pm CDT

Safely Driven: How Trimble Transportation Mobility is applying deep learning to make the roads safer for everyone.
Deep learning certainly has roots in the autonomous vehicle space. However, most trucking companies have a substantial investment in existing class 8 semi-trailer trucks that are not going to be replaced overnight. Trimble Transportation is using deep learning technologies, in conjunction with other advanced analytic techniques and state of the art DevOps approaches to help ensure the safe operation of trucking fleets. While it may be premature for many trucking fleets to embrace autonomous vehicles, Trimble Transportation has made it possible for those same companies to leverage deep learning as a way to reduce costs and improve safety.

Speakers
avatar for Ryan Wolbeck, MS

Ryan Wolbeck, MS

Data Scientist, Data Scientist
avatar for Miles Porter

Miles Porter

Data Scientist, Trimble, Inc.


Thursday May 30, 2019 4:00pm - 4:30pm CDT
(F) P0838 Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

4:00pm CDT

When Seeing is not Believing: New Machine Learning Algorithms to Detect Image Manipulations
Recent breakthroughs in artificial intelligence (AI) are making it shockingly easy to produce manipulated images and videos that appear nearly indistinguishable from authentic ones. While there will be many beneficial applications of this technology, it has already been used in some very harmful ways, and it threatens to have big implications for our personal lives, the legal system, politics, etc. In this talk, I will give an overview of DARPA’s Media Forensics (MediFor) Project, which is developing new machine learning algorithms to detect manipulations in images and video, and I will highlight some of my team’s recent work for this project.

Speakers
avatar for Michael Albright

Michael Albright

Senior Data Scientist, Honeywell
Michael Albright is a senior data scientist in the Data Science and Video Analytics group in Honeywell Labs in Golden Valley, Minnesota. He earned a Ph.D. in theoretical physics from the University of Minnesota in 2015.


Thursday May 30, 2019 4:00pm - 4:30pm CDT
(D) P0808 A&B Normandale Partnership Center, 9700 France Ave So, Bloomington, MN 55431

4:30pm CDT

Networking Social
Thursday May 30, 2019 4:30pm - 6:00pm CDT
 
Filter sessions
Apply filters to sessions.