Beaker Notebook
Internal Tool to Product in 3 months
Overview
Two Sigma Investments, is a computational hedge fund with deep expertise in data science and predictive modeling. Having embraced and used many open-source tools, it was time for them to give back to the data science community by releasing their internal development tool, Beaker Notebook, as free, open-source project. This project was about getting that tool ready for public.
My Role
User Research
I was brought in to evaluate the existing prototype and do primary research around validating key assumptions, existing workflows, and potential new features.
Information Architecture
Most of IA work centered around defining new hierarchy of user actions and redesigning the extensive IDE menu system.
UX Design
My role was to do all the UX and UI work around the app, marketing website and next version concepts.
Defining Success
Short Term
Near term focus was to release a functional and stable development tool that can be used by non-experts in the data science field. Feature set should provide enough functionality to be useful but by no means complete. This would allow us to get real-world usage data and scope the next release.
Long Term
Ultimate long term success was a vibrant ecosystem of open source contributors and add-ons to the tool. This would, in turn, create visibility for the client and allow them to recruit top talent to the company.
About notebook-style development
Notebook-style development provides a more exploratory way to write code than with traditional IDEs. Notebook interfaces are comprised of a series of code blocks, called cells, which can stand alone or act in unison. The development process is one of discovery, where a developer experiments in one cell, then can continue to write code in a subsequent cell depending on results from the first. Particularly when analyzing large datasets, this conversational approach allows researchers to quickly discover patterns or other artifacts of the data.
Code blocks, like Legos, can build on each other to create a notebook
Starting Discovery
To map the problem space correctly, I like to ask a lot of questions at the beginning of the engagement. In this case key areas I tried to understand were:
- The state of current code base
- Existing goals, plans and commitments
- Attributes of existing and potential customers
- Current key competitors
- Definition of Success
- Existing revenue expectations
- Key risks for the project
- Core functionality pieces we have and/or need to build
- Low hanging fruit, both UX and Development
- Resources available for the project
- Timeline expectations
- Existing project champions and/or detractors
Initial Discovery Conclusions
Notebooks are niche
Notebook development is a niche field with iPython being the only tool with significant market adoption.
Quick fixes opportunities exist
Existing Alpha prototype offered much opportunity to significantly improve the user experience with relatively small effort. Those improvements would create notable advantage over competitors.
Target users will change
Our existing understanding of user needs is heavily based on very sophisticated internal users , something that won’t be the case outside the company.
Runway is short
It was very important to internal stakeholders to show progress as quickly as possible, so our plan should prioritize high visibility work first.
Competitive Landscape
Below is the look of two main tools, iPython and RStudio, that were most commonly used by our potential users. The look and feel is what you'd expect from complex development tools in open source environment: rich on features and poor on usability. Distinguishing ourselves with competent UX that is consistent and straight forward to use was the simplest way to get noticed.
iPython: main competitor
RStudio: secondary competitor
Beaker before redesign. Very much par for the course
Product Big Bets
After looking at the existing plans and talking with stakeholders, it was clear that the success of the project rested on three big bets:
- Data scientists require using multiple languages for their work.
- The notebook UI model has advantages over other standard development methods.
- Sharing and publication of work is important to data scientists.
User Research Goals
Existing research and usability was based on highly technical data scientists inside the hedge fund. To gain deeper understanding of the non-expert target users that are most likely to use the product outside the company, I performed additional research in form of structured interviews. The goal was to answer the following:
- Who is potential target audience for Beaker?
- What does their daily workflow and tool usage look like?
- What are the key pain points in their data analysis workflow (and what works well)?
- Will current Beaker plan / approach address those pain points?
Another major outcome was to validate if our Big Bets are indeed true.
Research workflow used
Participant Profile
9 participants of various levels of expertise and verticals were recruited and interviewed 1-on-1. They were screened to represent an even cross section of data science expertise. Highly skilled participants were purposfully excluded as we already had a lot of internal data on that type of user.
Participant roles
Qualitative ranking of users
Key Research Findings
Existing tools are too complex
Difficulty of learning a new tool is a major barrier to new tool adoption. Complexity of each tool requires significant investment of time and effort, so adopting new tools is not easy or frequent.
“If the learning curve is too steep, it’s no longer worth my time to do it. Even if it takes longer, I may as well stick with what’s known.” - Julia
First use is key
Getting to the “first look” of data as quickly is difficult using existing tools. Concentrating in providing easy way to quickly visualize data right away can be a significant competitive advantage.
“It’s a huge time waste in my mind, going from raw data to getting the picture of what the story is. I don’t have any idea what the story is, until I get to the very end.“ - Alex
Existing data tools are sticky
Getting people to use our new tool will be difficult. Because of effort required to gain proficiency in any comparable tool, the value proposition needs to be clear and significant.
“Fundamentally, it’s just what camp you subscribe to and sticking with that tool, and becoming a master. As opposed to bouncing around different tools.” - Todd
And some red flags...
One major point of concern was that our Big Bets were only partially validated:
Notebook-style development tools are not very common
Since people were only somewhat familiar with notebooks, significant investment in product education will be required.
Multiple language support is a nice to have
Although seen as useful, it was not perceived as a major benefit. This was mostly due to users having one main language they were expert in.
Sharing is desirable but not always possible
Code / result sharing provided some benefit, but proprietary nature of models in companies / universities often prevent open sharing between scientists.
So, what should we do?
Research results were very clear: we must address the complexity of the tool to make it as non-intimidating as possible for new users. However, to bring such broad scope to a realistic 3-month timeline, we decided to address structural issues first and set up the foundation for the future iterations of the tool.
Code Cell Mechanics
Code cells are key building block of the entire interface and were top priority. The user spends the most time working within cells so getting that component right was key.
Action Hierarchy
Creating a consistent interaction model around various actions and the level of hierarchy they are exposed at (cell, notebook, global etc.) would simplify the interface and decrease the learning curve.
Product Website
To address lack of familiarity with notebook interfaces, we decided to create a new marketing site introducing the concept, providing help and creating initial user community.
Cloud Integration Concepts
With an eye to the future and more complex modeling requirements, we decided it’s important to at least explore what a more powerful, cloud-based, version of the notebook would look like. This would prevent significant redesign efforts should the MVP find traction and would provide a potential revenue stream for the project.
What are we NOT building?
- Installer, a critical part for quick adoption, had a lot of issues but its complexity made it too time intensive to tackle for v1.
- Publishing was not a priority for users, like initialy assumed, so all related features were postponed.
- Additional language support was postponed because initial set covered most of the users.
Low Hanging Fruit: Simple Visual System
One of the quickest way to give the existing prototype consistency and show progress to stakeholders was to create a simple visual system. Establishing simple grid, color palette, and selecting type and defining type hierarchy were the basics. Those were then incorporated into core UX components and were the basis of a simple visual system.
I created 3 alternate looks:
- Glass: a contemporary high-tech, minimalist look.
- Science: a look evoking more traditional, academic feel.
- System: a high-contrast look, commonly adopted by developer tools.
After review with Product Manager and stakeholders, we decided to go with the “Glass” look as it provided highest differentiation from our competition.
Visual style: Glass (selected for project)
Alternate style: Science
Alternate style: System
Core cell mechanics
Guiding principle for redesigning the cells was to expose only the key functionality to new users, as per user research findings, and hide the complexity of more advanced features.
New approach to cell structure was to divide them into 3 parts: top for actions, middle for code, bottom for execution. Actions previously only available via shortcuts and right-click menus, where now logically grouped above each cell, so new users had easy access to commonly used actions. More advanced features were tucked into a menu, which also showcased full functionality as well as shortcuts (which were still available for advanced users).
Cell execution options were now exposed at the bottom so users had a logical place to go to run code, see execution status and get runtime errors.
Finally, new layout moved emphasis to writing code, which is the primary function of the cell.
New cell design
Cells before redesign
First use experience, using product itself instead of help files.
One particularly common use case was inserting cells into the notebook. Providing an easy way to do this was complicated by the fact that all cells were stored in a hierarchical way underneath. Hierarchy supports more advanced features, like grouping cells so that they can be executed as a group (sequentially or in parallel), and needed to remain part of the product. I created a solution where hierarchy was initially hidden from the user, to avoid overall interface complexity, while still exposing introductory actions like moving of cells up/down and cut/paste functionality. Full hierarchy view and management was available to advanced users via advanced menu feature.
Inserting cells
New mechanism for managing cell hierarchy within notebook
New menu structure
The goal was to create a consistent UX where user can invoke actions on various UI components in a logical and expected way. This required mapping and reorganization of all menus and actions within them.
My approach was to create a menu system that defined a basic set of actions that are available at every level (notebook, section and cell) first. For example, a Run action is available at Notebook level (running all cells), section level (running cells in that group only) or cell level (running just that cell). Additionally, each level would then get it’s own specific set of actions as needed.
This resulted in a simple and predictable UX, where user knows where to look (top right corner of control), what to look for (standard menu icon) and what actions to expect (base set + specific actions).
New menu structure across the whole product
Product website
During user research, it became apparent that users are only marginally familiar with notebook style development. Marketing site would need to explain the concept, and it’s benefits over traditional development methods. It also explained how our multi-language support made overall tool suite for data scientist simpler, and saved development time.
Another research finding was that users gravitate towards solutions that have a healthy community and often look there for help and inspiration. Our marketing website delivered the basics (product pitch, downloads etc.) but also a section where community can find existing notebooks to use and/or learn from. The idea was to seed the community with useful examples of notebooks, and eventually expand it into a rich open-source community when tool adoption increases.
Homepage
Community and example notebooks
Explaining the technical features
A look to the future: Cloud Notebooks
Running notebooks in the cloud has a clear advantage for complex models that take long time to compute. Cloud provides parallel compute infrastructure that can massively reduce computation time.
The goal was to define UX that would provide the data scientist with:
- An easy way to leverage cloud capabilities with almost zero configuration
- A simple way to monitor the progress of their computation
- Basic troubleshooting and error handling in case something goes wrong (as it always does)
Notebook in the Cloud
Configuring the cluster
Cell execution status
Results: Foundation Delivered
This project successfully delivered the open-source MVP. Although much work remained, the structurally sound foundation (UX and code) was there to build on. Initial reviews in the community were positive.
“It’s nice when elegance and logic coincide”
Our decisions to invest time in reducing complexity of the interface and optimizing for easy initial experience were validated by user response and initial adoption. However, the initial concerns about the value of notebooks and multiple languages still remain, and the adoption proceeded slowly.
Like most open-source efforts, Beaker has continued to evolve over time. Most recently, it morphed into an extensions to Jupyter, the next version of the main competitor tool. As they say, if you can't beat them, join them!
See Beaker in action at NY Tech Meetup