AI-ML Process Documentation

Problem

When my Fortune 100 data science department overhauled its software stack, leadership struggled with the mountain of information it needed to communicate to associates. Images, containers, and Docker; S3, SageMaker, and Kubeflow; Informatica, Spark, and Databricks—where to begin? how should it be organized? how could process changes be tracked?

Proposals

At first, leadership favored the simplest option: directing associates to publicly available documentation. The option had its appeal. After all, most of the companies producing the parts of our stack already had passable, and in some cases quite good, documentation. But early interviews with our new users revealed the limitations of this approach. Associates were overwhelmed by the options outlined in publicly available documentation; they didn’t know what features we could use in our enterprise-tailored version of the tools. And, of course, none of the public options documented things like internal access requests and the configuration of our custom Git repository template. Given the significant technical changes we were asking associates to undertake, employing a hodgepodge of documentation felt like it would be adding one more hurdle to the learning process.

I proposed an alternative: building a custom documentation site from the ground up using Markdown, Jekyll, and GitHub Pages. This approach, essentially a docs as code workflow, had the added benefit of version controlling changes to the documentation with Git, which wasn’t available with the other option we had been using (SharePoint). My proposal was accepted, and, despite having little experience with my doc tool stack, I quickly got to work.

Solution: a Jeykll-Based Static Site

The result of my work was a comprehensive, user-friendly, 80+ page documentation site. At the time of this writing, it remains one of the most regularly consulted technical documentation resources not only within my department but across the enterprise.

Below I include a few screenshots from the site to highlight its features. No proprietary information is featured in these images.

In this first image, you can see the site landing page. I was unsatisfied with the out-of-the-box landing page options available from my Jekyll theme, so I built this page from scratch using the Bootstrap framework. Other notable features here include:

  • Header font and color aligned to enterprise standards. I modified the CSS to accomplish this.
  • Buttons built with appropriate icons sourced from Font Awesome
  • Link to the GitHub repository’s Issues page
  • A functional search bar

Code Documentation site landing page

In the next image, you can see one of the second-level landing pages. There are a couple of things worth noting here:

  • On the left, a collapsible side navigation bar. This navigation bar remains fixed in place as a user scrolls down the page. I adjusted the Jekyll theme’s Javascript to enable this functionality.
  • A next page button. Because our development team explained these steps happening in a sequence, I thought it logical to connect one page to the next naturally.

Getting Started Landing Page

Next, you can see a typical content page from the site. A few features evident here include:

  • A functional top navigation bar, complete with dropdown functionality. It was important to me that the site maximize navigability, and the top navigation bar offers but one path for users to access the content they need.
  • An “admonition” call-out box. A “Warning” is used here, but “Tip,” “Note”, and “Caution” call-outs can be found throughout the site, too.
  • Images to match the text description. Because these instructions direct users to navigate to a feature of GitHub most of them were not familiar with, I thought including images would help them feel confident about the tasks they are being asked to perform.

GitHub Webhooks Landing Page