For Excel, I used Office Add-ins to integrate the same ml-classify-text-js library we used earlier for Google Sheets. I followed this quickstart tutorial to create the base Excel task pane add-in. As we are building on top of that, I strongly encourage you to set that up first and come back here. (You are free to skip and browse the code below).
Here’s the task pane add-in we’re writing in Excel:
To start, make sure you finished the quickstart tutorial mentioned above.
Once you’re done with that, install the ml-classify-text-js library by running the command below in the same folder as package.json in the quickstart project:
npm install ml-classify-text
After that, there are two files in the src->taskpane folder we need to change: taskpane.html and taskpane.js.
Task pane UI
In taskpane.html, replace the body element with code below. This changes the task pane to display “Capture training data” and “Predict” buttons.
You will notice that each button is assigned an id. This lets us hook up click events to each. Let’s do that now.
Initializing event handlers + library
In taskpane.js, remove the click event handler for run, and replace it with capture and predict, as below:
Below it, delete the entire function for run. Replace it with these initialization code instead:
Capturing cell data
This part of the function gets the range selected by the user, as well as the range where data can be found.
“Used range” captures the furthest corners of the sheet that contains data. This provides a useful cap when selecting data via the columns, which would otherwise give us rows all the way to the bottom regardless if they have cell data or not.
Another interesting bit is the call to “load”, and consequently context.sync(). “load” queues up the commands to get properties we need from Excel, and context.sync executes those commands.
In the resolveRange function, we update the range value based on whether the user selected cells, or columns. So a column-select such as a “B:D” range would become “B1:D45”. At this point, we’re now ready to get some cell data based on the “resolved” range value.
Notice above how we’re “load”-ing to get “values”, and pulling them using context.sync(). We now have data! We format it using processData, ready to be sent to the classifier later.
Training + Predicting
The snippets below are either called from or is part of function called predictRange. It runs when the user clicks on “Predict”. I broke them out so I can explain each in detail:
This part trains the classifier with the cell data we processed earlier.
This selects the cell range and gets the data that we’ll predict the output for.
Remember that in this selection step we are selecting a column with data, and another empty column to place the predictions:
The code below iterates through the cell data, predicts the category via the trained classifier, and assigns the results back to the empty cells.
There are several other things we can do with this such as topic modeling, entity recognition, etc. We can also hook this up with a backend service to perform intensive ML tasks that require more processing power.
I’m looking to hear from you on those ideas, as well as how to make this code better.