Pipcook — Providing the Frontend with a Complete Intelligent Algorithm Framework

By Queyue

With the development of deep learning, all areas of our lives are undergoing intelligent transformations. As the team positioned closest to users, the frontend also wants to use AI capabilities to improve our efficiency, reduce labor costs, and provide users with a better experience. Intelligent transformation is seen as an important area of growth for the future of frontend development. However, frontend engineers have the following doubts:

Since frontend businesses are mature and provide a good user experience, why do we need machine learning?
Since machine learning requires massive volumes of data and manual labeling, why don’t we use traditional rules (if…else)?
Do we have to master advanced mathematical knowledge to apply machine learning?
Do we have to find the time to learn new languages, such as Python?

To resolve these doubts, we need a solution that enables AI to improve the frontend efficiency and minimize the costs and difficulty of using machine learning. Therefore, we came up with the idea of developing a JavaScript (JS) framework that is friendly to frontend engineers. It would allow them to quickly collect and process data and conduct machine learning experiments without having to master advanced mathematical and deep learning knowledge. This framework must also be flexible, scalable, and possess industrial-level availability. To achieve these goals, we launched Pipcook, a frontend algorithm framework based on tfjs-node.

Through communication with frontend engineers and research, we discovered the main reasons that prevent frontend teams from entering the AI field:

Language Barriers: In the traditional machine learning and deep learning fields, Python, R, and C ++ are the most common languages.JS, the language of frontend engineers, is rarely involved and is not suitable for intensive computing.
Algorithm Barriers: Mathematical and algorithm knowledge is a huge challenge for frontend personnel.
Scenario Barriers: Frontend scenarios rarely use intelligent technologies because frontend engineers cannot clearly define issues related to intelligent technologies and do not consider intelligent technologies when solving problems.
Data Barriers: It is difficult for the frontend to acquire high-quality data, which is also a common problem in the intelligent field. In addition, the data formats and specifications are not friendly to the frontend. In the following sections, we will discuss how to solve each of the preceding problems in detail.

A Problem Analysis Diagram

Implementation Scenarios

With the rapid development of AI, intelligence has empowered many industries. We believe that there are some web scenarios where AI can be applied. However, in many cases, non-algorithm engineers cannot effectively identify and determine the scenarios where machine learning can be used. In addition, they are not sure to what extent deep learning can solve problems and whether its performance is better than traditional rule engines due to their lack of an in-depth understanding of models and algorithms. To solve this problem, we can use either of the following methods:

Ask frontend engineers to study algorithm knowledge in depth so that they can understand the principles behind each algorithm and determine the technologies that should be used to solve different types of problems. This method is too demanding of the frontend engineers, and some may not be very enthusiastic about it.
Summarize a set of scenarios that may be encountered in frontend businesses and fields in our framework and classify these scenarios to form a case library. With these cases, frontend engineers can easily find similar scenarios, gain a better understanding of the problems that can be solved by machine learning, and intuitively apply these or similar cases to their businesses. This approach involves a much easier learning curve.

Data Processing

We know that data and models are the core elements of deep learning. If the model is a rocket engine, data is its fuel. Machine learning needs a large amount of high-quality fuel to allow it to realize its full potential. The frontend has accumulated some data over the years, and we also have advantages in data collection because we are the team closest to users.

The frontend possesses the following data:

UI data from design documents and module libraries (though this data has uneven quality)
Code data that is accumulated every day
Log data from online businesses, including performance, error, and other custom data logs
Specific data from other businesses

The data can be classified into computer vision (CV) data and text data. CV and natural language processing (NLP) are also the focus of machine learning. However, frontend engineers often do not know how to process data so that it can be turned into fuel for their models. Our framework must provide fast and simple data processing, as well as convenient capabilities, such as data quality assessment and data visualization.

Algorithms

For non-algorithm engineers, models and algorithms represent another huge obstacle. They always worry that they do not understand the mathematical principles of a model and do not know how to use deep learning frameworks, such as TensorFlow. This problem is both easy and difficult to solve.

It is easy to solve because experience has been accumulated in some traditional deep learning fields over many years, and almost every field has its own popular and mature models with industrial availability. We only need to provide model implementations in the framework. In this way, non-algorithm engineers can use models without any configuration required and do not need to worry about internal implementations. However, this problem is difficult to solve because some non-algorithm engineers think that models are too much like black-boxes and want to slightly adjust them based on their known algorithm knowledge. Therefore, we must also provide intervention and adjustment capabilities in the framework.

Language

JS vs. Python

The language problem is both simple and complex. As a simple solution, we can use JS, which is the language that frontend developers are most familiar with. Therefore, we developed Pipcook purely with Typescript, provided JS-based APIs, and implemented plugins for data processing and models based on tfjs-node. However, the JS-based machine learning ecosystem is still developing, and we cannot hope the JS ecosystem will provide the same richness as Python in a short time. Therefore, if our framework only uses JS, it is bound to be incomplete to a greater or lesser extent. Our solution is to provide a Node version of Python, like Swift, so Python libraries can be called in Node.js to help the frontend team.

Summary

After solving the preceding problems, we know why we need to use machine learning, when it can be used, and how to use it. In addition, we have provided solutions for each problem from the perspective of frontend engineers. As Pipcook and the entire JS-based machine learning ecosystem gradually mature, we believe that frontend engineers will get better at using intelligent capabilities.

A Diagram of the Pipcook Architecture

After we solved the scenario, algorithm, data processing, and language problems, we designed a pipeline-based frontend stream-format machine learning framework, as shown in the preceding figure. Models and data flow in the pipeline. We can embed plugins in this pipeline to process models and data and forward them downstream. Each plugin is responsible for a specific task in the machine learning cycle. Pipcook defines a series of specifications that allow third-party developers to develop plugins to extend Pipcook’s capabilities. Our framework is based on TensorFlow.js for machine learning and training. We can also use the Python ecosystem through Python bridging. The following sections introduce several key parts of the framework.

Pipelines and Plugins

A Sequence Diagram

Pipcook is a pipeline-based framework that includes data collection, data access, data processing, model configuration, model training, model service deployment, and online training. A specific plugin is responsible for each process. Plugins allow you to customize each process, and pipelines allow you to connect plugins in a series to implement algorithm engineering. The whole process is based on Node.js, and Node Package Manager (NPM) manages and maintains the plugins. The plugins for data processing and model service deployment can be deeply integrated with the existing frontend technical system.

Data Collection, Access, and Processing

Pipcook defines a set of dataset specifications. This prevents data access and usage costs resulting from different dataset standards in plugins for data collection, access, and processing. It also ensures that data can be shared between different pipelines. The protocols used by these plugins can generate standard and unified datasets under different labeling tools. The data processing plugin makes it easier to understand and optimize datasets.

tfjs-node

The underlying models and algorithm capabilities of Pipcook are provided by the node version of TensorFlow, a well-known machine learning framework. The tfjs-node makes it much easier to use JS for machine learning. Therefore, our JS-based machine learning platform can also easily use the tfjs-node. For example, we can use mature official models (such as MobileNets), use basic operators to build a new model, or use its tensor capabilities to make up for the fact that the JS platform does not have something similar to NumPy.

As a brand new JS-based machine learning platform that has only been open source for a short time, Pipcook still has many imperfections. To push the whole frontend industry towards intelligent development, we will work to continually optimize Pipcook.

Model Capability

Currently, Pipcook’s built-in plugins support a pipeline for image classification and object detection, and the pipeline for object detection uses Python capabilities. In the future, we hope to develop models based on the native tfjs-node to expand the JS-based machine learning ecosystem. In addition, Pipcook will continue to provide more plugins to support popular deep learning tasks, such as NLP and image segmentation. We also welcome third-party developers to contribute to these models.

Distributed Computing

As data volumes and model complexity increase, our computing power may prove insufficient. In the future, we will train models on multiple devices, support parallel, distributed parallel, and asynchronous data training, and use clusters to solve computing power problems.

Deployment Optimization

Currently, Pipcook only supports simple solutions, such as local deployment. In the future, Pipcook will cooperate with various cloud service providers, such as Alibaba Cloud, AWS, and Google Cloud, to deploy models to cloud computing machine learning deployment services in the pipeline. This will allow you to start using prediction services as soon as training is completed.

Summary

In the future, we hope to combine the power of Alibaba’s intelligent frontend team and the entire open-source community to continuously optimize Pipcook and the push for intelligent frontend capabilities it represents. This way, we can provide inclusive technical solutions for intelligent frontend capabilities, accumulate more competitive samples and models, provide intelligent code generation services with higher accuracy and availability, and improve frontend R&D efficiency. In addition, frontend engineers will no longer have to do simple and repetitive work, giving them more time to focus on challenging work.

Implementation Scenarios

Data Processing

Algorithms

Language

Summary

Pipelines and Plugins

Data Collection, Access, and Processing

tfjs-node

Model Capability

Distributed Computing

Deployment Optimization

Summary

Footer