Collective Knowledge concept (CK) provides a simple way to manage, connect and reuse any artifacts, scripts and workflows on any platform with any software stack. It is motivated by our tedious experience reproducing research papers on machine learning and systems and validating them in the real world.
Read more about the CK concept:
CK helps researchers and developers turn their scripts and artifacts into a database of portable, reusable, customizable and deterministic components with minimal effort and no changes to their projects.
All such components have a simple, human-friendly and platform-independent CLI, Python API, JSON/YAML meta description, tags, and Unique ID automatically generated by CK.
This approach allows users to automatically plug any ad-hoc scripts and artifacts from the community into their projects, build systems, CI/CD tools, containers, Jupyter/Colab notebooks and any other technology.
CK runtime system also helps users interconnect any scripts and artifacts into portable workflows, applications and web-services. They can run natively or inside containers while automatically adapting to any given software and hardware.
Any output of CK components and workflows (CSV/XLS/JSON/YAML files, pre-processed data set, notes and optimized code) can be also stored as CK components with all related CM dependencies. Such database-like organization of projects makes it easier for the community to re-run, reproduce and reuse research results.
We have donated CK to the MLCommons foundation to benefit everyone after it was successfully validated by Qualcomm, Arm, General Motors, OctoML, Krai, HPE, Dell, Lenovo and other organizations.
We continue improving and enhancing CK based on your feedback and requirements - feel free to join our open workgroup to provide your feedback and participate in further developments!
© 2021-2022 MLCommons
Collective Mind workflow automation meta-framework (CM) is the implementation of the CK concept.
CM is being developed by the community within the open MLPerf workgroup to modularize complex AI systems and automate their co-design, benchmarking, optimization and deployment across continuously changing software, hardware and data.
CM meta-framework is implemented as a small Python class with a CLI and minimal dependencies (Python 3+, PIP and Git client). It can be installed on any platform using PIP:
pip install cmind
cm version1.0.3
Please check the CM installation guide to install system dependencies and CM on Linux, Windows, MacOS and other operating systems.
If you encounter any problems, just open a ticket here and we will get back to you!
CM allows users to add simple database-like interface to their existing Git repositories to make it easier for the community to reuse their artifacts and automation scripts.
We are working with the community to assemble various artifacts and scripts from reproduced ML and Systems papers and MLPerf benchmarks as cross-platform CM components in this GitHub repository.
These components provide a unified API, CLI and meta to automatically download, install, build and interconnect ML applications, models, data sets, engines, libraries, SDKs and tools for different platforms and operating systems.
You can install this repository (or any other repository with the "CM-compatible logo") as follows:
cm pull repo --url=https://github.com/mlcommons/ckor
cm pull repo mlcommons@ckYou can then use the CM database interface to list, find, add, remove, update and load any CM component with a native user artifact or script:
cm list script
cm find script --tags=app,image-classificationYou can also run any shared script on any platform via the CM runtime natively or inside a container as follows:
cm run script --tags=detect,os --outYou can run the same script similar to a unified micro-service or standalone app using CM Python API as follows:
import cmind
r=cmind.access({'action':'run', 'automation':'script', 'tags':'get,os'})
print (r)Users can pull multiple public and private repositories compatible with the CM and automatically interconnect shared CM components into portable applications and workflows.
The following example demonstrates how to run a modular image classification workflow using CM. CM workflow is also a unified CM component that chains other cross-platform CM components using a simple and human friendly JSON/YAML description.
These components update global environments and a CM "state" dictionary similar to native scripts while automatically detecting, downloading, installing and building all related artifacts and tools to adapt this workflow to any user platform:
cm run script --tags=app,image-classification,onnx,python --quiet
cm run script --tags=app,image-classification,onnx,python --input=my-image.jpgor using Python scripting:
import cmind
r=cmind.access({'action':'run', 'automation':'script'
'tags':'app,image-classification,onnx,python',
'quiet':True})
print (r)It may take a few minutes to run this workflow for the first time and adapt it to your platform depending on the Internet speed. Note that all the subsequent runs will be much faster because CM automatically caches the output of all components to be quickly reused in this and other CM workflows.
See reusable CM scripts for this workflow:
- Digging into image classification CM workflow
- Understanding CM database and runtime interface
- Understanding cross-platform CM scripts
- Running and reproducing MLPerf inference benchmarks
Please check this guide to plug your own artifacts, scripts and workflows into CM.
You can then share CM components in your Git repository for your own scripts and artifacts to help the community reuse and build upon them (similar to this CM repository).
You can join our open workgroup and/or contribute your own CM components here.
These components must automatically download, install, build and preprocess different applications, models, data sets, frameworks, compilers, SDKs and tools across different platforms and operating systems to help the community modularize, simplify, co-design, benchmark, optimize and deploy efficient AI/ML systems.
You are very welcome to provide feedback and report bugs here.
Thank you for your support and looking forward to collaborating with you!
Please check this page to get the legacy CK framework v2.6.1. Note that we are gradually converting old CK components and workflows to the new CM format - please open a ticket if you need help!
- Open workgroup to automate co-design, benchmarking, optimization and deployment of ML Systems
- Artifact evaluation and reproducibility initiatives at conferences
- CK and CM motivation
- Journal article describing the CK concept
- "Reproducing 150 Research Papers and Testing Them in the Real World" (ACM TechTalk; Feb 2021)
- "MLOps Is a Mess But That's to be Expected" (March 2022)
- "Automating MLPerf design space exploration and production deployment" (HPCA'22)
- "Collaboratively Benchmarking and Optimizing Deep Learning Implementations" (General Motors; Jun 2017)
- MLOps projects, articles and tools
- Grigori Fursin and Arjun Suresh (OctoML and MLCommons)