Skip to content

gmarkall/ck

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CK - enabling and simplifying collaborative and reproducible research, experimentation, knowledge management and predictive analytics

Introduction

Collective Knowledge (CK) is a light-weight, portable, modular and python-based framework, repository, web service and SDK to organize, describe, cross-link and share user code, data, experimental setups and meta information as unified and reusable components with JSON API via standard Git services (such as GITHUB or BitBucket).

Such unification can help researchers assemble experimental setups (workflows, pipelines, etc) from shared components just as LEGO(TM) to quickly prototype ideas while automating, preserving, distributing, crowdsourcing and reproducing experiments.

Furthermore, CK simplifies connection of unified experiments to powerful predictive analytics tools such as scikit-learn and R (statistical analysis, data mining, machine learning) to automate and speed up exploration of multi-dimensional experimental choices, analysis of results and decision making.

Finally, unified and integrated repository of knowledge allows community to systematize unstructured code and data while collaboratively validating experimental results, improving models, finding missing features and so on.

For example, we use this framework primarily to systematize, open and crowdsource our research and experimentation to design faster, smaller, more power efficient and reliable software and hardware (by combining autotuning, run-time adaptation, machine learning and public repository of knowledge):

Further information including documentation, user scenarios and developer guide:

License

  • Permissive simplified 3-clause BSD license (see LICENSE.txt).

Author

Grigori Fursin, non-profit cTuning foundation, France

Extra description

CK is intended to enable open and collaborative research and experimentation particularly in computer engineering. It can help users preserve, organize, describe, cross-link and share their code, data and experimental results as reusable Python-based components with a very simple API (one function with JSON in and JSON out) and JSON-based schema-free meta description.

All components can be transparently indexed and searched via third-party Hadoop-based ElasticSearch, shared via GIT, and connected together into pipelines (workflows) just as LEGO(R) to quickly prototype various research ideas, crowdsource experiments, reproduce past results, perform statistical analysis and apply predictive analytics (as described in http://hal.inria.fr/hal-01054763).

CK can be invoked in a unified way via CMD, web service or from other programs and tools written in practically any language including C, C++, Fortran, PHP and Java using OpenME event-based plugin framework developed separately (https://github.com/ctuning/openme).

CK has been tested with Python 2.x and 3.x on various platforms including Ubuntu, OpenSUSE, CentOS, Android (partial support through web services and OpenME) and Windows 7.

CK is extensively used in the following projects:

Developers

This software is being developed by the non-profit cTuning foundation and its volunteers.

(C)opyright 2014-2015 Grigori Fursin, cTuning foundation and contributors

Where to get

Minimal requirements

  • Python > 2.6 (Python 2.x may have some issues with Unicode support)
  • Python 3.x is supported

We have been successfully using Anaconda python distribution which includes all scientific packages required for predictive analytics in CK.

Installation

We made a special effort to make installation as simple as possible.

You need to:

  • add environment variable CK_ROOT to the root CK directory.
  • add CK_ROOT/bin to the PATH environment variable: ** Linux: export PATH=$CK_ROOT/bin:$PATH ** Windows: set PATH=%CK_ROOT%/bin;%PATH%

Now you should be able to run CK from command line:

ck

In case you use international characters, you can test your console for UTF-8 unicode by viewing the following test entry:

ck load test:unicode

If you want to use CK as a python module or from IPython and IPython notebook, you need to execute the following command from the CK root directory:

python setup.py install

In such case, if you have IPython installed, you can check CK as following:

ipython

import ck.kernel as ck

ck.test()

ck.access('load kernel default')

You can also easily clone or pull various external CK repositories shared via GITHUB (see CK wiki for more details) to build collaborative R&D scenarios:

ck pull repo:ck-env

ck pull repo:ck-web

ck pull repo:ck-auto-tuning

You can update the whole framework and all shared repos as following:

ck pull all

Collaborative R&D usage scenarios:

Users can assemble various R&D scenarios (workflows, pipelines) from shared code and data as LEGO(R) using simple CK API. Pipelines are also implemented as CK modules and can be shared with all dependencies via GIT(HUB). See more details at:

Questions/comments/discussions?

Please, use our mailing lists:

Publications

Concepts has been described in the following publications:

If you found CK interesting and possibly useful, you are welcome to reference above publications in your reports.

Fun

Acknowledgments

This project has been initially funded by EU FP7 609491 TETRACOM project. We would like to thank all volunteers for their valuable feedback and support.

About

Collective Knowledge SDK - organize, preserve, share and reuse knowledge and experience

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 94.6%
  • PHP 3.5%
  • Other 1.9%