Skip to content

bonet/llm-compare

Repository files navigation

LLM Output Classifier

The Project

In this project, we conduct a simple experiment to investigate whether an LLM can recognize their own output.

The experiment is divided into 2 phases:

  • Phase 1

    In this phase, we feed prompt inputs to 3 different LLMs (OpenAI GPT4, Llama 2, and Mixtral 8x7B) and save the results into files.
    LLM Compare Phase 1

  • Phase 2

    We build a classification query that is asking each LLM to determine which one of the outputs from Phase 1 are theirs.
    LLM Compare Phase 2

LLM wrapper

LLMCaller is a wrapper for calling various LLMs dynamically. Under the hood, the class is using Ollama platform to locally connect with Llama 2 and Mixtral models and Langchain to remotely connect with OpenAI GPT-4.

Quickstart

  1. To run this project you need to have pipenv installed:

    • On Linux: sudo apt install pipenv
    • On Mac: brew install pipenv
  2. If you don't have Ollama already, you can download it here

  3. Add environment variables:

    • cp .env.example .env to copy environment file
    • Add your OpenAI API key to .env file
  4. Install packages:

    • pipenv shell to enter virtual environment
    • pipenv install to install python packages listed in Pipfile
  5. Run Ollama server:

    • ollama serve to run Ollama server locally and handle Llama 2 and Mixtral prompt request
  6. On another shell, run the script:

    • python3 generate_results.py to run Phase 1
    • python3 classify_results.py to run Phase 2
  7. The classification result can be found here

A more detailed discussion about this project can be found on this medium blog.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages