In this project, we conduct a simple experiment to investigate whether an LLM can recognize their own output.
The experiment is divided into 2 phases:
-
Phase 1
In this phase, we feed prompt inputs to 3 different LLMs (OpenAI GPT4, Llama 2, and Mixtral 8x7B) and save the results into files.
-
Phase 2
We build a classification query that is asking each LLM to determine which one of the outputs from Phase 1 are theirs.
LLMCaller is a wrapper for calling various LLMs dynamically. Under the hood, the class is using Ollama platform to locally connect with Llama 2 and Mixtral models and Langchain to remotely connect with OpenAI GPT-4.
-
To run this project you need to have
pipenvinstalled:- On Linux:
sudo apt install pipenv - On Mac:
brew install pipenv
- On Linux:
-
If you don't have Ollama already, you can download it here
-
Add environment variables:
cp .env.example .envto copy environment file- Add your OpenAI API key to
.envfile
-
Install packages:
pipenv shellto enter virtual environmentpipenv installto install python packages listed inPipfile
-
Run Ollama server:
ollama serveto run Ollama server locally and handle Llama 2 and Mixtral prompt request
-
On another shell, run the script:
python3 generate_results.pyto run Phase 1python3 classify_results.pyto run Phase 2
-
The classification result can be found here
A more detailed discussion about this project can be found on this medium blog.