Objective Personality AI (OPAI) is project aimed at developing AI models to classify personality types based on video transcripts. This project utilizes datasets gathered from various sources, including YouTube, and incorporates machine learning techniques to achieve its goal.
To run the scripts effectively, especially those involving computing embeddings like GritLM/GritLM-7B, ensure your system meets the following requirements:
- GPU Memory: At least 27 GB of GPU memory is required to compute embeddings with the
GritLM/GritLM-7Bmodel. - Processing Time: It takes approximately 5 seconds to compute embeddings per dataset entry.
Note: These requirements are crucial for performance and avoiding runtime errors due to insufficient resources.
For alternative models see MTEB leaderboard
The success of AI typing relies heavily on the quality and variety of data it can access. Currently, there are two methods for gathering data:
- Utilizing the dataset created by Tom Aylott (subtlegradient)
- Scraping data from YouTube videos using this project
The repository assumes the dataset is provided in the following CSV format specified in the .env file under the TRANSCRIPTS_CSV=<path_to_dataset.csv>:
name,ops_type,ModalitySensory,ModalityDe,ObserverDecider,DiDe,OiOe,SN,TF,SleepPlay,BlastConsume,InfoEnergy,IntroExtro,FlexFriends,GeneralisationSpecialisation,transcript_tokens_length,transcript
Field descriptions:
nameNormalized person's name usingutils#normalise_name(name)ops_typeFull ops type (e.g. MF-Ni/Fi-SB/P(C) [2])ModalitySensory: 'F' | 'M' | NoneSexual modality of the sensory functionModalityDe: 'F' | 'M' | NoneSexual modality of the extroverted decider functionObserverDecider: 'Observer' | 'Decider' | NoneDiDe: 'Di' | 'De' | NoneOiOe: 'Oi' | 'Oe' | None- ...
transcript_tokens_lengthnumber of tokens computed withtiktoken.get_encoding("cl100k_base")transcripttranscript of a person
Clone the repository and navigate to the project directory:
git clone https://github.com/stanbar/objectivepersonality.ai.git
cd objectivepersonality.aiInstall dependencies:
poetry installTo compute embeddings based on the transcripts from TRANSCRIPTS_CSV and outputs to TRANSCRIPTS_WITH_EMBEDDINGS_CSV
poetry run append_embeddings.pyTo run benchmarks for all classifiers:
./benchmark.shThis project is licensed under the PolyForm Perimeter License 1.0.1 - see the LICENSE file for details.
For support, raise an issue in the GitHub issue tracker or contact the maintainers via hello@objectivepersonality.ai
To compute a values of the people in the interviews
ollama servepython3 run values.pyTo compute peoples' demons and saviours
ollama servepython3 run saviours_demons.py