Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
251 commits
Select commit Hold shift + click to select a range
0a5b8ac
CODE_OF_CONDUCT.md committed
microsoftopensource Jan 8, 2024
7999376
README.md committed
microsoftopensource Jan 8, 2024
ffe60a2
SECURITY.md committed
microsoftopensource Jan 8, 2024
d5b4496
SUPPORT.md committed
microsoftopensource Jan 8, 2024
7b21ad2
LICENSE committed
microsoftopensource Jan 8, 2024
400c261
init commit
vyokky Feb 6, 2024
c686818
init commit
vyokky Feb 6, 2024
5b54235
init commit
vyokky Feb 6, 2024
6b522a5
init commit
vyokky Feb 6, 2024
ac2a71d
Update README.md
vyokky Feb 6, 2024
b1ab9bb
readme
vyokky Feb 6, 2024
e279b2c
readme
vyokky Feb 6, 2024
bb9c187
readme
vyokky Feb 6, 2024
6ee4508
readme
vyokky Feb 7, 2024
083a11a
readme
vyokky Feb 7, 2024
3de5487
readme
vyokky Feb 7, 2024
5b3c6ba
readme
vyokky Feb 7, 2024
9e7b927
readme
vyokky Feb 7, 2024
787b55c
readme
vyokky Feb 7, 2024
2fe3990
support openai
vyokky Feb 7, 2024
d2f404e
readme
vyokky Feb 7, 2024
faa095c
readme
vyokky Feb 7, 2024
f9542d2
fix api
vyokky Feb 7, 2024
a71cab7
readme
vyokky Feb 7, 2024
4b05b0d
readme
vyokky Feb 7, 2024
3985840
readme
vyokky Feb 7, 2024
cbbd385
readme
vyokky Feb 7, 2024
f5d0f24
readme
vyokky Feb 7, 2024
302cf03
readme
Feb 7, 2024
e57dc48
readme
Feb 7, 2024
9a326f1
readme
Feb 8, 2024
e0f0282
readme
Feb 8, 2024
5469f46
readme
Feb 8, 2024
c6c0a6b
readme
Feb 8, 2024
bb6daef
readme
Feb 8, 2024
e9b850a
readme
Feb 9, 2024
7632142
readme
Feb 9, 2024
5a99ec4
readme
Feb 9, 2024
19f86f8
readme
Feb 9, 2024
0eb3732
readme
Feb 9, 2024
9eafff8
readme
Feb 13, 2024
526a267
readme
Feb 14, 2024
3d1bab3
readme
Feb 14, 2024
85384c6
readme
Feb 14, 2024
7fc3d4c
Update README.md
eltociear Feb 14, 2024
5df907e
readme
Feb 14, 2024
81a4649
Merge pull request #1 from eltociear/patch-1
vyokky Feb 14, 2024
dcbad05
readme
Feb 14, 2024
0c61d06
fix: word spelling in prompt
saifeiLee Feb 19, 2024
4bc22af
Merge pull request #6 from saifeiLee/fix-word-spelling
vyokky Feb 19, 2024
309fbb4
readme
Feb 19, 2024
90fff8f
readme
Feb 19, 2024
2e3cff3
readme
Feb 19, 2024
0be3156
readme
Feb 19, 2024
eb4c8eb
remove print
vyokky Feb 20, 2024
30d0e14
fix aoai config
vyokky Feb 20, 2024
a36175b
plan list type
vyokky Feb 20, 2024
c1cc56d
todo
vyokky Feb 20, 2024
3c60b37
Fix colorama init
lserinol Feb 21, 2024
1d1f3da
Merge pull request #10 from lserinol/fix-colorama-init
vyokky Feb 22, 2024
9ebd313
x
Feb 22, 2024
8239e5b
x
Feb 22, 2024
0717723
x
Feb 22, 2024
ca9eb32
x
Feb 22, 2024
49b0e6a
qwen
vyokky Feb 23, 2024
9ff3868
arxiv
vyokky Feb 23, 2024
eef206e
arxiv
vyokky Feb 23, 2024
2b7a353
arxiv
vyokky Feb 24, 2024
b452fc3
Add config file to ignore, and added a template config file for user …
Feb 26, 2024
f8d8375
Add config file to ignore, and added a template config file for user …
Feb 26, 2024
2bbc799
Update dict format in app_selection.yaml
Feb 26, 2024
82185aa
Update dict format in app_selection.yaml
Feb 26, 2024
27b2949
Update configuration of API base URL for AOAI
Feb 26, 2024
1915b77
Update configuration of API base URL for AOAI
Feb 26, 2024
59b65cb
Merge pull request #12 from kangyu/pre-release
vyokky Feb 26, 2024
31ca841
Merge pull request #12 from kangyu/pre-release
vyokky Feb 26, 2024
71a4261
Remove obsolete config.yaml file
Feb 27, 2024
95ce456
Remove obsolete config.yaml file
Feb 27, 2024
434bc94
help doc processing
vyokky Mar 2, 2024
41b4c34
help doc processing
vyokky Mar 2, 2024
c690a42
help doc build
vyokky Mar 2, 2024
edc0a7a
help doc build
vyokky Mar 2, 2024
edc8439
help doc build
vyokky Mar 2, 2024
83373ce
help doc build
vyokky Mar 2, 2024
1ecf086
help doc build
vyokky Mar 2, 2024
0ebe839
help doc build
vyokky Mar 2, 2024
c0617c8
web retriever
vyokky Mar 3, 2024
bf45ace
web retriever
vyokky Mar 3, 2024
7243570
add retriever
vyokky Mar 3, 2024
4dfa683
add retriever
vyokky Mar 3, 2024
677fc06
get config
vyokky Mar 3, 2024
bc813e4
get config
vyokky Mar 3, 2024
58a1fa0
get config
vyokky Mar 3, 2024
b3dcb34
get config
vyokky Mar 3, 2024
edf5507
get config
vyokky Mar 3, 2024
289042e
get config
vyokky Mar 3, 2024
2cc4814
ignore updating config
vyokky Mar 3, 2024
b843be4
ignore updating config
vyokky Mar 3, 2024
ce77ffd
ignore updating config
vyokky Mar 3, 2024
4afbbde
ignore updating config
vyokky Mar 3, 2024
ee83b33
ignore updating config
vyokky Mar 3, 2024
6fab7a5
ignore updating config
vyokky Mar 3, 2024
ca06bb7
ignore updating config
vyokky Mar 3, 2024
848123c
ignore updating config
vyokky Mar 3, 2024
009b339
ignore updating config
vyokky Mar 3, 2024
4cea066
ignore updating config
vyokky Mar 3, 2024
ee03751
ignore updating config
vyokky Mar 3, 2024
34b7a86
ignore updating config
vyokky Mar 3, 2024
c24048a
ignore updating config
vyokky Mar 3, 2024
ef11790
ignore updating config
vyokky Mar 3, 2024
5726239
ignore updating config
vyokky Mar 3, 2024
dde96cb
ignore updating config
vyokky Mar 3, 2024
ce30854
ignore updating config
vyokky Mar 3, 2024
58c977f
ignore updating config
vyokky Mar 3, 2024
f921a86
rag dev
vyokky Mar 4, 2024
2d830b4
rag dev
vyokky Mar 4, 2024
39c8797
bug fixed
vyokky Mar 4, 2024
0f513ce
bug fixed
vyokky Mar 4, 2024
b5f51af
bug fixed
vyokky Mar 4, 2024
87934c3
bug fixed
vyokky Mar 4, 2024
4079159
rag dev
vyokky Mar 5, 2024
48e6cd2
rag dev
vyokky Mar 5, 2024
3cf2456
LLM error handling
vyokky Mar 5, 2024
405386c
LLM error handling
vyokky Mar 5, 2024
fc3d1b3
config template
vyokky Mar 5, 2024
f1a3d3f
config template
vyokky Mar 5, 2024
1e95a6d
log fixed
vyokky Mar 5, 2024
149a3c1
log fixed
vyokky Mar 5, 2024
540a322
update AAD login
Mac0q Mar 6, 2024
1d28b1f
update AAD login
Mac0q Mar 6, 2024
a575c57
TYPO
Mac0q Mar 6, 2024
bfd0bf6
TYPO
Mac0q Mar 6, 2024
bbbc56b
MS Todo adaption
vyokky Mar 6, 2024
6d1bfb5
MS Todo adaption
vyokky Mar 6, 2024
ae1fc44
Merge branch 'pre-release' into pre-release
vyokky Mar 6, 2024
c480b5a
Merge branch 'pre-release' into pre-release
vyokky Mar 6, 2024
7e0238b
Merge pull request #16 from Mac0q/pre-release
vyokky Mar 6, 2024
050f86b
Merge pull request #16 from Mac0q/pre-release
vyokky Mar 6, 2024
0ef799e
Merge branch 'pre-release' into develop
vyokky Mar 6, 2024
8c3aba8
Merge branch 'pre-release' into develop
vyokky Mar 6, 2024
a6187d6
api version
vyokky Mar 6, 2024
e47a9f9
api version
vyokky Mar 6, 2024
a6b8b39
fixed comment to prerelease
vyokky Mar 8, 2024
6c00eb6
fixed comment to prerelease
vyokky Mar 8, 2024
5dfd302
fixed comment to prerelease
vyokky Mar 8, 2024
686062a
fixed comment to prerelease
vyokky Mar 8, 2024
bdc33b8
fix bug in aad
vyokky Mar 8, 2024
fe3621d
fix bug in aad
vyokky Mar 8, 2024
dedc133
Merge pull request #14 from microsoft/develop
vyokky Mar 8, 2024
d134e07
readme for offline learner
vyokky Mar 9, 2024
4098a75
readme for offline learner
vyokky Mar 9, 2024
485eee0
readme for offline learner
vyokky Mar 9, 2024
c4fa74b
readme for offline learner
vyokky Mar 9, 2024
e703869
offline indexer example
vyokky Mar 9, 2024
3499d09
offline indexer example
vyokky Mar 9, 2024
43e434d
offline indexer example
vyokky Mar 9, 2024
55c1472
offline indexer example
vyokky Mar 9, 2024
94137bb
formating prompt for examples and apis
vyokky Mar 9, 2024
702943a
formating prompt for examples and apis
vyokky Mar 9, 2024
3ac44d7
formating prompt for examples and apis
vyokky Mar 9, 2024
8607c47
formating prompt for examples and apis
vyokky Mar 9, 2024
90b7c5c
restruct LLM call
vyokky Mar 9, 2024
af72e1d
restruct LLM call
vyokky Mar 9, 2024
9af0d83
add log for agent name and app name
vyokky Mar 11, 2024
75b0820
add log for agent name and app name
vyokky Mar 11, 2024
7a7f02a
fix comment
vyokky Mar 12, 2024
044daa8
fix comment
vyokky Mar 12, 2024
d5e53c7
Merge pull request #19 from microsoft/develop
vyokky Mar 12, 2024
3a4fb57
llm call
vyokky Mar 13, 2024
3875ded
Enable non-visual model
yunhao0204 Mar 12, 2024
9bedae6
delete
vyokky Mar 13, 2024
78aaf47
non-visual refine
vyokky Mar 13, 2024
44ba154
readme
vyokky Mar 13, 2024
181919c
readme
vyokky Mar 13, 2024
880bd6b
readme
vyokky Mar 13, 2024
8b0a05d
readme
vyokky Mar 14, 2024
12c4fef
merge new
vyokky Mar 14, 2024
263b23e
merge new
vyokky Mar 14, 2024
3ce7539
merge new
vyokky Mar 14, 2024
ea6e490
merge new
vyokky Mar 14, 2024
daaac52
merge new
vyokky Mar 14, 2024
589d3a6
merge new
vyokky Mar 14, 2024
a1b907d
merge new
vyokky Mar 14, 2024
c068f9f
Merge pull request #24 from microsoft/vyokky/dev
vyokky Mar 14, 2024
027fe78
OpenAI API
Mac0q Mar 14, 2024
5b11c2f
model list
Mac0q Mar 14, 2024
5e57226
API list
Mac0q Mar 15, 2024
8d5864f
check
Mac0q Mar 15, 2024
e8091e7
backup engine
Mac0q Mar 15, 2024
d420cb4
template update
Mac0q Mar 15, 2024
7c61c0d
backup format
Mac0q Mar 16, 2024
273bebf
log parser & bug fixed
vyokky Mar 17, 2024
070b311
add tag
vyokky Mar 17, 2024
4bbf363
abstract prompter
vyokky Mar 17, 2024
0f174c4
openai template
Mac0q Mar 18, 2024
a84749e
template
Mac0q Mar 18, 2024
03bdfb4
AOAI template
Mac0q Mar 18, 2024
8b4f317
revise prompter
vyokky Mar 18, 2024
359c505
prompter refinement
vyokky Mar 18, 2024
b3bb760
prompter refinement
vyokky Mar 18, 2024
fa58ca8
prompter refinement
vyokky Mar 18, 2024
8d714f5
experience prompter
vyokky Mar 18, 2024
5cc83bc
experience summarizer
vyokky Mar 18, 2024
39e7798
experience summarizer
vyokky Mar 18, 2024
239b8d6
Merge pull request #25 from Mac0q/pre-release
vyokky Mar 19, 2024
29171ab
Merge branch 'pre-release' into vyokky/dev
vyokky Mar 19, 2024
bedb7a3
new llm api
vyokky Mar 19, 2024
55aa435
check in openfile.py
Lenny2Liu Mar 19, 2024
38dd946
db saving
vyokky Mar 19, 2024
268a498
experience to the main flow
vyokky Mar 19, 2024
7c48ae8
add more package
Mar 19, 2024
b3602b6
experience rag
vyokky Mar 19, 2024
46af9fb
bug fixed
vyokky Mar 19, 2024
bd79dfe
delete exp
vyokky Mar 19, 2024
e101fb4
Merge pull request #26 from microsoft/addopenfile
vyokky Mar 19, 2024
549f730
config template
vyokky Mar 19, 2024
5978aff
rm redundant prompts
vyokky Mar 19, 2024
6c0247d
rewrite prompt
vyokky Mar 19, 2024
fb5edbf
add type
vyokky Mar 19, 2024
cbffd5a
rm redundancy
vyokky Mar 19, 2024
ef99b13
rm empty lines between tips
vyokky Mar 19, 2024
abf7af9
Merge branch 'microsoft:pre-release' into pre-release
AL-377 Mar 20, 2024
ca03ab2
file name fixed
vyokky Mar 20, 2024
e6c0f2a
resolve comment
vyokky Mar 21, 2024
15739fc
Merge pull request #28 from microsoft/vyokky/dev
vyokky Mar 21, 2024
aa3d154
Add record processor to parse the user behavior record to json file
yunhao0204 Mar 21, 2024
a01f4f3
Merge pull request #31 from yunhao0204/pre-release
vyokky Mar 21, 2024
063c54e
openai config
Mac0q Mar 21, 2024
71738f6
Merge pull request #32 from Mac0q/pre-release
vyokky Mar 21, 2024
2013790
update
Mar 21, 2024
ae94817
update
Mar 21, 2024
9934a42
Merge branch 'pre-release' of https://github.com/AL-377/UFO into pre-…
Mar 21, 2024
be85a65
fix:prompts of action example
Mar 21, 2024
f13abcf
Merge pull request #33 from AL-377/pre-release
vyokky Mar 21, 2024
2dc2f96
execution module
vyokky Mar 21, 2024
113516d
execution module
vyokky Mar 21, 2024
8f43d9f
execution module
vyokky Mar 21, 2024
5b08db3
Merge pull request #34 from microsoft/vyokky/executor
vyokky Mar 21, 2024
4d5a383
readme
vyokky Mar 24, 2024
b74b192
readme
vyokky Mar 24, 2024
1d6d443
new release
vyokky Mar 25, 2024
c09c8bf
new release
vyokky Mar 25, 2024
9f44d98
new release
vyokky Mar 25, 2024
5a8c88a
new release
vyokky Mar 25, 2024
14c5988
new release
vyokky Mar 25, 2024
7c6b17c
new release
vyokky Mar 25, 2024
9dffb3e
new release
vyokky Mar 25, 2024
185daa2
new release
vyokky Mar 25, 2024
89de7df
Merge pull request #36 from microsoft/vyokky/dev
vyokky Mar 25, 2024
1db2fe4
Merge branch 'main' into pre-release
vyokky Mar 25, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,28 @@
# Ignore login file
*.bin

# Ignore Jupyter Notebook checkpoints
.ipynb_checkpoints
/test/*
/deprecated/*
/test/*.ipynb
/logs/*
__pycache__/
**/__pycache__/
*.pyc

# Ignore the config file
ufo/config/config.yaml
ufo/config/config_llm.yaml


# Ignore the helper files
ufo/rag/app_docs/*
learner/records.json
vectordb/docs/*
vectordb/experience/*

# Don't ignore the example files
!vectordb/docs/example/

.vscode
102 changes: 84 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,29 +33,38 @@ Both agents leverage the multi-modal capabilities of GPT-Vision to comprehend th


## 📢 News
- 📅 2024-03-25: **New Release for v0.0.1!** Check out our exciting new features:
1. We now support creating your help documents for each Windows application to become an app expert. Check the [README](./learner/README.md) for more details!
2. UFO now supports RAG from offline documents and online Bing search.
3. You can save the task completion trajectory into its memory for UFO's reference, improving its future success rate!
4. You can customize different GPT models for AppAgent and ActAgent. Text-only models (e.g., GPT-4) are now supported!
- 📅 2024-02-14: Our [technical report](https://arxiv.org/abs/2402.07939) is online!
- 📅 2024-02-10: UFO is released on GitHub🎈. Happy Chinese New year🐉!


## 🌐 Media Coverage

UFO sightings have garnered attention from various media outlets, including:
- [Microsoft's UFO abducts traditional user interfaces for a smarter Windows experience](https://the-decoder.com/microsofts-ufo-abducts-traditional-user-interfaces-for-a-smarter-windows-experience/)
- [🚀 UFO & GPT-4-V: Sit back and relax, mientras GPT lo hace todo🌌](https://www.linkedin.com/posts/gutierrezfrancois_ai-ufo-microsoft-activity-7176819900399652865-pLoo?utm_source=share&utm_medium=member_desktop)
- [The AI PC - The Future of Computers? - Microsoft UFO](https://www.youtube.com/watch?v=1k4LcffCq3E)
- [下一代Windows系统曝光:基于GPT-4V,Agent跨应用调度,代号UFO](https://www.qbitai.com/2024/02/121048.html)
- [下一代智能版 Windows 要来了?微软推出首个 Windows Agent,命名为 UFO!](https://blog.csdn.net/csdnnews/article/details/136161570)
- [Microsoft発のオープンソース版「UFO」登場! Windowsを自動操縦するAIエージェントを試す](https://internet.watch.impress.co.jp/docs/column/shimizu/1570581.html)
- ...

These sources provide insights into the evolving landscape of technology and the implications of UFO phenomena on various platforms.


## 💥 Highlights

- [x] **First Windows Agent** - UFO is the pioneering agent framework capable of translating user requests in natural language into actionable operations on Windows OS.
- [x] **RAG Enhanced** - UFO is enhanced by Retrieval Augmented Generation (RAG) from heterogeneous sources to promote its ability, including offling help documents and online search engine.
- [x] **Interactive Mode** - UFO facilitates multiple sub-requests from users within the same session, enabling the completion of complex tasks seamlessly.
- [x] **Action Safeguard** - UFO incorporates safeguards to prompt user confirmation for sensitive actions, enhancing security and preventing inadvertent operations.
- [x] **Easy Extension** - UFO offers extensibility, allowing for the integration of additional functionalities and control types to tackle diverse and intricate tasks with ease.



## ✨ Getting Started


Expand All @@ -74,26 +83,83 @@ pip install -r requirements.txt
```

### ⚙️ Step 2: Configure the LLMs
Before running UFO, you need to provide your LLM configurations. You can configure `ufo/config/config.yaml` file as follows.
Before running UFO, you need to provide your LLM configurations **individully for AppAgent and ActAgent**. You can create your own config file `ufo/config/config.yaml`, by copying the `ufo/config/config.yaml.template` and editing config for **APP_AGENT** and **ACTION_AGENT** as follows:

#### OpenAI
```
API_TYPE: "openai"
OPENAI_API_BASE: "https://api.openai.com/v1/chat/completions" # The base URL for the OpenAI API
OPENAI_API_KEY: "YOUR_API_KEY" # Set the value to the openai key for the llm model
OPENAI_API_MODEL: "GPTV_MODEL_NAME" # The only OpenAI model by now that accepts visual input
```bash
VISUAL_MODE: True, # Whether to use the visual mode
API_TYPE: "openai" , # The API type, "openai" for the OpenAI API.
API_BASE: "https://api.openai.com/v1/chat/completions", # The the OpenAI API endpoint.
API_KEY: "sk-", # The OpenAI API key, begin with sk-
API_VERSION: "2024-02-15-preview", # "2024-02-15-preview" by default
API_MODEL: "gpt-4-vision-preview", # The only OpenAI model by now that accepts visual input
```

#### Azure OpenAI (AOAI)
```bash
VISUAL_MODE: True, # Whether to use the visual mode
API_TYPE: "aoai" , # The API type, "aoai" for the Azure OpenAI.
API_BASE: "YOUR_ENDPOINT", # The AOAI API address. Format: https://{your-resource-name}.openai.azure.com
API_KEY: "YOUR_KEY", # The aoai API key
API_VERSION: "2024-02-15-preview", # "2024-02-15-preview" by default
API_MODEL: "gpt-4-vision-preview", # The only OpenAI model by now that accepts visual input
API_DEPLOYMENT_ID: "YOUR_AOAI_DEPLOYMENT", # The deployment id for the AOAI API
```
You can also non-visial model (e.g., GPT-4) for each agent, by setting `VISUAL_MODE: True` and proper `API_MODEL` (openai) and `API_DEPLOYMENT_ID` (aoai). You can also optionally set an backup LLM engine in the field of `BACKUP_AGENT` if the above engines failed during the inference.


#### Non-Visual Model Configuration
You can utilize non-visual models (e.g., GPT-4) for each agent by configuring the following settings in the config.yaml file:

- ```VISUAL_MODE: False # To enable non-visual mode.```
- Specify the appropriate `API_MODEL` (OpenAI) and `API_DEPLOYMENT_ID` (AOAI) for each agent.

Optionally, you can set a backup language model (LLM) engine in the `BACKUP_AGENT` field to handle cases where the primary engines fail during inference. Ensure you configure these settings accurately to leverage non-visual models effectively.


### 📔 Step 3: Additional Setting for RAG (optional).
If you want to enhance UFO's ability with external knowledge, you can optionally configure it with an external database for retrieval augmented generation (RAG) in the `ufo/config/config.yaml` file.

#### RAG from Offline Help Document
Before enabling this function, you need to create an offline indexer for your help document. Please refer to the [README](./learner/README.md) to learn how to create an offline vectored database for retrieval. You can enable this function by setting the following configuration:
```bash
## RAG Configuration for the offline docs
RAG_OFFLINE_DOCS: True # Whether to use the offline RAG.
RAG_OFFLINE_DOCS_RETRIEVED_TOPK: 1 # The topk for the offline retrieved documents
```
API_TYPE: "aoai"
OPENAI_API_BASE: "YOUR_ENDPOINT" # The AOAI API address. Format: https://{your-resource-name}.openai.azure.com/openai/deployments/{deployment-id}/chat/completions?api-version={api-version}
OPENAI_API_KEY: "YOUR_API_KEY" # Set the value to the openai key for the llm model
OPENAI_API_MODEL: "GPTV_MODEL_NAME" # The only OpenAI model by now that accepts visual input
Adjust `RAG_OFFLINE_DOCS_RETRIEVED_TOPK` to optimize performance.


#### RAG from Online Bing Search Engine
Enhance UFO's ability by utilizing the most up-to-date online search results! To use this function, you need to obtain a Bing search API key. Activate this feature by setting the following configuration:
```bash
## RAG Configuration for the Bing search
BING_API_KEY: "YOUR_BING_SEARCH_API_KEY" # The Bing search API key
RAG_ONLINE_SEARCH: True # Whether to use the online search for the RAG.
RAG_ONLINE_SEARCH_TOPK: 5 # The topk for the online search
RAG_ONLINE_RETRIEVED_TOPK: 1 # The topk for the online retrieved documents
```
Adjust `RAG_ONLINE_SEARCH_TOPK` and `RAG_ONLINE_RETRIEVED_TOPK` to get better performance.


#### RAG from Self-Demonstration
Save task completion trajectories into UFO's memory for future reference. This can improve its future success rates based on its previous experiences!

After completing a task, you'll see the following message:
```
Would you like to save the current conversation flow for future reference by the agent?
[Y] for yes, any other key for no.
```
Press `Y` to save it into its memory and enable memory retrieval via the following configuration:
```bash
## RAG Configuration for experience
RAG_EXPERIENCE: True # Whether to use the RAG from its self-experience.
RAG_EXPERIENCE_RETRIEVED_TOPK: 5 # The topk for the offline retrieved documents
```



### 🎉 Step 3: Start UFO
### 🎉 Step 4: Start UFO

#### ⌨️ You can execute the following on your Windows command Line (CLI):

Expand All @@ -119,7 +185,7 @@ Please enter your request to be completed🛸:
- The GPT-V accepts screenshots of your desktop and application GUI as input. Please ensure that no sensitive or confidential information is visible or captured during the execution process. For further information, refer to [DISCLAIMER.md](./DISCLAIMER.md).


### Step 4 🎥: Execution Logs
### Step 5 🎥: Execution Logs

You can find the screenshots taken and request & response logs in the following folder:
```
Expand Down Expand Up @@ -178,11 +244,11 @@ If you use UFO in your research, please cite our paper:
```

## 📝 Todo List
- ⏩ Documentation.
- ⏩ Support local host GUI interaction model.
- Support more control using Win32 API.
- ⏩ RAG enhanced UFO.
- Chatbox GUI for UFO.
- [x] RAG enhanced UFO.
- [ ] Documentation.
- [ ] Support local host GUI interaction model.
- [ ] Support more control using Win32 API.
- [ ] Chatbox GUI for UFO.



Expand Down
32 changes: 32 additions & 0 deletions learner/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@

# Enhancing UFO with RAG using Offline Help Documents


## How to Prepare Your Help Documents ❓

### Step 1: Prepare Your Help Doc and Metadata

UFO currently supports processing help documents in XML format, as this is the default format for official help documents of Microsoft apps. More formats will be supported in the future.

You can write a dedicated document for a specific task of an app in a file named, for example, `task.xml`. Note that it should be accompanied by a metadata file with the same prefix, but with the `.meta` extension, i.e., `task.xml.meta`. This metadata file should have a `title` describing the task at a high level and a `Content-Summary` field summarizing the content of the help document. These two files are used for similarity search with user requests, so please write them carefully. The [ppt-copilot.xml](./doc_example/ppt-copilot.xml) and [ppt-copilot.xml.meta](./doc_example/ppt-copilot.xml.meta) are examples of a help document and its metadata.

### Step 2: Prepare Your Help Document Set

Once you have all help documents and metadata ready, put all of them into a folder. There can be sub-folders for the help documents, but please ensure that each help document and its corresponding metadata **are placed in the same directory**.


## How to Create an Indexer for Your Help Document Set ❓


Once you have all documents ready in a folder named `path_of_the_docs`, you can easily create an offline indexer to support RAG for UFO. Follow these steps:

```console
# assume you are in the cloned UFO folder
python -m learner --app <app_name> --docs <path_of_the_docs>
```
Replace `app_name` with the name of the application, such as PowerPoint or WeChat.
> Note: Ensure the `app_name` is accurately defined as it is used to match the offline indexer in online RAG.

Replace `path_of_the_docs` with the full path to the folder containing all your documents.

This command will create an offline indexer for all documents in the `path_of_the_docs` folder using Faiss and embedding with sentence transformer (more embeddings will be supported soon). The created index by default will be placed [here](../vectordb/docs/).
2 changes: 2 additions & 0 deletions learner/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
8 changes: 8 additions & 0 deletions learner/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.

from . import learn

if __name__ == "__main__":
# Execute the main script
learn.main()
39 changes: 39 additions & 0 deletions learner/basic.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from . import utils


class BasicDocumentLoader:
"""
A class to load documents from a list of files with a given extension list.
"""

def __init__(self, extensions: str = None, directory: str = None):
"""
Create a new BasicDocumentLoader.
:param extensions: The extensions to load.
"""
self.extensions = extensions
self.directory = directory


def load_file_name(self):
"""
Load the documents from the given directory.
:param directory: The directory to load from.
:return: The list of loaded documents.
"""
return utils.find_files_with_extension(self.directory, self.extensions)


def construct_document_list(self):
"""
Load the metadata from the given directory.
:param directory: The directory to load from.
:return: The list of metadata for the loaded documents.
"""
pass




Loading